Big Data and Data Science
1. What is Big Data?
The 2016 CIO survey by Gartner highlights that Business Intelligence and Data Analytics remains the number one IT investment priority for the second year in a row. The last decade has seen an explosion in the scale of data, the speed at which it is both produced and consumed and its variety. This is commonly known as the 3Vs: Volume, Velocity and Variety and are the defining characteristics of Big Data.
The other core characteristic is that it is usually made up of structured and unstructured data, generated from sources such as social media, emails or newsfeeds. Sources that don’t have a defined data model and are not pre-organised.
Big Data is defined as having large Volume, Velocity and Variety and is made up of structured and unstructured data.
The value comes not from the sheer volume and variety of the data itself, but from the insights that it is able to provide organisations, whether that is in support of better and more timely decision making, optimising retail services, personalising customer experiences or reducing costs.
2. What’s a Data Scientist?
A Data Scientist is someone who makes sense of all of this data and provides people with insights into what can be done with it and its value. A Data Scientist has a breadth of abilities: academic curiosity, storytelling, software engineering experience and they’re smart. But, most importantly, they have deep expertise in Statistics and Machine Learning. They often have a background in mathematics and scientific research.
A Data Scientist starts with an opportunity then takes raw data, transforms it, cleans it, filters it, mines it, visualizes it and then validates it: It’s a continual cycle where more and more data sets can be added and models refined.
Statistical and Machine Learning is the process of acquiring data from different sources, creating a model, optimizing its accuracy, validating its purpose and confirming the significance of the insights.Machine Learning is an Artificial Intelligence technique that uses a training dataset to build a software model that can predict values of target variables, for example, future sales volumes or the likelihood someone will default on a loan.
3. How does Data Science differ from traditional Analytics?
As stated above, Financial Service organisations have been using data analytics for decades, what has changed over the last few years is the volume, the variety and the velocity of data available and the technology available to support new ways to analyse and make sense of it.
Data Scientists understand how to undertake analytics, but they also have deep software engineering skills coupled with great storytelling. They are able to take previously unusable data sets, like social media and email, and use it to refine and improve business insights and predictions.
Data Science provides better insights, more quickly.
Whilst traditional analytics focuses on using structured data sets, running static models on silos of data, usually on expensive hardware, Big Data and Data Science focuses on using a wide range of data sets (usually in a Data Lake!) and typically utilises Open Source technologies on cloud based platforms.
4. Data Science in Financial Services
Financial services companies have long used data and analytics in their day to day operations, but now many are turning to Big Data and Data Science to gain competitive advantage. Credit scoring is a great example of where Big Data and Data Science are disrupting a long established market.
The traditional approach to risk-scoring relies on a person’s credit history, as distilled in FICO scores. Tech start-ups, however, use alternative data sources such as social network profiles, bill payment histories, public records, online communications, even how applicants fill out forms on the web.
Affirm is one such start-up. They are a lending company that provides instalment financing; aimed squarely at Millennials and people who don’t have an established credit rating. A customer enters in a minimum amount of personal data on a Smartphone App and Affirm automatically sends a text to authenticate them. Once confirmed, the Affirm service runs it algorithm using non-traditional data sources and an instant decision is made. All via the customer’s smartphone and in real time. Affirm can offer services and products to a wider customer base than traditional lenders at a much lower cost to serve. They have recently raised $325M in debt and equity finance for expansion.
Tech Start-ups are using Big Data and Data Science to disrupt lending markets.
Another similar example, is ZestFinance, who use Big Data and Data Science to make lending decisions. They use specific data points on a person’s credit score plus unstructured social media data and mobile phone information. This approach delivers a 40% improvement in default rates over current industry scoring methods.