What is a data scientist?
Over the last few years you might have heard about a new job title that has grown in conjunction with big data and big data analytics: ‘data scientists’.
This job role isn’t brand new - back in October 2012, Harvard Business Review labelled the data scientist as being “the sexiest job of the 21st century” – such an accolade certainly makes the job sound so much more appealing than most other modern jobs, especially in the big data field - but little is really known about data scientists outside of the companies that are utilising this skillset.
So what exactly are data scientists and what do they do?
Well firstly it should be noted that data scientists are not technically tied to just the field of big data or big data projects, but the role is largely associated with big data because it suits the field so well – with the increased variety and scale of the data being examined.
The data scientist role is essentially an evolution of the data analyst role, a position that has been common place amongst many large companies for some time now. Generally the formal training for a data scientist is the same as that of a data analyst, typically containing a background in computer science, statistics, analytics, modelling and mathematics.
But there are distinct differences between the more traditional data analyst and data scientist roles. Data analysts will usually concentrate on smaller or more specific sets of data, usually collected via a single third-party CRM (customer relationship management) system. Data analysts tend to have a set goal of determining where a company has been/what it has achieved in the past and where the company is going/what it is achieving right now.
Data scientists, however, have a deeper connection with the systems they are using. Many data scientists will be involved in some way in the design and development of the systems that they are using to collect and process the data (note: this does not necessarily mean the data scientist is programming/developing the software, but will have some input in how it works and how information is delivered to the end-user).
A key distinction is that data scientists will not view data from a single source, they will likely look at data collected across multiple, unrelated sources and use their skillset to spot trends and the problems that could affect their company.
IBM’s Anjul Bhambhri, vice president of big data products, has described the data scientist role as being “part analyst, part artist”. A data scientist does not simply collect and report on data, but looks at it from multiple angles to define what it means and will then recommend ways to apply the data.
The data scientist’s role is to sift through all incoming data and discover any previously hidden insights that can address a business problem and/or gain the company a competitive advantage.
Why are data scientists needed?
As mentioned, the data scientist is seen as an evolution of the data analyst role and all evolution occurs naturally, it is not something that can be forced. This is the same for data, as our technology evolved, so did our use of it and, in turn, so did the data. Where once we would store files solely on our home PCs, everyday users are now creating data online, storing and sharing information online – which has led to the evolution of big data.
To fully understand and make proper use of big data, the data analyst role has evolved to match the variety and volume of data that is now being collected; evolving into a data scientist.
For many SMEs, data scientists are not really a necessity for the moment (that’s not to say that down the line things will change), but for larger enterprises dealing with vast amounts of data, then a data scientist can become one of the most valuable members of its organisation.
Larger companies need to do a lot with their data. They need to gather, collate and store vast amounts of information and often clean, analyse, visualise and share said information. It’s the data scientists who do this; they add value to raw information, by turning data into insights and products. They help companies to make data-driven decisions and (an important distinction over most data analysts) know when to make data-driven decisions for the benefit of the company.