What is big data? Part 2: Why we need big data tools
Welcome to the second post in our series: ‘What is big data?’. In this section we explore an overview of why big data needs to be handled differently to the data that we might deal with on a daily basis.
We also look at how big data has been used in the real world, through two unlikely examples – one in a major political campaign and the other to help further the world of medical science and save lives – and see what both examples have in common in order to help gain the edge with big data.
Why do we need big data analytics tools?
Using a search engine is a great way of finding out information. However, rather than scouring the Internet for live data, most search engines (in a nutshell) use relatively ‘older’ forms of data and the sites that return higher on results tend to be the ones that receive the highest volume of traffic and happen to match key words used in a search.
This is perfect for the average user. But, for example, what if a person or company needed to know what is being said right then, at that moment in time, about them on social media?
A search in Google, Bing or Yahoo wouldn’t bring back a social media result from that minute or even hour, unless it had happened to have received a very high volume of likes, shares and comments. So to keep track of what is being said, that company would have to effectively sit and look at the social media channels manually, to ensure that as and when a comment comes in, they can be there to see it and (if need be) respond.
But while the company could potentially sit there looking at Twitter and Facebook all day, they are most definitely not going to be able to monitor every forum or blog post happening around the world as it would take up too much personnel time and (therefore) money.
And this is where the concept of big data management comes in. Big data management tools are able to monitor multiple sources from around the Internet and pick up on information as and when it comes in and return those results instantly.
There’s a world full of important data that you are missing
Using standard search engines, users tend to cherry pick the results they want to look at. This is partly because the top search results tend to be the most relevant to the search terms entered, but also because nobody really has time to read every single result that the search engine comes back with.
But, here’s the crux of the situation, users tend to be cherry picking results that have already been ‘cherry picked’ by the search engine itself. Yes, from the results delivered you may find what you are looking for, but due to lack of physical time and patience, I doubt anyone is delving through and examining every single source; they will just look at the most popular results. And from that, one could argue that users are missing out on the bigger picture.
You could compare this to looking for fruit and vegetables at a supermarket. Supermarkets gather their fresh produce from multiple farms across the world and, knowing that customers are fussy, they will only display the best looking produce on their shelves to entice customers.
But customers will then go and choose what they deem to be the best looking out of the available selection. They are ‘cherry picking’ food that has already been ‘cherry picked’ for them and they are not seeing the full range of fruit that’s possibly available in the world, just a small fraction – which is exactly what is happening when people use search engines like Google, Bing or Yahoo to find information.
Real world examples of big data applications
Don’t be fooled into thinking that big data is just for big companies or corporations to use to increase profits. When used correctly, big data has the potential to gain the edge in almost any application.
How Obama gained the edge with big data
In 2012 Barack Obama is said to have utilised big data to help win his re-election campaign over Republican rival, Mitt Romney. His team set out to understand what the voters wanted from a candidate and how best to accommodate those needs in order to gain votes.
While the Republicans were only just starting to use social media to help drive their campaign, Barrack Obama already had a strong following through social media and emails from his original 2008 election campaign and pushed forward using big data to help gain the edge.
By taking the data the Democrats had acquired in the 2008 election, Obama’s team was able to analyse and break it down into the differing target audiences – not just by ethnicity, or national origin, but by individual lifestyles, people with families, those with children attending public/private schools and whether they are environmentally conscious or not. The level of detail was very high.
And one key element of this strategy was that Barrack’s campaign team wasn’t just gathering all this data, it’s that they were able to process and action it immediately.
In doing this, Obama’s campaign was able to send out messages catered to the various voters, rather than just sending out one, or many mixed messages to all his voters at once. All of this was possible by utilising big data management and analytics to his advantage.
Big data analytics helps saves lives
In 2008 the technology and consulting company, IBM, and the University of Ontario Institute of Technology (UOIT) announced a research project to help doctors detect subtle changes in the conditions of critically ill, premature babies, by using big data analytics.
Led by Dr. Carolyn McGregor, the project utilised IBM’s InfoSphere Streams technology – an advanced analytic platform that is able to collect, analyse and correlate information from thousands of sources.
Prior to this research, Dr. McGregor is said to have highlighted an issue surrounding the way that premature babies were being monitored in intensive care units.
The premature babies were constantly being monitored, but due to the busy nature of hospitals, their vitals were only being recorded once an hour and the results were used to represent a trend for that hour, with any notable changes looked into once a pattern was seen.
However, the average baby is said to breathe more than 2,000 times an hour, its heart will beat 7,000 times an hour, with blood oxygen levels creating 3,600 readings per hour.
This meant that any subtle changes to an infant’s condition may not have been picked up on until it was too late; a life-changing or potential fatal condition could have worsened by the time a doctor realised the symptoms were there.
The use of IBM InfoSphere Streams allowed Dr. McGregor to constantly monitor the conditions of the premature babies, with the data being continually stored.
Able to interface with the hospital’s clinical systems, the InfoSphere environment could monitor and analyse multiple streams in real time, drawing information from a disparate range of formats, provide it in the format they needed and generate alerts should a change in a baby’s condition be noted by the system.
This created opportunities for much earlier pro-active intervention and the prediction of on-set illness in premature babies and this use of big data is now a practice that is more commonplace in neonatal ICUs.
But, it should be noted that this doesn’t mean that having/using a big data analytics tool will automatically give you an advantage over competitors, or help improve your business; you need one key element: a goal.
Without a defined objective, big data is still an unstructured mess of information. It’s having a target to work towards that allows users to exploit the results to their advantage.
That’s exactly what Obama’s team had: a goal to determine who was backing him and who wasn’t, and what could be done to cement the opinions of those who were in favour of him and how he could help change the minds of those that didn’t want him in for a second term as President.
In our next post we offer an overview of how big data analytics tools can give you more of a competitive edge than standard search engine results and summarise this introduction to big data.