In the ‘50s Sci-Fi writer Isaac Asimov introduced the strange fictional science of psychohistory in his Foundation series. The story is set in the far future and revolves around a mathematician who studies this intriguing art of social sciences and statistical mathematics being used to make predictions. By analysing inductive differences and data using different formulas and algorithms his characters can predict the future. The larger the data like the group of people or duration of time, the more accurate the predictions. And like gas, he compares, the more molecules there are the more visible the cloud becomes. As with science in the real world, it’s about excluding and including different form factors in order to fit the experiment that benefits certain hypotheses.
Today’s Big Data
Organizations are struggling to hold on to and analyse the waves of quintillions of data created each day. As a matter of fact according to IBM, “2.5 quintillion bytes of data is created each day since 2012.” That’s 57.5-billion 32GB iPads. And 90% of that data has only been created in the last two years. This is changing the ways in which data can be applied to science, whether it’s social, business or medical. Sure, most of the data is mainly created by social media’s dinner posts and holiday photos but it still holds the potential for interesting innovations especially in marketing fields which industries currently thrive off. This year though we should hope to see more innovations in the fields of humanitarian and environmental aid rather than the marketing ponds.
Big Data vs. psychohistory
In the Foundation series big events such as the death of the emperor are predicted by using the data of a whole Galactic Empire which consists of quadrillions of humans. Today however Big Data is mostly used for predicting, adapting and influencing consumer behaviour and not the fall of Galactic Emperors and such.
So when does prediction using Big Data become a science? Well, when we take the example of a person recording his daily routines in order to predict and hopefully prevent illness, there are too many potentially relevant ‘outside’ factors that could determine the causes of a disease. There are things like geography, demography, genetics and such. But let’s say you deliver your family’s history of illness to the doctor’s formula as well. And the longer the genetic record stretches back, the better. The prediction becomes more accurate and you’ll have a better chance of foreseeing illnesses you otherwise would probably not have.
A company such as Walmart uses Big Data to rely on the economic law of demand and supply. The more data curated and relevantly processed, the smaller the ceteris paribus (all other things being equal) x-factor becomes in their equation and the more accurate they can adjust their supplies. In other words to make the perfect predictions about the future we would want a perfect reconstruction of our world where all factors are known. Then the ceteris paribus factor becomes moot and the field of economics could justly be called a science.
Comparative examples: Wallmart, Futuremed and Oscars
The US retail giant, Walmart, was Big Data Before Big Data was Cool. As mentioned by InformationWeek, since 2011 when the biggest retailer in the US started the social media startup, SocialGenome “[the company] can even filter [this] data to understand location-based preferences and hold inventory that is preferred in certain locations. Monitoring social media can even help Walmart create or stock products that are in demand.” Simply put; Big Data being used to supply products to adequately meet the demands of millions using predictions based on streaming data.
This method of making predictions could be taken further though. The more data we can get our hands on and control, the more accurate predictions will be. For example if your daily consumption, exercise, blood pressure, etc. could be made available to your doctor (or his computer server), he would be able to predict certain illnesses (or chances thereof) by analyzing your patterns and comparing them to millions’ of other patients’ records worldwide. This is exactly an area that has become popular with FutureMed.
The conference is based on the tech and its developments in healthcare and, as discussed by The Economist, advocates the idea of the “quantified self”, where the individuals “take ownership” of their lives. It’s a strange concept that by releasing more personal data to a third party, the more control you supposedly get. The same goes for companies like Facebook when the more active and personal we become, the more information (in effect control) a third party will potentially have.
Another area where Big Data has been used in order to make predictions is that of politics. The much praised ‘rainmaker’ of data, Nate Silver, runs the popular FiveThirtyEight blog for the New York Times and used Big Data to predict the outcome of the 2012 US elections. Another company called Farsight is using interesting historical models and adjusting data daily in order to predict the outcome of this year’s Oscars. “The model [also] is informed by 40 years of Academy Awards history, such as the fact that only three movies have ever won Best Picture that didn’t also get at least a nomination in the Best Director category” for example.
Today we are still too disconnected to supply and too small to acquire all the data that would help us predict accurate events such as the fall of someone powerful such as an Emperor. We do however, with the right software and relevant skills, have the ability to lessen our chances of becoming ill for example or to find out who is going to be awarded an Oscar for best picture. The more data are being collected, the less private you become and the more control an outside party potentially has. Should a line be drawn when it comes to personal data and the “quantified self”?