Twitter is listening: data mining, Gnip and the fabled ‘firehose’

twitter-hashtags-for-writers-100

Twitter is currently making a ton of money by selling access to your tweets. Now you might be wondering “Why would someone pay to read my tweets?” I mean Twitter is free, right? You just need an account and presto! There you have your access.

The answer is a broad one that I will try to explain in the duration of this article. Let me start off by summing up this article in three words…

Information is power

The most valuable thing that the micro-blogging platform offers is information and insight into a group’s social psyche. Each month there are over 230-million users that are regularly active on Twitter. Each day they are continuously providing snippets of personal information and, for no cost at all, anyone can gain access to this vast sprawling network of data.

This information is not only easily accessible but also easily digestible due to the character limit imposed on tweets, making it an extremely powerful research tool. Within those 140 characters lies a deep insight into the individual who produced it.

But in truth, as a standard user, you can only gain access to the last 3 200 Tweets. By using Twitter’s search API you can push that number up to about 5 000 Tweets. That’s not a very large number considering that on average there are over 350 000 tweets being made every minute.

To be able to sift through the entire history of tweets you will need access to something known as Twitter’s “firehose”. And this is what the entities in question are paying for, unlimited access to every tweet that has ever been made… in real-time.

The nature of these entities and their intentions vary greatly as the data field they are harvesting consists of any imaginable subject you can think of. It is important to note that they are focusing on collecting data from collective groups rather than specific individuals. To give a few examples:

  • A sport equipment manufacturer looked at which athletes were popular on Twitter so that it could decide whose sport jerseys to produce and sell.
  • A music entertainment firm examined in which locations Justin Bieber were popular so that it could plan where to stop on his tour in Turkey.
  • Academics are using this data to see if they can predict the outcome of sport games and even elections.
  • Police forces are using it to identify criminal threats so that they can take more informed preventive measures, kind of like a realistic version of the movie Minority Report.

So how do you gain access to the Twitter’s “firehose”?

Access to this firehose is gained via third-party analytic services such as Data Shift and Gnip, the latter which was acquired by Twitter for US$130-million in April 2014. Access to their services though, is not something that the average person would be able to afford.

screen_shot_2014-04-15_at_9.56.49_am.0

But it is also not something that the average person would really need. The companies that are using this service are paying huge sums of money to receive incredibly in-depth insight into very specific information that fulfils very precise criteria.

“Twitter gives this fascinating ability to understand people in context like we’ve never been able to before” says Chris Moody, former CEO of Gnip and current Data Strategy Chief at Twitter.
‘It’s not “I know that Chris Moody is a 48-year-old male” – which is how we’ve thought about marketing in the past – but “I understand that Chris Moody is dealing with the death of a parent because he’s talking about it on this public platform”.’

Twitter isn’t the only platform that provides “data mining” services but it does offer unique information in comparison. Where Facebook and Google give insight into individuals, Twitter’s data gives insight into the collective consciousness of a large social group.

To many users this raises many uncomfortable privacy issues. But Moody insists that Twitter’s transparency sets it apart from the controversial data collecting practices of Facebook and
Google:

“From our perspective, the vast majority of our data is public and we are very clear about what’s public and what’s not. Tweets are public. Direct messages between your friends are not. Other platforms are not always clear on that.”

16319022705_d57a5bbdbe_b

Moody also says that “we did the world a giant disservice in the 1950’s when we introduced call centres. Big companies thought about customer service as an operational expense, reducing the cost of interacting with customers. Twitter is about brand building, and brand value often gets measured in billions of for some brands, it’s not about making a call cost 50 cents less”.
It seems Twitter and its enthusiastic advocate, Chris Moody, have great plans for the future. But is it all as innocent as it seems?

With such great power there comes great responsibility. The process of data collecting can have an immensely positive impact on our society but can have just as much of a negative impact.

Let us know your thought in the comments below…

Wiehahn Diederichs
More

News

Sign up to our newsletter to get the latest in digital insights. sign up

Welcome to Memeburn

Sign up to our newsletter to get the latest in digital insights.