Where did all this Big Data come from?

October 29, 2013

As an industry, we are constantly talking about what we can gain from analyzing the enormous amounts of data that has started being collected, but where does it all come from?

Though the subject of Big Data has seeped into a significant number of today’s business conversations, many professionals are still working to wrap their heads around the concept, and more specifically, where all of this data is coming from. According to IBM, 90% of the world’s data has been created in the past two years, which means that the amount of  Big Data is only going to keep growing.

While we have written about the three V’s of Big Data, determining where you will collect your data from in order to be able to complete your analysis warrants its very own post. The potential sources for the kind of data sets that require Big Data analysis are almost endless. Today we’ll discuss a few of the more popular collection points.

Social networking

With the growth of social networks, most notably Facebook, Twitter and Pinterest, the data mining capabilities for business are continuously growing. These first-hand accounts of personal sentiment related to brands are a wealth of information when it comes to determining the motivations of your customers. Because one of Big Data mining's main aims is to increase conversions, this type of intel can shed some light on what considerations go into those decisions. Due to the opensource nature of these sites, social networking data is readily available for all marketers. A few examples of the types of possible data:

  • Follower/following accounts
  • Interactions with posts (comments, likes, pins, retweets, etc.)
  • Sentiment analysis of posts and tweets
  • Geography of social media user

Review networks

Online review sites give businesses an unparalleled look into the way they are viewed by their consumers. These opinion based sites are a windfall for companies who are looking to glean insights from data that can be easily segmented. Sites like FourSquare give the added benefit of having geolocation data attached, helping marketers to pinpoint marketing efforts based on geography. Alternatively, TripAdvisor and Google Reviews (which are becoming more and more important as Google fully integrates them with G+) can provide sentiment data by analyzing the tone and language of entries.

Government data sources

While government data sources may seem like the boring one at the Big Data party, it is the original and still plays a big part in a great number of analyses. Population, healthcare, and legal information, along with weather and disaster forecasts; these types of data are imperative when looking to apply Big Data research across the board. Luckily, several countries are embracing the free data mindset and are making it publically available across the internet. In May 2013, American President Barack Obama signed an Open Data Executive Order to promote data sharing with the US.


In the Big Data paradigm, blogs are a useful source of data due to their targeted audiences. Whether your company sells children's toys or consulting services to other businesses, there is most likely a blog out there with a strong audience of users who are constantly interacting with the posts that have been written. Using RSS technology, a staple on most blogs, data can be quickly and easily extracted, analyzed and refreshed to reflect the most recent site information, giving you the ability to gain real-time intel. The comments section of popular blogs are another rich source of data, capturing the thoughts and opinions of interested visitors potentially giving you insight on upcoming trends,

Media outlets

Today's user interaction driven media outlet websites are similar to blogs in being effective data sources due to their generally targeted audiences and active visitor feedback. Using various data mining techniques, marketers can collect information on the wide breadth of subject matter carried by newspaper and news magazine sites. The comment sections of these sites are brimming with insight into the views and opinions of the general public that could be relevant to marketers.

User interaction with your website

The easiest place to start collecting Big Data from today is your company's own website. We have looked at the top three considerations when starting to wade into Big Data, that can help to get you started on amassing actionable intel from your first source of visitor information. Going beyond vanity metrics like pageviews and beginning to explore complex datapoints like conversion rates and visitor profiles should be your first step into capitalizing on the Big Data revolution. Have you started looking at ways to collect Big Data? What have you found to be the most effective sources?

Contact us

We would love to hear from you! Please fill out the form and the nearest person from office will contact you.

Let's reinvent the future