There’s More than Meets the Eye

Big Data Inforgraphic

This is a link to the Infographic

        This infographic shows a number of statistics related to the collection and transfer of data on the internet, giving the audience an idea of how massive “The World of Data” really is. This information is presented in such a way that the audience believes the information, instead of questioning the sources of the data. The viewers, including myself, get attached to the point that this infographic is trying to make by honing in on specific facts such as: Google collects 24 Petabytes of data per day, 20 hours of video is uploaded to YouTube every minute, and 2.9 million emails are sent every second, which causes us to trust the information in this random image. However, how can we trust the sources of this information and where do they come from? To find out, we will take a look at the specific piece of data: “Google collects 24 petabytes of data per day.” By analyzing the source of information in this image, we can determine the reliability and value of the infographic itself.

Big Data Infographic


The claim that “Google processes 24 petabytes of data per day” must have come from some research or information that Google presented themselves. To find this research, I began by searching the web for “Google’s Data Consumption” (I actually used Bing as a search engine, just in case Google was not willing to freely release this information to the public). I got redirected a couple of times to new websites, but it didn’t take long before I found an article about MapReduce, which is the software Google uses to sort and process their large quantities of data. In this article, a photo was shown comparing the amount of data Google has processed from August 2004 to September 2007. If you look at the numbers for 2007, and add up the amount of input data with the amount of machines used, it does indeed come out to over 20 petabytes.

Google MapReduce Satistics


Here’s the link to the magazine

        This article was published in 2008, in the “Communications of the ACM” magazine. “ACM (Association of Computing Machinery) is the world’s largest educational and scientific computing society, and they deliver resources that advance computing as a science and a profession.” The fact that this source was researched by a reliable Association, reviewed by a publishing company, and published, I believe it establishes itself as highly credible. The original infographic also mentioned MapReduce as one of its sources, therefore I think this Infographic uses reliable information and can be trusted.

ACM’s website is here

Big Data Infographic 2

        This infographic uses the reliable information that “Google collects 24 petabytes of data per day,” and puts it in context to make a strong claim about “How Big the World of Data” really is. This is how most infographics are, therefore the source of information is usually irrelevant, because the strong claims and visual evidence allows the audience to believe and consider the claim being made. However, the sources of information really matter, especially when being made in other contexts, such as a lawsuit against Google, or a scientific study about how information is collected online. Therefore, it’s important to understand the reliability and value of a piece of information by knowing the source. There’s a reason you cite all of your sources in a research paper, or any other academic paper for that matter. It’s not just so you can sound smarter, it proves that your work is credible and your facts come from actual data and is not made up. This infographic may have turned out to be reliable, however not all infographics are. Depending on the context the information is being used in, most infographics should not be trusted without a little bit of background research.

7 thoughts on “There’s More than Meets the Eye”

  1. One thing that stood out to me that you didn’t really cover was the actual citation itself. An interesting challenge for infographics seems to be finding a way to properly cite the sources. Even though this graphic cites the sources, it only lists the titles of the pages or organizations. Citing “MapReduce” as a source is really vague and it seems like it took at least a little effort to find and track down the exact source of the MapReduce data. It seems to be a delicate balance between making it graphically interesting and informational.

    1. Yeah, that is true. They also listed Facebook and Twitter as a source, but Facebook and Twitter have millions of people on the site, so how could they have found that information? That seems to make the infographic less reliable, but I did end up finding the correct source on MapReduce, which helps a little. I feel like they were more concerned with the information, than the source listings, so their citations were very weak.

      1. The edited images about the infographics make the blog post a lot more detailed and interesting to read. It is really impressive to me that you paid lot of efforts trying to trace down to the bottom of the source and carried out calculations to prove the reliability of the information. I would suggest is that, since there are many long sentences in the last paragraph, moving your main argument to the beginning of the paragraph will be able to show your main idea in a clearer manner.

  2. I realize these are very large amounts of information, but I find it hard to fully grasp the extremity of those figures. Did you consider researching the per capita of the values given?

  3. This is a very interesting infographic. Each number really stands out as a huge amount of data. I am curious as to why you picked the Google one and also how you came up with your title. You did a very good job tracking down little bits of information in the picture to find the real authority of it.

  4. I chose the Google statistic to research because it was the data that stood out to me the most honestly. That is a LOT of data they collect, and I wanted to see if it was actually correct. My title is based on the fact that when you look at an infographic you get attracted to it and believe what on it, however there’s actually more behind the infographic in its sources and they aren’t always true. I thought that title was pretty catchy so I went for it, and I was tired of the generic titles on all my posts.

    That raises an interesting question, the per capita for this data, however I feel like that information would be very hard to find. You could divide the whole number by the amount of users on Google’s sites and get a rough estimate, but even then Google controls so much of the internet, it might be hard to get a number for that.

Leave a Reply

Your email address will not be published.