In this week’s reading, author Franco Moretti argues about trends in novel literature over a span of several decades, and how literary history is defined by data sets and not by individual works. He states that:
“Quantitative research provides… data, not interpretation. Quantitative data can tell us when Britain produced one new novel per month, or week, or day, or hour for that matter, but where… and why–is something that must be decided on a different basis.”
While I do agree that quantitative data does provide data, such data sets are capable of exhibiting a limited form of interpretation. Collection of large data sets can be passed off to outside viewers as being completely unbiased, but for the most part, data mining does not exist for the sake of mining data; the motivation is almost never that circular. Big data, therefore, presents a watered-down form of interpretation of a subject through both the data it provides, and that which has been purposefully omitted from it. Applying a more concrete argument to a data set is essential to strengthening the claims made by both, but the fact is that data sets are assembled for specific, situational purposes, and therefore carry within them implicit arguments to be defined by the viewer/reader.