Data Has Always Been Big

Thursday, December 8, 2016 - 11:00am11:45am

Kyle Erf, Software Engineer, MongoDB, Inc.


In the tech sector, we pride ourselves in innovating. But if beating the past is our goal, then why do most of us only have a view of the past that extends back a few decades? Seemingly every day, another article is published explaining Big Data, our generation’s new struggle to manage more information than it can readily process. Big Data, however, is nothing new—most societies in human history struggled with this very same problem, from the Fertile Crescent to the Industrial Revolution. My talk will present a brief overview of how and why the amount of available information has always outpaced our ability to fully comprehend it.

This talk will give a brief history of handling information and how humanity’s solutions for dealing with information always ends with the creation of even more information, leading to the perpetual feeling of having "too much data to deal with" that brings us to our current Big Data situation. I will include historical points such as

  • the birth of the written word and the death of memorizing everything (and how mad people were about it)
  • early means of "backing up" the world’s writing
  • how Catholic monks invented alphabetical ordering
  • the explosion of knowledge due to Gutenberg’s printing press
  • the wacky tools researchers of the Renaissance used to organize the information overload of their day
  • early census machines and electronic databases

in order to place our current issues with information overload into this much larger timeline.

As a programmer at one of the leading "Big Data" companies, I'm sometimes uncomfortable with how the marketing speak and conference talks surrounding data storage always introduce Big Data as a new problem. While the technical specifics certainly are new, this feeling of information overload has existed since the dawn of written information. With this historical context in mind, we can better build for the future and avoid recreating the mistakes of those who came before.

LISA16 Open Access Sponsored by Bloomberg

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {201526,
author = {Kyle Erf},
title = {Data Has Always Been Big},
year = {2016},
address = {Boston, MA},
publisher = {USENIX Association},
month = dec

Presentation Video 

Presentation Audio