Big Data Phrenology


In their excellent book Big Data: A Revolution That Will Transform How We Live, Work, and Think, Kenneth Cukier and Viktor Mayer-Schonberger made the distinction between datafication and digitization.  Datafication is the process of putting a phenomenon into “a quantified format so it can be tabulated and analyzed.  This is very different from digitization, the process of converting analog information into the zeros and ones of binary code so that computers can handle it.”

“Digitization wasn’t the first thing we did with computers,” Cukier and Mayer-Schonberger explained.  “The initial era of the computer revolution was computational, as the etymology of the word suggests.  We used machines to do calculations that had taken a long time to do by previous methods: such as missile trajectory tables, censuses, and the weather.  Only later came taking analog content and digitizing it.  The arrival of computers brought digital measuring and storage devices that made datafying vastly more efficient.  It also greatly enabled mathematical analysis of data to uncover its hidden value.  In short, digitization turbocharges datafication.”

Although big data turbocharges information overload, a lot of what we relentlessly datafy and digitize also turbocharges information underwhelm.  So, while you are scratching your head trying to tell the difference between signal and noise, don’t try to understand every bump you will no doubt discover atop your head.

“The zeal to understand nature through quantification,” Cukier and Mayer-Schonberger explained, “defined science in the nineteenth century, as scholars invented new tools and units to measure and record electric currents, air pressure, temperature, sound frequency, and the like.  It was an era when absolutely everything had to be defined, demarcated, and denoted.  The fascination went so far as measuring people’s skulls as a proxy for their mental ability.  Fortunately the pseudo-science of phrenology has mostly withered away, but the desire to quantify has only intensified.”

Our desire to quantify everything has definitely intensified, as everyday it seems like new big data startups are emerging that are throwing big data analytics at everything possible.  While I certainly agree with the general premise, since data separates science from superstition, I am also concerned about organizations practicing what I call cargo cult data science.


The distinguishing feature of phrenology was the idea that the sizes of brain areas were meaningful and could be inferred by examining the skull of an individual.  Since a lot of the big data hype has the distinguishing feature that any large volume of data is meaningful and therefore its analysis must provide business insights, the pseudo-science of big data phrenology seems inevitable.

Phrenologists believed examining the shape of people’s skulls told a credible story about their intelligence.  Although data shapes our perception of the real world and one of the most important aspects of data science is data storytelling, big data phrenologists believe examining the shape of any large dataset will tell an actionable story about the future of their organization’s business intelligence.

Keep your head about you with big data analytics, bearing in mind that not every close encounter of the big data kind necessarily means something important.

About Jim Harris

Jim Harris is the Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ), which is an independent blog offering a vendor-neutral perspective on data quality and its related disciplines. He is a recognized thought leader with more than 20 years of enterprise data management industry experience. Jim Harris is an independent consultant, professional speaker, and freelance writer for hire.

View all posts by Jim Harris →

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>