Rage against the Machines Learning


As the dawn of a new year is ascending, the big data hype cycle is descending from the peak of inflated expectations, and with it a lot of recent blog posts have been written in an attempt to provide a more balanced perspective about the potential of big data analytics.


In his post Why big data might be more about automation than insights, Derrick Harris wrote about how smart data scientists “have figured out how to make machines work for humans instead of replacing them.  Yes, machine learning algorithms and big data technologies analyze a volume of data points that humans could never do, uncovering complex relationships the naked eye could never spot.  But once the heavy lifting is done, humans come in and use their subject-matter expertise and logic to prune off bad connections, add context and maybe even inject a little serendipity into the final algorithms.”

“Whether it’s corporate business intelligence or the consumer web, though,” Harris concluded, “all of this is about automation.  Data-minded people have always used data to aid in decision-making without ignoring their instincts.  Big data just lets them learn a lot more, a lot faster.”

Automation raises concerns that the role of the human in the learning process is being diminished.  Or simply that the role of the machine in the learning process is being overstated in the era of big data.  Machines are mathematical wizards and machine learning automated by math models is at the heart of data science.  But that doesn’t mean machine learning is the only brain involved in data science.

“The problem is that a math model, like a metaphor, is a simplification,” Steve Lohr wrote in his post Sure, Big Data Is Great, But So Is Intuition.  “This type of modeling came out of the sciences, where the behavior of particles in a fluid, for example, is predictable according to the laws of physics.  In so many big data applications, a math model attaches a crisp number to human behavior, interests and preferences.”  Lohr referenced an example of how the peril of that approach became brutally obvious during the global financial crisis.

“Listening to the data is important, but so is experience and intuition,” Lohr concluded.  “After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model?”

I made a similar point in my post Data-Driven Intuition.  Because of the way our brains work, our intuition has always been more data-driven than we give it credit for, so, in a way, humans have always been data scientists.  However, a human brain has limitations, most notably storage capacity and retrieval speed, that a machine brain does not.  And although big data can’t ignore its qualitative aspects, we can’t ignore big data’s daunting quantitive aspects.

“Without a doubt, more things can be quantified than ever before,” wrote Sam Ford in his post Without Human Insight, Big Data Is Just A Bunch Of Numbers.  “With the wealth of data we can now collect and analyze in increasingly sophisticated ways, we have only scratched the surface as to the vast number of advances we might find.  However, in any era with rapid technological change, it’s easy to start slipping into what has been termed technological determinism, to start speaking of the technology as if it drives culture and humanity, rather than thinking of technology as a tool.”

Ford argues that presuming everything can be quantified, that culture can be culled down to quantitative data, “supposes the world is infinitely knowable.  It posits that context and particularity is only so useful inasmuch as it can be captured by machines.  And that’s where the tail starts to wag the dog.”

Making a point similar to the one I made in my post Darth Vader, Big Data, and Predictive Analytics, Ford concludes “perhaps the answer is that our organizations must become cyborgs: combining what can be gathered technologically with the humanity that can help us balance and make sense of what the quantitative can tell us, lest we lose our humanity and just become robots.”

As T.S. Eliot raged: “Where is the wisdom we have lost in knowledge?  Where is the knowledge we have lost in information?”  Therefore, if you must rage against the machines learning, at least acknowledge the machines are only learning information, not knowledge, nor wisdom.  Humans have historically had a hard time with the last two.  Our learning machines are only tools, perhaps the best ones we have built yet, to help us with our quest for knowledge and wisdom.

About Jim Harris

Jim Harris is the Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ), which is an independent blog offering a vendor-neutral perspective on data quality and its related disciplines. He is a recognized thought leader with more than 20 years of enterprise data management industry experience. Jim Harris is an independent consultant, professional speaker, and freelance writer for hire.

View all posts by Jim Harris →

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>