Read a paragraph on analytics or follow an infographic on the challenges of software-as-a-service, and you’ll find the term “big data.” Business models are being upended, thanks to a digital environment related to Big Data. So what’s at stake, especially for small businesses that are discovering as much competitive use of data as larger corporations?
Authors Viktor Mayer-Schroenberger and Kenneth Cukier have set out to answer that and more in Big Data: A Revolution That Will Transform How We Live, Work and Think. Mayer-Schroenberger is a professor of Internet governance and regulation at Oxford University, and author of several books. His most recent is Delete the Virtue of Forgetting in the Digital Age. Cukier is a prominent commentator and the data editor at The Economist. Both authors have produced numerous writings and articles on the subject of analysis from the perspective of many industries, organizations and situations.
I picked up a copy of this big data book at Barnes and Noble. I wanted to see how well the authors sum up today’s digital data environment.
Adding to the Big Data Discussion in a Simple Way
Mayer-Schroenberger and Cukier attempt to simplify the background behind the book’s theme. Essentially, Big Data is a perspective on the “datification” of things – processes that can be recorded as data, helping society to understand how data is growing and being collected. Ten chapters are named with one word titles, such as Now, Correlation and Messy. These, along with the stories contained within the chapters, are meant to illuminate the impact data has on societal problems and business opportunities.
Data is no longer just to confirm or disprove a hypothesis. Instead, organizations must accept some messiness with data – i.e. being less concerned about exactness and instead, broadening what events influence causality in an occurrence:
“Big data transforms how we understand and explore the world. In the age of small data, we were driven by hypotheses about how the world worked, which we then attempted to validate by collecting and analyzing data. In the future, our understanding will be driven more by the abundance of data rather than hypotheses.”
This “no-more-sample-size” idea is similar to Wired Editor Chris Anderson’s assertions about the “end of theory.” In fact, the authors do look at the debate Anderson raised when he declared that hypothesizing and modeling from small data sizes were becoming obsolete.
Other takes on the data revolution include some twists on well known subjects, such as Steve Jobs’ choices of treatment for his cancer and Amazon’s investment in data to understand customer purchase behavior. Avid technology readers may have read these examples before, but they may be new to those with cursory familiarity with tech happenings. There are some interesting data applications, such as Con Edison’s effort to prevent exploding manhole cover incidents in New York City, as well as FlyOnTime.us, an open data application.
The enormity of the data created certainly permits new solutions, but it also yields new challenges. At first blush, small business owners reading this book may feel they will bear the lion’s share of challenges (reading the chapter on Amazon may not bring warm and fuzzy memories to local bookstores).
But Mayer-Schroenberger and Cukier expect the middle-sized companies to be on the chopping block – either scale by data or staying small and nimble. Along those lines, the subject matter expert has become less influential in many industries:
“In media, the content that gets created and publicized on websites like Huffington Post, Gawker and Fobres is regularly determined by data, not just the judgement of human editors…. Jeff Bezos got rid of in-house book reviewers at Amazon when the data showed that algorithmic recommendations drove more sales. This means the skills necessary to succeed in the workplace are changing.”
Small business readers may not feel that the material relates actionable ideas to their environment. The book gives a short historical context to the big data subject, with notes indicating references within the past 10 years or so. But there’s no IT-level discussion on databases and nothing on planning management – at least in relationship to technological features. Readers expecting noSQL vs SQL debates should look elsewhere.
The most thought-provocative perspective the book gives to small business owners is an alert to how the utility of technology has evolved. This differs from any age-old debates on the viability of a technology, a debate that can hinder budget considerations. Instead of focusing on whether email is better than social media, business strategists should be more alert to trends in their marketing to develop useful associations between a marketing medium and customer response.
It’s this kind of thought process Big Data encourages. Thus the book’s ultimate value lies in stories told about how organizations are accepting data and modeling solutions that improves operations.
The chapters on Risk and Control take the concepts to further realistic scenarios. These chapters cover the topic of privacy with the latest outlooks and are probably the most actionable in discerning what to do with tech. Mayer-Schroenberger and Cukier outline a definition of profiling vs. selecting suitable predictors of customer behavior. But they take the right step in outlining societal complications, such as “penalties based on propensities”, which they call “nauseating.” The authors also note the rise of the algorithmists – professionals with math, science and computer science backgrounds to help assure accountability for the very systems we create:
“We envision algorithmists as providing a market-oriented approach to problems like these that may head off more intrusive forms of regulation…. To ensure that people are protected at the same time as the technology is promoted, we must not let big data develop beyond the reach of human ability to shape the technology.”
The authors convey a hopeful tone in their writing, as well as pragmatism tone for potential future outcomes from big data research.
But for today’s business climate, reading Big Data will help innovative small businesses to think differently about the causation of human behavior and how that behavior is recorded. Improving services or unleashing new ones can be better considered. There are other books that go deeper into the debate about sample size and correlation, but as a primer for business, Big Data works to make a misunderstood topic more understandable.
Pierre: When is small data becoming big data?! 😉
IMO, it’s becoming big when it’s impossible for me to analyze it without a software 🙂
Ivan: Thanks for your definition! It sounds like a clear cut explanation! I am afraid that big data is turning into a buzz word without a real meaning in the near future. You could sift through a huge of collection of data for a long time, without coming up with a real solution to the problem. That is the danger with stats of all kinds.
Maybe Viktor Mayer-Schroenberger’s and Kenneth Cukier’s book will help me understand the Big Idea! 😉
Pierre, nice article on Big Data. Designed by data scientists, HPCC Systems is an open source data-intensive supercomputing platform to process and solve Big Data analytical problems. It is a mature platform and provides for a data delivery engine together with a data transformation and linking system. The real-time delivery of data queries of the Roxie component is a big advantage for marketers needing to take action from data insights. More info at http://hpccsystems.com.