The Problem of Analysis Illiteracy Among Users

I’ve worked on several "Big Data" products over the years. I’ve always inspired myself and others (or at least attempted to inspire others) with goal of helping regular people gain true insight from the hoards of data that their organizations collect. Not just data, but real, actionable insight.

But all the while I’ve known that it can be very difficult to offer real insight from data unless the problem is simplistic, or unless you unleash a PhD-credentialed data scientist on the task.

Deriving insight makes searching for a needle in a haystack seem trivial —- after all, you know ahead of time that you are looking for a needle. But to really gain insight to high level business questions like, “What factors drive sales?”, there are so many different avenues that you could explore. So instead of searching for a needle in a haystack, it’s more like you are searching for things that “seem interesting” in a whole bunch of haystacks. Except you shouldn’t restrict your search to the haystack –- it really could be anywhere in the entire field. Oh yeah, and you also have to search the field in the dark with a flashlight.

The quagmire is that valuable, business insight usually requires answering a question that has not been asked before. But how to know what questions to ask, which might be fruitful? How to do the initial exploration of data to find the marketers/nuggets that ultimately lead to the new question and subsequent answer? How to use data snippets as inspiration for broader analysis?

The difficulty of this task is supported by a survey recently reported in Information Week (“Why Big Data Doesn’t Always Equal Big Insight, by Shvetank Shah). The survey was of 5000 employees at 22 global companies, conducted by the Corporate Executive Board:

  • 62% of employees are making poor decisions when attempting to tease out insights from data.
  • Fewer than 40% of employees have the right process and skills to make good use of analysis
  • 85% of companies’ data is unstructured – meaning that traditional Business Intelligence tools generally can’t get at this data
  • 50% say that information from corporate sources is in unusable formats
  • 2/3 of respondants spend time on unproductive analysis
  • 2/3 of employees don’t trust data from other functions in the company