Unlike your children, you don’t need to love all of your data equally. Data exhibits huge variability in its value to an organization. Some data is vitally important and some is just overhead. The value of data follows a power law distribution. This can be counterintuitive and uncomfortable to think about because we tend to assume most things in life follow a normal distribution.
Such distributions are also called the 80-20 rule or the long tail.
The key takeaway here is that such distributions require a different mindset. Business leaders need to think more like Venture Capitalists (VCs) and less like loving parents. A VC knows that most companies fail and go to zero. Similarly, business leaders need to be discerning about their investments in data. It is wiser to invest asymmetrically based on a nuanced understanding of what data is important rather than spraying your money and praying for the best.
While this concept may seem obvious to some, most organizations don’t have strategies that exploit this distribution. Usually, there is a departmental level intuition about what data is important but I have rarely observed an enterprise strategy utilize such thinking.
Clichés like “we need a single version of the truth” or “we don’t care about log files and social media” tend to dominate discussions. An understanding of the Power Law of data will show you why such ideas though well intentioned don’t produce optimal outcomes.
So the million dollar question – How do you determine what data is most important? And then how do you exploit that to your advantage?
Step 0: Identify your high value data
Start by identifying what value discipline your organization competes on. Is it operational excellence, product superiority, customer intimacy or something else? Within these disciplines, identify the business processes and systems which help deliver on your promise. For example, if you compete on operational excellence then perhaps your supply chain and logistical processes are worth examining. If you compete on customer intimacy, data around your customer centric processes is important. The smartest organizations align their data efforts with strategy and then focus on the most important needs of their most important customers. Don’t try to do all things to all data.
Step 1: Get more nuanced and less standardized
Standardization is great if inter-departmental communication is the goal. But the processes core to your competitiveness should not be compromised at the altar of standardization. For e.g. Don’t make your customer service representative follow the definitions dictated by Supply Chain. Strive for as much nuance and subtlety as possible! Forget about “single version of the truth” which is really an attempt to standardize and treat all data the same.
Make the extra investments to support multi-faceted versions of the truth.
Step 2: Add more and diverse data sets
Augment the data generated in your core processes with new and diverse data sets. This enriches the data set because many data sets combined tell you more than they do separately. Add benchmark data, social media feeds, machine data, e-mail, log files, semi-structured data – whatever volume, variety or velocity the data comes in – grab it and assimilate.
Step 3: Throw in your superstars
When you have a rich enough data set – there will always be secrets.
In the hands of the right analysts unexpected patterns, mysterious outliers, and other previously unknown insights will pop out of the data. So get you superstar analysts involved. Run experiments. Generate hypotheses. Disseminate your findings with narratives so that everybody follows along. And then ensure your executives to act upon the insights. Without such action, the whole exercise is moot.
Step 4: Its a journey…keep going
There are still many secrets to discover and white spaces fill. Keep trying new things and adapting as you go. In the world of data, you have run faster just to stay in place.