The Six V’s of Big Data

22

September

2016

4.73/5 (92)

 

In the Harvard Business Review article ‘Big Data: The Management Revolution’, Andrew McAfee and Erik Brynjolfsson write about the differences between big data and the analytics of the past. With the information available for us today, it will not be surprising that they conclude that using big data leads to better predictions and smarter decisions. The authors argue that they are convinced that almost no sphere of business activity will remain untouched by this movement. Therefore they suggest companies should manage the transition to using big data very effectively. Five particularly important areas in this process are leadership, talent management, technology, decision making and company structure.

To describe the differences between big data and the so called analytics of the past, McAfee and Brynjolfsson are referring to the well-known three V’s of big data. To shortly summarize, the first V, volume, represents the enormous amount of data available on the internet. The second V, velocity, implies the speed of data creation. The third V stands for variety which refers to all the different forms of big data. Although it might be argued that these three V’s are the most important features to describe big data, there are other characteristics that need to be mentioned to complement this description. In the following paragraph, I will give a brief description of the three other V’s which are also recognized as characteristics or concerns of big data, but which are not mentioned in the article by McAfee and Brynjolfsson. These additional three V’s include veracity, validity, and volatility.

The question that belongs to the term veracity is: ‘is the data that is being stored and mined meaningful to the problem being analyzed?’ Veracity in data analysis is one of the biggest challenges in comparison with other characteristics like volume or velocity. The team assigned by the company to analyze data should keep the databases clean. In other words, meaningless data should be kept from accumulating the systems.

Validity concerns the issue of the exactness and accuracy of the data. Valid data for the intended use is very important to make the right decisions. Volatility, on the other hand, refers to how long the data should be stored. This sixth V is an addition to velocity and volume. In this world of real time data, a lot of data is created every second. To take the volatility of big data into account, one should determine at what point data will no longer be relevant to the current data analysis.

The most important thing that can be learned from the six V’s, is that big data not only comes with a lot of opportunities but also with some concerns. To fully trust the results of big data analyses the process should be carefully managed. Advice on this management issue is presented in the full article, but a very important quote in my opinion from McAfee and Brynjolfsson summarizes:

“Big data’s power does not erase the need for vision or human insight”.

 

Sources:

(https://hbr.org/2012/10/big-data-the-management-revolution)
(http://insidebigdata.com/2013/09/12/beyond-volume-variety-velocity-issue-big-data-veracity/)
(https://www-01.ibm.com/software/data/bigdata/what-is-big-data.html)

Please rate this