Big data. Probably you’ve heard of it. It’s a buzzword, it’s complex, and it has many meanings and applications. In large organizations, managers use it to get quick insights to stay ahead of competition, data scientists use it to get meaningful answers, and executives use it to act on market opportunities. Understandably, most start-ups and other small-sized businesses often do not have the resources to hire data scientists, system engineers or research firms to tackle their ‘Big data’ questions. But what does Big-data mean and how does it relate to small-sized firms?
What is Big Data?
Let me start by saying that there is no commonly accepted definition of Big-data. According to Gordon (2013) Big-data is data that can be defined by some combination of the following five characteristics: (i) volume (ii) variety (iii) velocity (iv) value, and (v) veracity of data. Both government and large data-intensive organizations, such as Amazon and Google, are interested in Big-data and are taking the lead in some of the developments of the technologies to handle it. They are often involved in the development of so called NoSQL databases; databases that are more scalable and faster in processing large quantities of structured, semi-structured and/or unstructured data, compared to the conventional SQL databases. Storing the data is just one part of the story; it still requires analysis and visualisation. In large organizations, data scientists usually store, transform, and analyse the raw data and visualise it so that it becomes of use.
How does this relate to Small Businesses?
In my previous job, I’ve seen many small to medium-sized businesses and I’ve spoken to lots of entrepreneurs. Some of them were, for example, traditional accountants that were comfortable with their day-to-day operations and used ‘just’ information from their website and CRM to make strategic decisions and acquire new clients. They didn’t see, or didn’t want to see the Big-data potential. However, most of the other entrepreneurs wanted to, but were struggling with, processing and analysing both their offline and online information. In many cases, I did see that small firms almost always possessed variety and volume in valuable data (online information gathering through Google Analytics, CRM, market trends, KPI’s, NPS scores, employees networks; and offline information through: surveys, other field research, events etc.) however they simply lacked the tools in making them blend and work together. For small firms, buying a cloud-based ‘Big-Data-ready’ NoSQL database is not the problem; storing, processing and analysing the structured and unstructured data is 9 out of the 10 times the issue. Luckily, also for small firms, there are now beneficial ways to making use of their Big-data.
Some practical recommendations
- First, entrepreneurs and managers of small firms must understand that Big-data is not only something for big firms; dealing with online, offline, and external data sources should be an integrative part of a firm’s strategy.
- Second, it is important that firms continuously gather and enrich their information. Enriching data means more knowledge about the (potential) customers, which gives firms the possibility to better personalize its offerings.
- Third, to cope with all the information, a third-party Big-data processing tool should be adopted. A very powerful Big-data tool that helps to store, process, analyse, and visualize information is ‘ClearStory Data’. It combines all types of business’s internal (online and offline) information with publicly available information (through data mining) and creates easy to understand and customizable dashboards to help make better decisions; whether this is on a strategic, tactical or operational level.
In conclusion, information management in small businesses should not be considered rocket science, right?. The key learning here is that every business has, and can make sense of, internal and external data; it is just a matter of acknowledging the potential and finding the right Big-data tools and solutions.
Sources:
Gordon, K. (2013). What is Big Data?. Itnow, 55(3), 12-13.
Hi Jurjen,
Great post I like and agree with your statement that big data isn’t necessarily only a topic for big companies!
But vice versa big data might (or should) not necessarily be a topic for big companies either. Or not even at all. Complementing what you mentioned in your opening statement, big data is a buzzword and will be misused in 9 out of 10 cases :).
Let me explain how: many managers that started their jobs in the bottom ranks acquired their technological skills and knowledge at a time when 256MB of RAM was almost a super computer, where the cloud was still a large collection of very tiny droplets of water and the Internet had just started to become a thing. These same managers might find themselves with a few gigabyte of data from many different sources per day leading up to hundreds of gigabyte and even terabytes a month. Naturally this seems like a lot and fairly big. But too many start yelling the b-word out loud too much and fall for companies that promise them big data solutions and the holy grail directly with it.
But relative to the evolution of computers and computing power availability the amounts of data have simply increased as well. And the terminology and definitions should evolve too.The key to handling this is a good data management, not big data management. In my previous job I’ve heard many customers and peers talk about their big data ambitions but no one had a clue of what they were actually looking for or really trying to archive. The term is simply overused and “blown into proportion”.
This leads me to the one thing companies should ask themselves before proclaiming their re-invention as the new big-data powered company – ask the question whether they really are a data company and want to operate in that area: if you’re a small retailer, a medium sized manufacturer or even a start up you may never be a company that will need to get into big data.
Companies such as LinkedIn, Facebook, Google are true information companies – they are all built on a business model that makes money from buying, selling or delivering information to people. These companies rise and fall with their ability to at all times mine massive amounts of data in order to deliver relevant information or improve their products. The amounts of data their networks generate is almost unimaginable, Facebook uploads are 100Tb per day already, image what network traffic for Snapchat or website logs for LinkedIn looks like. This is what true big data is, comparing your google analytics results with your sales numbers and correlating them with further external data certainly is a data scientist’s task but should not be coined big data.
If it’s not a company’s core business they might just burn their hands with pursuing a big word that doesn’t match their business. Companies, both big and small should rather be smart about where they want to put their money by being honest about what size of data they truly juggle with, what they can do with it now and how they can try to collect more (valuable) data in the future.
Last but not least it is a matter of the type of data you are looking at – sales, geo, product and even web statistics will likely never amount to big data (unless you’re one of the giants). Web logs, sensor, social & network data are on the other hand much more likely to grow so big so fast that only cheap, distributed systems like Hadoop can and should handle them. Most companies selling “big data solutions” will offer products where users get a can visualize their sales data, split it into regions on a nice looking map and then map the competitor against their store locations. The true difficulty in those cases is not an alleged technological problem but rather the lack of data or its acquisition.
The key to success is not to jump on the big word wave but to consider all solutions available. Analytics, databases and data management have been around for decades and if you’re not amassing so much data that you need to rent or buy new infrastructure all the time chances are that the conventional technologies will do just fine for the analytics you want to run. Because if you want to filter through your sales data and run it against some pre aggregated Google analytics results your old school SQL database will outperform any Hadoop system by light years.