*** The 3 Things Series aims to simplify – sometimes even oversimplify – technology concepts so that you learn 3 things about a topic ***. Opinions are my own.
The technology industry is full of “buzzwords”, with Big Data being one of the most used in recent years. Organizations have always dealt with data and have stored that data in databases, but we can see in the chart below how searches on Google have changed throughout the years comparing searches for “Databases” to searches for “Big Data”.
Big Data in general refers to the ability to gather, store, manage, manipulate, and – the most important one – get insights out of vast amounts of data. And the typical question is “how big does data need to be so it is considered Big?” And the answer is…. it depends. When it comes to size, an organization’s Big Data may be another organization’s small data.
There are 3 things to remember that define “Big Data”:
- Volume. It refers to size. So if you are capturing vast amounts of information, you probably have Big Data in your hands
- Velocity. Are you working with data at rest? Or data in motion? For example if you are analyzing sales figures for the past year, that data is at rest (it is not changing constantly). But if on the other hand you are analyzing tweets to understand how your clients are reacting to a product announcement, this is data in motion as it is continuously changing. It may not be necessarily big if you are looking at daily data, but the fact that it is data in motion is relevant to the definition of Big Data
- Variety. As the ability to capture, store and analyze more data has increased, so has the interest in analyzing data that is more complex in nature. For example, an insurance company may want to analyze the recordings of customer service calls to determine what characteristics of the conversation led to a policy sale, a retailer may want to analyze videos to determine how people navigate the store and how that impacts sales, or a hospital may want to analyze x-rays to find patterns and correlations between common symptoms in patients.
So when it comes to the definition of Big Data, remember 3 things, or the 3 Vs:
- Volume (size)
- Velocity (Frequency of data update during analysis)
- Variety (complexity of data to analyze – images, videos, texts, log files, etc)