To all us gathered here it sounds like a silly question. But is it?

According to data from 2018, over 3,7 billion of people are using the internet. That’s almost half of the world’s population. We can obviously tell from the very beginning, that this internet thing is kind of a big deal. For example, Google has to process (answer) over 40 thousand questions per second, which is around 40 trillion bytes of data (yes, 40 000 000 000 000 bytes which is equivalent to 40 000 gigabytes). No wonder that Google is running over 3 million servers located mainly in the US, Asia and Europe.
Big Data is just big, variable data. Processing and analysing the data is hard and takes a lot of effort to be extracted, but companies who gather all this information are using very fast servers (computers) to manage and store such a big amount of evidence. Also, data, in general, is very valuable – it’s easy to say, that it’s the most valuable thing in the whole world because it leads to gaining a new collection of pieces of information about clients (internet users). The concept of Big Data was created or expanded with the rise of the internet because the speed and ability of processing of data have picked up very drastically.
The value of Big Data basically depends on how you manage them and not only by leaning on their amount. The data which are generated by various sources can be easily used in an IT trade company in order of lowering the costs, reducing time, producing a new commercial offer or creating new, better strategical decisions.
Mass data connected with highly advanced analysis in the company of new high tech tools also support business processes such as evaluating the causes of some errors, faulty work, generating various coupons in all kinds of web stores, calculating the risk of many transactions or detecting a behaviour which could bare the abuse of certain company/organization.
So, summarizing and generally speaking, big data is all the data which is available to gather by the usage of the internet about everything – people, companies, shares, transactions, conversations, preferences, any kind of goods, history etc.
Now on to the second part of the topic, what is 5V? This abbreviation stands for Volume, Variety, Velocity, Veracity and Value – these are the crucial keys which make big data such a big deal.
Volume – it is the general amount of all the gathered and stored data during the usage of the world wide web. The size of the data determines the value and potential insight, and whether it can be considered big data or not.
Variety – The type and personality of the data. This helps people who analyze it to effectively use the resulting insight. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.
Velocity – this term stands for the speed of harvesting the data throughout the internet. As mentioned above, the velocity of data processing is painstaking but crucial. This forces the companies which are interested in collecting big data to invest bulk amounts of money into the hardware which is going to be able to extract all the demanded information on time. In order to increase the velocity, a process called sampling is widely used.
Veracity – this is sort of an extended definition for big data, which refers to both the quality and value of the data. The quality of data can is very variable, so it is very important to squeeze everything valuable from the product and store it in the correct order
Value – The value of the end product which is important specifically for the user himself. All the data itself has no value at all, but there are a lot of people, and especially companies/enterprises who are willing to pay unbelievable amounts of money for the final product. Take a look at Facebook, for example, just look around it and you’ll quickly realize, that there’s a suspiciously big amount of advertisements which are actually connected to what you are interested in, what you have recently talked about or what you just googled. All these ads are basing on big data which tells them who likes what kind of product
Big data seems to be a ground-breaking invention when speaking of technological progress, but in my opinion, this is too much to handle for such a simple creature, a human. The process of raising Big Data was like a domino, something that couldn’t be stopped once it was launched. The demand for this product became so high, that it couldn’t be undone at this point. In these days, Instagram knows nearly everything about its users. Study shows, that an average user of Instagram has his own profile which has over 900 pages of information gathered about them – this is how Big Data works. It’s something brilliant, but also the level of invigilation is simply scary. Big Data to me is just very controversial, it helps in many situations such as predicting terrorist acts, prevent environmental crisis or a crash of the market, but also can violate or privacy – because what else is important to us if not privacy?
“they” just know literally everything about us. A movie called “Snowden” is a great visualization of what I mean – because what else is important to us if not privacy?
It’s interesting how often our data appears in different datasets, and also a little bit scary.