Every day, large organizations are updating with the technologies that facilitate and best suit each company, facing great challenges that allow them to discover and analyze beyond the tools that are used on a daily basis, it is for them that it was created what is known as Big Data or in Spanish massive data, which are large-scale data storage systems.
This storage phenomenon is framed in the new information and communication technologies. Big Data is what occupies all the activities that are related to the systems that store a large set of data. One of the main characteristics is that it manipulates a large amount of information, collecting, classifying and then storing it. The purpose of this collection is to create statistical reports for use by organizations, either as analysis of business plans, advertising, espionage, among others.
The storage margin has grown over the years, since 2008 the storage level was measured in petabytes to zettabytes of data. Experts are periodically looking for new storage measures because there are certain areas where large amounts of data have to be stored and existing programs are not very optimal.
There are thousands of tools to carry out and manage Big Data, however not all are the same, there are three types of Datas, which are:
- Structured Data: are those where the data has a very particular structure, such as dates, numbers, among others. An example of them are spreadsheets.
- Unstructured data: usually it is data that has a specific format and cannot be stored in a spreadsheet, much less manipulate the information, an example of PDF documents.
- Semi-structured data: this type of data does not have a particular format, since it has its own semi-structured metadata, an example of this is HTML codes.