- Computers & Software»
- Computer Science & Programming
Big Data: Understanding New Insights
Big Data: What's that?
The term "Big Data" was introduced a decade ago. It is used to refer a massive volume of data which can be both structured and unstructured that is so large it is difficult to process using traditional software and database techniques.
Big Data is a broader classification of data sets which are large and complex. These data sets are used to analyze new insights and thus making big data good for predictive analyses. Big Data can find new correlations like 'preventing disease, spotting business trends, even predicting crimes.'
Large Doesn't Mean It's Big
It depends on a system if a data set can be called Big Data or not. If you look at the email system, the maximum size of attachments is 25 MB. 25 MB is not at all huge if you compare it with a 25 GB of data set. For the email system a file exceeding 25 MB becomes big data because it can't handle and store it easily.
Getting Started With Big Data
With the upcoming of the latest trends and technology, Big Data is becoming one of the most important technology and has the potential to change the way how information is used to enhance customer experience and creating business models.
Big Data has enabled organizations to store, manage and process vast amounts of data at the right speed and time to gain the right insights. Most organizations are at an early stage in their big data journey. They are experimenting with the techniques that allows them to collect huge amount of data and find hidden treasures which is within that data which can show an early indication to an important change.
Big Data solution requires quality infrastructure which needs to be in place to support scalability, distribution and management of data.
Which Social Media platform do you use regularly?
Big Data Evolution And Characteristics
Managing and analyzing data have always offered great benefits to an organization. Traditionally companies didn't have too much data to deal with. There were customers who bought the same product in the same way, keeping things simple and straight forward. But over time, due to competition amongst the companies, new products were launched and it complicated everything.
From spreadsheets to Relational Database Management System(RDBMS) to Distributed File Systems(DFS), data storing systems have changed and so does the process to analyze and process them.
Big Data is defined as any kind of data source that has at least three shared characteristics:
- Extremely large Volumes of data: The quantity of data being generated is important in this context.
- Extremely high Velocity of data: How fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development.
- Extremely wide Variety of data: The data may be structured(text, xml files etc.) or unstructured(pictures, videos etc.)
Types of Big Data
Big Data comes in various varieties, from dollar transactions to tweeting an image. Thus, this information needs to be integrated for analysis and data management.
Data management has always been around for a long time, what makes it difficult are:
- New data sources like data generated from sensors, smartphone and tablets.
- Previously generated data hasn't been captured because we didn't have any cost-effective way to deal with the data.
In structured data, the data have a defined length and format. It accounts for almost 20% of the data that is out there. Examples of structured data include numbers, dates, and groups of
words and numbers called strings.
Structured data is the data with which we deal the most and it's usually stored in a database. The evolution in technology provides newer sources of structured data often in real time and large volumes.
Type of Structured Data
Computer or Machine Generated
Web Log Data
Click Stream Data
Gaming Related Data
Unstructured data is everywhere. In fact, most individuals and organizations
conduct their lives around unstructured data. Just as with structured data,
unstructured data is either machine generated or human generated.
Types of Unstructured Data
Computer or Machine Generated
Text internal to your company
Social media data
Photographs and video
Radar or sonar data
Setting The Architectural Foundation
Before we start talking about architecture, lets take into account the functional requirements of big data.
To begin with, the data is captured, and then organized and integrated for analysis. Analysis is based on the problem being solved. Management then takes action on the results obtained from the analysis. This corresponds to the big data cycle of management.
In addition of these functional requirements, required performance is also of utmost importance. We need right amount of computational requirement like power and speed.
“Illuminating and very timely a fascinating — and sometimes alarming — survey of big data’s growing effect on just about everything: business, government, science and medicine, privacy, and even on the way we think.”
The Big Data Journey
Companies have always had to deal with lots of data to out rank their competitors. With the right technology in place, companies can solve business problems and react to opportunities.
With big data, data patterns can be analyzed to manage cities, prevent failures, manage traffic, improve customer satisfaction and the list goes on.