The first question which raises in ones mind on hearing the word Big Data is how big is "BIG"
Big data might be petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of data consisting of billions to trillions of records of millions of people—all from different sources (e.g. Web, sales, customer contact center, social media, mobile data and so on).
Hence it can be concluded that Big data is similar to small data but bigger in size.
An important examples would be:
Wallmart handles more than 1 million customer transaction every hour.
The 3 major characteristics of Big Data:
Volume: Facebook ingests 500 terabytes of new data every hour
Velocity: High-frequency stock trading algorithms reflect market changes within micro seconds
Variety: Big data isn't just numbers,dates and strings. It is also 3D data, audio file, video file, unstructured text, log files and social media
Big data may be:
Structured: Traditional data warehousing
Semi-structured: Text Mining
Unstructured: Video Surveillance
Type of Tools used in Big Data:
Where is processing is hosted?
-Distributed Servers/Cloud (Amazon EC2)
Where is data stored?
-Distributed Storage (Amazon S3)
What is the programming model?
- Distributed Processing ( MapReduce)
How is data stored and indexed?
-High-performance schema-free databases (MongoDB)
What operations are performed on data?
-Analytic Processing
Some important points:
Big data helps us in capturing the social media explosion in present times.
Unlike the traditional way of analyzing a subset of information collected in samples , here we analyse the entire data set, which results into better business decision both strategic and operational.
Big Data - The Indian Way
A Hyderabad based analytics firm named Modak Analytics had built India’s first Big Data-based electoral repository system.
The company brought together data of 81.4 crore Indian voters that’s 18 terabytes of data which includes 10 TB in .pdf format.
Mentioned below are a few insights from their report:
Of the 13.4 crore voters in Uttar Pradesh, the country’s biggest State by number of voters, at least 1.2 crore people have Ram somewhere in their name.
In Andhra Pradesh, the name Srinivas is spelt 600 different ways. About three lakh women in Gujarat have Gita Ben as their first name, while Bihar is home for 3.27 lakh women with Sita as their first name and an almost equal number of women named Geeta. Ramesh seems to be the most common first name across the nation.
The other names that are quite popular are: Lakshmi (19.28 lakh, Andhra Pradesh), Fernandes (81,000, Goa), Shankar (11.41 lakh) and Patil (24 lakh, Maharashtra).
Two longest names for voters are registered in Andhra Pradesh – E Janake Sathya Surya Vijaya Durga Maheshvari in Sangareddy constituency and Venkata Sathya Suriya Maitreyi Kumari Toleti in Narsapur constituency.
In Chhattisgarh, the age of a voter is marked as 19,545 years, while 64 voters in AP has ‘0’ years of age.
Thus I conclude with the words of Andrew McAfee "The world is one big data problem."
Request you to visit the following links:
http://datasceptre.blogspot.in/
datadosage.blogspot.in
http://analyticsyatra.blogspot.in/
http://praxis.ac.in/
Big data might be petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of data consisting of billions to trillions of records of millions of people—all from different sources (e.g. Web, sales, customer contact center, social media, mobile data and so on).
Hence it can be concluded that Big data is similar to small data but bigger in size.
An important examples would be:
Wallmart handles more than 1 million customer transaction every hour.
The 3 major characteristics of Big Data:
Volume: Facebook ingests 500 terabytes of new data every hour
Velocity: High-frequency stock trading algorithms reflect market changes within micro seconds
Variety: Big data isn't just numbers,dates and strings. It is also 3D data, audio file, video file, unstructured text, log files and social media
Big data may be:
Structured: Traditional data warehousing
Semi-structured: Text Mining
Unstructured: Video Surveillance
Type of Tools used in Big Data:
Where is processing is hosted?
-Distributed Servers/Cloud (Amazon EC2)
Where is data stored?
-Distributed Storage (Amazon S3)
What is the programming model?
- Distributed Processing ( MapReduce)
How is data stored and indexed?
-High-performance schema-free databases (MongoDB)
What operations are performed on data?
-Analytic Processing
Some important points:
Big data helps us in capturing the social media explosion in present times.
Unlike the traditional way of analyzing a subset of information collected in samples , here we analyse the entire data set, which results into better business decision both strategic and operational.
Big Data - The Indian Way
A Hyderabad based analytics firm named Modak Analytics had built India’s first Big Data-based electoral repository system.
The company brought together data of 81.4 crore Indian voters that’s 18 terabytes of data which includes 10 TB in .pdf format.
Mentioned below are a few insights from their report:
Of the 13.4 crore voters in Uttar Pradesh, the country’s biggest State by number of voters, at least 1.2 crore people have Ram somewhere in their name.
In Andhra Pradesh, the name Srinivas is spelt 600 different ways. About three lakh women in Gujarat have Gita Ben as their first name, while Bihar is home for 3.27 lakh women with Sita as their first name and an almost equal number of women named Geeta. Ramesh seems to be the most common first name across the nation.
The other names that are quite popular are: Lakshmi (19.28 lakh, Andhra Pradesh), Fernandes (81,000, Goa), Shankar (11.41 lakh) and Patil (24 lakh, Maharashtra).
Two longest names for voters are registered in Andhra Pradesh – E Janake Sathya Surya Vijaya Durga Maheshvari in Sangareddy constituency and Venkata Sathya Suriya Maitreyi Kumari Toleti in Narsapur constituency.
In Chhattisgarh, the age of a voter is marked as 19,545 years, while 64 voters in AP has ‘0’ years of age.
Thus I conclude with the words of Andrew McAfee "The world is one big data problem."
Request you to visit the following links:
http://datasceptre.blogspot.in/
datadosage.blogspot.in
http://analyticsyatra.blogspot.in/
http://praxis.ac.in/
Good Start. Please move the links to other blogs to the sidebar using Gadgets / Layouts
ReplyDeleteGood one.
ReplyDeleteHi, This is Yasmin from Chennai. Thanks for sharing such an informative post. Keep posting. I did Big Data Training in Chennai at TIS academy. Its really useful for me to know more knowledge about Big Data. They also give 100% placement guidance for all students.
ReplyDelete