Apache Hadoop is an open source software platform managed by the Apache Software Foundation has proven to be very helpful in storing and managing vast amounts of data cheaply and efficiently.
Hadoop is an open source JAVA framework.
Hadoop is not a database:
- Which allows you to store large amount of data on clusters of low cost commodity hardware.
- Provides you the capability to process that distributed data using simple programming model.
Hadoop an efficient distributed file system and not a database. It is designed specifically for information that comes in many forms, such as social media, server log files or other documents. Anything that can be stored as a file can be placed in a Hadoop repository.
Origin of Apache Hadoop:
The origin of Apache Hadoop Projects is from Google White paper series on BigTable, MapReduce and GFS. Google has written white papers on these topics but the code and implementation was never done.
Later on Yahoo and many other contributors implemented Google's white paper and that's how Hadoop framework developed.
Doug Cutting, Hadoop's creator, named the framework after his child's stuffed toy elephant.