Apache Hadoop has two core components
- • HDFS (Hadoop Distributed File System)
- • MapReduce(Distributed Programming Framework)
The most important aspect of Hadoop is that both HDFS and MapReduce are designed with each other in mind .Computation is done at the location of the data , rather than moving the data to the compute location.
Data storage and compute exist at same physical nodes in the cluster hence Map reduce need not to take care of network bandwidth and other such constraints while processing such a large data.