Apache Hadoop logo Apache Hadoop logo background glow

Apache Hadoop

A software framework for distributed storage that facilitates using a network of many computers to solve problems involving massive amounts of data and computation using MapReduce programming model

&

+Distributed ComputingHadoop enables the processing of large data sets across clusters of computers.
+ScalabilityIt can scale from single servers to thousands of machines.
+StorageOffers local computation and storage capabilities.
+Programming ModelUtilizes simple programming models for distributed data processing.
+Fault ToleranceDesigned to handle failures at the application layer.
+HDFSHigh-throughput access to application data via the Hadoop Distributed File System.
+YARNShort for “Yet Another Resource Navigator”; Manages resources and job scheduling across the cluster.
+MapReduceA system for parallel processing of large data sets within the YARN framework.
-Complex SetupIntricate setup process, particularly challenging for beginners.
-Batch ProcessingSlower batch-processing model compared to alternatives like Apache Spark.
-No Real-Time ProcessingAbsence of support for real-time data processing
-High LatencyElevated latency stemming from batch processing
-Single Point of FailureVulnerability due to reliance on a single master node
-Data Locality ConstraintsDifficulties in ensuring data locality
-Scalability ChallengesComplications associated with scaling Hadoop clusters
-Resource-IntensiveSignificant hardware resources required for operation

Platform

Social

 

System Requirements

Version ↓
#Minimum
1
  • Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only) (Please compile Hadoop with Java 8. Compiling Hadoop with Java 11 is not supported.)
  • Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8
  • Apache Hadoop from 2.7.x to 2.10.x support both Java 7 and 8
  • Java 6 is supported by v2.6 or earlier
2
SSH installed and SSHD running to use the Hadoop scripts that manage remote Hadoop daemons

Ratings

4.15
5

G2CROWD
4.3
5
based on 81 reviews
TrustRadius
8.0
10
based on 214 reviews

Written in

Java, C++, C

Initial Release

1 April 2006

Alternatives

Distributed File System
No alternative software available under 'Distributed File System' category.
Cloud Computing
Apache Mahout   Apache Spark  

Notes