Apache Hadoop logo Apache Hadoop logo background glow

Apache Hadoop

A software framework for distributed storage that facilitates using a network of many computers to solve problems involving massive amounts of data and computation using MapReduce programming model

&

+
Distributed Computing
Hadoop enables the processing of large data sets across clusters of computers.
+
Scalability
It can scale from single servers to thousands of machines.
+
Storage
Offers local computation and storage capabilities.
+
Programming Model
Utilizes simple programming models for distributed data processing.
+
Fault Tolerance
Designed to handle failures at the application layer.
+
HDFS
High-throughput access to application data via the Hadoop Distributed File System.
+
YARN
Short for “Yet Another Resource Navigator”; Manages resources and job scheduling across the cluster.
+
MapReduce
A system for parallel processing of large data sets within the YARN framework.
-
Complex Setup
Intricate setup process, particularly challenging for beginners.
-
Batch Processing
Slower batch-processing model compared to alternatives like Apache Spark.
-
No Real-Time Processing
Absence of support for real-time data processing
-
High Latency
Elevated latency stemming from batch processing
-
Single Point of Failure
Vulnerability due to reliance on a single master node
-
Data Locality Constraints
Difficulties in ensuring data locality
-
Scalability Challenges
Complications associated with scaling Hadoop clusters
-
Resource-Intensive
Significant hardware resources required for operation

Platform

Desktop

Social

System Requirements

#Minimum
1
  • Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only) (Please compile Hadoop with Java 8. Compiling Hadoop with Java 11 is not supported.)
  • Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8
  • Apache Hadoop from 2.7.x to 2.10.x support both Java 7 and 8
  • Java 6 is supported by v2.6 or earlier
2
SSH installed and SSHD running to use the Hadoop scripts that manage remote Hadoop daemons

Ratings

4.15
5

G2CROWD
4.3
5
based on 81 reviews
TrustRadius
8.0
10
based on 214 reviews

Developer

Written in

Java, C++, C

Initial Release

1 April 2006

Repository

License

Categories

Alternatives

Distributed File System
No alternative software available under 'Distributed File System' category.
Cloud Computing

Notes