Apache Hadoop
A software framework for distributed storage that facilitates using a network of many computers to solve problems involving massive amounts of data and computation using MapReduce programming model
&
+ | Distributed Computing | Hadoop enables the processing of large data sets across clusters of computers. |
---|---|---|
+ | Scalability | It can scale from single servers to thousands of machines. |
+ | Storage | Offers local computation and storage capabilities. |
+ | Programming Model | Utilizes simple programming models for distributed data processing. |
+ | Fault Tolerance | Designed to handle failures at the application layer. |
+ | HDFS | High-throughput access to application data via the Hadoop Distributed File System. |
+ | YARN | Short for “Yet Another Resource Navigator”; Manages resources and job scheduling across the cluster. |
+ | MapReduce | A system for parallel processing of large data sets within the YARN framework. |
- | Complex Setup | Intricate setup process, particularly challenging for beginners. |
- | Batch Processing | Slower batch-processing model compared to alternatives like Apache Spark. |
- | No Real-Time Processing | Absence of support for real-time data processing |
- | High Latency | Elevated latency stemming from batch processing |
- | Single Point of Failure | Vulnerability due to reliance on a single master node |
- | Data Locality Constraints | Difficulties in ensuring data locality |
- | Scalability Challenges | Complications associated with scaling Hadoop clusters |
- | Resource-Intensive | Significant hardware resources required for operation |
System Requirements
Version ↓
# | Minimum |
---|---|
1 |
|
2 | SSH installed and SSHD running to use the Hadoop scripts that manage remote Hadoop daemons |
License
Categories
Alternatives
Distributed File System
No alternative software available under 'Distributed File System' category.
Cloud Computing
Apache Mahout Apache Spark
No alternative software available under 'Distributed File System' category.
Cloud Computing
Apache Mahout Apache Spark
Notes
- Apache, Apache Hadoop name and logo are trademarks of Apache Software Foundation.
- Hardware System requirements (optimal) are not from official website.