Apache SystemDS
A machine learning system for the end-to-end data science lifecycle, encompassing data integration, cleaning, feature engineering, efficient model training, and deployment.
&
+ | Algorithm Customizability | Allows customization via R-like and Python-like languages |
---|---|---|
+ | Hybrid Execution Plans | Combines local, in-memory CPU and GPU operations with distributed operations on Apache Spark. |
+ | Multiple Execution Modes | Includes Spark MLContext, Spark Batch, Hadoop Batch, Standalone, and JMLC for varied operational needs. |
+ | Automatic Optimization | Optimizes based on data and cluster characteristics for efficiency and scalability. |
+ | Declarative Languages | Provides R-like syntax for various data science tasks |
+ | Compressed Linear Algebra | Enhances large scale machine learning |
+ | Principal Component Analysis | Provides scripts for statistical analysis |
+ | Compatibility | Compatibility with popular programming languages (Support for Java 11 and Python 3.5+) for broader use. |
+ | Integration | Works with Hadoop 3.3.x and Spark 3.5.x for big data processing. |
+ | Nvidia CUDA and Intel MKL Support | Utilizes Nvidia CUDA 10.2 and Intel MKL (<=2019.x) for enhanced performance. |
- | Limited Monitoring | Lacks a comprehensive set of monitoring and management tools, making it difficult to diagnose and troubleshoot issues |
- | Performance Impact of Message Tweaking | Modifying messages can significantly decrease performance, limiting flexibility. |
- | No Wildcard Topic Selection | Only supports matching exact topic names, which can be restrictive for complex messaging needs |
- | Potential Performance Reduction | Data compression and decompression by brokers and consumers can negatively impact throughput. |
- | Clumsy Behavior with higher number of Queues | As the number of queues in the Kafka Cluster increases, SystemDS may become unstable. |
- | Missing Message Paradigms | Doesn’t support certain message paradigms like point-to-point queues, hindering its use in specific scenarios |
System Requirements
Not available, but we appreciate help! You can help us improve this page by contacting us.
Developer
Written in
Java, R, Python, Scala
Initial Release
Not available, but we appreciate help! You can help us improve this page by contacting us.
Repository
License
Categories
Alternatives
Machine Learning
Massive Online Analysis TensorFlow Apache Mahout Apache Spark Apache MXNet Eclipse Deeplearning4j MALLET mlpack OpenCV Orange PyTorch scikit-learn The Microsoft Cognitive Toolkit Torch Weka Yooreeka
Deep Learning
TensorFlow Apache MXNet Caffe Eclipse Deeplearning4j OpenNN PyTorch The Microsoft Cognitive Toolkit Torch Weka
Massive Online Analysis TensorFlow Apache Mahout Apache Spark Apache MXNet Eclipse Deeplearning4j MALLET mlpack OpenCV Orange PyTorch scikit-learn The Microsoft Cognitive Toolkit Torch Weka Yooreeka
Deep Learning
TensorFlow Apache MXNet Caffe Eclipse Deeplearning4j OpenNN PyTorch The Microsoft Cognitive Toolkit Torch Weka
Notes
- Apache Spark is a prerequisite for installing Apache SystemDS. Hence, platforms for which Apache Spark is available are considered for Apache SystemDS.
- Apache SystemML is now Apache SystemDS.