Accord.NET is a framework for scientific computing in .NET. The framework is comprised of multiple libraries encompassing a wide range of scientific computing applications, such as statistical data processing, machine learning, artificial intelligence, pattern recognition, including but not limited to, computer vision and computer audition. The framework offers a large number of probability distributions, hypothesis tests, kernel functions and support for most popular performance measurements techniques. The framework comprises a set of libraries that are available in source code as well as via executable installers and NuGet packages.
AForge.NET is a computer vision and artificial intelligence library originally developed by Andrew Kirillov for the .NET Framework. The source code and binaries of the project are available under the terms of the Lesser GPL and the GPL (GNU General Public License). Another (unaffiliated) project called Accord.NET was created to extend the features of the original AForge.NET library. News Archive I Forum I Documentation
Apache(TM) Hadoop(R) is a library framework that facilitate using a network of many computers to solve problems involving massive amounts of data and computation providing for distributed storage and processing of big data using the MapReduce programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Apache Mahout(TM) is an open source project that is primarily used for creating scalable machine learning algorithms. It implements machine learning techniques such as, collaborative filtering, clustering, recommendation and classification. It also provides Java libraries for common math operations (focused on linear algebra and statistics) and primitive Java collections. A mahout is a word used in South Asian countries to describe one who drives an elephant as its master. The name comes from its close association with Apache Hadoop which uses an elephant as its logo.
Apache MXNet is an open source multi-language machine learning (ML) library especially to train and deploy deep neural networks, on a wide array of devices. Once embedded in the host language, it blends declarative symbolic expression with imperative tensor computation. It is built on a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient.
Apache Spark(TM) is an open-source distributed general-purpose cluster computing framework with (mostly) in-memory data processing engine that can do ETL, analytics, machine learning and graph processing on large volumes of data at rest (batch processing) or in motion (streaming processing) with rich concise high-level APIs for the programming languages: Scala, Python, Java, R, and SQL. In contrast to Hadoop’s two-stage disk-based MapReduce computation engine, Spark’s multi-stage (mostly) in-memory computing engine allows for running most computations in memory, and hence most of the time provides better performance for certain applications, e.
Caffe (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, originally developed at University of California, Berkeley. Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. - Official website Caffe allows switching between CPU and GPU by setting a single flag.
Eclipse Deeplearning4j is a deep learning programming library written for Java and Scala and a computing framework with wide support for deep learning algorithms. There are a lot of knobs to turn when you’re training a distributed deep-learning network. We’ve done our best to explain them, so that Eclipse Deeplearning4j can serve as a DIY tool for Java, Scala and Clojure programmers working on Hadoop and other file systems. - Official website
ELKI is Environment for Developing KDD(Knowledge Discovery in Databases, “Data Mining")-Applications Supported by Index-Structures. ELKI is an open source (AGPLv3) data mining software written in Java. The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection. In order to achieve high performance and scalability, ELKI offers data index structures such as the R*-tree that can provide major performance gains. ELKI is designed to be easy to extend for researchers and students in this domain, and welcomes contributions of additional methods.
MALLET is MAchine Learning for LanguagE Toolkit. MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. - Official website MALLET includes tools for document classification, sequence tagging, topic modeling. Many of the algorithms in MALLET depend on numerical optimization. MALLET includes an efficient implementation of Limited Memory BFGS, among many other optimization methods.
MOA (Massive Online Analysis) is an open source framework for data stream mining including machine learning algorithms such as classification, regression, clustering, outlier detection, concept drift detection and recommender systems and tools for evaluation. MOA is written in Java and relates to WEKA project. MOA allows to build and run experiments of machine learning or data mining on evolving data streams. It is also possible to use WEKA classifiers from MOA, and MOA classifiers and streams from WEKA.
Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits. … You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a few lines of code. - Official website
mlpack is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users. This is done by providing a set of command-line executables which can be used as black boxes, and a modular C++ API for expert users and researchers to easily make changes to the internals of the algorithms.
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. NumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object sophisticated (broadcasting) functions tools for integrating C/C++ and Fortran code useful linear algebra, Fourier transform, and random number capabilities Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data.
OpenCog is a project that aims to build an open source artificial intelligence framework. OpenCog Prime is an architecture for robot and virtual embodied cognition that defines a set of interacting components designed to give rise to human-equivalent artificial general intelligence (AGI) as an emergent phenomenon of the whole system. OpenCog Prime’s design is primarily the work of Ben Goertzel while the OpenCog framework is intended as a generic framework for broad-based AGI research.
OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. It has C++, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. OpenCV was designed for computational efficiency and with a strong focus on real-time applications.
OpenNN (Open Neural Networks Library) implements neural networks, a main area of deep learning research. OpenNN implements data mining methods as a bundle of functions. It allows embedding functions in other software tools using an ‘Application Programming Interface (API)’ for the interaction between the software tool and the predictive analytics tasks. A graphical user interface (GUI) is still missing, but some functions can support the integration of specific visualization tools.
pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. - Wikipedia Documentation I Stack Overflow Q&A I Mailing list - Developers I FAQ I IRC
PyTorch is an open-source machine learning library for Python, based on Torch, used for applications such as natural language processing. One can also reuse Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed. PyTorch provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks built on a tape-based autodiff system - README.md on GitHub repo Automatic differentiation is done with a tape-based system at both a functional and neural network layer level.
scikit-learn is an open source machine learning library featuring classification, regression, clustering, dimensionality reduction, model selection and preprocessing. It has tools for data mining and data analysis, and is built on NumPy, SciPy, and matplotlib. As per official website , it features: Classification : Identifying to which category an object belongs to Regression : Predicting a continuous-valued attribute associated with an object Clustering : Automatic grouping of similar objects into sets Dimensionality reduction : Reducing the number of random variables to consider Model selection : Comparing, validating and choosing parameters and models Preprocessing : Feature extraction and normalization Documentation I Wiki I Mailing list I Stack Overflow I FAQ I IRC
SciPy a free and open-source Python library used for scientific computing and technical computing. The SciPy library is one of the core packages that make up the SciPy stack. It provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization. - Official website SciPy refers to several related but distinct entities: * The SciPy ecosystem, a collection of open source software for scientific computing in Python.
TensorFlow is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. TensorFlow was originally developed by researchers and engineers from the Google Brain team within Google’s AI organization. It comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains.
The Microsoft Cognitive Toolkit—previously known as CNTK—is an open-source toolkit for commercial-grade distributed deep learning. It describes neural networks as a series of computational steps via a directed graph. The Microsoft Cognitive Toolkit enables to leverage the information within massive data-sets through deep learning by providing scaling, speed, and accuracy with commercial-grade quality and compatibility with the programming languages and algorithms already in use. News I Documentation I FAQ I Blog
Torch is a scientific computing framework with support for machine learning algorithms. It provides N-dimensional arrays, with support for routines for indexing, slicing, transposing, etc. Torch puts GPU first. It has an interface to C via LuaJIT, linear algebra & numeric optimization routines, neural network and energy-based models. It is embeddable, with ports to iOS and Android backends. Documentation I Wiki I Mailing list I Gitter chat
Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization. - Official website Weka(Waikato Environment for Knowledge Analysis) provides access to deep learning with WekaDeeplearning4j which uses Deeplearning4j. Blog I New Forum I Old Forum I Documentation I Stack Overflow Q&A I Mailing list I Wiki I FAQ I IRC I SourceForge I Package metadata
Yooreeka is a library for data mining, machine learning, soft computing, and mathematical analysis. It also provides examples. Google Code I Archive