scikit-learn logo scikit-learn logo background glow

scikit-learn

A SciPy based Python library for machine learning tasks like classification, regression, and clustering

&

+
Supervised Learning Algorithms
Tools for common supervised learning algorithms such as linear regression, support vector machines, and random forests; allowing you to build models for prediction tasks
+
Unsupervised Learning Algorithms
Implements unsupervised learning methods like clustering, factor analysis, and principal component analysis; for exploring unlabeled data and uncovering hidden patterns
+
Cross-validation
Techniques to assess the predictive performance of the models, choose the best model and prevent overfitting
+
Preprocessing
Functions for preprocessing data, such as scaling, centring, normalization, binarization, and imputation of missing values
+
Model Evaluation
Metrics and scoring functions to evaluate the performance of models
+
Pipeline
Streamlining the machine learning workflow by chaining transformations and models
+
Grid Search
Methods for parameter tuning to determine the best model parameters and avoid manual exploration
+
Persistence
Allows saving and loading models for later use, facilitating deployment and reusability
+
Scalability
Supports handling large datasets through efficient algorithms and integration with tools like scikit-learn pipelines
+
Visualization
Offers tools for visualizing data and model performance through integration with libraries like Matplotlib
+
Feature Extraction
Tools for extracting features from data such as text and images for machine learning algorithms
+
Dimensionality Reduction
Methods like PCA and feature selection techniques to reduce the number of features
+
Ensemble Methods
Combines the predictions of several base estimators to improve generalizability and robustness over a single estimator
+
Feature Selection
Techniques for feature selection to improve estimators’ accuracy scores or to boost their performance on very high-dimensional datasets
+
Datasets
Provides several toy datasets to practice machine-learning techniques
+
Metrics
Offers a wide range of performance metrics for classification, regression, clustering, and pairwise metrics
+
Semi-Supervised Learning
Algorithms for semi-supervised learning problems
+
Nearest Neighbors
Algorithms for unsupervised and supervised neighbors-based learning methods
+
Gaussian Processes
Tools for Gaussian process regression and classification
+
Manifold Learning
Algorithms for manifold learning with an emphasis on non-linear dimensionality reduction
+
Covariance Estimation
Methods for robust covariance estimation and Mahalanobis distances relevance
+
Isotonic Regression
Implements isotonic regression to fit a non-decreasing function to data
+
Multiclass and Multilabel Algorithms
Strategies to solve multiclass and multilabel classification problems
+
Random Projection
Methods for reducing dimensionality through random projection matrix generation
-
Limited Deep Learning Support
Limited capabilities for deep learning tasks
-
High-Dimensional Data
Challenges in effectively handling high-dimensional data
-
Graph Algorithms
Not optimized for graph algorithms
-
String Processing
Not very efficient at processing strings
-
Hyperparameter Spaces
Awkward definition of hyperparameters and search spaces in models

Platform

Desktop
Language
Python

Social

System Requirements

#Minimum
1
  • numpy >= 1.19.5
  • scipy >= 1.6.3
  • joblib >= 1.2
  • threadpoolctl >= 3.1
2
  • Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4
  • Scikit-learn 0.21 supported Python 3.5-3.7
  • Scikit-learn 0.22 supported Python 3.5-3.8.
  • Scikit-learn 0.23-0.24 required Python 3.6 or newer
  • Scikit-learn 1.0 supported Python 3.7-3.10
  • Scikit-learn 1.1, 1.2 and 1.3 support Python 3.8-3.12
  • Scikit-learn 1.4 requires Python 3.9 or newer

Ratings

4.70
5

G2CROWD
4.9
5
based on 30 reviews
InfoWorld
4.5
5
based on professional's opinion

Developer

Written in

Python, Cython, C, C++

Initial Release

June 2007

Repository

License

Categories