scikit-learn logo scikit-learn logo background glow

scikit-learn

A SciPy based Python library for machine learning tasks like classification, regression, and clustering

&

+Supervised Learning AlgorithmsTools for common supervised learning algorithms such as linear regression, support vector machines, and random forests; allowing you to build models for prediction tasks
+Unsupervised Learning AlgorithmsImplements unsupervised learning methods like clustering, factor analysis, and principal component analysis; for exploring unlabeled data and uncovering hidden patterns
+Cross-validationTechniques to assess the predictive performance of the models, choose the best model and prevent overfitting
+PreprocessingFunctions for preprocessing data, such as scaling, centring, normalization, binarization, and imputation of missing values
+Model EvaluationMetrics and scoring functions to evaluate the performance of models
+PipelineStreamlining the machine learning workflow by chaining transformations and models
+Grid SearchMethods for parameter tuning to determine the best model parameters and avoid manual exploration
+PersistenceAllows saving and loading models for later use, facilitating deployment and reusability
+ScalabilitySupports handling large datasets through efficient algorithms and integration with tools like scikit-learn pipelines
+VisualizationOffers tools for visualizing data and model performance through integration with libraries like Matplotlib
+Feature ExtractionTools for extracting features from data such as text and images for machine learning algorithms
+Dimensionality ReductionMethods like PCA and feature selection techniques to reduce the number of features
+Ensemble MethodsCombines the predictions of several base estimators to improve generalizability and robustness over a single estimator
+Feature SelectionTechniques for feature selection to improve estimators’ accuracy scores or to boost their performance on very high-dimensional datasets
+DatasetsProvides several toy datasets to practice machine-learning techniques
+MetricsOffers a wide range of performance metrics for classification, regression, clustering, and pairwise metrics
+Semi-Supervised LearningAlgorithms for semi-supervised learning problems
+Nearest NeighborsAlgorithms for unsupervised and supervised neighbors-based learning methods
+Gaussian ProcessesTools for Gaussian process regression and classification
+Manifold LearningAlgorithms for manifold learning with an emphasis on non-linear dimensionality reduction
+Covariance EstimationMethods for robust covariance estimation and Mahalanobis distances relevance
+Isotonic RegressionImplements isotonic regression to fit a non-decreasing function to data
+Multiclass and Multilabel AlgorithmsStrategies to solve multiclass and multilabel classification problems
+Random ProjectionMethods for reducing dimensionality through random projection matrix generation
-Limited Deep Learning SupportLimited capabilities for deep learning tasks
-High-Dimensional DataChallenges in effectively handling high-dimensional data
-Graph AlgorithmsNot optimized for graph algorithms
-String ProcessingNot very efficient at processing strings
-Hyperparameter SpacesAwkward definition of hyperparameters and search spaces in models

Platform

Social

           

System Requirements

Version ↓
#Minimum
1
  • numpy >= 1.19.5
  • scipy >= 1.6.3
  • joblib >= 1.2
  • threadpoolctl >= 3.1
2
  • Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4
  • Scikit-learn 0.21 supported Python 3.5-3.7
  • Scikit-learn 0.22 supported Python 3.5-3.8.
  • Scikit-learn 0.23-0.24 required Python 3.6 or newer
  • Scikit-learn 1.0 supported Python 3.7-3.10
  • Scikit-learn 1.1, 1.2 and 1.3 support Python 3.8-3.12
  • Scikit-learn 1.4 requires Python 3.9 or newer

Ratings

4.70
5

G2CROWD
4.9
5
based on 30 reviews
InfoWorld
4.5
5
based on professional's opinion

Written in

Python, Cython, C, C++

Initial Release

June 2007