Scikit-learn (Sklearn) provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering, and dimensionality reduction via a consistent interface in Python.
Prerequisites for Scikit-learn
Before we start using scikit-learn latest release, we require the following:
- Python (>=3.5)
- NumPy (>= 1.11.0)
- Scipy (>= 0.17.0)
- Joblib (>= 0.11)
- Matplotlib (>= 1.5.1) is required for Sklearn plotting capabilities.
- Pandas (>= 0.18.0) is required for some of the scikit-learn examples using data structure and analysis.
Rather than focusing on loading, manipulating, and summarising data, Scikit-learn library is focused on modeling the data. Some of the most popular groups of models provided by Sklearn are as follows:
Supervised Learning algorithms: Almost all the popular supervised learning algorithms, like Linear Regression, Support Vector Machine (SVM), Decision Tree (can see in continues), etc., are part of scikit-learn.
Unsupervised Learning algorithms: On the other hand, it also has all the popular unsupervised learning algorithms from clustering, factor analysis, PCA (Principal Component Analysis) to unsupervised neural networks.
Clustering: This model is used for grouping unlabeled data.
Cross-Validation: It is used to check the accuracy of supervised models on unseen data.
Dimensionality Reduction: It is used for reducing the number of attributes in data which can be further used for summarization, visualization, and feature selection.
Ensemble methods: As the name suggests, it is used for combining the predictions of multiple supervised models.
Feature extraction: It is used to extract the features from data to define the attributes in image and text data.
Feature selection: It is used to identify useful attributes to create supervised models.
Open Source: It is an open-source library and also commercially usable under a BSD license.
Decision Trees (DTs) by SciKit-Learn in Python
An interested and active person in the field of data science and molecular dynamics simulation