Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RandomForestClassifier

Classifier sklearn-compatible 🌲 Tree-Based

Random Forest classifier — bagging of CART trees with feature subsampling. / Random Forest classifieur — bagging d'arbres CART avec sous-échantillonnage de features.

⚡ Rust-native ✓ sklearn parity
Quick start — Python
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)
💡
EN — Drop-in replacement: sp.RandomForestClassifier has the same API as sklearn.
FR — Remplacement direct : même API que sklearn, changez l'import.

API Reference

JSON function name

ml_random_forest_classifier — aliases: random_forest_classifier, rf_cls

Python class
sp.RandomForestClassifier(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
Constructor Parameters
ParameterTypeDefaultDescription
n_estimatorsint100Number of trees.
max_depthintMaximum tree depth.
min_samples_splitint2Min samples to split a node.
min_samples_leafint1Min samples in a leaf.
max_featuresstrsqrtFeatures per split: `sqrt`, `log2`, `all`, or int.
Returns

JSON with predictions, feature_importances.

Algorithm

$$\hat{y} = \text{majority}{h_b(x)}_{b=1}^{B}$$

Example
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)

Référence API

Nom de fonction JSON

ml_random_forest_classifier — alias : random_forest_classifier, rf_cls

Classe Python
sp.RandomForestClassifier(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
Paramètres du constructeur
ParamètreTypeDéfautDescription
n_estimatorsint100Nombre d'arbres.
max_depthintProfondeur maximale.
min_samples_splitint2Min d'échantillons pour diviser.
min_samples_leafint1Min d'échantillons en feuille.
max_featuresstrsqrtFeatures par split : `sqrt`, `log2`, `all`, ou int.
Retourne

JSON avec predictions, feature_importances.

Algorithme

$$\hat{y} = \text{majority}{h_b(x)}_{b=1}^{B}$$

Exemple
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)

RandomForestRegressor

Regressor sklearn-compatible 🌲 Tree-Based

Random Forest regressor — averaged ensemble of CART trees. / Random Forest régresseur — ensemble moyenné d'arbres CART.

⚡ Rust-native ✓ sklearn parity
Quick start — Python
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))
💡
EN — Drop-in replacement: sp.RandomForestRegressor has the same API as sklearn.
FR — Remplacement direct : même API que sklearn, changez l'import.

API Reference

JSON function name

ml_random_forest_regressor — aliases: random_forest_regressor, rf_reg

Python class
sp.RandomForestRegressor(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
Constructor Parameters
ParameterTypeDefaultDescription
n_estimatorsint100Number of trees.
max_depthintMaximum tree depth.
min_samples_splitint2Min samples to split.
min_samples_leafint1Min samples in leaf.
max_featuresstrsqrtFeatures per split.
Returns

JSON with predictions, feature_importances.

Algorithm

$$\hat{y} = \frac{1}{B}\sum_{b=1}^{B} h_b(x)$$

Example
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))

Référence API

Nom de fonction JSON

ml_random_forest_regressor — alias : random_forest_regressor, rf_reg

Classe Python
sp.RandomForestRegressor(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
Paramètres du constructeur
ParamètreTypeDéfautDescription
n_estimatorsint100Nombre d'arbres.
max_depthintProfondeur maximale.
min_samples_splitint2Min d'échantillons pour diviser.
min_samples_leafint1Min d'échantillons en feuille.
max_featuresstrsqrtFeatures par split.
Retourne

JSON avec predictions, feature_importances.

Algorithme

$$\hat{y} = \frac{1}{B}\sum_{b=1}^{B} h_b(x)$$

Exemple
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))