RandomForestClassifier
Random Forest classifier — bagging of CART trees with feature subsampling. / Random Forest classifieur — bagging d'arbres CART avec sous-échantillonnage de features.
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)
sp.RandomForestClassifier has the same API as sklearn.FR — Remplacement direct : même API que sklearn, changez l'import.
API Reference
ml_random_forest_classifier — aliases: random_forest_classifier, rf_cls
sp.RandomForestClassifier(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
| Parameter | Type | Default | Description |
|---|---|---|---|
n_estimators | int | 100 | Number of trees. |
max_depth | int | ∞ | Maximum tree depth. |
min_samples_split | int | 2 | Min samples to split a node. |
min_samples_leaf | int | 1 | Min samples in a leaf. |
max_features | str | sqrt | Features per split: `sqrt`, `log2`, `all`, or int. |
JSON with predictions, feature_importances.
$$\hat{y} = \text{majority}{h_b(x)}_{b=1}^{B}$$
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)
Référence API
ml_random_forest_classifier — alias : random_forest_classifier, rf_cls
sp.RandomForestClassifier(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
| Paramètre | Type | Défaut | Description |
|---|---|---|---|
n_estimators | int | 100 | Nombre d'arbres. |
max_depth | int | ∞ | Profondeur maximale. |
min_samples_split | int | 2 | Min d'échantillons pour diviser. |
min_samples_leaf | int | 1 | Min d'échantillons en feuille. |
max_features | str | sqrt | Features par split : `sqrt`, `log2`, `all`, ou int. |
JSON avec predictions, feature_importances.
$$\hat{y} = \text{majority}{h_b(x)}_{b=1}^{B}$$
import seraplot as sp
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
rf = sp.RandomForestClassifier(n_estimators=100, max_depth=6)
rf.fit(X, y)
print(rf.score(X, y), rf.feature_importances_)
RandomForestRegressor
Random Forest regressor — averaged ensemble of CART trees. / Random Forest régresseur — ensemble moyenné d'arbres CART.
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))
sp.RandomForestRegressor has the same API as sklearn.FR — Remplacement direct : même API que sklearn, changez l'import.
API Reference
ml_random_forest_regressor — aliases: random_forest_regressor, rf_reg
sp.RandomForestRegressor(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
| Parameter | Type | Default | Description |
|---|---|---|---|
n_estimators | int | 100 | Number of trees. |
max_depth | int | ∞ | Maximum tree depth. |
min_samples_split | int | 2 | Min samples to split. |
min_samples_leaf | int | 1 | Min samples in leaf. |
max_features | str | sqrt | Features per split. |
JSON with predictions, feature_importances.
$$\hat{y} = \frac{1}{B}\sum_{b=1}^{B} h_b(x)$$
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))
Référence API
ml_random_forest_regressor — alias : random_forest_regressor, rf_reg
sp.RandomForestRegressor(n_estimators=100, max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=sqrt)
| Paramètre | Type | Défaut | Description |
|---|---|---|---|
n_estimators | int | 100 | Nombre d'arbres. |
max_depth | int | ∞ | Profondeur maximale. |
min_samples_split | int | 2 | Min d'échantillons pour diviser. |
min_samples_leaf | int | 1 | Min d'échantillons en feuille. |
max_features | str | sqrt | Features par split. |
JSON avec predictions, feature_importances.
$$\hat{y} = \frac{1}{B}\sum_{b=1}^{B} h_b(x)$$
import seraplot as sp, numpy as np
X = np.random.randn(500, 6)
y = X[:, 0] ** 2 + X[:, 1] * X[:, 2] + np.random.randn(500) * 0.3
rf = sp.RandomForestRegressor(n_estimators=50, max_depth=8)
rf.fit(X, y)
print(rf.score(X, y))