DecisionTreeClassifier
Decision tree classifier — CART with Gini/Entropy criterion, binned splits. / Arbre de décision classifieur — CART avec critère Gini/Entropie, splits binnés.
import seraplot as sp, numpy as np
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
tree = sp.DecisionTreeClassifier(max_depth=4)
tree.fit(X, y)
print(f"Accuracy: {tree.score(X, y):.3f}")
sp.DecisionTreeClassifier has the same API as sklearn.FR — Remplacement direct : même API que sklearn, changez l'import.
API Reference
ml_decision_tree_classifier — aliases: decision_tree_classifier, dt_cls
sp.DecisionTreeClassifier(max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=null, criterion=gini)
| Parameter | Type | Default | Description |
|---|---|---|---|
max_depth | int | ∞ | Maximum tree depth. |
min_samples_split | int | 2 | Minimum samples to split a node. |
min_samples_leaf | int | 1 | Minimum samples in a leaf. |
max_features | int|str | null | Max features per split (int or `sqrt`/`log2`). |
criterion | str | gini | Split criterion: `gini` or `entropy`. |
JSON with predictions, feature_importances, classes.
$$\text{Gini}(t) = 1 - \sum_{k} p_k^2$$
import seraplot as sp, numpy as np
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
tree = sp.DecisionTreeClassifier(max_depth=4)
tree.fit(X, y)
print(f"Accuracy: {tree.score(X, y):.3f}")
Référence API
ml_decision_tree_classifier — alias : decision_tree_classifier, dt_cls
sp.DecisionTreeClassifier(max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=null, criterion=gini)
| Paramètre | Type | Défaut | Description |
|---|---|---|---|
max_depth | int | ∞ | Profondeur maximale de l'arbre. |
min_samples_split | int | 2 | Minimum d'échantillons pour diviser un nœud. |
min_samples_leaf | int | 1 | Minimum d'échantillons dans une feuille. |
max_features | int|str | null | Max features par split (int ou `sqrt`/`log2`). |
criterion | str | gini | Critère de split : `gini` ou `entropy`. |
JSON avec predictions, feature_importances, classes.
$$\text{Gini}(t) = 1 - \sum_{k} p_k^2$$
import seraplot as sp, numpy as np
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
tree = sp.DecisionTreeClassifier(max_depth=4)
tree.fit(X, y)
print(f"Précision : {tree.score(X, y):.3f}")
DecisionTreeRegressor
Decision tree regressor — CART with MSE variance reduction, binned splits. / Arbre de décision régresseur — CART avec réduction de variance MSE, splits binnés.
import seraplot as sp, numpy as np
X = np.random.randn(400, 4)
y = X[:, 0] ** 2 + X[:, 1] - X[:, 2] + np.random.randn(400) * 0.5
tree = sp.DecisionTreeRegressor(max_depth=5)
tree.fit(X, y)
print(tree.score(X, y))
sp.DecisionTreeRegressor has the same API as sklearn.FR — Remplacement direct : même API que sklearn, changez l'import.
API Reference
ml_decision_tree_regressor — aliases: decision_tree_regressor, dt_reg
sp.DecisionTreeRegressor(max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=null)
| Parameter | Type | Default | Description |
|---|---|---|---|
max_depth | int | ∞ | Maximum tree depth. |
min_samples_split | int | 2 | Minimum samples to split a node. |
min_samples_leaf | int | 1 | Minimum samples in a leaf. |
max_features | int|str | null | Max features per split. |
JSON with predictions, feature_importances.
$$\text{MSE}(t) = \frac{1}{n_t}\sum_{i \in t}(y_i - \bar{y}_t)^2$$
import seraplot as sp, numpy as np
X = np.random.randn(400, 4)
y = X[:, 0] ** 2 + X[:, 1] - X[:, 2] + np.random.randn(400) * 0.5
tree = sp.DecisionTreeRegressor(max_depth=5)
tree.fit(X, y)
print(tree.score(X, y))
Référence API
ml_decision_tree_regressor — alias : decision_tree_regressor, dt_reg
sp.DecisionTreeRegressor(max_depth=∞, min_samples_split=2, min_samples_leaf=1, max_features=null)
| Paramètre | Type | Défaut | Description |
|---|---|---|---|
max_depth | int | ∞ | Profondeur maximale de l'arbre. |
min_samples_split | int | 2 | Minimum d'échantillons pour diviser un nœud. |
min_samples_leaf | int | 1 | Minimum d'échantillons dans une feuille. |
max_features | int|str | null | Max features par split. |
JSON avec predictions, feature_importances.
$$\text{MSE}(t) = \frac{1}{n_t}\sum_{i \in t}(y_i - \bar{y}_t)^2$$
import seraplot as sp, numpy as np
X = np.random.randn(400, 4)
y = X[:, 0] ** 2 + X[:, 1] - X[:, 2] + np.random.randn(400) * 0.5
tree = sp.DecisionTreeRegressor(max_depth=5)
tree.fit(X, y)
print(tree.score(X, y))