Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Model Selection — GridSearch

API Reference

GridSearchCV

gs = sp.GridSearchCV(estimator, param_grid, cv=5, seed=42, scoring="auto")
gs.fit(X, y)

RandomizedSearchCV

rs = sp.RandomizedSearchCV(estimator, param_distributions, n_iter=10, cv=5, seed=42, scoring="auto")
rs.fit(X, y)

HalvingGridSearchCV

hgs = sp.HalvingGridSearchCV(estimator, param_grid, cv=5, factor=3, seed=42, scoring="auto")
hgs.fit(X, y)

HalvingRandomSearchCV

hrs = sp.HalvingRandomSearchCV(estimator, param_distributions, n_candidates=256, cv=5, factor=3, seed=42, scoring="auto")
hrs.fit(X, y)

Constructor parameters — GridSearchCV

ParameterTypeDefaultDescription
estimatorstrModel name, e.g. "Ridge", "RandomForestClassifier"
param_griddict[str, list]Exhaustive grid of hyperparameters
cvint5Number of cross-validation folds
seedint42Random seed for fold shuffling
scoringstr"auto"Scoring metric (see below)

Constructor parameters — RandomizedSearchCV

ParameterTypeDefaultDescription
estimatorstrModel name
param_distributionsdict[str, list]Parameter distributions to sample from
n_iterint10Number of random parameter combinations
cvint5Number of cross-validation folds
seedint42Random seed
scoringstr"auto"Scoring metric

Constructor parameters — HalvingGridSearchCV

ParameterTypeDefaultDescription
estimatorstrModel name
param_griddict[str, list]Exhaustive grid of hyperparameters
cvint5Number of cross-validation folds
factorint3Halving factor — eliminates $1 - \frac{1}{\text{factor}}$ candidates per round
seedint42Random seed
scoringstr"auto"Scoring metric

Constructor parameters — HalvingRandomSearchCV

ParameterTypeDefaultDescription
estimatorstrModel name
param_distributionsdict[str, list]Parameter distributions to sample from
n_candidatesint256Initial number of random candidates
cvint5Number of cross-validation folds
factorint3Halving factor
seedint42Random seed
scoringstr"auto"Scoring metric

Attributes (all classes)

AttributeTypeDescription
best_params_dictBest hyperparameter combination found
best_score_floatMean CV score of the best combination
n_iterations_intNumber of halving iterations (Halving variants only)

Scoring

"auto" selects the default metric: for regressors, accuracy for classifiers.

Regression metrics

ValueFormula
"r2"$R^2 = 1 - \frac{\sum(y_i - \hat y_i)^2}{\sum(y_i - \bar y)^2}$
"neg_mean_squared_error"$-\frac{1}{n}\sum(y_i - \hat y_i)^2$
"neg_mean_absolute_error"$-\frac{1}{n}\sum|y_i - \hat y_i|$

Classification metrics

ValueFormula
"accuracy"$\frac{\text{correct}}{n}$
"f1" / "f1_weighted" / "f1_macro"$F_1 = \frac{2 \cdot P \cdot R}{P + R}$
"precision" / "precision_weighted" / "precision_macro"$P = \frac{TP}{TP + FP}$
"recall" / "recall_weighted" / "recall_macro"$R = \frac{TP}{TP + FN}$

Example — GridSearchCV (regression)
import seraplot as sp
import numpy as np

X = np.random.randn(500, 5)
y = X @ np.array([1.0, -2.0, 0.5, 1.5, -0.8]) + np.random.randn(500) * 0.5

gs = sp.GridSearchCV(
    "Ridge",
    {"alpha": [0.01, 0.1, 1.0, 10.0]},
    cv=5,
    scoring="neg_mean_squared_error",
)
gs.fit(X, y)
print(f"Best params: {gs.best_params_}")
print(f"Best score:  {gs.best_score_:.4f}")
Example — RandomizedSearchCV (classification)
import seraplot as sp
import numpy as np

X = np.random.randn(500, 10)
y = (X[:, 0] + X[:, 1] > 0).astype(int)

rs = sp.RandomizedSearchCV(
    "RandomForestClassifier",
    {
        "n_estimators": [50, 100, 200],
        "max_depth": [3, 5, 10, 15],
        "min_samples_split": [2, 5, 10],
    },
    n_iter=20,
    cv=5,
    scoring="f1",
)
rs.fit(X, y)
print(f"Best params: {rs.best_params_}")
print(f"Best score:  {rs.best_score_:.4f}")
Example — HalvingRandomSearchCV
import seraplot as sp
import numpy as np

X = np.random.randn(1000, 8)
y = X @ np.random.randn(8) + np.random.randn(1000) * 0.3

hrs = sp.HalvingRandomSearchCV(
    "Lasso",
    {"alpha": [0.001, 0.01, 0.1, 0.5, 1.0, 5.0, 10.0]},
    n_candidates=256,
    factor=3,
    cv=5,
)
hrs.fit(X, y)
print(f"Best params:  {hrs.best_params_}")
print(f"Best score:   {hrs.best_score_:.4f}")
print(f"Iterations:   {hrs.n_iterations_}")

Algorithmic Functioning

Cross-Validation

Each candidate is evaluated using stratified k-fold (classification) or shuffled k-fold (regression). For $k$ folds, the dataset is split into $k$ subsets; the model trains on $k-1$ folds and scores on the held-out fold. The final score is the mean over all $k$ folds:

$$\text{score} = \frac{1}{k}\sum_{i=1}^{k} S\bigl(\hat f_{-i},\, X_i,\, y_i\bigr)$$

GridSearchCV evaluates every combination in the Cartesian product of the parameter grid:

$$N_{\text{combos}} = \prod_{j=1}^{d} |V_j|$$

where $|V_j|$ is the number of values for the $j$-th parameter.

RandomizedSearchCV samples $n_{\text{iter}}$ combinations uniformly at random.

Successive Halving

HalvingGridSearchCV and HalvingRandomSearchCV use the successive halving strategy. Starting with a small resource budget $r_0$ and all candidates, each round:

  1. Evaluate all remaining candidates with resource $r_i$
  2. Keep the top $\frac{1}{\text{factor}}$ candidates
  3. Increase the resource: $r_{i+1} = r_i \times \text{factor}$

The initial resource is:

$$r_0 = \max\!\left(\left\lfloor\frac{n}{\text{factor}^{n_{\text{iters}}}}\right\rfloor,\, 1\right) \quad\text{where}\quad n_{\text{iters}} = \lceil\log_{\text{factor}}(C)\rceil$$

and $C$ is the number of candidates. This eliminates weak configurations early while spending full resources only on promising ones.

Optimizations

OptimizationModelsDescription
Gram matrix cacheLasso, ElasticNet, LinearRegressionPrecompute $X^TX$ and $X^Ty$ per fold once, reuse across all $\alpha$ values
KNN distance cacheKNeighborsClassifier, KNeighborsRegressorPrecompute full distance matrix per fold, reuse across all $k$ values
IRLS fast pathLogisticRegressionWarm-started iteratively reweighted least squares
Parallel evaluationAllRayon par_iter over parameter combinations

Benchmarks vs scikit-learn GridSearchCV

ModelSeraPlotscikit-learnSpeedup
Ridge5.6 ms82 ms15×
Lasso2.9 ms1 200 ms418×
ElasticNet3.3 ms2 283 ms686×
LogisticRegression137 ms5 745 ms42×
KNN12.9 ms1 543 ms119×
RandomForest6.9 s96.8 s14×
GradientBoosting23.3 s320 s14×

Référence API

GridSearchCV

gs = sp.GridSearchCV(estimator, param_grid, cv=5, seed=42, scoring="auto")
gs.fit(X, y)

RandomizedSearchCV

rs = sp.RandomizedSearchCV(estimator, param_distributions, n_iter=10, cv=5, seed=42, scoring="auto")
rs.fit(X, y)

HalvingGridSearchCV

hgs = sp.HalvingGridSearchCV(estimator, param_grid, cv=5, factor=3, seed=42, scoring="auto")
hgs.fit(X, y)

HalvingRandomSearchCV

hrs = sp.HalvingRandomSearchCV(estimator, param_distributions, n_candidates=256, cv=5, factor=3, seed=42, scoring="auto")
hrs.fit(X, y)

Paramètres du constructeur — GridSearchCV

ParamètreTypeDéfautDescription
estimatorstrNom du modèle, ex. "Ridge", "RandomForestClassifier"
param_griddict[str, list]Grille exhaustive d'hyperparamètres
cvint5Nombre de plis de validation croisée
seedint42Graine aléatoire
scoringstr"auto"Métrique de score (voir ci-dessous)

Paramètres du constructeur — RandomizedSearchCV

ParamètreTypeDéfautDescription
estimatorstrNom du modèle
param_distributionsdict[str, list]Distributions à échantillonner
n_iterint10Nombre de combinaisons aléatoires
cvint5Nombre de plis
seedint42Graine aléatoire
scoringstr"auto"Métrique de score

Paramètres du constructeur — HalvingGridSearchCV

ParamètreTypeDéfautDescription
estimatorstrNom du modèle
param_griddict[str, list]Grille exhaustive
cvint5Nombre de plis
factorint3Facteur de réduction — élimine $1 - \frac{1}{\text{factor}}$ candidats par tour
seedint42Graine aléatoire
scoringstr"auto"Métrique de score

Paramètres du constructeur — HalvingRandomSearchCV

ParamètreTypeDéfautDescription
estimatorstrNom du modèle
param_distributionsdict[str, list]Distributions à échantillonner
n_candidatesint256Nombre initial de candidats
cvint5Nombre de plis
factorint3Facteur de réduction
seedint42Graine aléatoire
scoringstr"auto"Métrique de score

Attributs (toutes les classes)

AttributTypeDescription
best_params_dictMeilleure combinaison d'hyperparamètres
best_score_floatScore CV moyen de la meilleure combinaison
n_iterations_intNombre d'itérations (variantes Halving uniquement)

Scoring

"auto" sélectionne la métrique par défaut : pour les régresseurs, accuracy pour les classifieurs.

Métriques de régression

ValeurFormule
"r2"$R^2 = 1 - \frac{\sum(y_i - \hat y_i)^2}{\sum(y_i - \bar y)^2}$
"neg_mean_squared_error"$-\frac{1}{n}\sum(y_i - \hat y_i)^2$
"neg_mean_absolute_error"$-\frac{1}{n}\sum|y_i - \hat y_i|$

Métriques de classification

ValeurFormule
"accuracy"$\frac{\text{correct}}{n}$
"f1" / "f1_weighted" / "f1_macro"$F_1 = \frac{2 \cdot P \cdot R}{P + R}$
"precision" / "precision_weighted" / "precision_macro"$P = \frac{TP}{TP + FP}$
"recall" / "recall_weighted" / "recall_macro"$R = \frac{TP}{TP + FN}$

Exemple — GridSearchCV (régression)
import seraplot as sp
import numpy as np

X = np.random.randn(500, 5)
y = X @ np.array([1.0, -2.0, 0.5, 1.5, -0.8]) + np.random.randn(500) * 0.5

gs = sp.GridSearchCV(
    "Ridge",
    {"alpha": [0.01, 0.1, 1.0, 10.0]},
    cv=5,
    scoring="neg_mean_squared_error",
)
gs.fit(X, y)
print(f"Meilleurs params : {gs.best_params_}")
print(f"Meilleur score :   {gs.best_score_:.4f}")
Exemple — RandomizedSearchCV (classification)
import seraplot as sp
import numpy as np

X = np.random.randn(500, 10)
y = (X[:, 0] + X[:, 1] > 0).astype(int)

rs = sp.RandomizedSearchCV(
    "RandomForestClassifier",
    {
        "n_estimators": [50, 100, 200],
        "max_depth": [3, 5, 10, 15],
        "min_samples_split": [2, 5, 10],
    },
    n_iter=20,
    cv=5,
    scoring="f1",
)
rs.fit(X, y)
print(f"Meilleurs params : {rs.best_params_}")
print(f"Meilleur score :   {rs.best_score_:.4f}")
Exemple — HalvingRandomSearchCV
import seraplot as sp
import numpy as np

X = np.random.randn(1000, 8)
y = X @ np.random.randn(8) + np.random.randn(1000) * 0.3

hrs = sp.HalvingRandomSearchCV(
    "Lasso",
    {"alpha": [0.001, 0.01, 0.1, 0.5, 1.0, 5.0, 10.0]},
    n_candidates=256,
    factor=3,
    cv=5,
)
hrs.fit(X, y)
print(f"Meilleurs params : {hrs.best_params_}")
print(f"Meilleur score :   {hrs.best_score_:.4f}")
print(f"Itérations :       {hrs.n_iterations_}")

Fonctionnement algorithmique

Validation croisée

Chaque candidat est évalué par k-fold stratifié (classification) ou k-fold mélangé (régression). Le score final est la moyenne sur les $k$ plis :

$$\text{score} = \frac{1}{k}\sum_{i=1}^{k} S\bigl(\hat f_{-i},\, X_i,\, y_i\bigr)$$

Recherche exhaustive vs aléatoire

GridSearchCV évalue chaque combinaison du produit cartésien :

$$N_{\text{combos}} = \prod_{j=1}^{d} |V_j|$$

RandomizedSearchCV échantillonne $n_{\text{iter}}$ combinaisons uniformément.

Successive Halving (réduction successive)

HalvingGridSearchCV et HalvingRandomSearchCV utilisent la stratégie de réduction successive. À chaque tour :

  1. Évaluer tous les candidats restants avec le budget $r_i$
  2. Garder le top $\frac{1}{\text{factor}}$ des candidats
  3. Augmenter le budget : $r_{i+1} = r_i \times \text{factor}$

Le budget initial est :

$$r_0 = \max\!\left(\left\lfloor\frac{n}{\text{factor}^{n_{\text{iters}}}}\right\rfloor,\, 1\right) \quad\text{où}\quad n_{\text{iters}} = \lceil\log_{\text{factor}}(C)\rceil$$

Les configurations faibles sont éliminées tôt ; seules les prometteuses reçoivent le budget complet.

Optimisations

OptimisationModèlesDescription
Cache matrice de GramLasso, ElasticNet, LinearRegressionPrécalcul de $X^TX$ et $X^Ty$ par pli, réutilisé pour toutes les valeurs de $\alpha$
Cache distances KNNKNeighborsClassifier, KNeighborsRegressorPrécalcul de la matrice de distances par pli
Chemin rapide IRLSLogisticRegressionMoindres carrés repondérés itérativement avec démarrage à chaud
Évaluation parallèleTouspar_iter Rayon sur les combinaisons

Performances vs scikit-learn GridSearchCV

ModèleSeraPlotscikit-learnAccélération
Ridge5,6 ms82 ms15×
Lasso2,9 ms1 200 ms418×
ElasticNet3,3 ms2 283 ms686×
LogisticRegression137 ms5 745 ms42×
KNN12,9 ms1 543 ms119×
RandomForest6,9 s96,8 s14×
GradientBoosting23,3 s320 s14×