Skip to content
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS
Close
Beyond Knowledge Innovation

Beyond Knowledge Innovation

Where Data Unveils Possibilities

  • Home
  • AI & ML Insights
  • Machine Learning
    • Supervised Learning
      • Introduction
      • Regression
      • Classification
    • Unsupervised Learning
      • Introduction
      • Clustering
      • Association
      • Dimensionality Reduction
    • Reinforcement Learning
    • Generative AI
  • Knowledge Base
    • Introduction To Python
    • Introduction To Data
    • Introduction to EDA
  • References
HomeImplementationSupervised LearningRandomizedSearchCV vs GridSearchCV
Supervised Learning

RandomizedSearchCV vs GridSearchCV

April 24, 2024April 24, 2024CEO 179 views

RandomizedSearchCV is a method provided by scikit-learn for hyperparameter tuning and model selection through cross-validation. It’s similar to GridSearchCV, but instead of exhaustively searching through all possible combinations of hyperparameters, it randomly samples a fixed number of hyperparameter settings from specified distributions.

Here’s a basic overview of how RandomizedSearchCV works:

  1. Define a parameter grid or a distribution for each hyperparameter you want to tune.
  2. Specify the number of iterations (random samples) you want to perform.
  3. Pass the estimator (model), parameter grid/distributions, and number of iterations to RandomizedSearchCV.
  4. RandomizedSearchCV performs cross-validation for each random combination of hyperparameters and selects the best combination based on the scoring metric.
  5. After the search is complete, you can access attributes like best_params_, best_score_, and best_estimator_ to retrieve information about the best-performing model.

Here’s a basic example of how to use RandomizedSearchCV:

from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from scipy.stats import randint

# Define parameter distributions
param_dist = {
    'n_estimators': randint(10, 100),
    'max_depth': randint(1, 10),
    'min_samples_split': randint(2, 20),
    'min_samples_leaf': randint(1, 20),
    'max_features': ['auto', 'sqrt', 'log2']
}

# Create a RandomForestClassifier instance
clf = RandomForestClassifier()

# Create RandomizedSearchCV instance
random_search = RandomizedSearchCV(clf, param_distributions=param_dist, n_iter=10, cv=5)

# Fit the model
random_search.fit(X_train, y_train)

# Get the best parameters
best_params = random_search.best_params_

In this example, RandomizedSearchCV is used to search for the best hyperparameters for a RandomForestClassifier by randomly sampling from the specified parameter distributions. The n_iter parameter controls the number of random combinations to try, and cv specifies the number of folds for cross-validation.

Here is another example:

%%time

# Choose the type of classifier. 
rf2 = RandomForestClassifier(random_state=1)

# Grid of parameters to choose from
parameters = {"n_estimators": [150,200,250],
    "min_samples_leaf": np.arange(5, 10),
    "max_features": np.arange(0.2, 0.7, 0.1), 
    "max_samples": np.arange(0.3, 0.7, 0.1),
    "max_depth":np.arange(3,4,5),
    "class_weight" : ['balanced', 'balanced_subsample'],
    "min_impurity_decrease":[0.001, 0.002, 0.003]
             }

# Type of scoring used to compare parameter combinations
acc_scorer = metrics.make_scorer(metrics.recall_score)

# Run the random search
grid_obj = RandomizedSearchCV(rf2, parameters,n_iter=30, scoring=acc_scorer,cv=5, random_state = 1, n_jobs = -1, verbose = 2)
# using n_iter = 30, so randomized search will try 30 different combinations of hyperparameters
# by default, n_iter = 10

grid_obj = grid_obj.fit(X_train, y_train)

# Print the best combination of parameters
grid_obj.best_params_
cross validation, gridsearch, randomizedsearch

Post navigation

Previous Post
Previous post: Get available Hyperparameters
Next Post
Next post: Quantile-based discretization of continuous variables

You Might Also Like

No image
Parameter cv in GridSearchCV
March 10, 2024 Comments Off on Parameter cv in GridSearchCV
No image
Pre-pruning Decision Tree – GridSearch for Hyperparameter…
March 8, 2024 Comments Off on Pre-pruning Decision Tree – GridSearch for Hyperparameter tuning
  • Recent
  • Popular
  • Random
  • No image
    7 months ago Low-Rank Factorization
  • No image
    7 months ago Perturbation Test for a Regression Model
  • No image
    7 months ago Calibration Curve for Classification Models
  • No image
    March 15, 20240Single linkage hierarchical clustering
  • No image
    April 17, 20240XGBoost (eXtreme Gradient Boosting)
  • No image
    April 17, 20240Gradient Boosting
  • No image
    January 16, 2024How to create a smaller dataset for…
  • No image
    January 16, 2024Handling missing data in a dataset
  • No image
    February 7, 2024How-to: give a specific sorting order to…
  • Implementation (55)
    • EDA (4)
    • Neural Networks (10)
    • Supervised Learning (26)
      • Classification (17)
      • Linear Regression (8)
    • Unsupervised Learning (11)
      • Clustering (8)
      • Dimensionality Reduction (3)
  • Knowledge Base (44)
    • Python (27)
    • Statistics (6)
May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« Oct    

We are on

FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS

Subscribe

© 2025 Beyond Knowledge Innovation
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS