Neptune Blog

Best Tools for Model Tuning and Hyperparameter Optimization

Bunmi Akinremi

9 min

25th April, 2025

ML Tools

I vividly remember a machine learning hackathon that I participated in two years ago, when I was at the beginning of my data science career. It was a pre-qualification hackathon for a bootcamp organised by Data Science Nigeria.

The dataset had information about certain employees. I had to predict if an employee should get a promotion or not. After days of trying to improve and engineer features, the model’s accuracy seemed to oscillate around 80%.

I needed to do something to improve my score on the leaderboard. I started tuning the model manually – got a bit better results. The accuracy moved grew to 82% by changing a parameter (this move is really important, as anyone who’d done a hackathon will attest!). Excited, I started tuning other hyperparameters, but not all turned out so well. I was already exhausted, imagine working 7 hours straight to improve a model. Pretty tiring.

I knew about GridSearchCV and RandomSearchCV. I tried out GridSearchCV and took more than 3 hours to give me results from the range of values I provided. Even worse, the results from GridSearchCV weren’t better. Frustrated, I decided to try RandomSearchCV. This brought a little joy, my accuracy moved from 82 to 86%.

After a lot of trials and no improvement, I went back to manual tuning to see what I could gain. I ended up with about 90% accuracy by the end of the hackathon. I wish I had known about tools for optimizing hyperparameters faster! Luckily, even though I wasn’t part of the top 50, I still qualified for the bootcamp.

That was in the past. Now, I know that there are good hyperparameter tuning tools I could’ve used, and I’m excited to share them with you.

Before you start hypertuning, make sure these things are done:

Get a baseline. You can get this with smaller models, fewer iterations, default parameters, or a manually tuned model.
Separate your data into training, validation and test sets.
Use early stopping rounds with large epochs to prevent overfitting.
Set up your full model pipeline before training.

Now, I’d like to discuss some terms that I’ll be using in the article:

Model parameter – A model parameter is that which your model learns from the data, like features, relationships, etc., which you can’t manually tune (not feature engineering).
Model hyperparameter – Hyperparameters are those values you can tune manually from the model itself, like the learning rate, number of estimators, type of regularization, etc..
Optimization – A process of adjusting hyperparameters in order to minimize the cost function by using one of the optimization techniques.
Hyperparameter optimization – Hyperparameter optimization is simply a search to get the best set of hyperparameters that gives the best version of a model on a particular dataset.
Bayesian optimization – Part of a class of sequential model-based optimization (SMBO) algorithms for using results from a previous experiment to improve the next.
Hyperparameter sampling – Simply specifying the parameter sampling method to use over the hyperparameter space.

Learn more

Hyperparameter Tuning in Python: a Complete Guide 2021
How to Track Hyperparameters of Machine Learning Models?

I’m not against using GridSearchCV. It’s a good option, only that it’s really time-consuming and computationally expensive. If you’re like me, with a busy schedule, you’ll definitely find better options.

A better alternative is RandomSearch CV, which uses random hyperparameter values to pick the best hyperparameters. It’s way faster than GridSearchCV. The downside here is that since it takes random values, we can’t be so sure that those values are the best combination.

But really, when do I know I need to do hyperparameter optimization?

One of the mistakes we often make as data scientists is using the default parameters of a model. You’re probably not using the best version of your model by the default parameters you used.

Sometimes, when your model is overfitting (performing well on training set and poor on test dataset), or underfitting (performing poorly on training dataset and well on test dataset), optimizing your hyperparameters can really help. A little tweak can make a large difference, from 60% accuracy to 80% accuracy, or even more!

Okay, let’s wrap up the introduction. By the end of this article, you’ll learn:

The top best hyperparameter tuning tools,
The various open sourced services (free to use) and paid services,
Their features and advantages,
The frameworks they support,
How to choose the best tool for your project,
How you can add them to your project.

We’ll start with a TL;DR comparison of all the tools discussed below.

Comparing tools for model tuning and hyperparameter optimization

If you’re strapped for time, this table should help you pick a good tool to try in your use case. For detailed descriptions of each tool, keep reading below the table.

Ray Tune

Optuna

Hyperopt

Sckit-Optimize

Google Vizer

Microsoft’s NNI

AWS Sage Maker

Azure Machine Learning

Ray Tune

Optuna

Hyperopt

Sckit-Optimize

Google Vizer

Microsoft’s NNI

AWS Sage Maker

Azure Machine Learning

Open sourced

Complete package

Algorithms

Ax/Botorch, HyperOpt, and Bayesian Optimization

AxSearch, DragonflySearch, HyperOptSearch, OptunaSearch, BayesOptSearch

Random Search, Tree of Parzen Estimators, Adaptive TPE

Bayesian Hyperparameter Optimizationt

Black box optimization algorithms

Bayesian optimization, Heuristic search, Exhaustive search

Bayesian or a random search

Automated early stopping algorithms, automated stopping algorithms, transfer learning. Learn more

Various supported frameworkst

Pytorch,Tensorflow, XGBoost, LIghtGBM, Scikit-Learn, and Keras

Any ML or Deep Learrning framework, PyTorch, TensorFlow, Keras, MXNet, Scikit-Learn, LightGBMt

sklearn, xgboost, Tensorflow, pytorch, etc

Machine Learning algorithms offered by the scikit-learn library

Vizer supports multiple different algorithms under the cover, with a default of ‘Batched Gaussian Process Bandits’

Pytorch, Tensorflow, Keras, Theano, Caffe2

Any

Few lines of codes

Low or No codes

Uses GPUs

Cloud

Easy scalability with little or no changes to the code

Parallelized trainingt

Yes, depending on the capacity of your configured training platforms

Distributed optimization

Auto filled metrics

Handling large datasets

Free or paidt

Free

Paid

Free

Paid

Moving on, I’ll start with some open-source tools. Each tool will be described in the following way:

Brief introduction of the tool,
Core features/ Advantages of the tool,
Steps on how to use the tool,
Additional links on how to use the tool in your project.

1. Ray Tune

Ray provides a simple, universal API for building distributed applications. Tune is a Python library for experiment execution and hyperparameter tuning at any scale. Tune is one of the many packages of Ray. Ray Tune is a Python library that speeds up hyperparameter tuning by leveraging cutting-edge optimization algorithms at scale.

Why should you use RayTune?

Here are some features:

It integrates easily with many optimization libraries, such as Ax/Botorch and HyperOpt.
Scaling can be done without changing your code.
Tune leverages a variety of cutting edge-optimization algorithms, such as Ax/Botorch, HyperOpt, and Bayesian Optimization, enabling you to scale them transparently.
Tune parallelizes across multiple GPUs and multiple nodes, so you don’t have to build your own distributed system to speed up training.
You can visualise results automatically with tools like Tensorboard.
It provides a flexible interface for optimization algorithms, you can easily implement and scale new optimization algorithms with few lines of code.
It supports any machine learning framework including Pytorch, Tensorflow, XGBoost, LIghtGBM, Scikit-Learn, and Keras.

Using it takes five simple steps (I’m supposing you already have your data preprocessed):

Install tune

pip install ray[tune]

Choose a search algorithm. There are many to choose from. AxSearch, DragonflySearch, HyperOptSearch, OptunaSearch, BayesOptSearch, and so many more. Here’s the full list of available search algorithms.
Set up and train your model.
Define a search space.
Run and evaluate your model.

Whether you want to implement Ray Tune in your ML project using Tensorflow, Pytorch, or any other framework, a lot of tutorials are available. Here are some to check out:

Machine learning and reinforcement learning projects from Ray.
“Hyperparameter Tuning” to implement the steps listed above in Tensorflow.
Hyperparameter tuning with Keras and Ray Tune.

You can learn more about configuring Ray Tune and its capabilities from this article: “Ray Tune: a hyperparameter library for fast hyperparameter tuning at any scale”.

2. Optuna

Optuna is designed specially for machine learning. It’s a black-box optimizer, so it needs an objective function. This objective function decides where to sample in upcoming trials, and returns numerical values (the performance of the hyperparameters). It uses different algorithms, such as GridSearch, Random Search, Bayesian and Evolutionary algorithms to find the optimal hyperparameter values.

Some of the features are:

Efficient sampling and pruning algorithms.
Easy to install, needs few requirements.
Easier to use than Hyperopt.
Uses distributed optimization.
You can define search spaces using Python syntax, including conditionals and loops.
You can analyze optimization results visually.
Easy scalability with little or no changes to the code.

Optuna uses the pruning algorithm. Pruning is a technique used in machine learning and search algorithms to reduce the size of decision trees, by removing sections of the tree that are non-critical and redundant to classify instances.

Pruning in Optuna automatically stops unpromising trials at the early stages of the training, which you can also call automated early-stopping. Optuna provides the following pruning algorithms:

Asynchronous Successive Halving algorithm.
Hyperband algorithm.
Median pruning algorithm which uses the median stopping rule.
Threshold pruning algorithm, used to detect outlying metrics of the trials.

I’ll highlight the simple steps you need to use Optuna:

First, install Optuna with `pip install optuna`, if it’s not already installed.
Define your model.
Choose parameters to optimize.
Create a study.
Define objective function.
Optimize.
Check trial results.

Tutorial and example codes to check out:

“How to make your model awesome with Optuna” for a step by step walkthrough on how to use Optuna in your project.
Optuna’s Github page has a list of example codes written using Optuna.

You can also check out this read: “Optuna Guides how to monitor hyperparameter optimization runs”, to better understand how Optuna optimizes your hyperparameters.

3. HyperOpt

From the official documentation, Hyperopt is a Python library for serial and parallel optimization over awkward search spaces, which may include real-valued, discrete, and conditional dimensions.

Hyperopt uses Bayesian optimization algorithms for hyperparameter tuning, to choose the best parameters for a given model. It can optimize a large-scale model with hundreds of hyperparameters.

Hyperopt currently implements three algorithms:

Random Search,
Tree of Parzen Estimators,
Adaptive TPE.

Hyperopt has been designed to accommodate Bayesian optimization algorithms based on Gaussian processes and regression trees, but unfortunately they’re not currently implemented.

Features of Hyperopt:

HyperOpt requires 4 essential components for the optimization of hyperparameters:

the search space,
the loss function,
the optimization algorithm,
a database for storing the history (score, configuration)

Steps to use Hyperopt in your project:

Initialize the space over which to search.
Define the objective function.
Select the search algorithm to use.
Run the hyperopt function.
Analyze the evaluation outputs stored in the trials object.

Here are some hands-on tutorials you can check out:

Here’s also a good kaggle notebook you can try out.

Optuna vs Hyperopt: Which Hyperparameter Optimization Library Should You Choose?

4. Scikit-Optimize

Scikit-Optimize is an open-source library for hyperparameter optimization in Python. It was developed by the team behind Scikit-learn. It’s relatively easy to use compared to other hyperparameter optimization libraries.

It has sequential model-based optimization libraries known as Bayesian Hyperparameter Optimization (BHO). The advantage of BHO is that they find better model settings than random search in fewer iterations.

What really is Bayesian optimization?

Bayesian optimization is a sequential design strategy for global optimization of black box functions that does not assume any functional forms. It’s usually used to optimize computationally expensive functions. At least that’s what Wikipedia says.

But, in plain English, BO evaluates hyperparameters that appear more promising from past results, and finds better settings, rather than using random search with fewer iterations. The performance of the past hyperparameter affects future decisions.

Features of Scikit-Optimize:

Sequential model-based optimization,
Built on NumPy, SciPy, and Scikit-Learn,
Open source, commercially usable, BSD license.

Scikit-Optimize Bayesian optimization using a Gaussian process is based on an algorithm called gp_optimize. You can learn more about it here. If you’re interested in how to build your own Bayesian Optimizer from scratch, you can also check out this tutorial: “How to Implement Bayesian Optimization From Scratch in Python”.

Here are the simple steps you need to follow to use Scikit-Optimize:

Start by installing skopt using pip install skopt, if it’s not already installed.
Define the model.
Decide the parameter to optimize.
Define search space.
Define the objective function.
Run the optimization.

Here’s a list of tutorials you can follow to implement Scikit Optimize in your project:

For an in-depth explanation of Scikit-Optimize features, check out this article.

5. Microsoft’s NNI (Neural Network Intelligence)

NNI is a free, open-source AutoML toolkit developed by Microsoft. It’s used to automate feature engineering, model compression, neural architecture search, and hyper-parameter tuning.

How does it work?

The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments, like local machine, remote servers, and cloud.

Microsoft’s NNI supports frameworks like Pytorch, Tensorflow, Keras, Theano, Caffe2, etc., and libraries like Sckit-learn, XGBoost, CatBoost, and LightGBM for now.

Features of NNI:

Many popular automatic tuning algorithms (like TPE, Random Search, GP Tuner, Metis Tuner, and so on) and early stop algorithms (Medianstop, Curvefitting assessors).
NAS (Neural Architecture Search) framework for users to easily specify neural architectures they want to use.
Support for NAS algorithms like ENAS(Efficient Neural Architecture Search) and DARTS(Differentiable Architectural Search) through NNI trial SDK.
Automatic feature engineering through NNI trial SDK; you don’t have to create an NNI experiment, simply import a built-in auto-feature-engineering algorithm in your trial code and run!
Command line tools and a web UI to manage training experiments.
Extensible API to customize your AUTO ML models.
It could be trained on your local machine, remote servers, Azure Machine Learning, kubernetes-based services like Kube Flow, Adapt DL, Open pal, etc..
It has methods for hyperparameter tuning which includes Exhaustive search, Heuristic search, Bayesian optimization and RL based.
Some of it’s Bayesian optimization algorithms for hyperparameter tuning are TPE, GP Tuner, Metis Tuner, BOHB, and more.

Here are the steps you need to follow to use NNI:

Install NNI on either Windows or Linux and verify the installation.
Define and update the model.
Enable NNI API.
Define search space.
Define your experiment.
Prepare trial.
Prepare tuner.
Prepare config file.
Run the experiment.

To learn more about these steps, check out NNI’s official documentation. You can also learn more about Microsoft’s NNI algorithms from Github.

Looking for how to implement this in your project? Check out this tutorial: “How to add Microsoft’s NNI to your Project”.

6. Google’s Vizer

AI Platform Vizier is a black-box optimization service for tuning hyperparameters in complex machine learning models.

It not only optimizes your model’s output by tuning the hyperparameters, it can also be used effectively to tune parameters in a function.

How does it work?

Determine study configuration by setting the result and the hyperparameters that affect it.
Creates study from configuration values already set, uses it to perform experiments to produce results.

Why should you use Vizer?

It’s easy to use.
Requires minimal user configuration and setup.
Hosts state-of-the-art black-box optimization algorithms.
High availability.
Scalable to millions of trials per study, thousands of parallel trial evaluations per study, and billions of studies.

Follow these steps to use Vizer:

Ensure you have a Google account and you’re logged in.
Create a new cloud project from the console.
Enable billing for your account if it’s not enabled already.
Activate the AI Platform Vizer.
Install and initialize the Cloud SDK.

In the documentation page, steps are shown on how to make API requests using curl.

7. AWS Sage Maker

AWS Sage Maker is a fully-managed machine learning service. With SageMaker, you can build and train machine learning models quickly and with ease. You can directly deploy them into a production-ready hosted environment right after building, just like a complete package.

It also provides machine learning algorithms that are optimized to run efficiently against extremely large data in a distributed environment. SageMaker natively supports bring-your-own-algorithms and frameworks, also offering flexible distributed training options that adjust to your specific workflows.

SageMaker uses Random Search or Bayesian Search for model hyperparameter tuning. For Bayesian Search, it either improves performance with a combination of hyperparameter values close to the combination from the best previous training job, or it chooses a set of hyperparameter values far removed from those it has tried.

Why should you use AWS?

AWS Sagemaker takes care of abstracting a ton of software development skills necessary to accomplish the task, while still being highly effective, flexible and cost-effective. You can focus on what’s more important, the core ML experiments, and SageMaker supplements the remaining necessary skills with easy abstracted tools similar to your existing workflow. All your tools in one place, so you can move easily from data preprocessing, to model building and model deployment, all in one platform.

In a nutshell, you can use SageMaker’s automatic model tuning with built-in algorithms, custom algorithms, and SageMaker pre-built containers for machine learning frameworks.

Learn more in these tutorials:

7. Azure Machine Learning

Azure was created by Microsoft, leveraging its constantly-expanding worldwide network of data centers. Azure is a cloud platform for building, deploying, and managing services and applications, anywhere.

Azure Machine Learning is a separate and modernized service that delivers a complete data science platform. Complete in the sense that it’s from data preprocessing, to model building, to model deployment and maintenance, the whole data science journey on a single platform. It supports both code-first and low-code experiences. If you’re someone who likes little or no code, you should consider using Azure Machine Learning Studio.

Azure Machine Learning Studio is a web portal on Azure Machine Learning that contains low-code and no-code options (drag-and-drop) for project authoring and asset management.

Azure Machine Learning supports the following hyperparameter sampling methods:

Random sampling is used to randomly select a value for each hyperparameter, which can be a mix of discrete and continuous values. It also supports early termination of low-performance runs, just like early stopping in Tree based models.
Grid sampling can only be employed when all hyperparameters are discrete, and is used to try every possible combination of parameters in the search space.
Bayesian Sampling chooses hyperparameter values based on the Bayesian optimization algorithm, which tries to select parameter combinations that will result in improved performance from the previous selection.

With a really large hyperparameter search space (hundreds of hyperparameters and more), it would take a lot of iterations to try out every single combination. To save your time, you could set early iteration stopping to those experiments (iterations) where results were poorer than earlier. Azure has early stopping policies to help you with that:

Bandit Policy.You can use a bandit policy to stop a run (experiment or iteration) if the target performance metric underperforms the best run so far by a specified margin.
Median stopping policy. Like bandit policy, it abandons the run where the target performance metric is worse than the median of the running averages for all runs.
Truncation selection policy. A truncation selection policy cancels all runs at each evaluation interval where percentages are lower than the truncation percentage value you specified.

How can you start using Azure for hyperparameter tuning in your project?

Define a search space.
Configure sampling. You can choose out of Grid Sampling, Bayseian sampling, or Random sampling.
Configure early termination. You can use either Bandit Policy, Median stopping policy, or Truncation stopping policy.
Run a hypertuning training experiment.

Check out this tutorial module by Microsoft: “Tune Hyperparameters with Azure Machine Learning”.

Optimization in deep learning

Have a look at other articles on our blog exploring aspects of optimization in deep learning:

Deep Learning Model Optimization Methods: Deep learning models exhibit excellent performance but require high computational resources. Techniques like pruning, quantization, and knowledge distillation are vital for improving computational efficiency.
How to Optimize GPU Usage During Model Training with neptune.ai: Since GPUs are expensive resources, it is paramount to utilize them to their fullest degree. Metrics like GPU usage, memory utilization, and power consumption provide insight into resource utilization and potential for improvement.
Deep Learning Optimization Algorithms: Training deep learning models means solving an optimization problem: The model is incrementally adapted to minimize an objective function. A range of optimizers are used in deep learning, each addressing a particular shortcoming of the basic gradient descent approach.

Conclusion

I hope I was able to teach you one or two things about hyperparameter tools. Don’t just let it sit there in your head, try them out! And feel free to reach out to me, I’d love to learn your opinions and preferences. Thanks for reading!

Other resources to also check out:

Was the article useful?

More about Best Tools for Model Tuning and Hyperparameter Optimization

Check out our product resources and related articles below:

We are joining OpenAI

Synthetic Data for LLM Training

What are LLM Embeddings: All you Need to Know

Detecting and Fixing ‘Dead Neurons’ in Foundation Models

Explore more content topics:

Computer Vision General LLMOps ML Model Development ML Tools MLOps Natural Language Processing Paper Reflections Reinforcement Learning Tabular Data Time Series

Neptune is the experiment tracker purpose-built for foundation model training.

It lets you monitor and visualize thousands of per-layer metrics—losses, gradients, and activations—at any scale. Drill down into logs and debug training issues fast. Keep your model training stable while reducing wasted GPU cycles.

Play with a live project

See Docs

Transition Hub

Train FM

State of Foundation Model Training Report 2025