Neptune Blog

How to Save Trained Model in Python

Gourav Bais

11 min

6th May, 2025

ML Model Development MLOps

When working on real-world machine learning (ML) use cases, finding the best algorithm/model is not the end of your responsibilities. It is crucial to save, store, and package these models for their future use and deployment to production.

These practices are needed for a number of reasons:

Backup: A trained model can be saved as a backup in case the original data is damaged or destroyed.

Reusability & reproducibility: Building ML models is time-consuming by nature. To save cost and time, it becomes essential that your model gets you the same results every time you run it. Saving and storing your model the right way takes care of this.

Deployment: When deploying a trained model in a real-world setting, it becomes necessary to package it for easy deployment. This makes it possible for other systems and applications to use the same model without much hassle.

To reiterate, while saving and storing ML models allow ease of sharing, reusability, and reproducibility; packaging the models enables quick and painless deployment. These 3 operations work in harmony to simplify the whole model management process.

In this article, you will learn about different methods of saving, storing, and packaging a trained machine-learning model, along with the pros and cons of each method. But before that, you must understand the distinction between these three terms.

Save vs package vs store ML models

Although all these terms look similar, they are not the same.

Saving vs Storing vs Packaging ML Models | Source: Author

Saving a model refers to the process of saving the model’s parameters, weights, etc., to a file. Usually, all ML and DL models provide some kind of method (eg. model.save()) for saving the models. But you must be aware that save is a single action and gives only a model binary file, so you still need code to make your ML application production-ready.

Packaging, on the other hand, refers to the process of bundling or containerizing the necessary components of a model, such as the model file, dependencies, configuration files, etc., into a single deployable package. The goal of a package is to make it easier to distribute and deploy the ML model in a production environment.

Once packaged, a model can be deployed across different environments, which allows the model to be used in various production settings such as web applications, mobile applications, etc. Docker is one of the tools which allows you to do this.

Storing the ML model refers to the process of saving the trained model files in a centralized storage that can be accessed anytime when needed. When storing a model, you normally choose some sort of storage from where you can fetch your model and use it anytime. The model registry is a category of tools that solve this issue for you.

Now let’s see how we can save our model.

How to save a trained model in Python?

In this section, you will see different ways of saving machine learning (ML) as well as deep learning (DL) models. To begin with, let’s create a simple classification model using the most famous Iris-dataset.

Note: The focus of this article is not to show you how you can create the best ML model but to explain how effectively you can save trained models.

You first need to load the required dependencies and the iris dataset as follows:

# load dependencies
import pandas as pd 

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix

# load the dataset
url = "iris.data"

# column names to use
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# read the dataset from the URL
dataset = pd.read_csv(url, names=names) 

# check the first few rows of iris-classification data
dataset.head()

Next, you need to split the data into training and testing sets and apply the required preprocessing stages, such as feature standardization.

# separate the independent and dependent features
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values 

# Split dataset into random training and testing subsets
X_train, X_test, y_train, y_test = train_test_split(X, 
                                                    y, test_size=0.20) 
# feature standardization
scaler = StandardScaler()
scaler.fit(X_train)

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Finally, you need to train a classification model (feel free to choose any) on training data and check its performance on testing data.

# training a KNN classifier
model = KNeighborsClassifier(n_neighbors=5)
model.fit(X_train, y_train) 

# make predictions on the testing data
y_predict = model.predict(X_test)

# check results
print(confusion_matrix(y_test, y_predict))
print(classification_report(y_test, y_predict))

Iris Classification Results — *Iris classification results | Source: Author*

Now you have an ML model that you want to save for future use. The first way to save an ML model is by using the pickle file.

Saving trained model with pickle

The pickle module can be used to serialize and deserialize the Python objects. Pickling is the process of converting a Python object hierarchy into a byte stream, while Unpickling is the process of converting a byte stream (from a binary file or other object that appears to be made of bytes) back to an object hierarchy.

For saving the ML models used as a pickle file, you need to use the Pickle module that already comes with the default Python installation.

To save your iris classifier model you simply need to decide on a filename and dump your model to a pickle file like this:

import pickle

# save the iris classification model as a pickle file
model_pkl_file = "iris_classifier_model.pkl"  

with open(model_pkl_file, 'wb') as file:  
    pickle.dump(model, file)

As you can see the file is opened in wb (write binary) mode for saving the model as bytes. Also, the dump() method stores the model in the given pickle file.

You can also load this model using the load() method of the pickle module. Now you need to open the file in rb (read binary) mode to load the saved model.

# load model from pickle file
with open(model_pkl_file, 'rb') as file:  
    model = pickle.load(file)

# evaluate model 
y_predict = model.predict(X_test)

# check results
print(classification_report(y_test, y_predict))

Once loaded you can use this model to make predictions.

Another Iris Classification Result — Iris classification result | Source: Author

Pros of the Python pickle approach

1 Pickling comes as the standard module in Python which makes it easy to use for saving and restoring ML models.
2 Pickle files can handle most Python objects including custom objects, making it a versatile way to save models.
3 For small models, pickle approach is quite fast and efficient.
4 When an ML model is unpickled, it is restored to its previous state, including any variables or configurations. This makes Python pickle files one of the best alternatives for saving ML models.

Cons of the Python Pickle Approach

1 If you unpickle untrusted data, pickling could pose a security threat. Unpickling an object can execute malicious code, so it’s crucial to only unpickle information from reliable sources.
2 Pickled objects’ use may be constrained in some circumstances since they cannot be transferred between different Python versions or operating systems.
3 For models with a big memory footprint, pickling can result in the creation of huge files, which can be problematic.
4 Pickling can make it difficult to track changes to a model over time, especially if the model is updated frequently and it is not feasible to create multiple pickle files for different versions of models that you try.

Pickle is most suited for small-size models and also has some security issues, these reasons are enough to look for another alternative for saving the ML models. Next, let’s discuss Joblib to save and load ML models.

Note: In the upcoming sections you will see the same iris classifier model to be saved using different techniques.

Saving trained model with Joblib

Joblib is a set of tools (typically part of the Scipy ecosystem) that provide lightweight pipelining in Python. It majorly focuses on disk-caching, memoization, and parallel computing and is used for saving and loading Python objects. Joblib has been specifically optimized for NumPy arrays to make it fast and reliable for ML models that have a lot of parameters.

To save large models with Joblib, you need to use the Python Joblib module that comes preinstalled with Python.

import joblib 

# save model with joblib 
filename = 'joblib_model.sav'
joblib.dump(model, filename)

To save the model, you need to define a filename with a ‘.sav’ or ‘.pkl’ extension and call the dump() method from Joblib.

Similar to pickle, Joblib provides the load() method to load the saved ML model.

# load model with joblib
loaded_model = joblib.load(filename)

# evaluate model 
y_predict = model.predict(X_test)

# check results
print(classification_report(y_test, y_predict))

After loading the model with Joblib you are free to use it on the data to make predictions.

Pros of saving ML models with Joblib

1 Fast and effective performance is a key component of Joblib, especially for models with substantial memory requirements.
2 The serialization and deserialization process can be parallelized via Joblib, which can enhance performance on multi-core machines.
3 For models that demand a lot of memory, Joblib employs a memory-mapped file format to reduce memory utilization.
4 Joblib offers various security features, such as a whitelist of secure functions that can be utilized during deserialization, to assist safeguard against untrusted data.

Cons of Saving ML Models with Joblib

1 Joblib is optimized for numpy arrays, and may not work as well with other object types.
2 Joblib offers less flexibility than Pickle because there are fewer options available for configuring the serialization process.
3 Compared to Pickle, Joblib is less well known, which can make it more difficult to locate help and documentation around it.

Although Joblib solves the major issues faced by pickle, it has some issues on its own. Next, you will see how you can manually save and restore the models using JSON.

Saving trained model with JSON

When you want to have full control over the save and restore procedure of your ML model, JSON comes into play. Unlike the other two methods, this method does not directly dump the ML model to a file; instead, you need to explicitly define the different parameters of your model to save them.

To use this method, you need to use the Python json module that again comes along with the default Python installation. Using the JSON method requires additional effort to write all parameters that an ML model contains. To save the model using JSON, let’s create a function like this:

import json 

# create json save function
def save_json(model, filepath, X_train, y_train): 
    saved_model = {}
    saved_model["algorithm"] = model.get_params()['algorithm'],
    saved_model["max_iter"] = model.get_params()['leaf_size'],
    saved_model["solver"] = model.get_params()['metric'],
    saved_model["metric_params"] = model.get_params()['metric_params'],
    saved_model["n_jobs"] = model.get_params()['n_jobs'],
    saved_model["n_neighbors"] = model.get_params()['n_neighbors'],
    saved_model["p"] = model.get_params()['p'],
    saved_model["weights"] = model.get_params()['weights'],
    saved_model["X_train"] = X_train.tolist() if X_train is not None else "None",
    saved_model["y_train"] = y_train.tolist() if y_train is not None else "None"
    
    json_txt = json.dumps(saved_model, indent=4)
    with open(filepath, "w") as file: 
        file.write(json_txt)

# save the iris-classification model in a json file
file_path = 'json_model.json'
save_json(model, file_path, X_train, y_train)

You see how you need to define each model parameter and the data to store it in JSON. Different models have different methods to check out the parameter details. For example, the get_params() for KNeighboursClassifier gives the list of all the hyperparameters in the model. You need to save all these hyperparameters and data values in a dictionary which is then dumped into a file with the ‘.json’ extension.

To read this JSON file you just need to open it and access the parameters as follows:

# create json load function 
def load_json(filepath): 
    with open(filepath, "r") as file:
        saved_model = json.load(file)
    
    return saved_model

# load model configurations
saved_model = load_json('json_model.json')
saved_model

In the above code, a function load_json() is created that opens the JSON file in read mode and returns all the parameters and data as a dictionary.

Unfortunately, you can not use the saved model directly with JSON, you need to read these parameters and data to retrain the model all by yourself.

Pros of saving ML models with JSON

1 Models that need to be exchanged between various systems can be done so using JSON, which is a portable format that can be read by a wide variety of programming languages and platforms.
2 JSON is a text-based format that is easy to read and understand, making it a good choice for models that need to be inspected or edited by humans.
3 In comparison to Pickle or Joblib, JSON is a lightweight format that creates smaller files, which can be crucial for models that must be transferred over the internet.
4 Unlike pickle, which executes code during deserialization, JSON is a secure format that minimizes security threats.

Cons of Saving ML Models with JSON

1 Because JSON only supports a small number of data types, it could not be compatible with sophisticated machine learning models that employ unique data types.
2 In particular, for large models, JSON serialization and deserialization can be slower than other formats.
3 Compared to alternative formats, JSON offers less flexibility and may take more effort to tailor the serialization procedure.
4 JSON is a lossy format that may not preserve all of the information in the original model, which can be a problem for models that require exact replication.

To ensure security and JSON/pickle benefits, you can save your model to a dedicated database. Next, you will see how you can save an ML model in a database.

Saving deep learning model with TensorFlow Keras

TensorFlow is a popular framework for training DL-based models, and Keras is a wrapper for TensorFlow. A neural network design with numerous layers and a set of labeled data are used to train deep learning models. These models have two major components, Weights and Network architecture, that you need to save to restore them for future use. Typically there are two ways to save deep learning models:

Save the model architecture in a JSON or YAML file and weights in an HDF5 file.
Save both model and architecture both in HDF5, protobuf, or tflite file.

You can refer to any one way to do this, but the widely used method is to save the model weights and architecture together in an HDF5 file.

To save a deep learning model in TensorFlow Keras, you can use the save() method of the Keras Model object. This method saves the entire model, including the model architecture, optimizer, and weights, in a format that can be loaded later to make predictions.

Here’s an example code snippet that shows how to save a TensorFlow Keras-based DL model:

# import tensorflow dependencies
from tensorflow.keras.models import Sequential, model_from_json
from tensorflow.keras.layers import Dense

# define model architecture
model = Sequential()
model.add(Dense(12, input_dim=4, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, epochs=150, batch_size=10, verbose=0)

# save model and its architecture 
model.save('model.h5')

This is it, you just need to define the model architecture, train the models with appropriate settings, and finally save it using the save() method.

Loading the saved models with Keras is as easy as reading a file in Python. You just need to call the load_model() method by providing the model file path and your model will be loaded.

# define dependency 
from tensorflow.keras.models import load_model

# load model 
model = load_model('model.h5')

# check model info 
model.summary()

Your model is now loaded for use.

Tensorflow loaded model | Source: Author

Pros of saving models with TensorFlow Keras

1 Saving and loading models in TensorFlow Keras is very straightforward using the save() and load_model() functions. This makes it easy to save and share models with others or to deploy them to production.
2 The whole model architecture, optimizer, and weights are saved in one file when you save a Keras model. With no need to bother about loading the architecture and weights separately, it is simple to load the model and generate predictions.
3 TensorFlow Keras supports several file formats for saving models, including the HDF5 format (.h5), the TensorFlow SavedModel format (.pb), and the TensorFlow Lite format (.tflite). This gives you flexibility in choosing the format that best suits your needs.

Cons of Saving Models with TensorFlow Keras

1 When you save a Keras model, the resulting file can be quite large, especially if you have a large number of layers or parameters. This can make it challenging to share or deploy the model, especially in situations where bandwidth or storage space is limited.
2 Models saved with one version of TensorFlow Keras could not work with another. If you try to load a model that was saved with a different version of Keras or TensorFlow, this may result in problems.
3 Although it’s simple to save a Keras model, you’re only able to use the features that Keras offers for storing models. A different framework or strategy may be required if you require more flexibility in the way models are saved or loaded.

There is one more widely used framework named Pytorch for training the DL-based models. Let’s check how you can save Pytorch-based deep learning models with Python.

Saving deep learning model with Pytorch

Developed by Facebook, Pytorch is one of the highly used frameworks for developing DL-based solutions. It provides a dynamic computational graph, which allows you to modify your model on-the-fly, making it ideal for research and experimentation. It uses ‘.pt’ and ‘.pth’ file formats to save model architecture and its weights.

To save a deep learning model in PyTorch, you can use the save() method of the PyTorch torch.nn.Module object. This method saves the entire model, including the model architecture and weights, in a format that can be loaded later to make predictions.

Here’s an example code snippet that shows how to save a PyTorch model:

# import dependencies
import torch
import torch.nn as nn
import numpy as np

# convert data numpy arrays to tensors
X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)

# define model architecture
class NeuralNetworkClassificationModel(nn.Module):
    def __init__(self,input_dim,output_dim):
        super(NeuralNetworkClassificationModel,self).__init__()
        self.input_layer    = nn.Linear(input_dim,128)
        self.hidden_layer1  = nn.Linear(128,64)
        self.output_layer   = nn.Linear(64,output_dim)
        self.relu = nn.ReLU()
    
    
    def forward(self,x):
        out =  self.relu(self.input_layer(x))
        out =  self.relu(self.hidden_layer1(out))
        out =  self.output_layer(out)
        return out

# define input and output dimensions
input_dim  = 4 
output_dim = 3
model = NeuralNetworkClassificationModel(input_dim,output_dim)

# create our optimizer and loss function object
learning_rate = 0.01
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),lr=learning_rate)

# define training steps
def train_network(model,optimizer,criterion,X_train,y_train,X_test,y_test,num_epochs,train_losses,test_losses):
    for epoch in range(num_epochs):
        # clear out the gradients from the last step loss.backward()
        optimizer.zero_grad()
        
        # forward feed
        output_train = model(X_train)

        # calculate the loss
        loss_train = criterion(output_train, y_train)

        # backward propagation: calculate gradients
        loss_train.backward()

        # update the weights
        optimizer.step()
        
        output_test = model(X_test)
        loss_test = criterion(output_test,y_test)

        train_losses[epoch] = loss_train.item()
        test_losses[epoch] = loss_test.item()

        if (epoch + 1) % 50 == 0:
            print(f"Epoch { epoch+1 }/{ num_epochs }, Train Loss: { loss_train.item():.4f }, Test Loss: {loss_test.item():.4f}")

# train model
num_epochs = 1000
train_losses = np.zeros(num_epochs)
test_losses  = np.zeros(num_epochs)
train_network(model,optimizer,criterion,X_train,y_train,X_test,y_test,num_epochs,train_losses,test_losses)

# save model 
torch.save(model, 'model_pytorch.pt')

Unlike Tensorflow, Pytorch allows you to have more control over the model training, as seen in the above code. After training the model, you can save the weights and their architecture using save() method.

Loading the saved model with Pytorch requires the use of load() method.

# load model
model = torch.load('model_pytorch.pt')
# check model summary
model.eval()

Pros of saving models with Pytorch

1 The computational graph used by PyTorch is dynamic, meaning it is built as the program is run. This allows for more flexibility in modifying the model during training or inference.
2 For dynamic models, such as those with variable-length inputs or outputs, which are frequent in natural language processing (NLP) and computer vision, PyTorch offers improved support.
3 Given that PyTorch is written in Python and functions well with other Python libraries like NumPy and pandas, manipulating data both before and after training is simple.

Cons of Saving Models with Pytorch

1 Even though PyTorch provides an accessible API, there may be a steep learning curve for newcomers to deep learning or Python programming.
2 Since PyTorch is essentially a framework for research, it might not have as many tools for production deployment as other deep learning frameworks like TensorFlow or Keras.

This isn’t it, you can use model registry platforms to save DL-based models as well, specially the ones with large size. This makes it easy to deploy and maintain them without requiring extra effort from developers.

You can find the dataset and code used in this article here.

How to package ML models?

An ML model is typically optimized for performance on the training dataset and the specific environment in which it is trained. But, when it comes to deploying the models in different environments, such as a production environment, there could be various challenges.

These challenges are but not limited to differences in hardware, software, and data inputs. Packaging the model makes it easier to address these problem, as it allows the model to be exported or serialized into a standard format that can be loaded and used in various environments.

There are various options available for packaging right now. By packaging the model in a standard format such as PMML (Predictive Model Markup Language), ONNX, TensorFlow SavedModel format, etc. it becomes easier to share and collaborate on a model without being concerned about different libraries and tools used by different teams. Now, let’s check a few examples of packaging an ML model with different frameworks in Python.

Note: For this section as well, you will see the same iris-classification example.

Packaging models with PMML

Using the PMML library in Python, you can export your machine learning models to PMML format and then deploy that as a web service, a batch processing system, or a data integration platform. This can make it easier to share and collaborate on machine learning models, as well as to deploy them in various production environments.

To package an ML model using PMML you can use different modules like sklearn2pmml, jpmml-sklearn, jpmml-tensorflow, etc.

Note: To use PMML, you must have Java Runtime installed on your system.

Here is an example code snippet that allows you to package the trained iris classifier model using PMML.

from sklearn2pmml import PMMLPipeline, sklearn2pmml
# package iris classifier model with PMML
sklearn2pmml(PMMLPipeline([("estimator",
                        	model)]),
         	"iris_model.pmml",
         	with_repr=True)

In the above code, you simply need to create a PMML pipeline object by passing your model object. Then you need to save the PMML object using sklearn2pmml() method. That is it, now you can use this “iris_model.pmml” file across different environments.

Pros of using PMML

1 Since PMML is a platform-independent format, PMML models can be integrated with numerous data processing platforms and used in a variety of production situations.
2 PMML can reduce vendor lock-in as it allows users to export and import models from different machine-learning platforms.
3 PMML models can be easily deployed in production environments as they can be integrated with various data processing platforms and systems.

Cons of using PMML

1 Some machine learning models and algorithms may not be able to be exported in PMML format as a result of the limited support.
2 PMML is an XML-based format that can be verbose and inflexible, which may make it difficult to modify or update models after they have been exported in PMML format.
3 It might be difficult to create PMML models, especially for complicated models with several features and interactions.

Packaging models with ONNX

Developed by Microsoft and Facebook, ONNX (Open Neural Network Exchange) is an open format for representing machine learning models. It allows for interoperability between different deep-learning frameworks and tools.

ONNX models can be deployed efficiently on a variety of platforms, including mobile devices, edge devices, and the cloud. It supports a variety of runtimes, including Caffe2, TensorFlow, PyTorch, and MXNet, which allows you to deploy your models on different devices and platforms with minimal effort.

To save the model using ONNX, you need to have onnx and onnxruntime packages downloaded in your system.

Here is an example of how you can convert the existing ML model to ONNX format.

# load dependencies
import onnxmltools
import onnxruntime

# Convert the KNeighborsClassifier model to ONNX format
onnx_model = onnxmltools.convert_sklearn(model)

# Save the ONNX model in a file
onnx_file = "iris_knn.onnx"
onnxmltools.utils.save_model(onnx_model, onnx_file)

You just need to import the required modules and use the convert_sklearn() method to corvet the sklearn model to the ONNX model. Once the conversion is done, using the save_model() method, you can store the ONNX model in a file with the “.onnx” extension. Although here you see an example of an ML model, ONNX is majorly used for DL models.

You can also load this model using the ONNX Runtime module.

# Load the ONNX model into ONNX Runtime
sess = onnxruntime.InferenceSession(onnx_file)

# Evaluate the model on some test data
input_data = {"X": X_test[:10].astype('float32')}
output = sess.run(None, input_data)

You need to create a session using InferenceSession() method to load the ONNX model from a file and then use sess.run() method to make predictions from the model.

Pros of using ONNX

1 With little effort, ONNX models can easily be deployed on a number of platforms, including mobile devices and the cloud. It is simple to deploy models on various hardware and software platforms thanks to ONNX’s support for a wide range of runtimes.
2 ONNX models are optimized for performance, which means that they can run faster and consume fewer resources than models in other formats.

Cons of using ONNX

1 ONNX is primarily designed for deep learning models and may not be suitable for other types of machine learning models.
2 ONNX models may not be compatible with all versions of different deep learning frameworks, which may require additional effort to ensure compatibility.

Packaging models with Tensorflow SavedModel

Tensorflow’s SavedModel format allows you to easily save and load your deep learning models, and it ensures compatibility with other Tensorflow tools and platforms. Additionally, it provides a streamlined and efficient way to deploy our models in production environments.

SavedModel supports a wide range of deployment scenarios, including serving models with Tensorflow Serving, deploying models to mobile devices with Tensorflow Lite, and exporting models to other ML libraries such as ONNX.

It provides a simple and streamlined way to save and load Tensorflow models. The API is easy to use and well-documented, and the format is designed to be efficient and scalable.

Note: You can use the same TensorFlow model trained in the above section.

To save the model in SavedModel format, you can use the following lines of code:

import tensorflow as tf

# using SavedModel format to save the model
tf.saved_model.save(model, "my_model")

You can also load the model with load() method.

# Load the model
loaded_model = tf.saved_model.load("my_model")

Pros of using Tensorflow SavedModel

1 SavedModel is platform-independent and version-compatible, which makes it easy to share and deploy models across different platforms and versions of TensorFlow.
2 A variety of deployment scenarios are supported by SavedModel, including exporting models to other ML libraries like ONNX, serving models with TensorFlow Serving, and distributing models to mobile devices using TensorFlow Lite.
3 SavedModel is optimized for training and inference, with support for distributed training and the ability to use GPUs and TPUs to accelerate training.

Cons of using Tensorflow SavedModel

1 SavedModel files can be large, particularly for complex models, which can make them difficult to store and transfer.
2 Given that SavedModel is exclusive to TensorFlow, its compatibility with other ML libraries and tools may be constrained.
3 The saved model is a binary file that can be difficult to inspect, making it harder to understand the details of the model’s architecture and operation.

Now that you have seen multiple ways of packaging ML and DL models, you must also be aware that there are various tools available that provide infrastructure to package, deploy and serve these models. Two of the popular ones are BentoML and MLflow.

BentoML

BentoML is a flexible framework for building and deploying production-ready machine learning services. It allows data scientists to packaging their trained models, their dependencies, and the infrastructure code required to serve the model into a reusable package called a “Bento”.

BentoML supports various machine learning frameworks and deployment platforms and provides a unified API for managing the lifecycle of the model. Once a model is packaged as a Bento, it can be deployed to various serving platforms like AWS Lambda, Kubernetes, or Docker. BentoML also offers an API server that can be used to serve the model via a REST API. You can know more about it here.

MLflow

MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. It provides a comprehensive set of tools for tracking experiments, packaging code, and dependencies, and deploying models.

MLflow allows data scientists to easily package their models in a standard format that can be deployed to various platforms like AWS SageMaker, Azure ML, and Google Cloud AI Platform. The platform also provides a model registry to manage model versions and track their performance over time. Additionally, MLflow offers a REST API for serving models, which can be easily integrated into web applications or other services.

How to store ML models?

Now that we know about saving models let’s see how we can store them to facilitate their quick and easy retrieval.

Storing ML models in a database

There is also scope for you to save your ML models in relational databases PostgreSQL, MySQL, Oracle SQL, etc. or NoSQL databases like MongoDB, Cassandra, etc. The choice of database totally depends on factors such as the type and volume of data being stored, the performance and scalability requirements, and the specific needs of the application.

PostgreSQL is a popular choice when working on ML models that provide support for storing and manipulating structured data. Storing ML models in PostgreSQL provides an easy way to keep track of different versions of a model and manage them in a centralized location.

Additionally, it allows for easy sharing of models across a team or organization. However, it’s important to note that storing large models in a database can increase database size and query times, so it’s important to consider the storage capacity and performance of your database when storing models in PostgreSQL.

To save an ML model in a database like PostgreSQL, you need to first Convert the trained model into a serialized format, such as a byte stream (pickle object) or JSON.

import pickle

# serialize the model
model_bytes = pickle.dumps(model)

Then open a connection to the database and create a table or collection to store the serialized model. For this, you need to use the psycopg2 library of Python, which lets you connect to the PostgreSQL database. You can download this library with the help of the Python package installer.

$ pip install psycopg2-binary

Then you need to establish a connection to the database to store the ML model like this:

import psycopg2

#  establishing the connection to the Database
conn = psycopg2.connect(
  database="database-name", user=user-name, password='your-password', host='127.0.0.1', port= '5432'
)

To perform any operation on the database, you need to create a cursor object that will help you to execute queries in your Python program.

# create a cursor
cur = conn.cursor()

With the help of this cursor, you can now execute the CREATE TABLE query to create a new table.

cur.execute("CREATE TABLE models (id INT PRIMARY KEY NOT NULL, name CHAR(50), model BYTEA)")

Note: Make sure that the model object type is BYTEA.

Finally, you can store the model and other metadata information using the INSERT INTO command.

# Insert the serialized model into the database
cur.execute("INSERT INTO models (id, name, model) VALUES (%s, %s, %s)", (1, 'iris-classifier', model_bytes))
conn.commit()

# Close the database connection
cur.close()
conn.close()

Once all the operations are done, close the cursor and connection to the database.

Finally, to read the model from the database, you can use the SELECT command by filtering the model either on name or id.

import psycopg2
import pickle

# Connect to the database
conn = psycopg2.connect(
  database="database-name", user=user-name, password='your-password', host='127.0.0.1', port= '5432'
)

# Retrieve the serialized model from the database
cur = conn.cursor()
cur.execute("SELECT model FROM models WHERE name = %s", ('iris-classifier',))
model_bytes = cur.fetchone()[0]

# Deserialize the model
model = pickle.loads(model_bytes)

# Close the database connection
cur.close()
conn.close()

Once the model is loaded from the database, you can use it to make predictions as follows:

# test loaded model
y_predict = model.predict(X_test)

# check results
print(classification_report(y_test, y_predict))

This is it, you have the model stored and loaded from the database.

Pros of storing ML models in a database

1 Storing ML models in a database provides a centralized storage location that can be easily accessed by multiple applications and users.
2 Since most organizations already have databases in place, integrating ML models into the existing infrastructure becomes easier.
3 Databases are optimized for data retrieval, which means that retrieving the ML models is faster and more efficient.
4 Databases are designed to provide robust security features such as authentication, authorization, and encryption. This ensures that the stored ML models are secure.

Cons of storing ML models in a database

1 Databases are designed for storing structured data and are not optimized for storing unstructured data such as ML models. As a result, there may be limitations in terms of model size, file formats, and other aspects of ML models that cannot be accommodated by databases.
2 Storing ML models in a database can be complex and requires expertise in both database management and machine learning.
3 If the ML models are large, storing them in a database may lead to scalability issues. Additionally, the retrieval of large models may impact the performance of the database.

While pickle, joblib, and JSON are common ways to save machine learning models, they have limitations when it comes to versioning, sharing, and managing machine learning models. Here ML model registries come to the rescue and resolve all the issues faced by the alternatives.

Next, you will see how saving ML models in the model registry can help you achieve reproducibility and reusability.

Storing ML models in model registry

A model registry is a central repository that can store, version, and manage machine learning models.
It typically includes features like model versioning, metadata control, comparing model runs, etc.
When working on any ML or DL projects, you can save and retrieve the models and their metadata from the model registry anytime you want.
Above all, model registries enable high collaboration among team members.

There are various options for the model registry, such as MLflow or Kubeflow. You can also use tools like neptune.ai – even though it’s an experiment tracker, it covers model registry and model versionins capabilities to a great extent. Although all these platforms have unique features on their own, it is rather wise to choose a registry that can provide you with a comprehensive set of features.

Storing models with MLflow

MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. It includes a model registry component that allows you to centrally manage models.

You can register a model with MLflow either in the UI or programmatically.

Registering a model via UI in MLflow | Source

Once registered, you can:

Version your models,
Transition models through stages (e.g., Staging, Production),
Add descriptions and tags,
Compare model versions,
Fetch registered models from the model registry.

Storing models with Neptune

Neptune is an experiment tracker designed with a strong focus on collaboration and scalability. It lets you monitor months-long model training, track massive amounts of data, and compare thousands of metrics in the blink of an eye.

You can log, store, and organize your model metadata with Neptune’s flexible Python API. To log the model metadata, use the run object. Depending on your setup, you can separate the model and training metadata by creating multiple runs or log everything together.

With Neptune, you can:

Track models and model versions, along with the associated metadata.
Filter, sort, and compare the versioned data easily.
Manage model stages using tags.
Query and download any stored model files and metadata.

Pros of storing models with model registry

1 A centralized location for managing, storing, and version-controlling machine learning models.
2 Metadata regarding models, such as their version, performance metrics, etc. are frequently included in model registries, making it simpler to follow changes and comprehend the model’s past.
3 Model registries allow team members to collaborate on models and share their work easily.
4 Some model registries provide automated deployment options, which can simplify the process of deploying models to production environments.
5 Model registries often provide security features such as access control, encryption, and authentication, ensuring that models are kept secure and only accessible to authorized users.

Cons of storing models with model registry

1 A paid subscription is necessary for some model registries, which raises the cost of machine learning programs.
2 Model registries often have a learning curve, and it may take time to get up to speed with their functionality and features.
3 Using a model registry may require integrating with other tools and systems, which can create additional dependencies.

You have now seen different ways of saving an ML model (model registry being the most optimal one), this is time to check some ways to save the Deep Learning (DL) based models.

Best practices

In this section, you will see some of the best practices for saving the ML and DL models.

Ensure Library Versions: Using different library versions for saving and loading the models may create compatibility issues as there could be some structural changes with the library update. You must ensure that library versions while loading the machine learning models should be the same as the library versions used to save the model.
Ensure Python Versions: It is a good practice to use the same Python version across all stages of your ML pipeline development. Sometimes changes in the Python version can create execution issues, for example, TensorflowV1 is supported up till Python 3.7, and if you try to use it with later versions, you will face the errors.
Save Both Model Architecture and Weights: In the case of DL-based models, if you save only model weight but not architecture, then you can not reconstruct the model. Saving the model architecture along with the trained weights ensures that the model can be fully reconstructed and used later on.
Document the Model: The goal, inputs, outputs, and anticipated performance of the model should be documented. This can aid others in understanding the capabilities and constraints of the model.
Use Model Registry: Use a model registry like neptune.ai to keep track of models, their versions, and metadata and to collaborate with team members.
Keep the Saved Model Secure: Keep the saved model secure by encrypting it or storing it in a secure location, especially if it contains sensitive data.

Conclusions

In conclusion, saving machine learning models is an important step in the development process, as it allows you to reuse and share your models with others. There are several ways to save machine learning models, each with its own advantages and disadvantages. Some popular methods include using pickle, Joblib, JSON, TensorFlow save, and PyTorch save.

It is important to choose the appropriate file format for your specific use case and to follow best practices for saving and documenting models, such as version control, ensuring language and library versions, and testing the saved model. By following the practices discussed in this article, you can ensure that your machine-learning models are saved correctly, are easy to reuse and deploy, and can be effectively shared with others.

References

Was the article useful?

More about How to Save Trained Model in Python

Check out our product resources and related articles below:

We are joining OpenAI

Synthetic Data for LLM Training

What are LLM Embeddings: All you Need to Know

Detecting and Fixing ‘Dead Neurons’ in Foundation Models

Explore more content topics:

Computer Vision General LLMOps ML Model Development ML Tools MLOps Natural Language Processing Paper Reflections Reinforcement Learning Tabular Data Time Series

Neptune is the experiment tracker purpose-built for foundation model training.

It lets you monitor and visualize thousands of per-layer metrics—losses, gradients, and activations—at any scale. Drill down into logs and debug training issues fast. Keep your model training stable while reducing wasted GPU cycles.

Play with a live project

See Docs