Blog » ML Project Management » How to Make your MLflow Projects Easy to Share and Collaborate on

How to Make your MLflow Projects Easy to Share and Collaborate on

If you use MLflow, you’re in for a treat! Because, in this article, you’ll see how to make your MLflow projects much easier to share, and enable seamless collaboration with your teammates.

Creating a seamless workflow for your machine learning projects can be extremely challenging. 

A typical machine learning lifecycle includes: 

  • Data collection + preprocessing 
  • Training the model on data
  • Deploying the model to production 
  • Testing + improving the model with new data 

These four steps seem fairly straightforward, but each layer comes with new obstacles. You might need to use different tools for each step – Kafka for data prep, Tensorflow as a model training framework, Kubernetes as a deployment environment, etc.  

Every time you use a new tool, you must repeat the entire process, perhaps by running the same funnel through Scikit-learn and deploying to Amazon SageMaker. This is obviously not sustainable as APIs and organizations expand. 

Plus, tuning hyperparameters is vital for creating an extraordinary model; there should be a thorough record of hyperparameter history, source code, performance metrics, dates, persons, and more. The machine learning lifecycle can be a formidable platform development challenge: you should be able to reproduce, revisit, and deploy your workflow to production easily, and you also need a platform that standardizes the lifecycle. 

Luckily, there’s MLflow, which is a great open-source solution built around 3 pillars: tracking, projects, and models.

  • MLflow Tracking

Create an extensive logging framework around your model, assign specific metrics to compare runs. 

  • MLflow Projects

Create an MLflow pipeline to determine how the model would run on the cloud. 

  • Mlflow Models

Package your machine learning models in a standard format for use in various downstream tools. For example, real-time serving with a REST API, or batch inference with Apache Spark. 

MLflow enables reproducibility and scalability for large organizations. The same model can execute in the cloud, locally, or in a notebook. You can work with any ML library, algorithm, deployment tool or language, and you can also add and share previous code. 

But, there’s something that MLflow doesn’t have: an easy way to organize work and collaborate. 

You would need to host an MLflow server, painstakingly organize team member access, store backups, and more. Plus, MLflow’s UI, the MLflow Tracking Module that lets you compare experiments is not easy to use at all, especially for large teams.

MLflow UI

Not to worry! We can use Neptune AI to solve this problem. 

Neptune’s intuitive UI lets you track experiments and collaborate with teammates, while also keeping your favorite parts from MLflow. 

Introducing Neptune & MLflow integration

Neptune MLflow integration

Neptune is a lightweight ML experiment management tool. It’s flexible and easy to integrate with all types of workflows. Your teammates can use different ML libraries and platforms, share results and collaborate on a single dashboard with Neptune. You can even use their web platform, so you don’t have to deploy it on your own hardware. 

Neptune’s main features are: 

  • Experiment Management: keep track of all your team’s experiments, also tag, filter, group, sort, and compare them 
  • Notebook versioning and diffing: compare two notebooks or checkpoints in the same notebook; similarly to source code, you can do a side-by-side comparison 
  • Team Collaboration: add comments, mention teammates, and compare experiment results

READ MORE
➡️ Neptune vs MLflow – how are they different?


Neptune and MLflow can be integrated with one simple command: 

neptune mlflow

Now, you can push all these MLrun objects to a Neptune experiment: 

  • Experiment id + name
  • Run id + name
  • Metrics
  • Parameters
  • Artifacts 
  • tags

Organization and collaboration With Neptune 

Now let’s walk through how you will be able to share and collaborate on experiments from MLflow through Neptune’s beautiful and intuitive UI. 

Neptune setup (skip if you already have a Neptune account)

1. Sign up for a Neptune AI account first. It’s free for individuals and non-organizations, and you get a generous 100 GB of storage. 

2. Get your API token by clicking the top right menu. 

Neptune getting started
  1. Create a NEPTUNE_API_TOKEN environment variable and run it in your console.
export NEPTUNE_API_TOKEN=’your_api_token’

4. Create a project. In your Projects dashboard, click “New Project” and fill in the following information. Pay attention to the privacy settings!

Neptune create new project

Sync Neptune and MLflow

First install Neptune-MLflow:

pip install neptune-mlflow

Next, setyour NEPTUNE_PROJECT variable to USER_NAME/PROJECT_NAME: 

export NEPTUNE_PROJECT=USER_NAME/PROJECT_NAME

Finally, sync your mlruns directory with Neptune: 

neptune mlflow

Collaborate with Neptune 

Your experiment metadata should now be stored in Neptune, and you can view it in your experiment dashboard:

Neptune MLflow collaborate

You can customize the dashboard by adding tags and grouping experiments with custom filters. 

Neptune lets you share ML experiments simply by sending a link. It can be:

Mlflow experiment links
Mlflow experiment charts
MLflow group of experiments
MLflow diagnosis plots

Neptune also comes with workspaces, a central hub where you can manage projects, users, and subscriptions; there are individual and team workspaces. 

In the team workspace, team members can browse the content that’s related to their assigned role. You can assign various roles in projects and workspaces. In a team workspace, you can invite people either as admin or member, each with different privileges. 

Workplace settings can be changed in the workspace name on the top bar: 

Neptune workspaces

Under the Overview, Projects, People and Subscription tabs, you can see workplace settings: 

Neptune workplace settings

There are three roles in a project: owner, contributor, and viewer. Depending on the role, users can run experiments, create notebooks, modify previous stored data, etc. 

For more details, see -> User Management

Learning more about Neptune

As you see, MLflow and Neptune aren’t mutually exclusive. You can keep your favorite features from MLflow, while using Neptune as a central place for managing your experiments and collaborating on them with your team.

If you want to learn more about Neptune, check out the official documentation. If you want to try it out, create your account and start tracking your machine learning experiments with Neptune.


NEXT STEPS

How to get started with Neptune in 5 minutes

1. Create a free account
Sign up
2. Install Neptune client library
pip install neptune-client
3. Add logging to your script
import neptune.new as neptune

run = neptune.init('Me/MyProject')
run['params'] = {'lr':0.1, 'dropout':0.4}
run['test_accuracy'] = 0.84
Try live notebook

The Best MLflow Alternatives (2021 Update)

Read more
Experiment tracking Experiment management

15 Best Tools for ML Experiment Tracking and Management

Read more
Experiment tracking in project management

How to Fit Experiment Tracking Tools Into Your Project Management Setup

Read more
MLOps guide

MLOps: What It Is, Why it Matters, and How To Implement It

Read more