Setting up a good tool stack for your Machine Learning team is important to work efficiently and be able to focus on delivering results. If you work at a startup you know that setting up an environment that can grow with your team, needs of the users and rapidly evolving ML landscape is especially important.
We wondered: “What are the best tools, libraries and frameworks that ML startups use?” to tackle this challenge.
And to answer that question we asked 41 Machine Learning startups from all over the world.
A ton of great advice that we grouped into:
- Software development setup
- Machine Learning frameworks
- Unexpected 🙂
Read on to figure out what will work for your machine learning team.
Good methodology is the key
Tools are only as strong as the methodology that employs them.
If you run around training models on some randomly acquired data and deploy whatever model you can get your hands on, sooner or later there will be trouble 🙂
Kai Mildenberger from psyML says that:
“To us, the careful versioning of all the training and testing data is probably the most essential tool/methodology. We expect that to remain one of the most key elements in our toolbox, even as all of the techniques and mathematical models iterate forever. A second aspect might be to be extremely hypothesis driven. We use that as the single most important methodology to develop models.”
I think having a strong understanding of what you want to use your tools for (and that you actually need them) is the very first step.
That said it is important to know what is out there and what people in similar situations use successfully.
Let’s dive right into that!
Software development tooling is the backbone of ML teams
Development environment is the foundation of every team’s workflow. So it was very interesting to learn what tools companies around the world consider the best in this area.
“Jupyter Notebook is very useful for quick experiments and visualization, especially when exchanging ideas between multiple team members. Because we use Tensorflow, Google Colab is a natural extension to share our code more easily.” – says Wenxi Chen from Juji.
Various flavours of Jupyter have been mentioned as well. Deepnote (a hosted Jupyter Notebook solution) is “loved for their ML stuff” by the team of Intersect Labs while Google Colab “is a natural extension to share our code more easily” for the Juji team.
Others choose more standard software development IDEs. Among those Pycharm, tooted by Or Izchak from Hotelmize as “the best Python IDE” and Visual Studio Code used by Scanta for its “ease of connectivity with Azure and many ML-based extensions provided” were mentioned the most.
For teams that use R language like SimpleReport, RStudio was a clear winner when it comes to the IDE of choice. As Kenton White from Advanced Symbolics mentions
“We mostly use R + RStudio for analysis and model building. The workhorse for our AI modeling is VARX for time series forecasts. “
When it comes to code versioning Github is a clear favourite. As Daniel Hanchen from Umbra AI mentions:
“Github (now free for all teams!!) with its super robust version control system and easy repository sharing functionality is super useful for most ML teams.“
As for the environment/infrastructure setup notable mentions from ML startups are:
- “AWS as the platform for deployment” (Simple Report)
- “Anaconda serves as our goto tool for running ML experiments due to its *live code* feature wherein it can be used to combine software code, computational output, explanatory text, and multimedia resources in a single document.” (Scanta)
- “Redis dominates as an in-memory data structure store due to its support for different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets, HyperLogLogs, bitmaps, streams, and spatial indexes.” (Scanta)
- “Snowflake and Amazon S3 for data storage.” (Hypergiant)
- “Spark-pyspark – very simple api for distributing job to work on big data.” (Hotelmize)
Sooo many Machine Learning Frameworks
Integrated development environment is crucial, but one needs a good ML framework on top of that to transform the vision into a project. The range of tools pointed out by the startups is quite diverse here.
For playing with tabular data, Pandas was mentioned the most.
Additional benefit of using Pandas mentioned by Nemo D’Qrill, the CEO of Sigma Polaris is:
“I’d say that Pandas is probably one of the most valuable tools, in particular when working in collaboration with external developers on various projects. Having all data files in the form of data frames, across teams and individual developers, makes for a much smoother collaboration and unnecessary hassle.”
Plotly was also a common choice. As developers from Wordnerds explain “for great visualisations to make data understandable and look good”. Dash, a tool for building interactive dashboards on top of Plotly charts, was recommended by Theodoros Giannakopoulos from Behavioral Signals for ML teams that need to present their analytical results in a nice, user-friendly manner.
“It is one of the most popular toolkits used by machine learning researchers, engineers, and developers. The ease with which you can get what you want is amazing! From feature engineering to interpretability, scikit-learn provides you with every functionality.”
Truth be told Pandas and Sklearn are really the workhorses of ML teams all over the world.
As Michael Phillips, Data Scientist from Numerai says:
“Modern Python libraries like Pandas and Scikit-learn have 99% of the tools that an ML team needs to excel. Though simple, these tools have extraordinary power in the hands of an experienced data scientist”
In my opinion, while in the general ML team population this may be true, in the case of ML Startups a lot of work goes into state of the art methods which usually means deep learning models.
When it comes to general deep learning frameworks we had many different opinions.
The team of ML experts from iSchoolConnect tells us why so many ML practitioners and researchers choose PyTorch.
“If you want to go deep into the waters, PyTorch is the right tool for you! Initially, it will take time to get accustomed to it but once you get comfortable with it there is nothing like it! The library is even optimized for quickly training and evaluating your ML-models.”
But it is still Tensorflow and Keras that are leading in popularity.
Most teams like Strayos and Repetere choose it as their ML development frameworks. Cedar Milazzo from Trustium said:
“Tensorflow, of course. Especially with 2.0! Eager execution was what TF really needed and now it’s here. I should note that when I say “”tensorflow”” I mean “”tensorflow + keras”” since keras is now built into TF”.
It’s also important to mention that you don’t have to choose one framework and exclude others.
For example, Melodia’s Founder, Omid Aryan said that:
“The tools that have been most beneficial to us are TensorFlow, PyTorch, and Python’s old scikit-learn tools.”
There are some popular frameworks for more specialized applications.
In Natural Language Processing we’ve heard:
- “Huggingface: it’s the most advanced and highest performance NLP library ever created. It’s the first of its kind in that researchers are directly contributing to a highly scalable NLP library. It separates itself from other similar tools by having production level tools available a few months after a newer model is published” says Ben Lamm, the CEO of Hypergiant.
- “Spacy is a very cool natural language toolkit. NLTK is by far the most popular and I certainly use it, but spacy does lots of things NLTK can’t do so well, such as stemming and dependency parsing.” mentions Cedar Milazzo, the CEO of Trustium
- “Gensim is good for word vectors and document vectors too, and I believe it isn’t so popular.” adds Cedar Milazzo.
In Computer Vision:
Also it’s worth noting that not every team is implementing deep learning models themselves.
As Iuliia Gribanova and Lance Seidman from Munchron say, there are now API services where you can outsource some (or all) of the work:
“Google ML kit is currently one of the best easy-to-entry tools that lets mobile developers easily embed ML API services like face recognition, image labeling, and other items that Google offers into an Android or iOS App. But additionally, you can also bring in your own TF (TensorFlow) lite models to run experiments and then bring them into production using Google’s ML Kit.”
I think it’s important to mention that not always you can choose the latest and greatest libraries and the toolstack gets handed to you when you join the team.
As Naureen Mahmood from Meshcapade shared:
“In the past, some important autodiff libraries that have made it possible for us to run multiple joint optimizations, and in doing so helped us build some of the core tech we still use today, are Chumpy & OpenDR. Now there are fancier and faster ones out there, like Pytorch and TensorFlow.”
When it comes to model deployment Patricia Thaine from Private AI mentions “tflite, flask, tfjs and coreml” as their frameworks of choice. She also suggests that visualizing models is very important to them and they are using Netron for that.
But there are tools that go beyond frameworks that can help ML teams deliver real value quickly.
This is where MLOps comes in.
MLOps starts to be more important for machine learning startups
You may be wondering what MLOps is or why you should care.
The term alludes to DevOps and describes tools used for operationalization of machine learning activities.
Jean-Christophe Petkovich CTO at Acerta provided us with an extremely thorough explanation of how their ML team approaches MLOps. It was so good that I decided to share it (almost) in full:
“I think most of the interesting tools that are going to see broader adoption in 2020 are centered around MLOps. There was a big push to build those tools last year, and this year we’re going to find out who the winners will be.
For me, MLflow seems to be in the lead for tracking experiments, artifacts, and outcomes. A lot of what we’ve built internally for this purpose are extensions to the functionality of MLflow to incorporate more data tracking similar to how DVC tracks data.
The other big names in MLOps are Kubeflow, Airflow and TFX with Apache Beam—all tools designed for capturing data science workflows and pipelines end-to-end.
There are several ingredients for a complete MLOps system:
- You need to be able to build model artifacts that contain all the information needed to preprocess your data and generate a result.
- Once you can build model artifacts, you have to be able to track the code that builds them, and the data they were trained and tested on.
- You need to keep track of how all three of these things, the models, their code, and their data, are related.
- Once you can track all these things, you can also mark them ready for staging, and production, and run them through a CI/CD process.
- Finally, to actually deploy them at the end of that process, you need some way to spin up a service based on that model artifact.
When it comes to tracking, MLflow is our pick, it’s tried-and true at Acerta, as several of our employees already used it as part of their personal workflows, and now it’s the de facto tracking tool for our data scientists.
For tracking data pipelines or workflows themselves, we are currently developing against Kubeflow since we’re already on Kubernetes making deployment a breeze, and our internal model pipelining infrastructure meshes well with the Kubeflow component concept.
On top of all of this MLOps development, there’s a shift toward building feature stores—basically specialized data lakes for storing preprocessed data in various forms—but I haven’t seen any serious contenders that really stand out yet.
These are all tools that need to be in place—I know a lot of places are doing their own home-baked solutions to this problem, but I think this year we’re going to see a lot more standardization around machine learning applications.”
Emily Kruger from Kaskada, which accidently is a startup building a feature store solution 🙂 adds:
“The most useful tools from our perspective are feature stores, automated deployment pipelines, and experimentation platforms. All these tools address challenges with MLOps, which is an important emerging space for data teams, especially those running ML models in production and at scale.”
The Best MLOps Tools
Ok so in light of this what are other teams using to solve those problems?
Some teams prefer end-to-end platforms, others create everything in-house. Many teams are somewhere in between with a mix of some specific tools and home-grown solutions.
In terms of larger platforms, two names that were mentioned often were:
- Amazon SageMaker which according to ML team from VCV “has a variety of tools for distributed collaboration” and SimpleReport chooses as their platform for deployment.
- Azure which as Scanta team tells us “serves as a way to build, train, and deploy our Machine Learning applications as well as it helps in adding intelligence in our applications via their Language, Vision, and Speech recognition support. Azure has been our choice of IaaS due to rapid deployments and low-cost Virtual Machines.”
Experiment tracking tools come in and we see ML startups use various options:
- Strayos uses Comet ML “for model collaboration and results sharing”.
- Hotelmize and others are going with tensorboard which “is the best tool to visualize your model behavior, specially for neural network models.”
- “MLflow seems to be in the lead for tracking experiments, artifacts, and outcomes.” as Jean-Christophe Petkovich CTO at Acerta mentioned before
- Other teams like Repetere try to keep it simple and say that ”Our tooling is very simple, we use tensorflow and s3 to version model artifacts for analysis”.
Typically, experiment tracking tools keep track of metrics and hyperparameters but as James Kaplan from MeetKai points out:
“The most useful types of ML tools for us are anything that helps with dealing with model regressions caused by everything except the model architecture. Most of these are tools we have built ourselves, but I assume there are many existing options out there. We like to look at confusion matrices that can be visually diff’d under scenarios such as:
– new data added to the training set (and the providence of said data)
– quantization configurations
We have found that being able to track performance across new data additions is far more important than being able to just track performance across hyper parameters of the model itself. This is especially so when datasets grow/change far faster than model configurations”
Speaking of pruning/distillation Malte Pietsch, Co-Founder of deepset explains that:
“We see an increasing need for tools that help us profile & optimize models in terms of speed and hardware utilization. With the growing size of NLP models, it becomes increasingly important to make training and inference more efficient.
While we are still looking for the ideal tooling here, we found pytest-benchmark, NVIDIA’s Nsight Systems and kernprof quite helpful.”
Experimenting with models is undoubtedly very important but putting models in front of end-users is where the magic happens (for most of us). On that front Rosa Lin from Tolstoy mentioned using streamlit.io which is a “great tool for building ML model web apps easily.”
Valuable word of warning when it comes to using ML focused solutions comes from Gianvito Pio, Co-Founder of Sensitrust:
“There are also tools like Knife and Orange that allow you to design an entire pipeline in a drag-and-drop fashion, as well as AutoML tools (see AutoWEKA, auto-sklearn and JADBio) that will automatically select the most appropriate model for a specific task.
However, in my opinion, a strong expertise in the Machine Learning and AI areas are still necessary. Even the “”best, automated”” tool can be misused, without a good background in the field.”
Ok, when I started working on this, some answers like PyTorch, Pandas or Jupyter Lab were what I expected.
But one answer we received was really out-of-the-box.
It put all the other things in perspective and made me think that perhaps we should take a step back and take a look at the larger picture.
Christopher Penn from Trust Insights suggested that ML teams should use a rather interesting “tool”:
“Wetware – the hardware and software combination that sits between your ears – is the most important, most useful, most powerful machine learning tool you have.
Far, FAR too many people are hoping AI is a magic wand that solves everything with little to no human input. The reverse is true; AI requires more management and scrutiny than ever, because we lack so much visibility into complex models.
Interpretability and explainability are the greatest challenges we face right now, in the wake of massive scandals about bias and discrimination. And AI vendors make this worse by focusing on post hoc explanations of models instead of building the expensive but worthwhile interpretations and checkpoints into models.
So, wetware – the human in the loop – is the most useful tool in 2020 and for the foreseeable future.”
Since we are building tools for ML teams and some of our customers are AI startups I think it makes sense to give you our perspective.
So we see:
- A lot of teams use Jupyter ecosystem for exploration and Pycharm/VSCode for development
- For deep learning people are using everything Tensorflow, Keras and Pytorch. Notably, we see more and more people using high-level PyTorch training libraries like Lightning, Ignite, Catalyst, fastai and Skorch,
- For visual exploration people are using matplotlib, plotly, altair and hiplot (hyperparameter visualizations)
- For running hyperparameter sweeps and general run orchestration some teams like YNAP choose AWS SageMaker.
- For experiment tracking we see open-source packages like TensorBoard, MLflow and Sacred (Neptune integrates with all of them)
… and since those are our customers naturally they use neptune-notebooks for tracking explorations in jupyter notebooks and neptune for experiment tracking and organization of their machine learning projects.
Startups can apply for a 50% discount on Neptune Team Pro accountApply now
Huge thanks to all the Machine Learning teams who took part in this roundup. Sharing your knowledge and experience brings an incredible value to the community!
Startups that contributed to this post:
Acerta is a software company that delivers machine learning technology to the manufacturing and automotive industries.
A market research firm that uses AI to analyze social media to monitor brand health while predicting trends and consumer behavior.
Analance is a self-serve advanced analytics platform that allows users to prepare, manage, model, and visualize data from a single platform.
Developing Emotionally Intelligent Conversations with AI. Oliver API is the fastest evolving robust emotion AI engine.
Machine learning agency with a focus on deep learning based Natural Language Processing (NLP).
Developing open-source tools and providing R&D services to help customers automate their complex AI, ML and quantum R&D.
A Big Data and Machine Learning platform that empowers CPG, Retail and ecommerce using predictions. It analyzes audience behavior patterns in real time through large-scale big data.
GenRocket is the Technology leader in Data Generation for Software Testing and Machine Learning.
Hazy generates smart synthetic data that’s safe to use and actually works as a drop in replacement for real data science and analytics workloads.
Hotelmize is a platform designed to improve the pricing and booking process of hotel rooms using big data analysis.
We are leaders in Machine Intelligence – with an impressive record of tailoring solutions and products for clients in fields ranging from oil drilling and fluid dynamics to satellite imaging, defense, and security.
Cognitive Computing using Artificial General Intelligence.
Intersect Labs offers services that enable intelligent decisions involving machine learning from spreadsheet data in 3 clicks.
Artificial Intelligence, international student admission, international universities and colleges.
The easiest way to build AI chatbots – simpler than making PowerPoint. Juji is the #1 chatbot platform for DIY AI chatbots.
Kaskada helps organizations make better predictions and drive more impact from machine learning by increasing speed of innovation and computing features in real time.
SaaS, Mobile App, Medtech, Biotech, Healthcare, IoT, AI, ML, Big Data.
Luden.io is an independent game developer focused on meaningful and educational games.
MeetKai is a voice-operated virtual assistant that makes your life easier through conversation, personalization, and curation.
Melodia is a smart music streaming platform, offering a simpler way to play and explore music.
SaaS, State of the art 3D body models. Automatically convert body scans, mocap, or measurements to rigged meshes.
Early Disease and Illness Detection using Machine Learning & A.I for Physicians & Laboratories.
Numerai transforms and regularizes financial data into machine learning problems for the global network of data scientists.
We want you to stop wasting time building reports, so we built a platform that automates report building while respecting your data privacy.
Empowering privacy-preserving software development.
psyML combines modern psychology, advanced psychometrics, machine learning and AI to enable unprecedented understanding of human behavior.
Repetere generates automated sales and product mix forecasts with machine learning and artificial intelligence.
SavantX uses analytics to bring order and understanding to all types of data. 2. Trust Insights USA | Massachusetts | Norfolk
Protecting machine learning algorithms and the businesses that use them.
Real-time software for intelligent traffic signals.
Platform where customers and professionals get in touch, make deals and design new projects. Every phase exploits blockchain technologies, managed by Smart Contracts, and supported by Artificial Intelligence.
Candidate assessment and shortlisting tech. Automates employment processes to increase efficiency, diversity and quality of hire.
SimpleReport is a safety reporting and analytics tool for companies that make OHS their priority.
Strayos is a 3D aerial intelligence platform for Mining and Quarry blasting operations to reduce cost.
Machine learning tools for your text.
Trust Insights helps companies light up dark data and help you take action through analysis and insight.
SaaS B2B platform for determining Credibility and protecting Brand image.
Simulations, Artificial Intelligence, Data Search Engine, Question and Answer, Whole Earth Modelling.
VCV is an AI-powered robot recruiter.
A text analysis and insight platform, which combines cutting-edge artificial intelligence (AI) and old-school linguistics allows computers to read – and genuinely understand – what people actually mean, not just count the words they use.
Real-time video surveillance software monitored by AI 24/7.
ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It
Jakub Czakon | Posted November 26, 2020
Let me share a story that I’ve heard too many times.
”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…
…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…
…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”
– unfortunate ML researcher.
And the truth is, when you develop ML models you will run a lot of experiments.
Those experiments may:
- use different models and model hyperparameters
- use different training or evaluation data,
- run different code (including this small change that you wanted to test quickly)
- run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)
And as a result, they can produce completely different evaluation metrics.
Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.
This is where ML experiment tracking comes in.Continue reading ->