According to a report by WaveStone, over 90% of the leading companies now have ongoing investments in Artificial Intelligence – a significant yet not-so-surprising result reflecting the blooming of the machine learning era. Its applications rapidly expand due to its data-agnostic nature (e.g., a fully-connected network can be used for weather forecasting just as well as house price prediction). Following the success of conglomerates, numerous startups have also emerged providing modern AI solutions for conventional problems.
However, just like any other technology, a successful adoption requires organized collaboration followed by well-planned execution. As machine learning is relatively niche and also drastically different from many other computer science domains, companies, especially startups could potentially struggle to find the optimal strategy to incorporate machine learning into daily productions. Meetings thus become a crucial aspect of determining final outcomes of an ML project.
This article dives into some little techniques that can boost the effectiveness of your ML meetings. In particular, we provide tips for meetings under different phases of an ML project development to ensure smoothness along the entire journey.
The article focuses particularly on larger-scale ML models (e.g., deep learning models) instead of conventional smaller models (e.g., linear/logistic regression), as larger models exhibit more issues that need to be alleviated to conduct more effective meetings. To give this a more specific perspective, we first provide an overview of how a deep learning model works.
Deep learning is a branch of conventional machine learning models that utilises a particular type of model termed neural networks. Neural networks are inspired by the complex brain structure with multiple neurons, and the activations of neurons are what sum up to be the ultimate prediction of the network.
May interest you
To a neural network, one would have to feed through a large set of datasets, compute the differences between the predictions and growth truth answers, and “backpropagate” to update the neurons. The backpropagation has to be done repeatedly to allow the model to slowly converge to a near-optimal solution. The processing time increases as more data become available or if the complexity of the model increases.
To this end, how to organize the meetings in a timely manner and avoiding wasting time becomes particularly important.
Prior stage – before the implementation of the ML project
When proposing an ML project, no matter what benefits the project would ultimately bring to the company, one should first consider the feasibility of the project from a real-world perspective. The following are important points to address (some prior to and some during the meeting):
1. Are there any datasets available?
Prior to entering a meeting to discuss your project, you should have a fair understanding of what datasets are potentially applicable to your task. This is particularly important when your task is supposedly supervised. That is, ground truth is required to compute the error and loss functions with your prediction.
Data collection is both labor and time-intensive, and therefore it would be inefficient to jump into an ML project and start designing architectures before confirming if there is even sufficient data to perform the training.
How to perform a rough check?
- Public datasets are often available on Kaggle/publications. The first step would be to scan through and see if there are any datasets identical or similar to the application you are aiming towards.
- If the dataset is somewhat similar yet not fully identical, it might then be necessary for your team to collect some additional data and add that into part of your training set. Note that this would take up a substantial amount of time depending on the size of your team.
- Finally, if none of the datasets is similar, it is optimal to best address this issue at the beginning of a series of meetings to make the team aware.
May interest you
2. Research upon state-of-the-art architectures before any implementation & make your meeting a sharing session of approaches
- While libraries such as PyTorch and Tensorflow have already made implementations of neural networks relatively straightforward, it is important to gather up and analyze previous arts instead of building an entirely new network from scratch. ML is a fast-moving community with thousands of papers (and potentially available codes) published per conference. The fast development implies that the adoption of a paper today would vastly expedite your designing stage and allow you to do more testing.
- Some notable methods of research would be to skim through related work top ML venues such as ICLR, ICML, NeurIPS, AAAI, and domain-specific conferences such as CVPR/ICCV for computer vision and ACL/EMNLP for natural language processing. While this task may seem time-consuming, it would dramatically reduce the implementation time of projects by knowing the correct tool to use.
- Searching up these architectures beforehand and making your meeting a sharing session is also an effective way to bring every teammate up to speed. In fact, such kinds of meetings are already widely adopted in research groups across academia. A lot of papers require somewhat complex mathematical deductions and understandings, so a presentation by one would facilitate in-group understanding drastically.
3. Is the computation feasible?
After deciding on the project and method, it is also important to address the feasibility of the project at the beginning stage. The meaning of feasibility extends far beyond “is training this model possible”, but also to “can we afford to train the model”. This is a critical problem that exists in modern deep learning problems; models such as transformers could require multiple high-end GPUs just for inferencing! Below we describe the memory-wise and temporal limitations worth discussing in your early ML meetings.
While the rule of thumb is that a more complex model would generally perform better in more complex scenarios, there often exists a tradeoff between accuracy and memory space requirement (e.g., a vision transformer (ViT) generally performs better than a ResNet but is also much larger at the same time). It is therefore important to dive into the method you are potentially implementing and cross-check the paper for any implementation details regarding the GPUs used. A lot of time it is viable (and wise) to sacrifice some accuracy to get the production rolling.
Another drawback of deep learning models is their training time. Unlike academia where a longer training time may be possible, startups have a timeline before putting the model into production. Projects such as unsupervised learning for pre-training that are notable for their long time in training may not be a viable approach under these circumstances. It is also important to look at training time and computational resources comprehensively, a 30-minute TPU training could mean 2 weeks with an Nvidia 1080Ti!
One solution, on the other hand, could be finding and using pre-trained checkpoints to save training costs. Checkpoints trained on large-scale datasets (e.g., ImageNet) for popular network architectures such as ViT, ResNet, BERT are often available. Training from a pre-trained checkpoint usually leads to faster convergence even when the task is slightly different.
On the other hand, using pre-trained models would usually mean less flexibility in deciding the model architecture, and fine-tuning is still nonetheless an expensive process if the targetted domain is very different.
Bookmark for later
Intermediate stage – during the implementation of the ML project
Unlike other algorithms where one can quickly validate its accuracy, deep learning algorithms go through cycles of training and validation to understand what can be further improved. In this section, we describe several tips that could speed up your meetings and make the implementation faster albeit with the waiting training time.
4. Set up a meeting to ensure that all environments are consistent
Depending on the task at hand, a deep learning model could comprise multiple modules. It is, therefore, crucial to ensure that all sides of development are done in a consistent environment with libraries so that the models don’t crash when combined. A meeting is preferable for setting up every contributor’s environment in that sense. This would also allow people to quickly use another person’s code/checkpoints for training/fine-tuning and significantly reduce the requirement of further meetings.
Some tips to do so include:
- Deciding on using PyTorch or Tensorflow, and ensuring the same version of the library. If you are using more advanced packages such as PyTorch Lightning, carefully consider the difficulty it may create when packaging all modules together into one big model.
- Using the same setting as single or multiple GPUs. Codes on multi-threading might not necessarily work on a single-GPU machine and vice versa. With regards to this, the memory requirement when combining more modules also has to be carefully considered. Make sure data parallelisms or distribution must be agreed upon in terms of model training to avoid any extra meetings required just to sort out and reprocess them.
- Keep a detailed note on the libraries and versions required to run a particular module. In other words, keep a readme doc if others may need to install and run your code to further build on it.
5. Investigate the tasks that can be done concurrently and make the task distributions clear
Frankly, while distributing the job to different roles is good practice for any coding project, it is particularly important for an ML project due to the processing and training cycles within the experimental stage that could take days or even weeks. If one fails to recognize the kind of setting and creates duplicates it would be a dramatic waste of time.
A lot of the tasks, such as data processing, model architecture design, and metric evaluation implementation, can be concurrently finished to reduce the overhead for deep learning model training. Communication is key in this stage to prevent redoing the same task. Hence, deciding in a meeting what could be done concurrently and distributing tasks to different team members can also save a lot of time later on.
6. Have a single place to store all the information about your projects
If two people are working on the same network architecture but on testing different hyperparameters, make sure to keep a shared note between the settings already run and what settings are to be run. This could significantly reduce the time it requires to update each other and allows each team member to have a rough idea of the current progress and what still needs to be done.
There are many tools in the MLOps space that can help with that. Platforms like Neptune.ai, Weights & Biases, or Comet can be a good source of truth for the whole team. They let you keep track of all metadata and experiments, register models, compare them and collaborate with other people.
Diving a bit deeper into Neptune.ai and how it is useful in the context of ML team meetings:
- It stores individual runs with the metadata in clear tables. You may simply add columns of hyperparameters you would like to see when evaluating the results and avoid repeatedly running the same settings just to confirm the results.
- You can easily organize, filter, and search through the runs table or models table and discuss specific experiments or models during the meeting.
- All the historic data is stored in the app, so you can come back to the runs your team did weeks ago.
- You can also create custom dashboards ahead of the meeting to have all the necessary charts and tables ready for discussion.
- Finally, by utilizing tools like these, discussing checkpoint results and sharing checkpoints with each other would eliminate excessive meetings and make the project pipeline more efficient. One can build on slight improvements from others to further incorporate them into their own testing.
7. Constantly keep an eye on new architectures/data augmentations in top venues. Perform rigorous testing on multiple test cases.
Implementations and testing of neural network models could take months, during which new architectures/codes could rapidly come out to tackle particular aspects of a problem that you are trying to solve. Thus, a weekly/monthly check on different techniques, whether it be the architecture itself or the data augmentations should be conducted and presented in a meeting. In this way, a particular method is potentially helpful and orthogonal to the current team implementation can be rapidly incorporated into the production cycle, leading to faster development.
One important idea to keep in mind is that many architectures/augmentation schemes that work great on academic datasets may not work as well in real-world testing. Therefore, while at this stage not all datasets or real-world scenarios are applicable, one must carefully and rigorously test in different cases and try mimic as many “accidents” that may take place as possible.
8. Don’t call a meeting without results
A data science team meeting should only happen to discuss tangible results
This is perhaps the most important tip in making your ML meetings effective. Contrary to program applications or conventional algorithms, evaluation and visualizations of a deep learning model take a significantly long time to run, not to mention the long training cycles during which one can do nothing but quietly wait. It would simply be a waste of time to meet up if training is still in place or if it takes an hour or two for the evaluation to appear. If the results are just not ready yet, postponing a meeting will be rather beneficial compared to speculating a result that can’t be sped up.
On the other hand, because of the aforementioned constraints of ML model training time, setting up spontaneous meetings to address newly found results (both in terms of architecture and hyperparameters) could prevent a lot of dead ends for people working in different settings of similar tasks. In particular, if the team has a designated accuracy threshold to hit, one should quickly set up a meeting to prevent further training and move on to other aspects of the projects.
Final stage – deployment
9. Separate work into deployment and continuous improvement and discuss them closely.
After your ML model has hit the pre-set goals in the implementation stage, the final important thing to make the ML meetings effective is to divide the remaining tasks roughly into three categories:
- Checking the results of domain shift in real-world applications: There inevitably exists a change in the training and testing dataset no matter how rigorous and meticulous the training has been performed. It would be helpful for part of the team to quickly test and report the discrepancy in terms of validation and real testing datasets. Notifying large discrepancies would allow the team to quickly jump back to the intermediate stages and continue the implementation process.
- Deployment to applications: The deployment into applications (e.g., App, platform, website) is also a time-consuming process. One should begin this process and implement most parts of the application to make the model plug-in smooth.
- Minor improvement on current models: This suggests small hyperparameter tunings that wouldn’t require a drastic model change. These changes can provide a final bit of boost to the model performance while simultaneously quickly adopted into the production for plug-and-play.
One common approach widely adopted in the machine learning and software engineering community is to tackle the project as CI/CD (continuous integration/delivery), where you attempt to enforce automation in building, testing, and deployment. It allows incremental changes to the current code for further improvements.
In addition, extensive QA should also be run on models to ensure that they work smoothly in the production stage. This allows for faster deployment at the initial stage where there are multiple models to be deployed. Tools like Jenkins may be a great help in executing many of these tasks.
And there you have it! 9 small tips to help you along the way in making your ML meetings more effective. While all projects are different, keeping an eye on new research, distributing tasks, and setting up spontaneous contacts will certainly make your ML meetings a little better and productions a lot faster!
ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It
10 mins read | Author Jakub Czakon | Updated July 14th, 2021
Let me share a story that I’ve heard too many times.
”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…
…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…
…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”
– unfortunate ML researcher.
And the truth is, when you develop ML models you will run a lot of experiments.
Those experiments may:
- use different models and model hyperparameters
- use different training or evaluation data,
- run different code (including this small change that you wanted to test quickly)
- run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)
And as a result, they can produce completely different evaluation metrics.
Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.
This is where ML experiment tracking comes in.Continue reading ->