Blog » ML Tools » 4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

Machine learning is one of the most dynamic industries. But the growth in usage and popularity comes from delivering new, more sophisticated and handy tools to support the data scientists and engineers in daily work. 

When reading over-buzzed press articles one may think that Artificial Intelligence is a novelty unseen before. It is not – the concept of neural networks was coined by the end of World War II by Warren McCulloch and Walter Pitts who delivered a computational model of an artificial neural network. By 1980 there were actually working models existing. 

Yep – there were artificial neural networks operating before one could run Windows operating system on his or her machine – the first edition arrived five years later (1985 – Blue Screen of Death – nice to meet you!)

So why we enjoy the practical applications of ML and neural networks only recently, not since the dawn of computing? The answer is in computing power. The first reason is that it is getting cheaper and cheaper every year with modern washing machines being powered by faster and better chips than ones used in space shuttles that transported our guys to the moon. 

And the second reason? Tools – building every model from scratch is a tedious and exhausting job, wearing all but the most resilient. Luckily, today we have many tools to use – many of them free – so enjoy the list below.

1. TensorFlow

TensorFlow is an open-source platform for developing machine learning and deep learning models. It was developed by the Google Brain team in the year 2015.

TensorFlow accepts the data in the form of a multi-dimensional array known as Tensor and works with the data-flow graph that has nodes and edges.

Graphs help in making the execution of TensorFlow code in a distributed manner across a cluster of computers with the GPU acceleration.

Pros

  • Easy to implement,
  • Graph visualization using Tensorboard for developing and debugging the models,
  • Model Checkpointing. Train a model. Stop to evaluate. Reload from the checkpoint and keep training,
  • Provides a simple interface with a library like Keras,
  • GPU and TPU support,
  • Supports almost all deep learning models such as ConvNets, RNN, LSTMs, Word2Vec, etc.

Cons

  • Long learning curve. Understanding dynamic graphs and tensors can be a huge pain for a beginner,
  • Doesn’t support dynamic model building. First, we have to build a model then run it which means the model is static,
  • Implementation of complex architecture can be daunting.

2. PyTorch

PyTorch is an open-source framework developed by Facebook for building machine learning and deep learning models. It is based on the Torch library.

PyTorch uses multi-dimensional arrays called Tensors to store the data but with the support of CUDA to make machine learning computations faster. It is best suited for deep learning research work.

PyTorch provides dynamic computational graphs which is the biggest drawback of TensorFlow.

It uses tensor backend Gloo for CPU and NCLL for GPU. 

Pros

  • Dynamic model building,
  • Data Parallelism,
  • Good for research purposes,
  • Provides GPU accelerated computation.

Cons

  • Historically has not been lagging Tensorfow when it comes to serving models especially in edge/mobile and browser serving setups. It is getting better by the day though.

3.  Sci-kit learn

Scikit-learn is an open-source library for developing machine learning models that shines in numerical computations.

It is built upon NumPy and SciPy for scientific computations and other than this scikit-learn includes Matplotlib for 2D/3D plotting and Pandas for data munging and wrangling.

It provides vast varieties of algorithms for both supervised and unsupervised learning such as regression, classification, clustering, collaborative filtering, dimensionality reduction, and so on.

Pros

  • Provides a wide range of machine learning methods such as regression, classification, clustering, dimensionality reduction, etc.,
  • Easy to implement,
  • Open-source.

Cons

  • Some algorithms don’t support hardware acceleration,
  • Doesn’t do much when it comes to deep learning and reinforcement learning.

4. OpenNN

OpenNN (Open Neural Networks) is an open-source library that is written in C++ which helps in developing the deep learning models. The library implements n-number of layers of non-linear processing units for supervised learning.

It has better memory management and higher processing speed since it was developed in C++ and also implements CPU parallelization by means of OpenMP and GPU acceleration with CUDA.

Pros

  • High processing speed,
  • Provides algorithms for tasks like regression, classification, forecasting,
  • Open-source.

Cons

  • Debugging could be a pain,
  • Lack of proper documentation.

5. Microsoft Cognitive Toolkit (CNTK)

The Microsoft Cognitive Toolkit (CNTK) is an open-source toolkit for distributed deep learning that describes neural networks as a series of computational steps via a directed graph.

CNTK is implemented in C++ and Python and also available in C# and Java. CNTK provides both a low-level and high-level API for building neural networks. 

CNTK provides parallelism with high accuracy on multiple machines via 1-bit SGD and auto hyperparameter tuning. It can implement CNN, FNN, RNN, Batch normalization, and Sequence-to-Sequence with attention.

Pros

Cons

  • Lack of documentation,
  • Doesn’t have a large community.

6. Caffe2

Caffe2 (Convolutional Architecture for Fast Feature Embedding) owned by Facebook is an open-source framework for developing machine learning models.

Caffe was developed by Yangqing Jia during his Ph.D. program at the University of Berkley written in C++, with a Python interface. In April 2017, Facebook announced Caffe2 with features such as Recurrent Neural Networks, and later Caffe2 was merged into PyTorch.

Pros

  • Good for large scale machine learning models,
  • Well suited for deploying models in production,
  • Well suited for tasks like image classification and object detection,
  • Well suited for mobile applications,
  • Open-source.

Cons

  • Steep learning curve.

Conclusion

With the tools described in the article, an ambitious data scientist or data engineer can build new, exciting and bold projects, limited only by his or her imagination, dataset, storage, and computing power. To be honest, usually, these are serious issues to consider. 

But fear not. We have already visited the moon and introduced the Blue Screen of Death to humanity. No matter how bold and ambitious is the next step, the tools listed above will make taking it much easier. 

Python & Machine Learning Instructor | Founder of probog.com

READ NEXT

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

Jakub Czakon | Posted November 26, 2020

Let me share a story that I’ve heard too many times.

”… We were developing an ML model with my team, we ran a lot of experiments and got promising results…

…unfortunately, we couldn’t tell exactly what performed best because we forgot to save some model parameters and dataset versions…

…after a few weeks, we weren’t even sure what we have actually tried and we needed to re-run pretty much everything”

– unfortunate ML researcher.

And the truth is, when you develop ML models you will run a lot of experiments.

Those experiments may:

  • use different models and model hyperparameters
  • use different training or evaluation data, 
  • run different code (including this small change that you wanted to test quickly)
  • run the same code in a different environment (not knowing which PyTorch or Tensorflow version was installed)

And as a result, they can produce completely different evaluation metrics. 

Keeping track of all that information can very quickly become really hard. Especially if you want to organize and compare those experiments and feel confident that you know which setup produced the best result.  

This is where ML experiment tracking comes in. 

Continue reading ->
Free ML tools

4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

Read more
Free ML tools

4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

Read more
Free ML tools

4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

Read more
Free ML tools

4 Free Machine Learning Tools You Must Know (+ 2 That You Probably Never Heard of)

Read more