Blog » Natural Language Processing » Sentiment Analysis in Python: TextBlob vs Vader Sentiment vs Flair vs Building It From Scratch

Sentiment Analysis in Python: TextBlob vs Vader Sentiment vs Flair vs Building It From Scratch

Sentiment analysis is one of the most widely known Natural Language Processing (NLP) tasks. This article aims to give the reader a very clear understanding of sentiment analysis and different methods through which it is implemented in NLP. So let’s dive in.

The field of NLP has evolved very much in the last five years, open-source packages like Spacy, TextBlob, etc. provide ready to use functionalities for NLP like sentiment analysis. There are so many of these packages available for free to make you confused about which one to use for your application. 

In this article, I will discuss the most popular NLP Sentiment analysis packages:

At the end, I will also compare the performance of each of them in a common dataset.

What is sentiment analysis?

Sentiment analysis is the task of determining the emotional value of a given expression in natural language. 

It is essentially a multiclass text classification text where the given input text is classified into positive, neutral, or negative sentiment. The number of classes can vary according to the nature of the training dataset. 

For example, sometimes it is formulated as a binary classification problem with 1 as positive sentiment and 0 as negative sentiment label.

Application of sentiment analysis

Sentiment analysis has applications in a wide variety of domains including analyzing user reviews, tweet sentiment, etc. Let’s go through some of them here:

  • Movie reviews: Analysing online movie reviews to get insights from the audience about the movie.
  • News sentiment analysis: analyzing news sentiments for a particular organization to get insights.
  • Social media sentiment analysis: analyze the sentiments of Facebook posts, twitter tweets, etc.
  • Online food reviews: analyzing sentiments of food reviews from user feedback.

Sentiment analysis in python 

There are many packages available in python which use different methods to do sentiment analysis. In the next section, we shall go through some of the most popular methods and packages.

Rule-based sentiment analysis

Rule-based sentiment analysis is one of the very basic approaches to calculate text sentiments. It only requires minimal pre-work and the idea is quite simple, this method does not use any machine learning to figure out the text sentiment. For example, we can figure out the sentiments of a sentence by counting the number of times the user has used the word “sad” in his/her tweet. 

Now, let’s check out some python packages that work using this method.


It is a simple python library that offers API access to different NLP tasks such as sentiment analysis, spelling correction, etc.

Textblob sentiment analyzer returns two properties for a given input sentence: 

  • Polarity is a float that lies between [-1,1], -1 indicates negative sentiment and +1 indicates positive sentiments. 
  • Subjectivity is also a float which lies in the range of [0,1]. Subjective sentences generally refer to personal opinion, emotion, or judgment. 

Let’s see how to use Textblob:

from textblob import TextBlob

testimonial = TextBlob("The food was great!")
 Sentiment(polarity=1.0, subjectivity=0.75)

Textblob will ignore the words that it doesn’t know, it will consider words and phrases that it can assign polarity to and averages to get the final score.

VADER sentiment

Valence aware dictionary for sentiment reasoning (VADER) is another popular rule-based sentiment analyzer. 

It uses a list of lexical features (e.g. word) which are labeled as positive or negative according to their semantic orientation to calculate the text sentiment.   

Vader sentiment returns the probability of a given input sentence to be 

Positive, negative, and neutral. 

For example:

“The food was great!”
Positive : 99%
Negative :1%
Neutral : 0%

These three probabilities will add up to 100%.

Let’s see how to use VADER:

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()
sentence = "The food was great!" 
vs = analyzer.polarity_scores(sentence)
print("{:-<65} {}".format(sentence, str(vs)))
{'compound': 0.6588, 'neg': 0.0, 'neu': 0.406, 'pos': 0.594}

Vader is optimized for social media data and can yield good results when used with data from twitter, facebook, etc.

The main drawback with the rule-based approach for sentiment analysis is that the method only cares about individual words and completely ignores the context in which it is used. 

For example, “the party was savage” will be negative when considered by any token-based algorithms.

Embedding based models

Text embeddings are a form of word representation in NLP in which synonymically similar words are represented using similar vectors which when represented in an n-dimensional space will be close to each other.

word vectors

Embedding based python packages use this form of text representation to predict text sentiments. This leads to better text representation in NLP and yields better model performance.

One of such packages is Flair.    


Flair is a simple to use framework for state of the art NLP. 

It provided various functionalities such as:

  • pre-trained sentiment analysis models, 
  • text embeddings, 
  • NER, 
  • and more.

Let’s see how to very easily and efficiently do sentiment analysis using flair.

Flair pretrained sentiment analysis model is trained on IMDB dataset. To load and make prediction using it simply do:

from flair.models import TextClassifier
from import Sentence

classifier = TextClassifier.load('en-sentiment')
sentence = Sentence('The food was great!')

# print sentence with predicted labels
print('Sentence above is: ', sentence.labels)
[POSITIVE (0.9961)

If you like to have a custom sentiment analyzer for your domain, it is possible to train a classifier using flair using your dataset.

The drawback of using a flair pre-trained model for sentiment analysis is that it is trained on IMDB data and this model might not generalize well on data from other domains like twitter.

Building sentiment analysis model from scratch 

In this section, you will learn when and how to build a sentiment analysis model from scratch using TensorFlow. So, let’s check how to do it.

Why a custom model?

Let’s first understand when you will need a custom sentiment analysis model. For example, you have a niche application like analyzing sentiments of airline reviews. 

By building a custom model you can also get more control over the output.


TensorFlow Hub is a repository of trained machine learning models ready for fine-tuning and deployable anywhere. 

For our purpose, we will use the universal sentence encoder which encodes text to high dimensional vectors. You can also use any of your preferred text representation models available like GloVe, fasttext, word2vec, etc.


As we are using a universal sentence encoder to vectorize our input text we don’t need an embedding layer in the model. If you are planning to use any other embedding models like GloVe, feel free to follow one of my previous posts to get a step by step guide. Here I will just build a simple model for our purpose.


For our example, I will be using the twitter sentiment analysis dataset from Kaggle. This dataset contains 1.4 million labeled tweets. 

You can download the dataset from here


For running the example in Colab just upload your Kaggle API key when prompted by the notebook and it will automatically download the dataset for you.

For running the example in Colab just upload your Kaggle API key when prompted by the notebook and it will automatically download the dataset for you. 

Example: Twitter sentiment analysis with Python

Here is the link to the Colab notebook.

Example: Twitter sentiment analysis with Python. 

In the same notebook, I have implemented all the algorithms we discussed above.

Comparing results

Now, let’s compare the results from the notebook.

Algorithm Accuracy
Textblob 56%
Flair 50%
USE model 0.775

You can see that our custom model without any hyperparameter tuning yields the best results. 


I have only trained the Use model on the Twitter data, the other ones come out-of-the-box.

You can see that none of the above packages are generalizing well on twitter data, I have been working on a cool open source project to develop a package especially for twitter data and this is under active contribution. 

Feel free to check out my project on GitHub.

Final thoughts

In this article, I discussed sentiment analysis and different approaches to implement it in python. 

I also compared their performance on a common dataset. 

Hopefully, you will find them useful in some of your projects.


How to Structure and Manage Natural Language Processing (NLP) Projects

Dhruvil Karani | Posted October 12, 2020

If there is one thing I learned working in the ML industry is this: machine learning projects are messy.

It is not that people don’t want to have things organized it is just there are many things that are hard to structure and manage over the course of the project. 

You may start clean but things come in the way. 

Some typical reasons are:

  • quick data explorations in Notebooks, 
  • model code taken from the research repo on github, 
  • new datasets added when everything was already set,
  • data quality issues are discovered and re-labeling of the data is needed,
  • someone on the team “just tried something quickly” and changed training parameters (passed via argparse) without telling anyone about it,
  • push to turn prototypes into production “just this once” coming from the top.

Over the years working as a machine learning engineer I’ve learned a bunch of things that can help you stay on top of things and keep your NLP projects in check (as much as you can really have ML projects in check:)). 

In this post I will share key pointers, guidelines, tips and tricks that I learned while working on various data science projects. Many things can be valuable in any ML project but some are specific to NLP. 

Continue reading ->
ML tourism

How AI and ML Can Solve Business Problems in Tourism – Chatbots, Recommendation Systems, and Sentiment Analysis

Read more
Data analysis nlp featured

Exploratory Data Analysis for Natural Language Processing: A Complete Guide to Python Tools

Read more

How to Structure and Manage Natural Language Processing (NLP) Projects

Read more

AI Limits: Can Deep Learning Models Like BERT Ever Understand Language?

Read more