The International Conference on Learning Representations (ICLR) took place last week, and I had a pleasure to participate in it. ICLR is an event dedicated to research on all aspects of representation learning, commonly known as deep learning. This year the event was a bit different as it went virtual due to the coronavirus pandemic. However, the online format didn’t change the great atmosphere of the event. It was engaging and interactive and attracted 5600 attendees (twice as many as last year). If you’re interested in what organizers think about the unusual online arrangement of the conference, you can read about it here.
Over 1300 speakers presented many interesting papers, so I decided to create a series of blog posts summarizing the best of them in four main areas. You can catch up with the first post with the best deep learning papers here, the second post with reinforcement learning papers here, and the third post with generative models papers here.
This is the last post of the series, in which I want to share 10 best Natural Language Processing/Understanding contributions from the ICLR.
Best Natural Language Processing/Understanding Papers
1. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
A new pretraining method that establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large.
(TL;DR, from OpenReview.net)

First author: Zhenzhong Lan
2. A Mutual Information Maximization Perspective of Language Representation Learning
Word representation is a common task in NLP. Here, authors formulate new frameworks that combine classical word embedding techniques (like Skip-gram) with more modern approaches based on contextual embedding (BERT, XLNet).
3. Mogrifier LSTM
An LSTM extension with state-of-the-art language modelling results.
(TL;DR, from OpenReview.net)
4. High Fidelity Speech Synthesis with Adversarial Networks
We introduce GAN-TTS, a Generative Adversarial Network for Text-to-Speech, which achieves Mean Opinion Score (MOS) 4.2.
(TL;DR, from OpenReview.net)
5. Reformer: The Efficient Transformer
Efficient Transformer with locality-sensitive hashing and reversible layers.
(TL;DR, from OpenReview.net)
Main authors:
6. DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling
DeFINE uses a deep, hierarchical, sparse network with new skip connections to learn better word embeddings efficiently.
(TL;DR, from OpenReview.net)
7. Depth-Adaptive Transformer
Sequence model that dynamically adjusts the amount of computation for each input.
(TL;DR, from OpenReview.net)
8. On Identifiability in Transformers
We investigate the identifiability and interpretability of attention distributions and tokens within contextual embeddings in the self-attention based BERT model.
(TL;DR, from OpenReview.net)
9. Mirror-Generative Neural Machine Translation
Translation approaches known as Neural Machine Translation models (NMT), depend on availability of large corpus, constructed as a language pair. Here, a new method is proposed for translations in both directions using generative neural machine translation.
10. FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Here, the authors propose a new algorithm, called FreeLB that formulate a novel approach to the adversarial training of the language model is proposed.
Summary
Depth and breadth of the ICLR publications is quite inspiring. This post focuses on the “Natural Language Processing” topic, which is one of the main areas discussed during the conference. According to this analysis, these areas include:
In order to create a more complete overview of the top papers at ICLR, we have built a series of posts, each focused on one topic mentioned above. This is the last one, so you may want to check the others for a more complete overview.
We would be happy to extend our list, so feel free to share other interesting NLP/NLU papers with us.
In the meantime – happy reading!
READ NEXT
Exploratory Data Analysis for Natural Language Processing: A Complete Guide to Python Tools
11 mins read | Author Shahul ES | Updated July 14th, 2021
Exploratory data analysis is one of the most important parts of any machine learning workflow and Natural Language Processing is no different. But which tools you should choose to explore and visualize text data efficiently?
In this article, we will discuss and implement nearly all the major techniques that you can use to understand your text data and give you a complete(ish) tour into Python tools that get the job done.
Before we start: dataset and dependencies
In this article, we will use a million news headlines dataset from Kaggle. If you want to follow the analysis step-by-step you may want to install the following libraries:
pip install \ pandas matplotlib numpy \ nltk seaborn sklearn gensim pyldavis \ wordcloud textblob spacy textstat
Now, we can take a look at the data.
news= pd.read_csv('data/abcnews-date-text.csv',nrows=10000)
news.head(3)

The dataset contains only two columns, the published date, and the news heading.
For simplicity, I will be exploring the first 10000 rows from this dataset. Since the headlines are sorted by publish_date it is actually 2 months from February/19/2003 until April/07/2003.
Ok, I think we are ready to start our data exploration!
Continue reading ->