Understanding XLNet and its implications for NLP

Nov. 19, 2019
 
 

Last year BERT revolutionized NLP and since then there have appeared a large number of improvements over the original implementation: MT-DNN, RoBERTa, AlBERTa. The main feature of these models is their autoencoding nature. On the other hand, a group of autorregressive methods have been proposed like Transformer-XL, GPT-2 or XLNet. In this short post we want to give a short overview of XLNet and its nature and how it compares with BERT.

 
 
 
 

In this post we revisit the revision of the Unreasonable Effectiveness of Data in the hope that it empowers the deep learning community to keep revising old ideas.

 
 

Cloud-Scale Text Classification with Convolutional Neural Networks

March 3, 2019
 
 

Natural Language Processing (NLP) is one of the fields in which deep learning has made significant progress. Specifically, the area of text classification, where the objective is to categorize documents, paragraphs or individual sentences into classes. In this post, we review an old article we published in 2017: Cloud-Scale Text Classification with Convolutional Neural Networks on Microsoft Azure and share the code we used to create the models.

 
 

Top 35 AI solutions for today's key industries

Dec. 21, 2018
 
 

Artificial Intelligence is leading the new technological revolution and it is considered by many the new electricity. AI is transforming every industry and driving new ways for companies of being more efficient, profitable and innovative. In this post, we analyze the most popular AI solutions across 16 different industries.

 
 

A Gentle Explanation of Dimensionality Reduction with t-SNE

Oct. 31, 2018
 
 

t-SNE is one of the most popular dimensionality reduction algorithms, it allows to represent a high-dimensional dataset in a 2 or 3-dimensional space, making it easy to visualize high-dimensional data points. In this post, we explain the algorithm and make a light overview of the math. Together with this post, we wrote a jupyter notebook where we show an example of t-SNE using sklearn and CUDA implementations.