Last year BERT revolutionized NLP and since then there have appeared a large number of improvements over the original implementation: MT-DNN, RoBERTa, AlBERTa. The main feature of these models is their autoencoding nature. On the other hand, a group of autorregressive methods have been proposed like Transformer-XL, GPT-2 or XLNet. In this short post we want to give a short overview of XLNet and its nature and how it compares with BERT.