A Gentle Introduction to Contextual Bandits

Markus Cozowicz and Miguel González-Fierro
April 1, 2020

Contextual bandits are a simplified type of reinforcement learning algorithm that use contextual information about the environment to make decisions in real-time and require reward at every step. They can be used for applications such as news recommendation, ads placement on a website, financial portfolio design to name a few. In this post, we give an overview of contextual bandits and explain how they can be used.


contextual bandits; online learning; recommendation systems

blog comments powered by Disqus