Disentanglement in VAEs with the Spatial Broadcast Decoder

The world around us is noisy and constantly changing, yet our brain manages to filter out irrelevant details, and learn what we need. Think of sea shells, for example. If you weren’t born close to the sea, you probably learned first about them in school, or you saw them in a movie, or in a photograph. Then, when going for the first time to t... Read more

Approximating Wasserstein distances with PyTorch

Many problems in machine learning deal with the idea of making two probability distributions to be as close as possible. In the simpler case where we only have observed variables $\mathbf{x}$ (say, images of cats) coming from an unknown distribution $p(\mathbf{x})$, we’d like to find a model $q(\mathbf{x}\vert\theta)$ (like a neural network) tha... Read more

The Variational Autoencoder

In this notebook we are interested in the problem of inference in a probabilistic model that contains both observed and latent variables, which can be represented as the following graphical model: For this model the joint probability distribution factorizes as For any distribution $q(\mathbf{z}\vert\mathbf{x},\boldsymbol{\phi})$, we can ... Read more

The Multi-armed Bandit Problem

In this problem we are faced with a set of $k$ bandits, each of which gives an uncertain reward. The main purpose is to find the set of actions that lead to the highest long term reward, that is, the expected value of the reward. We express the dependency between an action $A_t$ taken at time $t$ and the expected reward through a value function ... Read more

Expectation Maximization for latent variable models

In all the notebooks we’ve seen so far, we have made the assumption that the observations correspond directly to realizations of a random variable. Take the case of linear regression: we are given observations of the random variable $t$ (plus some noise), which is the target value for a given value of the input $\mathbf{x}$. Under some criterion... Read more