Generative Adversarial Networks (GANs)

In [1]:
%pylab inline
from ipypublish import nb_setup
Populating the interactive namespace from numpy and matplotlib

Overview and Intuition

Suppose you had a small number of samples to be used for a classification exercise. How might you use these samples to generate more samples? Of course, there are approaches used for oversampling that may be employed. But these are convex combinations of existing points in the data set. Can we create a system that will generate original samples that are not combinations of existing ones? That is, can we generate from the distribution of existing samples, something new? Using a neural net this is possible and is known as Generative Deep Learning (GDL).

GDL is unsupervised learning but may also be implemented as we will see using supervised learning techniques. But is quite different than standard supervised learning which is concerned mostly with classification models. A supervised learning model like classification produces an output $Pr(Y|X)$ conditional on a feature set input${\bf X} = \{X_1,X_2,...,X_n\}$. A generative model instead produces $X$ from the existing set of $X$s using the individual distributions of the $X$ values and the correlations amongst them, i.e., the joint distribution $Pr({\bf X})$. Therefore, a generative model attempts to mimic the original data generating process. Ideally we want the model to generate observations that look like they came from the same source as the other ${\bf X}# values but also sufficiently dissimilar that they don't look like replicas of existing data.

Generative modeling does not need a neural net per se. Several other approaches may be set up to generate observations from the existing data.

References

In [ ]: