Notes taken from the chapter 15 of the book:

Introduction

We want to generate a new sample $x_j^$ that doesn’t belong to the training dataset $\{x_i^\}$ with a given a generator $g$ with the latent variable $z_j$ and parameters $\theta$. We want to find the parameters that allow us to generate a new sample “similar” to the training samples.

$$ x_j^* = g[z_j,\theta] $$

The parameters will be obtained from a discriminator that will take the output of the generator or the dataset randomly and ahs to clasify if it’s fake or not. When the discriminator doesn’t know which one is real, we made it. So, the discriminator will return a probability of being true (close to 1) with the parameters $\phi$.

$$ ŷ = d[z,\phi] $$

Loss functions

Discriminator

We want to get the minimum value for this loss function and get the paramethers $\phi$.

$$ \begin{equation} \mathcal{L_D} = \sum_i \left( -(1 - y_i) \log\left[ 1 - ŷ_i \right] - y_i \log\left[ ŷ_i \right] \right) \end{equation} $$

$$ ŷ_i = \sigma(d(x_i, \phi)) $$

Where $\sigma$ is the sigmoid function, $y_i$ is the true label (0 for fake and 1 for true) and $ŷ_i$ is the predicted probability.

As you can see, the loss function is the adapted cross entropy loss. In this function we distinguish between the case of fake and true. But this can be reorganized distinguishing between the $j$ real samples and the $i$ fake samples.

$$ \begin{equation} \mathcal{L_D} =

\sum_j{-\log{[1-\sigma(d(x_j^*, \phi))}}

+\sum_i{-\log{[\sigma(d(x_i, \phi))]}} \end{equation} $$

$$ \mathcal{L_D} = \mathcal{L_{fake}} + \mathcal{L_{true}} $$

We want to MINIMIZE this loss (argmin)!

Generator

So, to get the loss of the generator, we just have to generate the image $x_j^* = g(z_j,\theta)$ and the objective of the generator will be to MAXIMIZE the loss $\mathcal{L_{fake}}$ (argmax).