-
Notifications
You must be signed in to change notification settings - Fork 622
Description
The Inception Score calculation has 3 mistakes.
It uses an outdated Inception network that in fact outputs a 1008-vector of classes (see the following GitHub issue):
It turns out that the 1008 size softmax output is an artifact of dimension back-compatibility with a older, Google-internal system. Newer versions of the inception model have 1001 output classes, where one is an "other" class used in training. You shouldn't need to pay any attention to the extra 8 outputs.
Fix: See link for the new inception Model.
It calculates the kl-divergence directly using logs, which leads to numerical instabilities (can output nan instead of inf). Instead, scipy.stats.entropy
should be used.
kl = part * (np.log(part) - np.log(np.expand_dims(np.mean(part, 0), 0)))
kl = np.mean(np.sum(kl, 1))
Fix: Replace the above with something along the lines of the following:
py = np.mean(part, axis=0)
l = np.mean([entropy(part[i, :], py) for i in range(part.shape[0])])
It calculates the mean of the exponential of the split rather than the exponential of the mean:
Here is the code in inception_score.py
which does this:
scores.append(np.exp(kl))
return np.mean(scores), np.std(scores)
This is clearly problematic, as can easily be seen in a very simple case with a x~Bernoulli(0.5) random variable that E[e^x] = .5(e^(0) + e^(1)) != e^(.5(0)+.5(1)) = e^[E[x]]. This can further be seen with an example w/ a uniform random variable, where the split-mean over-estimates the exponential.
import numpy as np
data = np.random.uniform(low=0., high=15., size=1000)
split_data = np.split(data, 10)
np.mean([np.exp(np.mean(x)) for x in split_data]) # 1608.25
np.exp(np.mean(data)) # 1477.25
Fix: Do not calculate the mean of the exponential of the split, and instead calculate the exponential of the mean of the KL-divergence over all 50,000 inputs.