While metrics like Inception Score (IS) and Fréchet Inception Distance (FID) provide valuable insights by comparing features extracted from pre-trained networks, they have limitations. FID, for instance, models the extracted features (typically from an Inception network) as multivariate Gaussian distributions and compares their means and covariances. This assumption might not always hold, and FID estimates can be biased, especially when calculated on small sets of samples.
To address these issues, the Kernel Inception Distance (KID) offers an alternative approach for comparing the distributions of real and generated data features. Instead of assuming Gaussianity, KID leverages the Maximum Mean Discrepancy (MMD), a non-parametric statistical test used to determine if two sets of samples originate from the same distribution.
At its core, MMD measures the distance between the mean embeddings of two distributions in a high-dimensional feature space known as a Reproducing Kernel Hilbert Space (RKHS). The intuition is that if two distributions are identical, their mean representations in this space will also be identical. The further apart the mean embeddings, the larger the MMD, indicating greater dissimilarity between the distributions.
The squared MMD between two distributions, Pr (real) and Pg (generated), using a kernel function k, is defined as:
MMD2(Pr,Pg)=Ex,x′∼Pr[k(x,x′)]−2Ex∼Pr,y∼Pg[k(x,y)]+Ey,y′∼Pg[k(y,y′)]Here, x and x′ are samples from the real distribution, y and y′ are samples from the generated distribution, and k(a,b) is the kernel function evaluating the similarity between samples a and b.
KID applies the MMD calculation specifically to the features extracted from an Inception network, similar to how FID operates. The typical steps are:
The final KID value is typically reported as the squared MMD estimate, sometimes multiplied by a scaling factor (e.g., 100). As with FID, lower KID values indicate that the distribution of generated image features is closer to that of real image features, suggesting higher quality and diversity.
torch_fidelity
or can be built using scientific computing libraries.In summary, KID serves as a powerful distributional metric for evaluating generative models, offering robustness, particularly with limited data, and avoiding the Gaussian assumptions inherent in FID. It provides a complementary perspective on the similarity between the distributions of real and generated data features.
© 2025 ApX Machine Learning