In previous chapters, we examined how Variational Autoencoders (VAEs) learn latent representations of data. A significant goal in representation learning is to discover underlying factors of variation that are not only compact but also meaningful. This chapter addresses disentangled representation learning, an area concerned with training models, particularly VAEs, to learn representations where individual latent dimensions correspond to distinct, interpretable generative factors present in the data. For instance, with images of faces, a disentangled representation might separate factors like hair color, pose, or emotion into separate latent variables.
We will begin by considering different ways to define disentanglement and the associated difficulties in achieving it. You will learn about common metrics for quantifying the degree of disentanglement, such as Mutual Information Gap (MIG), Separated Attribute Predictability (SAP), and Disentanglement, Completeness, and Informativeness (DCI). The chapter will then investigate how the KL divergence term in the VAE objective influences disentanglement, leading to models like β-VAEs. We will also cover techniques designed to improve disentanglement, including FactorVAEs and Total Correlation VAEs (TCVAEs), which more directly address statistical independence in the latent code. Theoretical connections to Information Bottleneck theory and group-theoretic perspectives will be discussed, along with the inherent limitations and identifiability issues in this field. Finally, you'll gain practical experience by training VAEs for disentanglement and evaluating their performance using the established metrics.
5.1 Defining Disentanglement: Formulations and Difficulties
5.2 Metrics for Quantifying Disentanglement
5.3 The Influence of KL Regularization on Disentanglement
5.4 Information Bottleneck Theory and VAEs for Disentanglement
5.5 Adversarial Training for Disentanglement
5.6 Group-Theoretic Approaches to Disentanglement
5.7 Identifiability and Limitations in Disentanglement Learning
5.8 Hands-on Practical: Training and Evaluating Disentangled VAEs
© 2025 ApX Machine Learning