After training an autoencoder, we are left with an encoder function that maps high-dimensional input data x to a lower-dimensional latent representation z, and a decoder that attempts to reconstruct x from z. While the reconstruction loss gives us a measure of how well the autoencoder preserves information, it doesn't tell us much about the structure of the latent space itself. Does the space organize the data meaningfully? Are similar inputs mapped to nearby points in the latent space?
Visualizing the latent space can provide valuable intuition about the representations learned by the model. However, latent spaces, while lower-dimensional than the input, are often still too high-dimensional (e.g., 16, 32, 64 dimensions or more) to be plotted directly. We need techniques to project these high-dimensional latent vectors into a 2D or 3D space that we can readily visualize.
Two widely used and effective algorithms for this purpose are t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). These are non-linear dimensionality reduction techniques specifically designed to preserve the local structure of the data, making them excellent tools for visualizing how our autoencoder has organized the latent representations. It's important to remember that these techniques are primarily for visualization and exploration, not necessarily for creating embeddings for downstream tasks, as they can distort global structures or distances.
t-SNE is a popular technique for visualizing high-dimensional datasets. It works by converting high-dimensional Euclidean distances between data points into conditional probabilities representing similarities. Specifically, the similarity of data point xj to data point xi is modeled as the conditional probability pj∣i that xi would pick xj as its neighbor if neighbors were picked in proportion to their probability density under a Gaussian centered at xi.
In the low-dimensional space (typically 2D), t-SNE defines a similar conditional probability qj∣i using a Student's t-distribution with one degree of freedom (which is equivalent to a Cauchy distribution). The goal of t-SNE is then to find a low-dimensional embedding of the data points that minimizes the Kullback-Leibler (KL) divergence between the joint probabilities P=(pj∣i+pi∣j)/2n and Q=(qj∣i+qi∣j)/2n (where n is the number of data points).
KL(P∣∣Q)=i<j∑pijlogqijpijMinimizing this KL divergence encourages points that are similar (high pij) in the high-dimensional latent space to be represented by points that are close together (high qij) in the low-dimensional map. The use of the heavy-tailed t-distribution in the low-dimensional space helps to alleviate the crowding problem (where points tend to clump together in the center of the map) and allows dissimilar points to be modeled further apart.
Key Considerations for t-SNE:
Using t-SNE for Latent Space Visualization:
sklearn.manifold.TSNE
in Python) to this collection of latent vectors, specifying n_components=2
.A hypothetical 2D t-SNE visualization of a latent space for data with four distinct classes, indicated by color. Note the separation into distinct clusters.
UMAP is a more recent dimensionality reduction technique that has gained significant popularity, often producing higher-quality visualizations than t-SNE while also being computationally faster. It is grounded in Riemannian geometry and algebraic topology.
The core idea of UMAP involves two main steps:
UMAP tends to be better at preserving the global structure of the data compared to t-SNE, meaning the relative positioning of clusters might be more meaningful. It's also generally faster and scales better to larger datasets.
Key Considerations for UMAP:
n_neighbors
: Similar to perplexity in t-SNE, this parameter controls how UMAP balances local versus global structure. Smaller values focus more on local structure, while larger values incorporate more global information. Typical values range from 5 to 50.min_dist
: This parameter controls the minimum distance between points in the low-dimensional embedding. Lower values result in tighter clusters, while higher values allow points to spread out more. It primarily affects the visual density of the embedding.Using UMAP for Latent Space Visualization:
The process is analogous to using t-SNE:
umap-learn
library in Python) to the latent vectors, specifying n_components=2
. Ensure you install the library first (pip install umap-learn
). You might also need pynndescent
for performance.# Example using umap-learn (conceptual)
# Assuming 'latent_vectors' is a NumPy array of shape (n_samples, latent_dim)
# and 'labels' is an array of corresponding class labels.
import umap
import plotly.graph_objects as go
import numpy as np # Assuming latent_vectors and labels are defined
# Placeholder data for demonstration
np.random.seed(42)
n_samples_per_class = 30
latent_dim = 16
centers = np.random.rand(4, latent_dim) * 20 - 10
latent_vectors = np.vstack([
np.random.randn(n_samples_per_class, latent_dim) * 1.5 + centers[i] for i in range(4)
])
labels = np.repeat(np.arange(4), n_samples_per_class)
reducer = umap.UMAP(n_neighbors=15,
min_dist=0.1,
n_components=2,
random_state=42)
embedding = reducer.fit_transform(latent_vectors)
# Create Plotly chart (code embedded in the final Plotly JSON below)
A hypothetical 2D UMAP visualization of the same latent space shown previously. Compare the cluster shapes and relative positions to the t-SNE plot. UMAP often preserves more global structure.
While t-SNE and UMAP are powerful tools, interpretation requires care:
perplexity
or n_neighbors
values. Stable clusters that appear across different settings are more likely to represent real structure in the data.By applying t-SNE and UMAP to the latent vectors generated by your autoencoder's encoder, you gain a visual window into the structure of the learned representations. This helps understand how the model organizes information and whether it successfully captures the underlying factors of variation in the data, paving the way for more advanced analysis and manipulation discussed next.
© 2025 ApX Machine Learning