Unsupervised learning forms the bedrock upon which many representation learning techniques, including autoencoders, are built. Unlike supervised learning, which relies on labeled data (input-output pairs) to train models for prediction or classification, unsupervised learning algorithms work with datasets containing only input features, X, without corresponding target variables, y. The primary objective is to discover inherent structures, patterns, or relationships within the data itself.
The overarching goal is to model the underlying structure or distribution of the data. This can manifest in several ways:
Unsupervised learning algorithms process unlabeled input data to uncover underlying patterns, such as clusters or manifolds, resulting in a structured representation.
Autoencoders are fundamentally unsupervised neural networks. They learn by trying to reproduce their input. The core components are an encoder that maps the input x to a latent representation z, and a decoder that reconstructs the input x^ from z. The model is trained to minimize a reconstruction loss, often the difference between x and x^, such as Mean Squared Error:
L(x,x^)=∣∣x−x^∣∣2Because the network is trained only on the input data x without any associated labels y, it operates entirely within the unsupervised paradigm. The significant part is the bottleneck layer, where the latent representation z typically has a lower dimension than the input x. This forces the encoder to learn a compressed representation that captures the most important variations and structures within the data distribution to allow for accurate reconstruction by the decoder. This compression and reconstruction process is a form of self-supervised learning, a subset of unsupervised learning where the supervision signal is derived from the data itself.
The latent representation z learned by an autoencoder is the learned representation. The quality of this representation determines how well the autoencoder performs its primary task (reconstruction) and how useful z is for downstream tasks like classification, generation, or anomaly detection. Effective unsupervised learning, therefore, aims to find representations that are:
As mentioned in the chapter introduction, traditional linear methods like PCA perform unsupervised dimensionality reduction. However, they assume data lies on or near a linear subspace. Many real-world datasets, such as images or natural language, exhibit complex, non-linear structures (manifolds). Unsupervised methods based on deep learning, like autoencoders, are far more flexible and capable of learning these intricate non-linear representations. Understanding the basic principles of unsupervised learning, learning from unlabeled data, identifying structure, and the goal of finding informative representations, provides the necessary context for appreciating the design and function of the various autoencoder architectures we will examine throughout this course.
© 2025 ApX Machine Learning