Let's translate the theoretical components of the classic autoencoder discussed earlier into a practical implementation. We will build a simple, fully-connected autoencoder using TensorFlow/Keras and train it on the MNIST dataset of handwritten digits. This hands-on example will solidify your understanding of how the encoder, bottleneck, and decoder work together to learn a compressed representation of the data.
Our goal is to train a network that takes a 784-dimensional vector (a flattened 28x28 MNIST image) as input x, encodes it into a much lower-dimensional latent representation z, and then decodes z back into a 784-dimensional vector x^ that closely resembles the original input x.
First, we need to import the necessary libraries and load the MNIST dataset. We'll normalize the pixel values to the range [0, 1] which is a standard practice for image data, helping with model training stability. We will also flatten the 28x28 images into vectors of size 784.
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist
# Load the MNIST dataset
(x_train, _), (x_test, _) = mnist.load_data() # We only need the images, not the labels
# Normalize pixel values to [0, 1] and flatten images
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(f"Training data shape: {x_train.shape}")
print(f"Test data shape: {x_test.shape}")
Training data shape: (60000, 784)
Test data shape: (10000, 784)
We'll construct our autoencoder using Keras' Functional API or Sequential API. For this simple example, the Sequential API is sufficient.
# Define input shape and latent dimension
input_dim = 784
latent_dim = 32
# --- Encoder ---
encoder = keras.Sequential(
[
keras.Input(shape=(input_dim,)),
layers.Dense(128, activation="relu"),
layers.Dense(64, activation="relu"),
layers.Dense(latent_dim, activation="relu", name="bottleneck"), # Bottleneck layer
],
name="encoder",
)
# --- Decoder ---
decoder = keras.Sequential(
[
keras.Input(shape=(latent_dim,)),
layers.Dense(64, activation="relu"),
layers.Dense(128, activation="relu"),
layers.Dense(input_dim, activation="sigmoid"), # Output layer
],
name="decoder",
)
# --- Autoencoder (Encoder + Decoder) ---
autoencoder = keras.Sequential(
[
encoder,
decoder,
],
name="autoencoder",
)
# Display model summaries
encoder.summary()
decoder.summary()
autoencoder.summary()
Here's a structural representation of our simple autoencoder:
A simple fully-connected autoencoder architecture. The encoder maps the 784-dimensional input to a 32-dimensional bottleneck, and the decoder reconstructs the 784-dimensional output.
Before training, we need to compile the autoencoder
model. We specify the optimizer and the loss function.
We train the model to minimize this reconstruction loss, using the input data x_train
as both the input and the target output.
# Compile the autoencoder
autoencoder.compile(optimizer='adam', loss='mse') # Using Mean Squared Error loss
# Train the autoencoder
epochs = 20
batch_size = 256
history = autoencoder.fit(x_train, x_train, # Input and target are the same
epochs=epochs,
batch_size=batch_size,
shuffle=True,
validation_data=(x_test, x_test)) # Evaluate reconstruction on test set
During training, Keras will output the loss on the training set and the validation set for each epoch. We expect to see the loss decrease over time, indicating that the model is learning to reconstruct the input images more accurately.
We can visualize the training progress by plotting the loss curves:
Training and validation loss (MSE, logarithmic scale) over 20 epochs for the simple autoencoder on MNIST. Both losses decrease steadily, indicating successful learning.
After training, we can evaluate the autoencoder's performance by visually comparing original test images with their reconstructions. We use the trained autoencoder
model to predict the reconstructions for the test set x_test
.
# Predict reconstructions for the test set
reconstructed_imgs = autoencoder.predict(x_test)
# --- Visualization ---
n = 10 # Number of digits to display
plt.figure(figsize=(20, 4))
for i in range(n):
# Display original images
ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if i == 0:
ax.set_title("Original", loc='left', fontsize=12, pad=10)
# Display reconstructed images
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(reconstructed_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
if i == 0:
ax.set_title("Reconstructed", loc='left', fontsize=12, pad=10)
plt.suptitle("Original vs. Reconstructed MNIST Digits", fontsize=16)
plt.show()
You should observe that the reconstructed digits are recognizable but slightly blurry compared to the originals. This loss of detail is expected because the information had to be compressed through the 32-dimensional bottleneck layer. The autoencoder learned to retain the most salient features necessary for reconstruction while discarding some finer details or noise.
In this practical section, we successfully implemented and trained a basic fully-connected autoencoder on the MNIST dataset. We covered:
This example demonstrates the core functionality of an autoencoder: learning a compressed representation (encoding) and reconstructing the input from that representation (decoding). While effective for simple data, this basic architecture has limitations, particularly concerning overfitting and the structure of the learned latent space. In the next chapter, we will explore regularized autoencoders designed to address these issues and learn more robust representations.
© 2025 ApX Machine Learning