Keras provides several ways to define model architectures. The simplest path is the Sequential
API, designed specifically for models constructed as a linear stack of layers. Think of it like building with LEGO® bricks, where each layer connects directly to the previous one, and the output of one layer becomes the input to the next, in sequence. This approach is highly intuitive and sufficient for many common network types, particularly feedforward networks like standard Multi-Layer Perceptrons (MLPs) or many Convolutional Neural Networks (CNNs).
If your model follows a straightforward path from input to output without branching, merging, or multiple inputs/outputs, the Sequential
model is often the most concise way to define it.
You can create a Sequential
model by passing a list of layer instances to its constructor. TensorFlow needs to know the shape of the input data the model should expect. You only need to specify this for the first layer in the model; subsequent layers can automatically infer the shape of their inputs based on the output shape of the preceding layer.
Let's build a simple example: a model with two fully connected (Dense
) layers. Assume we're building a classifier for input data that consists of vectors with 784 features (perhaps flattened 28x28 pixel images).
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Define the model
model = keras.Sequential(
[
keras.Input(shape=(784,), name="input_layer"), # Define input shape explicitly
layers.Dense(64, activation="relu", name="dense_layer_1"),
layers.Dense(10, activation="softmax", name="output_layer") # Output layer for 10 classes
],
name="my_simple_mlp"
)
# Display the model's architecture
model.summary()
In this example:
keras.Sequential
.keras.Input(shape=(784,))
: This isn't technically a layer but defines the expected input shape. Providing an Input
object is the recommended way to specify the input shape for a Sequential model. It makes the model's structure explicit from the start. The shape (784,)
indicates that each input sample is a flat vector of 784 elements.layers.Dense(64, activation="relu")
: The first hidden layer is a fully connected layer with 64 units and uses the Rectified Linear Unit (ReLU) activation function. Keras infers its input shape (784 units) from the preceding Input
object.layers.Dense(10, activation="softmax")
: The output layer has 10 units (one for each potential class in a classification problem) and uses the softmax
activation function to produce probability-like outputs for each class. Keras infers its input shape (64 units) from the previous Dense
layer.model.summary()
prints a useful summary of the model, showing the layers, their output shapes, and the number of trainable parameters.You'll notice the output of model.summary()
details each layer, including its type, output shape, and parameter count. This is an excellent tool for verifying your architecture.
Model: "my_simple_mlp"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_layer_1 (Dense) (None, 64) 50240
output_layer (Dense) (None, 10) 650
=================================================================
Total params: 50,890
Trainable params: 50,890
Non-trainable params: 0
_________________________________________________________________
The None
in the output shape represents the batch size, which is typically flexible and not determined when defining the model architecture.
add()
Alternatively, you can initialize an empty Sequential
model and add layers incrementally using the .add()
method. This can sometimes be more readable if you have logic determining which layers to add.
Here's the same model built using .add()
:
# Alternative way: Initialize and add layers
model_added = keras.Sequential(name="my_simple_mlp_added")
model_added.add(keras.Input(shape=(784,), name="input_layer")) # Define input shape
model_added.add(layers.Dense(64, activation="relu", name="dense_layer_1"))
model_added.add(layers.Dense(10, activation="softmax", name="output_layer"))
# Display the model's architecture
model_added.summary()
This produces an identical model structure to the previous method.
The Sequential
API's strength is its simplicity, but this comes with limitations. It's not suitable for models where:
For these more complex architectures, TensorFlow offers the Functional API, which we will cover in a later section. However, for many standard tasks, the Sequential
model provides a clean and efficient way to define your network.
In the next sections, we'll look more closely at the common types of layers and activation functions you can stack within models like these.
© 2025 ApX Machine Learning