While the Sequential
API provides a straightforward way to define models as a linear pipeline of layers, many real-world applications require more intricate network designs. Models might need multiple inputs (like an image and its metadata), produce multiple outputs (like classifying an object and estimating its bounding box), or involve layers that share information across different parts of the network (like siamese networks or models with residual connections). For these scenarios, Keras offers the Functional API.
The Functional API treats layers as functions that you can call on tensors. It allows you to build models as directed acyclic graphs (DAGs) of layers. This graph-based approach provides significant flexibility compared to the Sequential
model's linear stack.
Think of the Functional API like connecting LEGO bricks, but where each brick is a layer and the connections are tensors flowing between them. You start with an input tensor, pass it through a layer (which is like calling a function), get an output tensor, pass that to another layer, and so on, until you define your final output tensor(s).
Unlike the Sequential
model where the input shape is often inferred or specified in the first layer, the Functional API requires you to explicitly define the starting point of your graph using keras.Input
. This creates a symbolic tensor-like object that contains information about the shape and data type of the input your model expects.
import keras
from keras import layers
# Define an input expecting 28x28 grayscale images (flattened)
# The shape is a tuple, None usually indicates the batch size can vary.
# For a flat vector like MNIST (784 pixels), the shape is (784,).
input_tensor = keras.Input(shape=(784,), name='image_input')
Here, input_tensor
isn't actual data yet. It's a specification for the kind of data the model will receive. Giving inputs meaningful names (like image_input
) is good practice, especially for complex models.
Once you have an input tensor, you can connect layers by calling the layer instance on the tensor. The layer returns a new tensor, which you can then pass to the next layer.
Let's build a simple multi-layer perceptron (MLP), similar to what you might build with Sequential
, but using the Functional API:
# Start with our input tensor
inputs = keras.Input(shape=(784,), name='img_input')
# First Dense layer: call the layer on the input tensor
x = layers.Dense(64, activation='relu', name='dense_layer_1')(inputs)
# Second Dense layer: call the layer on the output of the previous layer ('x')
x = layers.Dense(64, activation='relu', name='dense_layer_2')(x)
# Output layer: call the layer on the output of the second Dense layer
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
Notice the pattern: output_tensor = Layer(...)(input_tensor)
. The variable x
is reused here to represent the tensor flowing through the network, being transformed by each layer.
After defining the graph of layers from inputs to outputs, you instantiate a keras.Model
object. You need to tell the Model
constructor which tensor(s) represent the input(s) and which represent the output(s) of your network graph.
# Create the Model by specifying the input and output tensors
model = keras.Model(inputs=inputs, outputs=outputs, name='simple_mlp_functional')
# Now you can inspect the model architecture
model.summary()
Running model.summary()
will produce output similar to that of a Sequential
model, showing the layers, output shapes, and parameter counts. However, behind the scenes, Keras has constructed a graph representation.
The real power of the Functional API becomes apparent when you need architectures beyond a simple linear sequence:
Multi-Input Models: You can define multiple keras.Input
tensors and feed them into different branches of your network, eventually merging them.
A conceptual diagram illustrating a model with two distinct inputs processed through separate branches before being merged for final prediction.
Multi-Output Models: A model can produce multiple outputs by specifying a list or dictionary of output tensors when creating the keras.Model
. This is useful for tasks like multi-label classification or joint regression and classification.
Shared Layers: A single layer instance can be called multiple times on different tensors. The layer instance maintains a single set of weights, which are updated based on all the places it's used. This is fundamental for models like Siamese networks that compare two inputs using the same processing branch.
A conceptual diagram showing how a single layer instance (Shared Layer) can be applied to multiple inputs, producing separate outputs that might be used independently or combined later.
Non-Linear Topologies: You can create complex graphs, such as networks with residual connections (where the output of a layer is added back to its input) or inception-style modules with parallel convolutional branches.
While the Sequential
API is convenient for simple, linear models, the Functional API is the tool of choice for defining the majority of sophisticated deep learning architectures used today. Mastering it opens the door to implementing a much wider range of network designs. In the following sections and chapters, you will see the Functional API used extensively, particularly when building more complex models like CNNs and RNNs with specific structural requirements.
© 2025 ApX Machine Learning