When constructing a neural network, one of the first practical details you encounter is telling the model what kind of data to expect. Keras needs to know the dimensionality of the input data before it can create the necessary parameters (weights and biases) for its layers, particularly the very first layer that receives the raw input. Subsequent layers can usually figure out their input size based on the output size of the layer before them, but the initial entry point needs explicit definition.
Think about a Dense
layer, which we introduced earlier. Internally, it performs a matrix multiplication between the input features and its weight matrix. If an input sample has N features, and the Dense
layer has M units, the weight matrix needs to have a shape of (N,M). Keras cannot create this weight matrix correctly unless it knows the value of N, the number of features in each input sample.
This is where specifying the input shape comes in. You are essentially defining the contract for the data that will be fed into the model.
There are two primary ways to define the input shape in Keras, corresponding to the two APIs we've discussed:
input_shape
in the Sequential APIWhen using the Sequential
model, you provide the shape of the input data to the first layer in the stack using the input_shape
argument. This argument takes a tuple representing the dimensions of a single input sample, excluding the batch dimension.
For example, if your input data consists of flat vectors of 784 features (like flattened MNIST images), you would define the first Dense
layer like this:
import keras
from keras import layers
model = keras.Sequential(
[
layers.Input(shape=(784,)), # Preferred way using Input layer
layers.Dense(64, activation="relu"),
layers.Dense(10, activation="softmax") # Output layer for 10 classes
],
name="my_simple_sequential_model"
)
# Alternatively, using input_shape on the first Dense layer directly:
# model = keras.Sequential(
# [
# layers.Dense(64, activation="relu", input_shape=(784,)), # Note the comma for a 1D shape
# layers.Dense(10, activation="softmax")
# ],
# name="my_alternative_sequential_model"
# )
model.summary()
Notice the comma in (784,)
. This signifies a tuple with one dimension of size 784. If you omit the comma, (784)
would be interpreted as just the integer 784, which is incorrect. While using input_shape
directly on the first layer works, explicitly defining an Input
layer using keras.Input(shape=...)
as the first element in the Sequential
list is often clearer and more consistent with the Functional API approach.
keras.Input
in the Functional APIThe Functional API provides a more explicit way to define the model's entry point using the keras.Input
object. This object represents the symbolic tensor that will hold the input data. You specify the shape using the shape
argument, again excluding the batch dimension.
import keras
from keras import layers
# Define the input tensor
inputs = keras.Input(shape=(784,), name="input_features")
# Chain layers using the functional style
x = layers.Dense(64, activation="relu", name="hidden_layer_1")(inputs)
outputs = layers.Dense(10, activation="softmax", name="output_layer")(x)
# Create the model
model = keras.Model(inputs=inputs, outputs=outputs, name="my_functional_model")
model.summary()
Here, keras.Input(shape=(784,))
creates the starting point for our graph of layers. All subsequent layers are called using the output of the previous layer as input. This approach makes the flow of data explicit.
The shape
tuple describes the dimensions of one sample of your data. The batch size (how many samples are processed at once) is typically omitted because the model should be able to handle batches of any size.
Here are common examples:
(num_features,)
. Example: (150,)
for 150 features.(height, width, channels)
. Example: (28, 28, 1)
for a 28x28 grayscale image, or (64, 64, 3)
for a 64x64 color image (RGB).(timesteps, features_per_timestep)
. Example: (100, 50)
for a sequence of 100 time steps, where each step has 50 features.The following diagram illustrates how the input shape defines the entry point for data flowing into the first layer (in this case, a Dense layer) in a Functional API model.
A conceptual flow showing an input specification leading into the first processing layer of a network. The
shape
defined in the Input determines the expected dimensionality of each data sample entering the Dense Layer.
Once the input shape is defined for the first layer (either via input_shape
or keras.Input
), Keras automatically infers the input shapes for all subsequent layers by tracking the output shapes as data flows through the network. You generally only need to worry about specifying the shape explicitly at the model's entrance. Getting this initial shape right is an important first step in building a correctly configured Keras model.
© 2025 ApX Machine Learning