The Dense layer is the most basic and frequently encountered layer type in neural networks, often referred to as a fully connected layer. It serves as a fundamental building block for neural network architectures.
A Dense layer implements the operation:
Let's break this down:
input: This is the tensor input to the layer. For the very first layer in your model, you'll need to define its shape. For subsequent layers, Keras automatically infers the input shape based on the output shape of the preceding layer.kernel: This is the weights matrix of the layer. It's one of the core components learned during the training process. The layer multiplies the input tensor by this kernel matrix.bias: This is a bias vector, another learnable parameter. It's added to the result of the dot product. Adding a bias increases the model's flexibility, allowing it to fit the data better. Think of it like the y-intercept in a simple linear equation .activation: This is an element-wise activation function applied to the result. Activation functions introduce non-linearity into the network, enabling it to learn complex patterns. We'll cover activation functions in detail in the next section, but common examples include ReLU, Sigmoid, and Softmax.The defining characteristic of a Dense layer is its "fully connected" nature. Every neuron (or unit) in the Dense layer receives input from every neuron in the previous layer. This comprehensive connectivity allows the layer to learn interactions between all features represented by the previous layer's outputs.
A visualization of a fully connected Dense layer. Each neuron (N1, N2, N3) from the previous layer connects to every unit (U1, U2) in the Dense layer.
In Keras, you can easily add Dense layers using keras.layers.Dense. The most important argument you'll specify is units.
units: This positive integer defines the dimensionality of the output space, which is equivalent to the number of neurons in the layer. For example, units=64 means the layer will output a tensor of shape (batch_size, 64). The choice of units influences the representational capacity of the layer. More units allow the layer to potentially learn more complex patterns, but also increase the number of parameters and the risk of overfitting.
activation: This argument specifies the activation function to use. You can provide the name of a built-in activation (like 'relu' or 'softmax') or pass a callable activation function object. If you don't specify an activation, no activation is applied (it uses a linear activation, ).
input_shape: For the first layer in a Sequential model, Keras needs to know the shape of the input data. You specify this using the input_shape argument, which should be a tuple (e.g., input_shape=(784,) for flattened 28x28 images). You don't need to include the batch dimension. For subsequent layers, Keras infers the input shape automatically.
Here's how you might use Dense layers in a simple Sequential model:
import keras
from keras import layers
# Define input shape (e.g., for flattened MNIST images)
input_shape = (784,)
model = keras.Sequential(
[
# First Dense layer: needs input_shape
layers.Dense(units=128, activation='relu', input_shape=input_shape, name='hidden_layer_1'),
# Second Dense layer: input shape is inferred from the previous layer's output (128 units)
layers.Dense(units=64, activation='relu', name='hidden_layer_2'),
# Output layer: 10 units (e.g., for 10 digit classes), softmax activation for classification
layers.Dense(units=10, activation='softmax', name='output_layer')
],
name="simple_mlp"
)
# Display the model's architecture and parameters
model.summary()
Executing model.summary() would produce output detailing the layers, their output shapes, and the number of trainable parameters. Notice how the parameter count relates to the input dimension, the number of units, and the bias for each layer. For instance, hidden_layer_1 has weights plus biases.
Dense layers are versatile:
Dense layers with non-linear activations allows the network to learn hierarchical representations of the input data.Dense layer with 1 unit and a sigmoid activation is typical.Dense layer with N units (where N is the number of classes) and a softmax activation is used.Dense layer with 1 unit (or more, for multi-output regression) and typically no activation (linear activation) is employed.Dense layers are often placed after the main feature extraction blocks (convolutional or recurrent layers) to perform the final classification or regression based on the extracted features.The Dense layer is a fundamental component you'll use extensively when building various neural network architectures in Keras. Understanding its operation and how to configure it is an important step in mastering practical deep learning development.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with