Now that you understand the overall structure of neural networks and have your Keras environment ready, let's start assembling the fundamental building blocks. The most basic and frequently encountered layer type in neural networks is the Dense
layer, often referred to as a fully connected layer.
A Dense
layer implements the operation:
output=activation(dot(input,kernel)+bias)
Let's break this down:
input
: This is the tensor input to the layer. For the very first layer in your model, you'll need to define its shape. For subsequent layers, Keras automatically infers the input shape based on the output shape of the preceding layer.kernel
: This is the weights matrix of the layer. It's one of the core components learned during the training process. The layer multiplies the input tensor by this kernel matrix.bias
: This is a bias vector, another learnable parameter. It's added to the result of the dot product. Adding a bias increases the model's flexibility, allowing it to fit the data better. Think of it like the y-intercept in a simple linear equation y=mx+b.activation
: This is an element-wise activation function applied to the result. Activation functions introduce non-linearity into the network, enabling it to learn complex patterns. We'll cover activation functions in detail in the next section, but common examples include ReLU, Sigmoid, and Softmax.The defining characteristic of a Dense
layer is its "fully connected" nature. Every neuron (or unit) in the Dense
layer receives input from every neuron in the previous layer. This comprehensive connectivity allows the layer to learn interactions between all features represented by the previous layer's outputs.
A visualization of a fully connected Dense layer. Each neuron (N1, N2, N3) from the previous layer connects to every unit (U1, U2) in the Dense layer.
In Keras, you can easily add Dense
layers using keras.layers.Dense
. The most important argument you'll specify is units
.
units
: This positive integer defines the dimensionality of the output space, which is equivalent to the number of neurons in the layer. For example, units=64
means the layer will output a tensor of shape (batch_size, 64)
. The choice of units
influences the representational capacity of the layer. More units allow the layer to potentially learn more complex patterns, but also increase the number of parameters and the risk of overfitting.
activation
: This argument specifies the activation function to use. You can provide the name of a built-in activation (like 'relu'
or 'softmax'
) or pass a callable activation function object. If you don't specify an activation, no activation is applied (it uses a linear activation, a(x)=x).
input_shape
: For the first layer in a Sequential
model, Keras needs to know the shape of the input data. You specify this using the input_shape
argument, which should be a tuple (e.g., input_shape=(784,)
for flattened 28x28 images). You don't need to include the batch dimension. For subsequent layers, Keras infers the input shape automatically.
Here's how you might use Dense
layers in a simple Sequential
model:
import keras
from keras import layers
# Define input shape (e.g., for flattened MNIST images)
input_shape = (784,)
model = keras.Sequential(
[
# First Dense layer: needs input_shape
layers.Dense(units=128, activation='relu', input_shape=input_shape, name='hidden_layer_1'),
# Second Dense layer: input shape is inferred from the previous layer's output (128 units)
layers.Dense(units=64, activation='relu', name='hidden_layer_2'),
# Output layer: 10 units (e.g., for 10 digit classes), softmax activation for classification
layers.Dense(units=10, activation='softmax', name='output_layer')
],
name="simple_mlp"
)
# Display the model's architecture and parameters
model.summary()
Executing model.summary()
would produce output detailing the layers, their output shapes, and the number of trainable parameters. Notice how the parameter count relates to the input dimension, the number of units, and the bias for each layer. For instance, hidden_layer_1
has 784×128 weights plus 128 biases.
Dense layers are versatile:
Dense
layers with non-linear activations allows the network to learn hierarchical representations of the input data.Dense
layer with 1 unit and a sigmoid
activation is typical.Dense
layer with N
units (where N
is the number of classes) and a softmax
activation is used.Dense
layer with 1 unit (or more, for multi-output regression) and typically no activation (linear activation) is employed.Dense
layers are often placed after the main feature extraction blocks (convolutional or recurrent layers) to perform the final classification or regression based on the extracted features.The Dense
layer is a fundamental component you'll use extensively when building various neural network architectures in Keras. Understanding its operation and how to configure it is an important step in mastering practical deep learning development.
© 2025 ApX Machine Learning