Let's put the pieces together and visualize a basic feedforward neural network. We've discussed individual neurons, their parameters (weights and biases), activation functions, and how they are organized into layers. Now, imagine we want to build a simple network to, for instance, predict a single output value based on two input features.
Consider a network with the following structure:
- An Input Layer with 2 neurons, corresponding to our two input features (x1,x2). This layer doesn't perform computations; it simply passes the input values forward.
- A Hidden Layer with 3 neurons (h1,h2,h3). Each hidden neuron receives input from all neurons in the input layer.
- An Output Layer with 1 neuron (o1). This neuron receives input from all neurons in the hidden layer and produces the final prediction (y^).
This is called a "feedforward" network because information flows strictly in one direction: from input to hidden to output layers, without looping back.
Here's how information propagates:
-
Input to Hidden Layer:
- Each input feature (x1,x2) is connected to each hidden neuron (h1,h2,h3).
- For the first hidden neuron (h1), the input signals are multiplied by corresponding weights (e.g., w11 for x1→h1, w21 for x2→h1).
- A weighted sum (zh1) is calculated: zh1=(x1×w11)+(x2×w21)+bh1, where bh1 is the bias for neuron h1.
- An activation function f (like ReLU or Sigmoid) is applied to this sum to get the neuron's output: ah1=f(zh1).
- Similar calculations (z=∑(weight×input)+bias followed by a=f(z)) are performed independently for the other hidden neurons (h2,h3), each using its own set of weights and bias.
-
Hidden to Output Layer:
- The outputs of the hidden layer neurons (ah1,ah2,ah3) become the inputs for the output layer.
- Each hidden neuron output is connected to the output neuron (o1).
- For the output neuron (o1), a weighted sum (zo1) is calculated using the hidden layer activations and a new set of weights (e.g., wh1,o1 for ah1→o1, etc.) and its own bias (bo1): zo1=(ah1×wh1,o1)+(ah2×wh2,o1)+(ah3×wh3,o1)+bo1.
- A final activation function g (which might be different from the hidden layer's activation function, depending on the task) is applied to get the network's prediction: y^=ao1=g(zo1).
This entire process, from feeding the initial inputs x1,x2 to obtaining the final output y^, is called forward propagation, which we will examine in detail later.
The diagram below illustrates this simple network structure.
A simple feedforward neural network with 2 input neurons, 1 hidden layer of 3 neurons, and 1 output neuron. Arrows indicate the direction of information flow during forward propagation. Each connection represents a weight, and each neuron in the hidden and output layers has an associated bias (not explicitly drawn).
This example showcases the fundamental architecture. Real-world networks can have many more layers (making them "deep") and many more neurons per layer, but the core principles of weighted sums, activation functions, and layered connections remain the same. The specific number of layers and neurons, along with the choice of activation functions, are design decisions that depend on the problem you are trying to solve. In the upcoming chapters, we'll explore how to implement these calculations efficiently and how the network learns the optimal values for its weights and biases.