The core idea of a Graph Neural Network is to learn a function that generates a new feature vector, or embedding, for each node. This new embedding is derived from the node's own features and the features of its immediate neighbors. This process, often referred to as "neighborhood aggregation," can be formalized by breaking down a single GNN layer into two distinct steps: an AGGREGATE step and an UPDATE step.
This two-step process is the fundamental computational block for almost every modern GNN. First, a node gathers the feature vectors from all of its direct neighbors. Then, it uses this aggregated information, along with its own current feature vector, to compute its new feature vector for the next layer.
A diagram of the two-step message passing process for a central node . Information from neighbors is first combined in the AGGREGATE step. The result is then combined with the node's own information in the UPDATE step to produce its new representation .
Let's examine each of these steps more closely.
The first challenge in designing a neural network for graphs is that nodes can have a variable number of neighbors. A node in a social network might have two friends or two thousand. Standard neural network layers, like a fully connected layer, require a fixed-size input vector. How can we process an arbitrary number of neighbor vectors?
The AGGREGATE function solves this problem. Its job is to take a set of neighbor feature vectors, , and combine them into a single, fixed-size vector. This vector acts as a summary of the node's entire neighborhood.
Here, represents the aggregated "message" from the neighborhood of node at layer .
An important property of the AGGREGATE function is that it must be permutation invariant. This means the function should produce the same output regardless of the order in which the neighbor vectors are presented. A simple sum, mean, or maximum operation are common choices because they naturally have this property. We will explore these options in the next section.
Once we have a single vector representing the neighborhood's message, the UPDATE step is responsible for creating the node's new feature vector for the next layer, .
This function takes two inputs:
AGGREGATE step, .It is significant that the node's own representation, , is included in this step. If we only used the aggregated message, the node would lose its original information and become purely a reflection of its surroundings. By combining its existing state with the incoming message, the GNN allows the node to both retain its own identity and incorporate information from its local graph structure. The UPDATE function is typically implemented as a standard neural network layer, often a linear transformation followed by a non-linear activation function like ReLU.
By combining these two steps, we arrive at the general formula for a single layer in a message passing GNN. This equation describes how the feature vector for any node is transformed as it passes through layer to produce its output at layer .
Let's break down each component:
AGGREGATE function, such as mean or sum, reduces the feature vectors of all neighbors into a single vector.UPDATE function, often a neural network with learnable weights, combines the node's own vector with the aggregated neighbor vector.The specific choices for the AGGREGATE and UPDATE functions are what differentiate various GNN architectures like GCN, GraphSAGE, and GAT. The learnable parameters of the GNN are contained within these two functions. By processing all nodes in the graph through this mechanism and stacking multiple such layers, a GNN can learn complex patterns from the graph structure.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with