Translate the mathematical formulation of Graph Convolutional Networks into a working model. Solidify understanding of how the GCN's message passing mechanism functions at a practical level. Build a two-layer GCN using PyTorch, focusing on the main matrix operations that define the graph convolution.
This approach gives you a clear view of the model's inner workings before we move on to higher-level libraries in Chapter 5, which abstract away many of these details.
To build a GCN, we need three primary components:
Let's begin by defining a simple, small graph and preparing its matrices for computation. We will use a graph with four nodes.
First, we define our graph's adjacency matrix A and an initial feature matrix X. Then, we perform the normalization procedure discussed in the "Graph Convolutional Networks (GCN)" section. The propagation rule for a GCN layer is:
Our first task is to compute the normalized adjacency matrix, D^−21A^D^−21. We will use PyTorch for all tensor operations.
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
# 1. Define graph structure and features
adj = torch.tensor([
[0., 1., 1., 0.],
[1., 0., 1., 1.],
[1., 1., 0., 0.],
[0., 1., 0., 0.]
])
features = torch.tensor([
[10., 2.],
[5., 15.],
[20., 10.],
[1., 1.]
])
# 2. Add self-loops to the adjacency matrix
# A_hat = A + I
adj_hat = adj + torch.eye(adj.shape[0])
# 3. Calculate the degree matrix D_hat
# D_ii = Sum_j A_ij
degree_hat = torch.diag(torch.sum(adj_hat, dim=1))
# 4. Compute D_hat^(-1/2)
# Add a small epsilon for numerical stability
degree_hat_inv_sqrt = torch.pow(degree_hat.diag() + 1e-5, -0.5)
degree_hat_inv_sqrt = torch.diag(degree_hat_inv_sqrt)
# 5. Calculate the normalized adjacency matrix
# norm_A = D_hat^(-1/2) * A_hat * D_hat^(-1/2)
norm_adj = torch.matmul(torch.matmul(degree_hat_inv_sqrt, adj_hat), degree_hat_inv_sqrt)
print("Normalized Adjacency Matrix:")
print(norm_adj)
The output norm_adj is the matrix that will be used to propagate information between nodes. It is symmetric and its values are scaled based on node degrees. This matrix is constant throughout the model's training process and only needs to be computed once.
Next, we create a custom PyTorch module for a single GCN layer. This class will contain one trainable weight matrix W and its forward method will execute the core multiplication: norm_adj @ features @ W.
class GCNLayer(nn.Module):
"""
A single Graph Convolutional Network layer.
"""
def __init__(self, in_features, out_features):
super(GCNLayer, self).__init__()
# Define the trainable weight matrix W
self.weights = nn.Parameter(torch.FloatTensor(in_features, out_features))
# Initialize weights with a standard method
nn.init.xavier_uniform_(self.weights)
def forward(self, adj, features):
# First, transform the features: X * W
support = torch.matmul(features, self.weights)
# Then, propagate features: norm_A * (X * W)
output = torch.matmul(adj, support)
return output
This GCNLayer class cleanly encapsulates the logic. It takes the pre-computed normalized adjacency matrix and the node features as input and produces new node embeddings. The nn.Parameter wrapper ensures that PyTorch's autograd system tracks gradients for our weight matrix during training.
With the GCNLayer defined, stacking them to form a deep GNN is straightforward. We will create a two-layer GCN for node classification. This model will:
The data flow through a two-layer GCN. The normalized adjacency matrix is passed to each layer to guide the aggregation of messages.
Here is the implementation of the full model:
class GCN(nn.Module):
"""
A two-layer Graph Convolutional Network.
"""
def __init__(self, in_features, hidden_features, out_features):
super(GCN, self).__init__()
self.gc1 = GCNLayer(in_features, hidden_features)
self.gc2 = GCNLayer(hidden_features, out_features)
def forward(self, adj, features):
# First layer and ReLU activation
x = F.relu(self.gc1(adj, features))
# Second layer
x = self.gc2(adj, x)
return x
This GCN class combines two instances of our GCNLayer. The forward method defines the computation flow: the output of the first layer becomes the input to the second, with a ReLU activation in between to introduce non-linearity, which allows the model to learn more complex functions.
Now we can put everything together. We will instantiate our GCN model and pass our prepared graph data through it to get the final node embeddings. Let's assume our task is to classify each node into one of three categories.
# Get the dimensions from our data
num_nodes = features.shape[0]
in_features = features.shape[1]
hidden_features = 16 # A common choice for hidden layer size
out_features = 3 # Number of classes
# Instantiate the model
model = GCN(in_features, hidden_features, out_features)
# Perform a forward pass
logits = model(norm_adj, features)
print("\nModel Output (Logits):")
print(logits)
print("\nOutput Shape:", logits.shape)
The output of this script will be a tensor of shape [4, 3]. This represents the raw, unnormalized scores (logits) for each of the 4 nodes belonging to each of the 3 classes.
Model Output (Logits):
tensor([[-1.9965, -3.2057, 2.0163],
[-2.3150, -4.5161, 3.7142],
[-2.2965, -4.0189, 2.9785],
[-1.3857, -2.9754, 2.4187]], grad_fn=<MmBackward0>)
Output Shape: torch.Size([4, 3])
We have successfully built a GCN from scratch. Each row in the output tensor is a new representation for a node, calculated by aggregating information from its local neighborhood as defined by the graph structure. In the next chapter, we will see how to use these logits with a loss function to train the model's weights and make accurate predictions for a task like node classification.
Was this section helpful?
nn.Module, nn.Parameter, and building custom layers, which are fundamental to the GCN implementation.© 2026 ApX Machine LearningEngineered with