Building Models with PyG GNN Layers

Building graph-based models in PyTorch Geometric involves assembling their core components. This process feels remarkably similar to building standard neural networks in PyTorch because it uses the same torch.nn.Module class structure. Instead of using layers like nn.Linear or nn.Conv2d, specialized GNN layers provided by PyG are utilized.

GNNs as PyTorch Modules

A GNN model in PyG is a Python class that inherits from torch.nn.Module. The layers of the network are defined in the __init__ method, and the logic for the forward pass, which describes how data flows through the network, is implemented in the forward method. This structure provides a familiar and organized way to define even complex architectures.

The primary difference lies in the signature of the forward method. While a standard feed-forward network might only need the feature tensor x, a GNN also needs the graph's structure to perform message passing. Consequently, the forward method typically accepts the node features x and the edge_index tensor.

Exploring PyG's GNN Layers

PyTorch Geometric offers a rich collection of GNN layers in its torch_geometric.nn module. These are highly optimized implementations of the architectures we discussed in Chapter 3. Let's look at how to use a few of the most fundamental ones.

GCNConv

The GCNConv layer implements the Graph Convolutional Network operator. Its constructor is straightforward, requiring the number of input features per node and the desired number of output features.

gnn.GCNConv(in_channels: int, out_channels: int)

in_channels: The dimensionality of the input node features. For the first layer, this is the number of features each node in your dataset has.
out_channels: The dimensionality of the node embeddings produced by this layer.

SAGEConv

The SAGEConv layer implements the GraphSAGE operator. It follows a similar pattern but offers more flexibility in its aggregation scheme.

gnn.SAGEConv(in_channels: int, out_channels: int, aggr: str = 'mean')

in_channels and out_channels are the same as in GCNConv.
aggr: A string specifying the aggregation method to use, such as 'mean', 'max', or 'add'.

GATConv

The GATConv layer implements the Graph Attention Network operator. This layer introduces attention mechanisms, allowing nodes to weigh the importance of their neighbors.

gnn.GATConv(in_channels: int, out_channels: int, heads: int = 1, concat: bool = True)

heads: The number of parallel attention mechanisms, or "heads," to use. Multi-head attention often stabilizes the learning process.
concat: If set to True, the embeddings from the multiple attention heads are concatenated, resulting in an output feature dimension of heads * out_channels. If False, they are averaged.

Building a Two-Layer GCN Model

Let's put these components together to build a GNN for node classification. Our model will consist of two GCNConv layers. The first layer will transform the initial node features into an intermediate hidden representation, and the second layer will transform this hidden representation into the final output, which corresponds to the number of classes. We'll use a ReLU activation function between the layers to introduce non-linearity.

Here is the complete definition of the model:

import torch
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super().__init__()
        self.conv1 = GCNConv(in_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, out_channels)

    def forward(self, x, edge_index):
        # 1. First GCN layer + ReLU activation
        x = self.conv1(x, edge_index)
        x = F.relu(x)

        # Optional: Add dropout for regularization
        # x = F.dropout(x, p=0.5, training=self.training)

        # 2. Second GCN layer
        x = self.conv2(x, edge_index)

        return x

Let's break down the forward method:

The initial node features x and the graph structure edge_index are passed to the first convolution layer, self.conv1.
The output is a new set of node embeddings with dimensionality hidden_channels.
A ReLU activation function is applied element-wise to these embeddings.
The activated embeddings are then passed to the second convolution layer, self.conv2, which produces the final embeddings of size out_channels. For a classification task, out_channels would equal the number of classes.
The model returns these final embeddings, which are often called "logits." A softmax function is typically applied to these logits later as part of the loss calculation.

A diagram of our two-layer GCN architecture. Node features x and the graph structure edge_index are fed into the first GCN layer. The resulting embeddings are passed through a ReLU activation before entering the second GCN layer to produce the final output logits.

The Power of Modularity

The modular design of PyG makes experimenting with different architectures simple. Want to see if a GraphSAGE model performs better? You only need to swap the layer types in the __init__ method. The forward pass logic remains identical.

For example, to change our GCN to a GraphSAGE model, we would make the following change:

# from torch_geometric.nn import GCNConv
from torch_geometric.nn import SAGEConv

class GraphSAGE(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super().__init__()
        # Swap GCNConv with SAGEConv
        self.conv1 = SAGEConv(in_channels, hidden_channels)
        self.conv2 = SAGEConv(hidden_channels, out_channels)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = self.conv2(x, edge_index)
        return x

This flexibility is a significant advantage of using a dedicated library like PyTorch Geometric. It allows you to focus on high-level model design and iteration rather than the low-level mathematical implementations of each layer. Having defined our model's structure, the next step is to write the script that will feed it data and train its parameters.

Was this section helpful?

References

PyTorch Geometric Documentation, Matthias Fey and Jan E. Lenssen, 2024 - Official documentation for PyTorch Geometric, providing API details for GNN layers and module construction.
Semi-Supervised Classification with Graph Convolutional Networks, Thomas N. Kipf and Max Welling, 2017 International Conference on Learning Representations (ICLR) DOI: 10.48550/arXiv.1609.02907 - Introduces the Graph Convolutional Network (GCN) architecture, a foundational paper for graph convolutions.
Inductive Representation Learning on Large Graphs, William L. Hamilton, Rex Ying, Jure Leskovec, 2017 Advances in Neural Information Processing Systems (NeurIPS) DOI: 10.48550/arXiv.1706.02216 - Presents GraphSAGE, an inductive framework for generating node embeddings by sampling and aggregating features from a node's local neighborhood.
Graph Attention Networks, Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio, 2018 International Conference on Learning Representations (ICLR) DOI: 10.48550/arXiv.1710.10903 - Introduces Graph Attention Networks (GATs), which leverage attention mechanisms to aggregate neighborhood information.