To apply machine learning models to graphs, we first need to convert their abstract structure of nodes and edges into a numerical format that algorithms can process. Just as we represent images as grids of pixel values or text as sequences of numerical vectors, we need a standard way to encode graphs. This is accomplished primarily through two matrices: the adjacency matrix, which captures the graph's topology, and the feature matrix, which holds the attributes of each node.
The most direct way to represent the connections within a graph is with an adjacency matrix, typically denoted as A. For a graph with N nodes, the adjacency matrix is a square matrix of size N×N.
The rule for populating this matrix is straightforward. For an unweighted graph, the element Aij at the i-th row and j-th column is:
Aij={10if there is an edge between node i and node jotherwiseBy convention, a node is not considered to be connected to itself, so the diagonal elements Aii are usually set to 0.
Consider the simple social network graph below with four individuals.
An undirected graph with four nodes (0-3) representing individuals and edges representing friendships.
The corresponding adjacency matrix A for this 4-node graph would be:
A=0110100110010110Notice a few properties of this matrix:
For weighted graphs, where edges have different strengths (e.g., interaction frequency), the matrix entries Aij would contain the edge weight instead of just 1.
In most applications, nodes themselves contain useful information. A user in a social network has a profile (age, location). A protein in a biological network has chemical properties. This information is stored in a node feature matrix, commonly denoted as X.
For a graph with N nodes and F features for each node, the feature matrix X has dimensions N×F. Each row i of the matrix corresponds to node i and contains its feature vector.
Let's assign two features to each person in our example graph: their age and the group they belong to (encoded as 0 or 1).
This information can be organized into a 4×2 feature matrix X:
X=253022280011The first column represents age, and the second represents group membership. This matrix provides the initial state or attributes for each node before any learning occurs. The goal of a GNN is often to use these features, along with the graph structure, to learn more expressive representations of the nodes.
Together, the adjacency matrix A and the node feature matrix X provide a complete numerical representation of an attributed graph. They serve as the two primary inputs to nearly all Graph Neural Network models.
The core operation of a GNN, which we will examine in the next chapter, involves using the structure defined by A to propagate and transform the information contained in X. This allows each node to learn from its neighbors, integrating both its own attributes and its local network context.
It is also worth noting that in some graphs, edges may also have features. For example, in a molecular graph, the edges representing chemical bonds can have types (single, double). This information is typically stored in a separate edge feature tensor, adding a third component to the graph's representation. For now, we will focus on the fundamental pairing of A and X.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with