Theory provides the blueprint, but a solid understanding comes from working with data directly. A classic graph dataset is loaded and its properties examined using the Python library NetworkX, illustrating abstract representations of graphs such as adjacency and feature matrices. This process connects mathematical definitions with their practical implementation.
Before we begin, you will need to install NetworkX and matplotlib for visualization. NetworkX is a powerful library for creating, manipulating, and studying the structure and dynamics of complex networks. You can install these packages using pip:
pip install networkx matplotlib
With the environment ready, we can proceed to load and inspect our first graph.
We will use a well-known social network dataset called "Zachary's Karate Club". This graph represents the social relationships between 34 members of a university karate club in the 1970s. A conflict between the club's administrator and its instructor led the club to split into two factions. The graph's edges represent friendships outside the club, and the task is often to predict which faction each member joined after the split.
Fortunately, this dataset is included with NetworkX, making it very easy to load.
import networkx as nx
import matplotlib.pyplot as plt
# Load the Zachary's Karate Club graph
G = nx.karate_club_graph()
# Print some basic information about the graph
print(f"Number of nodes: {G.number_of_nodes()}")
print(f"Number of edges: {G.number_of_edges()}")
Executing this code will produce the following output:
Number of nodes: 34
Number of edges: 78
This tells us our graph has 34 nodes (club members) and 78 edges (friendships).
In NetworkX, nodes are more than just numbers. They can hold attributes, or features, that contain information about them. In the Karate Club graph, each node has a 'club' attribute indicating which faction the member joined ("Mr. Hi" or "Officer").
Let's inspect node 0 (the instructor) and node 33 (the administrator).
# Access attributes of a specific node
node_0_data = G.nodes[0]
print(f"Node 0 data: {node_0_data}")
# The club attribute represents the faction
print(f"Node 0 joined the '{node_0_data['club']}' faction.")
print(f"Node 33 joined the '{G.nodes[33]['club']}' faction.")
Output:
Node 0 data: {'club': 'Mr. Hi'}
Node 0 joined the 'Mr. Hi' faction.
Node 33 joined the 'Officer' faction.
This 'club' attribute is the ground-truth label we would try to predict in a node classification task. A GNN would use the graph's structure and potentially other node features to make these predictions.
A great way to build intuition for a graph is to visualize it. We can use matplotlib along with NetworkX's drawing capabilities to plot the graph.
A small portion of the graph might look something like this, where nodes 0 and 33 are the central figures of the two factions.
The Karate Club graph showing connections between members of the two factions led by node 0 ("Mr. Hi") and node 33 ("Officer").
Now, let's plot the entire graph. We can make our visualization more informative by coloring the nodes according to their club affiliation. This will give us a clear visual of the two communities.
# Create a color map based on the 'club' attribute
colors = []
for node in G:
if G.nodes[node]['club'] == 'Mr. Hi':
colors.append('#9775fa') # Violet
else:
colors.append('#ffa94d') # Orange
# Draw the graph
plt.figure(figsize=(8, 6))
nx.draw_spring(G, with_labels=True, node_color=colors, node_size=500)
plt.title("Zachary's Karate Club Social Network")
plt.show()
This script generates a plot where the two factions are clearly visible. You can see how the friendships (edges) tend to cluster within the two groups, with fewer connections between them. This structure is precisely what a GNN learns from.
As we discussed previously, GNNs don't operate on graph objects directly. They require numerical tensors: an adjacency matrix and a node feature matrix . NetworkX makes it simple to derive the adjacency matrix.
import numpy as np
# Get the adjacency matrix as a NumPy array
A = nx.to_numpy_array(G)
print("Adjacency Matrix Shape:", A.shape)
print("A few entries of A:")
print(A[:5, :5])
Output:
Adjacency Matrix Shape: (34, 34)
A few entries of A:
[[0. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 0. 0. 0. 1. 0. 1. 0. 1.
0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[1. 0. 1. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[1. 1. 0. 1. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0.]
[1. 1. 1. 0. 0. 0. 0. 1. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
The shape of the adjacency matrix is , or , where is the number of nodes. A 1 at indicates an edge between node and node , while a 0 indicates no direct connection.
What about the node feature matrix ? For this particular dataset, explicit node features are not provided. In such cases, a common strategy is to create features from the graph's structure itself. For example, we could use each node's degree (the number of connections it has) as a simple feature. Another approach is to use an identity matrix, which assigns each node a unique one-hot encoded vector.
For the purpose of this example, let's consider creating a simple feature matrix where each node's feature is just its degree.
# Create a feature matrix where each feature is the node's degree
degrees = [G.degree(n) for n in G.nodes()]
X = np.array(degrees).reshape(-1, 1)
print("Feature Matrix Shape:", X.shape)
print("First 5 features (node degrees):")
print(X[:5])
Output:
Feature Matrix Shape: (34, 1)
First 5 features (node degrees):
[[16]
[ 9]
[10]
[ 6]
[ 3]]
We now have our graph represented by two matrices: , which describes the structure, and , which describes the properties of the nodes. These are the fundamental inputs required by most GNN models. With this foundation, we are ready to explore how a GNN uses these matrices to learn from graph data.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with