Introduction to the NetworkX Library

Building and manipulating graph objects in code requires a dedicated tool. For the Python ecosystem, NetworkX is the standard library designed for this task. It provides data structures for graphs, along with a large collection of graph algorithms. While NetworkX is not a deep learning library, its role in loading, preprocessing, and analyzing graph data makes it an essential part of the GNN toolkit.

Core Operations in NetworkX

Getting started with NetworkX is straightforward. The primary object is the Graph, which stores nodes and edges in a flexible way. Let's create a simple undirected graph.

First, you import the library, typically aliased as nx:

import networkx as nx

Next, you can instantiate an empty graph object.

# Create an empty undirected graph
G = nx.Graph()

You can add nodes and edges individually or in batches from lists or tuples.

# Add individual nodes
G.add_node(0)
G.add_node(1)

# Add multiple nodes from a list
G.add_nodes_from([2, 3])

# Add an edge between nodes 0 and 1
G.add_edge(0, 1)

# Add multiple edges from a list of tuples
G.add_edges_from([(1, 2), (2, 3), (3, 0)])

Once the graph is built, you can inspect its basic properties:

print(f"Total nodes: {G.number_of_nodes()}")
# Output: Total nodes: 4

print(f"Total edges: {G.number_of_edges()}")
# Output: Total edges: 4

# List all nodes
print(f"Nodes: {list(G.nodes)}")
# Output: Nodes: [0, 1, 2, 3]

# Find the neighbors of a specific node
print(f"Neighbors of node 1: {list(G.neighbors(1))}")
# Output: Neighbors of node 1: [0, 2]

Attaching Data to Nodes and Edges

For machine learning, graphs are more than just connections. Nodes and edges often have associated features. NetworkX handles this through attributes, which are stored as Python dictionaries. This is how we begin to bridge the gap between a graph structure and the feature matrices ( $X$ ) used by GNNs.

You can add attributes when creating nodes or update them later.

# Add nodes with attributes
G.add_node(0, community=0, role='user')
G.add_node(1, community=0, role='user')
G.add_node(2, community=1, role='moderator')
G.add_node(3, community=1, role='user')

# You can also add attributes to edges, such as weights
G.add_edge(0, 1, weight=0.5)
G.add_edge(2, 3, weight=0.9)

To access these attributes, you can treat the G.nodes and G.edges objects like dictionaries.

# Access attributes for a specific node
print(G.nodes[2]['role'])
# Output: moderator

# Access attributes for a specific edge
print(G.edges[0, 1]['weight'])
# Output: 0.5

The collection of these attributes across all nodes forms the basis for the node feature matrix we discussed earlier. For example, if each node had a feature vector, you could store it as a NumPy array or list in a 'features' attribute for each node.

A simple graph created with NetworkX. Nodes are colored by their 'community' attribute, and some edges have an associated 'weight'.

Directed and Multigraphs

NetworkX also supports other types of graphs. If the relationships in your data are directional (e.g., a "follows" relationship on a social network), you should use a DiGraph (directed graph).

# Create a directed graph
DG = nx.DiGraph()

DG.add_edge('A', 'B') # This creates an edge from A to B
DG.add_edge('B', 'C')

# In a DiGraph, neighbors are successors by default
print(list(DG.successors('A'))) # Output: ['B']
print(list(DG.predecessors('A'))) # Output: []

print(list(DG.successors('B'))) # Output: ['C']
print(list(DG.predecessors('B'))) # Output: ['A']

The syntax remains largely the same, but the underlying behavior respects the direction of edges.

The Role of NetworkX in the GNN Pipeline

NetworkX is excellent for graph manipulation, visualization, and applying classical graph algorithms like centrality analysis or community detection. It provides a human-readable and flexible way to work with graph data.

However, it is not optimized for the high-performance numerical computations required for training neural networks. Deep learning frameworks operate on tensors, not NetworkX Graph objects. Therefore, a common workflow is to use NetworkX to load and preprocess a graph, and then convert its structural information (the adjacency matrix) and feature information (the node attributes) into tensors for a library like PyTorch or TensorFlow.

In the next section, we will apply these skills to load a well-known graph dataset and inspect its properties, setting the stage for its use in a GNN model later in the course.

Was this section helpful?

References

NetworkX Documentation, NetworkX Developers, 2025 - The primary resource for using the NetworkX library, covering its API, data structures, and algorithms.
NetworkX-A Library for Network Analysis in Python, Aric A. Hagberg, Daniel A. Schult, Pieter J. Swart, 2008 Proceedings of the 7th Python in Science Conference (SciPy2008) (Proceedings of the 7th Python in Science Conference) DOI: 10.25080/TCWV9851 - The original publication introducing NetworkX, detailing its design and capabilities for graph analysis.
Graph Theory, Reinhard Diestel, 2017 (Springer) DOI: 10.1007/978-3-662-53621-6 - A comprehensive and authoritative textbook on graph theory, offering a rigorous theoretical background for graph structures and algorithms.