Limitations of Standard Neural Networks on Graphs

Standard deep learning models like Multi-Layer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs) have achieved remarkable success on tasks involving images, text, and tabular data. However, their core architectural assumptions break down when applied directly to graph-structured data. The unique properties of graphs present several fundamental challenges that these conventional architectures are not equipped to handle.

The Problem of Variable Size and Node Order

For example, a basic MLP, or a fully-connected dense network, is designed to accept a fixed-size feature vector as input. This presents an immediate problem: graphs are not of a fixed size. A social network can grow from 1,000 to 1,001 users, or a molecular database might contain compounds with different numbers of atoms. There is no straightforward way to create a single, fixed-size input vector for a collection of graphs that vary in their number of nodes and edges.

Even more significant is the problem of node ordering. Graphs have no canonical order. The two diagrams below represent the exact same graph structure, but the nodes are listed in a different order.

A simple directed graph structure.

If we represent this graph with an adjacency matrix, the matrix's appearance depends entirely on the arbitrary order we choose for the nodes. For an ordering of (A, B, C), the adjacency matrix $A_1$ is:

A_1 = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}

However, if we choose the ordering (B, C, A), we get a completely different matrix, $A_2$ :

A_2 = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \end{pmatrix}

An MLP would treat the flattened versions of $A_1$ and $A_2$ as completely different inputs, failing to recognize they describe the same object. A model for graphs must be permutation invariant (for graph-level tasks) or permutation equivariant (for node-level tasks), meaning that its output should not depend on the arbitrary ordering of nodes. Standard MLPs lack this property entirely.

Irregular Neighborhoods vs. Grid Structures

Convolutional Neural Networks have become the standard for computer vision tasks because they are designed to exploit the spatial locality of pixels in a grid. A core operation in a CNN is the convolution, where a small, learnable filter (or kernel) slides across the image. This works because every pixel's neighborhood is a regular, fixed-size grid (e.g., 3x3 or 5x5).

Graphs do not share this grid-like regularity. A node's local neighborhood is defined by its connections, and its size can vary dramatically. One node in a citation network may be connected to two others, while a foundational paper might be connected to two thousand. There is no concept of "up," "down," "left," or "right," and no way to apply a fixed-size filter that makes sense for every node.

The regular, grid-like neighborhood processed by CNNs (left) versus the irregular, variable-sized neighborhood of a node in a graph (right).

Forcing a graph into a grid format would discard the precise relational information that defines its structure, which is often the most important information we want the model to learn from.

The Absence of a Sequential Structure

Recurrent Neural Networks are built for sequences where order carries meaning, such as words in a sentence or events in a time series. One might attempt to apply an RNN to a graph by performing a "walk" to generate a sequence of nodes. However, just as there is no canonical node ordering, there is no canonical path through a graph.

Starting a walk from a different node, or choosing a different path at an intersection, produces a completely different sequence. Feeding these varied sequences into an RNN would yield inconsistent representations for the exact same graph, making it impossible for the model to learn stable and meaningful patterns.

In summary, these limitations all point to a central challenge: standard neural networks are not designed to process the explicit relational structure encoded in a graph's topology. An MLP discards this structure by flattening it, a CNN requires a regular grid that graphs lack, and an RNN imposes a sequential order that is artificial. To perform machine learning on graphs effectively, we need a different class of models that operates directly on the graph, treating nodes and their connections as fundamental components of the computation. This is precisely the function of Graph Neural Networks.

Was this section helpful?

References

The Graph Neural Network Model, Franco Scarselli, Marco Gori, Ah Chung Tsoi, Mohamed Kamel, Luca Sperduti, 2009 IEEE Transactions on Neural Networks, Vol. 20 (IEEE) DOI: 10.1109/TNN.2008.2005605 - Presents one of the earliest formal definitions of Graph Neural Networks and clearly outlines the necessity for models capable of processing graph-structured data by discussing the limitations of traditional neural networks.
Geometric Deep Learning: Going Beyond Euclidean Data, Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, Pierre Vandergheynst, 2017 IEEE Signal Processing Magazine, Vol. 34 (IEEE) DOI: 10.1109/MSP.2017.2693418 - Introduces the concept of Geometric Deep Learning, providing a theoretical framework that highlights why traditional deep learning models designed for Euclidean data fail on non-Euclidean structures such as graphs.
Graph Neural Networks: A Review of Methods and Applications, Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, Maosong Sun, 2020 AI Open, Vol. 1 (KeAi Publishing) DOI: 10.1016/j.aiopen.2021.04.001 - A widely cited comprehensive survey that provides an extensive overview of Graph Neural Networks, including a clear explanation of why traditional deep learning models struggle with graph data.
Graph Neural Networks: Foundations, Frontiers, and Applications, Lingfei Wu, Peng Cui, Jian Pei, Liang Zhao, Le Song, 2022 (Springer) DOI: 10.1007/978-3-031-01588-5 - A foundational textbook that details the principles of graph neural networks, starting with the inherent challenges of applying standard deep learning techniques to graph-structured data.