While Graph Convolutional Networks (GCNs) provide a powerful link between spectral graph theory and practical deep learning on graphs, their underlying spectral filter is relatively simple, often a first-order polynomial of the graph Laplacian. This limitation restricts their ability to capture complex spectral patterns and learn highly localized filters. To address this, more sophisticated spectral GNN architectures have been developed, primarily focusing on designing more expressive and flexible graph filters $g_\theta(\Lambda)$. Two significant examples are ChebNet, which uses Chebyshev polynomial approximations, and CayleyNets, which employ rational Cayley filters.ChebNet: Polynomial Filters for LocalizationThe core idea behind ChebNet [Defferrard et al., 2016] is to approximate the desired spectral filter $g_\theta(\Lambda)$ using a basis of orthogonal polynomials, specifically Chebyshev polynomials of the first kind, $T_k(x)$. These polynomials are defined by the recurrence relation: $$ T_k(x) = 2x T_{k-1}(x) - T_{k-2}(x) $$ with base cases $T_0(x) = 1$ and $T_1(x) = x$.Chebyshev polynomials are particularly suitable because they are defined on the interval $[-1, 1]$, and the eigenvalues of the normalized graph Laplacian can be scaled to fit this range. Let $L$ be the graph Laplacian (e.g., $L = I - D^{-1/2}AD^{-1/2}$). Its eigenvalues $\lambda_i$ lie in $[0, \lambda_{max}]$, where $\lambda_{max}$ is the largest eigenvalue (often $\lambda_{max} \le 2$ for the normalized Laplacian). We define the scaled Laplacian $\tilde{L} = \frac{2}{\lambda_{max}} L - I$, whose eigenvalues $\tilde{\lambda}_i$ fall within $[-1, 1]$.The spectral filter $g_\theta(\Lambda)$ acting on the diagonal matrix of eigenvalues $\Lambda$ is then approximated by a $K$-th order polynomial: $$ g_\theta(\Lambda) \approx \sum_{k=0}^{K} \theta_k T_k(\tilde{\Lambda}) $$ Here, $\theta = (\theta_0, ..., \theta_K)$ are the learnable filter parameters (coefficients of the polynomial basis), and $T_k(\tilde{\Lambda})$ applies the $k$-th Chebyshev polynomial function to each eigenvalue on the diagonal of $\tilde{\Lambda}$.Applying this filter to a graph signal $x$ gives: $$ y = g_\theta(L) x = U g_\theta(\Lambda) U^T x \approx U \left( \sum_{k=0}^{K} \theta_k T_k(\tilde{\Lambda}) \right) U^T x = \sum_{k=0}^{K} \theta_k \left( U T_k(\tilde{\Lambda}) U^T \right) x $$ Crucially, $U T_k(\tilde{\Lambda}) U^T = T_k(U \tilde{\Lambda} U^T) = T_k(\tilde{L})$. This means we can compute the filtered signal without explicit eigendecomposition: $$ y \approx \sum_{k=0}^{K} \theta_k T_k(\tilde{L}) x $$ The term $T_k(\tilde{L})x$ can be computed efficiently using the Chebyshev recurrence relation directly on the scaled Laplacian $\tilde{L}$ and the signal $x$. A fundamental property is that applying $T_k(\tilde{L})$ corresponds to aggregating information strictly from the $k$-hop neighborhood of each node. This makes the filter strictly localized within $K$ hops.A ChebNet layer with input features $H^{(l)} \in \mathbb{R}^{N \times F_{in}}$ and output features $H^{(l+1)} \in \mathbb{R}^{N \times F_{out}}$ is defined as: $$ H^{(l+1)}{:, j} = \sigma \left( \sum{i=1}^{F_{in}} \sum_{k=0}^{K} \Theta^{(l)}{k, i, j} T_k(\tilde{L}) H^{(l)}{:, i} \right) \quad \text{for } j=1, ..., F_{out} $$ or more compactly, viewing $\Theta_k^{(l)}$ as parameter matrices of size $F_{in} \times F_{out}$: $$ H^{(l+1)} = \sigma\left( \sum_{k=0}^{K} T_k(\tilde{L}) H^{(l)} \Theta_k^{(l)} \right) $$ where $\sigma$ is a non-linear activation function.Relationship to GCN: GCN can be viewed as a simplification of ChebNet where $K=1$. Setting $K=1$ gives $y \approx \theta_0 T_0(\tilde{L})x + \theta_1 T_1(\tilde{L})x = \theta_0 x + \theta_1 \tilde{L}x$. By further assuming $\lambda_{max} \approx 2$, so $\tilde{L} \approx L - I$, and using a single parameter $\theta = \theta_0 = -\theta_1$, we recover an operation similar to the GCN propagation rule (after renormalization tricks). ChebNet, with $K>1$, allows for learning much more complex, localized filters by controlling the polynomial order $K$ and the coefficients $\theta_k$.CayleyNets: Rational Filters for Spectral ZoomingCayleyNets [Levie et al., 2018] propose using rational spectral filters based on the Cayley transform, aiming for better localization in the spectral domain, particularly for selecting specific frequency bands (spectral zooming).The Cayley transform is defined as $c(x, h) = (x - ih) / (x + ih)$, where $h > 0$ is a real parameter and $i$ is the imaginary unit. Cayley polynomials are built upon this transform. A basic Cayley filter acting on the Laplacian $L$ is defined as: $$ g_\theta(L) = \theta_0 I + 2 \text{Re} \sum_{j=1}^{r} \theta_j (hL - iI)^j (hL + iI)^{-j} $$ Here, $r$ controls the filter's complexity (similar to $K$ in ChebNet), and $\theta = (\theta_0, ..., \theta_r)$ are the learnable coefficients (potentially complex, but often constrained). The parameter $h$ acts as a spectral zoom parameter. Small $h$ focuses the filter's energy near eigenvalue $\lambda = 0$ (low frequencies), while large $h$ distributes it more broadly.Applying this filter requires computing terms like $(hL + iI)^{-j}x$. This involves solving complex linear systems of the form $(hL + iI)y = z$. Since $hL+iI$ is related to the Laplacian, these systems are typically sparse and can be solved efficiently using iterative methods like conjugate gradient.Advantages: Cayley filters, being rational functions, can approximate band-pass filters more effectively than polynomials of the same order. This allows CayleyNets to potentially isolate specific frequency ranges in the graph signal, which might be beneficial for certain tasks where distinguishing between low-frequency (smooth) and high-frequency (variant) components is important.Comparing ChebNet and CayleyNetsFeatureChebNetCayleyNetsFilter TypePolynomial (Chebyshev)Rational (Cayley Transform)LocalizationSpatial: Strictly $K$-hop neighborhoodSpectral: Better frequency band selectionParametersPolynomial order $K$, coefficients $\theta_k$Filter order $r$, zoom parameter $h$, coefficients $\theta_j$ComputationRecursive polynomial application ($T_k(L)x$)Solving complex linear systems ($(hL+iI)y=z$)FlexibilityGood general-purpose filter approximationBetter spectral zooming capabilitiesComplexity$O(K\mathcal{E})$ per feature per layerDepends on linear solver efficiencyBoth ChebNet and CayleyNets represent significant advancements over the basic GCN filter by enabling more complex and tailored spectral filtering. ChebNet provides spatially localized filters whose reach ($K$) is explicitly controlled. CayleyNets offer potentially sharper filters in the frequency domain. The choice between them depends on the specific application requirements: if explicit control over spatial neighborhood size is desired, ChebNet is a natural choice. If precise frequency selection is the goal, CayleyNets might be more suitable.These advanced spectral methods demonstrate the richness of designing GNNs from a graph signal processing perspective. While spatial methods (discussed next) have become very popular due to their inductive capabilities and simpler formulation, understanding spectral approaches like ChebNet and CayleyNets provides valuable insights into the expressive power and theoretical underpinnings of graph convolutions. Implementations of layers like ChebConv are available in standard GNN libraries, allowing for practical experimentation with these powerful architectures.