Lecture 3.5
Steerable GNNs
We review the state-of-the-art in Steerable GNNs. These methods utilize the full machinery of representation theory (Wigner-D matrices, Clebsch-Gordan products) to process 3D data.
1. The Steerable GNN Recipe
Most steerable GNNs follow a similar pattern for their layers:
- Geometric Embedding: Encode relative position $x_j - x_i$ into a steerable vector $Y(\Delta x)$ (Spherical Harmonics).
- Feature Interaction: Combine neighbor features $h_j$ with geometric embedding using the Clebsch-Gordan tensor product: $$ m_{ij} = h_j \otimes_{\text{CG}} Y(\Delta x_{ij}) $$
- Aggregation: Sum messages from all neighbors.
2. Key Architectures
- Tensor Field Networks (TFN): The pioneering work that introduced rotation-equivariant point convolutions. It uses the recipe above, interpretable as a continuous convolution with steerable kernels.
- SE(3)-Transformers: Adds an Attention Mechanism. The aggregation is weighted by invariant attention coefficients $\alpha_{ij}$, allowing the network to focus on specific neighbors while maintaining equivariance.
- Cormorant: Designed for molecular physics, explicitly modeling physical interactions (dipoles, quadrupoles) using high-order tensor products.
- PaiNN: An efficient architecture that restricts features to scalars ($l=0$) and vectors ($l=1$), avoiding expensive higher-order CG products while still achieving state-of-the-art results in many tasks.
3. Physical Interpretation
Higher-order steerable features ($l=0, 1, 2, \dots$) naturally align with the multipole expansion in physics (Charge, Dipole, Quadrupole). By using these features, the network learns to reason in the language of physics.