Lecture 1.4
Example: Histopathology
We demonstrate the practical utility of Group CNNs with an application to Histopathology: the detection of mitotic cells in tumor tissue sections.
1. The Problem: Mitosis Detection
Pathologists diagnose cancer grades by counting "mitotic figures"—cells undergoing division. These cells have a distinct appearance but can be arbitrarily rotated. A "healthy" cell or a "mitotic" cell remains effectively the same biological entity regardless of its orientation under the microscope.
Input: RGB Image Patch.
Output: Binary Label (Mitotic / Non-Mitotic).
Constraint: The prediction must be Rotation Invariant.
2. Architecture: Invariant via Equivariant
Although the final task is invariant, we solve it by being equivariant throughout the network.
- Lifting Layer: The input patch is lifted to a group feature map (positions + orientations).
- Group Convolutions: We process these maps, detecting patterns of patterns at specific relative poses. Point-wise non-linearities (like ReLU) are applied, which preserve equivariance.
- Projection (Pooling): Only at the very end, we perform a Global Max Pooling over the rotation axis. This collapses the group feature map to a spatial map (or a single vector) that is invariant to the input's rotation.
3. Results: Data Efficiency & Stability
We compared Regular Group CNNs (G-CNN) against standard CNNs trained with data augmentation.
Sample Efficiency
G-CNNs are significantly more sample-efficient. A G-CNN trained on only 25% of the data achieves comparable or better performance than a standard CNN trained on 100% of the data with augmentation. This is because the G-CNN shares weights across rotations—it doesn't need to "learn" that a rotated edge is still an edge; it knows this by design.
Geometric Stability
If we rotate the input image and look at the predicted class probability:
- Standard CNN: The prediction fluctuates wildly (e.g., Healthy -> Mitotic -> Healthy) as the image rotates.
- Groups CNN: The prediction remains stable and constant across rotations (up to discretization artifacts).
4. Other Domains
The same principles apply to other domains:
- Medical Imaging: Vessel segmentation, lung nodule detection (scale equivariance).
- Audio Analysis: Pitch shifts in audio (e.g., a word spoken with high vs. low pitch) can be modeled as scaling of the waveform. Scale-equivariant CNNs can recognize sounds regardless of pitch.