Pooling routes¶
Pooling is relation reduction. Local pooling builds a KernelRelation and
reduces each output row’s input neighbors. Global pooling ignores kernel
geometry and reduces rows by batch metadata.
Local pooling contract¶
For a relation edge set \(\mathcal{E}\), local pooling computes:
The denominator in average pooling is the sparse neighbor count for the output row. It is not the dense kernel volume unless every dense kernel position is active.
Backend routes¶
Route |
Predicate |
Implementation |
|---|---|---|
CPU local pooling |
Valid |
CPU relation reduction over edge arrays. |
Metal local pooling |
Valid |
|
Local pooling VJP |
Differentiating through local pooling |
Sum/avg use direct gradient scatter; max uses max-tie policy. |
Local pooling JVP |
Forward-mode transform |
|
Global pooling |
|
MLX dense reductions or scatter reductions over batch ids. |
Input-exclusive gradient path¶
The pooling backend carries an input_exclusive flag derived from kernel
geometry. When each input row contributes to at most one output row, the
gradient path can use an exclusive input-gradient kernel. Otherwise it uses the
sum/avg or max relation-gradient route.
Validation boundaries¶
Local pooling currently validates:
feature dtype is
float32;Metal coordinates are
int32;mode is
sum,max, oravg;relation metadata includes output coordinates, counts, kernel count, and output capacity.
Global pooling validates:
batch_countsis present;empty batches are allowed for sum and average;
empty batches are rejected for max pooling.
Global pooling formulas¶
For batch \(b\) with row set \(R_b\):
global_max_pool requires \(|R_b| > 0\) for every batch because there is
no finite feature value that represents the maximum of an empty sparse set.