Quantized weights
QuantizedWeight stores packed affine int4/int8 weights for inference. The
object is a logical weight plus the metadata required to execute it without
guessing shape from storage:
\[w_{g,j} = s_g q_{g,j} + b_g.\]
Supported layouts map to the public operations:
Layout |
Logical source shape |
Used by |
linear
|
(C_out, C_in)
|
Sparse-feature linear projections. |
kernel_major
|
(K, C_in, C_out)
|
Relation convolution with mapped kernel rows. |
dense_5d
|
(C_out, Kx, Ky, Kz, C_in)
|
Public sparse convolution modules and functions. |
Related pages
-
class mlx_lattice.core.quantized.QuantizedWeight(weight, scales, biases, group_size, bits, in_channels, out_channels, kernel_size, layout)[source]
Bases: object
Packed affine INT4/INT8 inference weight.
The object stores packed uint32 integer codes plus per-group affine
scales and biases. Logical values are reconstructed as
scale * code + bias by quantized linear and convolution paths.
layout records the logical source shape:
linear for (C_out, C_in), kernel_major for
(K, C_in, C_out), and dense_5d for
(C_out, Kx, Ky, Kz, C_in).
- Parameters:
weight (array)
scales (array)
biases (array)
group_size (int)
bits (int)
in_channels (int)
out_channels (int)
kernel_size (tuple[int, int, int])
layout (Literal['linear', 'kernel_major', 'dense_5d'])
-
weight: array
-
scales: array
-
biases: array
-
group_size: int
-
bits: int
-
in_channels: int
-
out_channels: int
-
kernel_size: tuple[int, int, int]
-
layout: Literal['linear', 'kernel_major', 'dense_5d']
-
property storage_in_channels: int
-
property is_pointwise: bool
-
property nbytes: int
-
mlx_lattice.core.quantized.dequantize_weight(weight)[source]
Restore the logical floating-point weight represented by weight.
The returned array uses the original logical layout recorded by
weight.layout and slices away any padded storage channels.
- Return type:
array
- Parameters:
weight (QuantizedWeight)
-
mlx_lattice.core.quantized.quantize_weight(weight, *, group_size=None, bits=4)[source]
Pack a linear or sparse-convolution weight for inference.
- Parameters:
weight (array) – Floating float16 or float32 weight. Accepted shapes are
(C_out, C_in), (K, C_in, C_out), or
(C_out, Kx, Ky, Kz, C_in).
group_size (int | None) – Quantization group size. None chooses 64 for
C_in >= 64 and 32 otherwise.
bits (int) – Packed integer width, either 4 or 8.
- Return type:
QuantizedWeight
- Returns:
A QuantizedWeight containing packed storage and affine metadata.
Input channels are padded in storage to the selected group size when
needed; logical in_channels remains the original channel count.