MCPcopy
hub / github.com/tinygrad/tinygrad / QuantizeLinear

Function QuantizeLinear

tinygrad/nn/onnx.py:1206–1216  ·  view source on GitHub ↗
(x:Tensor, y_scale:Tensor, y_zero_point:Tensor|int=0, axis:int=1, block_size:int=0, output_dtype:int=0, saturate=1)

Source from the content-addressed store, hash-verified

1204
1205 # ***** Quantization Ops *****
1206 def QuantizeLinear(x:Tensor, y_scale:Tensor, y_zero_point:Tensor|int=0, axis:int=1, block_size:int=0, output_dtype:int=0, saturate=1):
1207 if isinstance(y_zero_point, Tensor): out_dtype = y_zero_point.dtype
1208 elif output_dtype != 0: out_dtype = OnnxDataType(output_dtype).to_dtype()
1209 else: out_dtype = dtypes.uint8
1210 y_scale, y_zero_point = _prepare_quantize(x, y_scale, y_zero_point, axis, block_size)
1211 if out_dtype == dtypes.uchar:
1212 # this appears to work in practice, at least for uchar out_dtype. it folds with the quantize stuff
1213 ret = _clamp_cast((x / y_scale + 0.4999999 + y_zero_point).int(), out_dtype)
1214 else:
1215 ret = _clamp_cast(((x / y_scale).round() + y_zero_point), out_dtype)
1216 return ret.contiguous()
1217
1218 def DynamicQuantizeLinear(x: Tensor):
1219 # only support uint8

Callers

nothing calls this directly

Calls 7

OnnxDataTypeClass · 0.85
_prepare_quantizeFunction · 0.85
_clamp_castFunction · 0.85
to_dtypeMethod · 0.80
intMethod · 0.80
roundMethod · 0.80
contiguousMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…