MCPcopy Index your code
hub / github.com/tensorflow/tfjs / quantize_weights

Function quantize_weights

tfjs-converter/python/tensorflowjs/quantization.py:93–147  ·  view source on GitHub ↗

Quantizes the weights by linearly re-scaling across available bits. The weights are quantized by linearly re-scaling the values between the minimum and maximum value, and representing them with the number of bits provided by the `quantization_dtype`. In order to guarantee that 0 is perfect

(data, quantization_dtype)

Source from the content-addressed store, hash-verified

91 return quantization_dtype
92
93def quantize_weights(data, quantization_dtype):
94 """Quantizes the weights by linearly re-scaling across available bits.
95
96 The weights are quantized by linearly re-scaling the values between the
97 minimum and maximum value, and representing them with the number of bits
98 provided by the `quantization_dtype`.
99
100 In order to guarantee that 0 is perfectly represented by one of the quantized
101 values, the range is "nudged" in the same manner as in TF-Lite.
102
103 Weights can be de-quantized by multiplying by the returned `scale` and adding
104 `min`.
105
106 Args:
107 data: A numpy array of dtype 'float32' or 'int32'.
108 quantization_dtype: A numpy dtype to quantize weights to. Only np.float16,
109 np.uint8, and np.uint16 are supported.
110
111 Returns:
112 quantized_data: The quantized weights as a numpy array with dtype
113 `quantization_dtype`.
114 metadata: A dictionary with the corresponding metadata for the quantization
115 type. There is no metadata associated with float16.
116 For affine quantization there are two associated metadata values:
117 scale: The linearly scaling constant used for quantization.
118 min_val: The minimum value of the linear range.
119 Raises:
120 ValueError: if `quantization_dtype` is not a valid type.
121 """
122 if quantization_dtype in [np.uint8, np.uint16]:
123 # Compute the min and max for the group.
124 min_val = data.min().astype(np.float64)
125 max_val = data.max().astype(np.float64)
126 if min_val == max_val:
127 # If there is only a single value, we can represent everything as zeros.
128 quantized_data = np.zeros_like(data, dtype=quantization_dtype)
129 scale = 1.0
130 else:
131 # Quantize data.
132 scale, min_val, max_val = _get_affine_quantization_range(
133 min_val, max_val, quantization_dtype)
134 quantized_data = np.round(
135 (data.clip(min_val, max_val) - min_val) / scale).astype(
136 quantization_dtype)
137
138 return quantized_data, {'min': min_val, 'scale': scale}
139 elif quantization_dtype == np.float16:
140 if data.dtype != np.float32:
141 raise ValueError(
142 'Invalid data dtype %r\n'
143 'float16 quantization only supports float32 dtype' % data.dtype)
144 quantized_data = data.astype(np.float16)
145 return quantized_data, {}
146 else:
147 raise ValueError('Invalid `quantization_dtype`: %r' % quantization_dtype)
148
149
150

Callers

nothing calls this directly

Calls 5

ValueErrorClass · 0.85
minMethod · 0.80
maxMethod · 0.80
roundMethod · 0.80

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…