hub / github.com/tensorflow/tfjs / quantize_weights

Function quantize_weights

tfjs-converter/python/tensorflowjs/quantization.py:93–147 · view source on GitHub ↗

Quantizes the weights by linearly re-scaling across available bits. The weights are quantized by linearly re-scaling the values between the minimum and maximum value, and representing them with the number of bits provided by the `quantization_dtype`. In order to guarantee that 0 is perfect

(data, quantization_dtype)

Source from the content-addressed store, hash-verified

91	return quantization_dtype
92
93	def quantize_weights(data, quantization_dtype):
94	"""Quantizes the weights by linearly re-scaling across available bits.
95
96	The weights are quantized by linearly re-scaling the values between the
97	minimum and maximum value, and representing them with the number of bits
98	provided by the `quantization_dtype`.
99
100	In order to guarantee that 0 is perfectly represented by one of the quantized
101	values, the range is "nudged" in the same manner as in TF-Lite.
102
103	Weights can be de-quantized by multiplying by the returned `scale` and adding
104	`min`.
105
106	Args:
107	data: A numpy array of dtype 'float32' or 'int32'.
108	quantization_dtype: A numpy dtype to quantize weights to. Only np.float16,
109	np.uint8, and np.uint16 are supported.
110
111	Returns:
112	quantized_data: The quantized weights as a numpy array with dtype
113	`quantization_dtype`.
114	metadata: A dictionary with the corresponding metadata for the quantization
115	type. There is no metadata associated with float16.
116	For affine quantization there are two associated metadata values:
117	scale: The linearly scaling constant used for quantization.
118	min_val: The minimum value of the linear range.
119	Raises:
120	ValueError: if `quantization_dtype` is not a valid type.
121	"""
122	if quantization_dtype in [np.uint8, np.uint16]:
123	# Compute the min and max for the group.
124	min_val = data.min().astype(np.float64)
125	max_val = data.max().astype(np.float64)
126	if min_val == max_val:
127	# If there is only a single value, we can represent everything as zeros.
128	quantized_data = np.zeros_like(data, dtype=quantization_dtype)
129	scale = 1.0
130	else:
131	# Quantize data.
132	scale, min_val, max_val = _get_affine_quantization_range(
133	min_val, max_val, quantization_dtype)
134	quantized_data = np.round(
135	(data.clip(min_val, max_val) - min_val) / scale).astype(
136	quantization_dtype)
137
138	return quantized_data, {'min': min_val, 'scale': scale}
139	elif quantization_dtype == np.float16:
140	if data.dtype != np.float32:
141	raise ValueError(
142	'Invalid data dtype %r\n'
143	'float16 quantization only supports float32 dtype' % data.dtype)
144	quantized_data = data.astype(np.float16)
145	return quantized_data, {}
146	else:
147	raise ValueError('Invalid `quantization_dtype`: %r' % quantization_dtype)
148
149
150

Callers

nothing calls this directly

Calls 5

_get_affine_quantization_rangeFunction · 0.85

ValueErrorClass · 0.85

minMethod · 0.80

maxMethod · 0.80

roundMethod · 0.80

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…