hub / github.com/OpenPPL/ppq / ParameterBakingPass

Class ParameterBakingPass

ppq/quantization/optim/baking.py:11–47 · view source on GitHub ↗

ParameterBakingPass is a useful tool for quantization simulation acceleration. By default quantizer will bake network parameters once all quantization procedures are finished. For a typical Convolution layer or Gemm layer, which has a non-empty bias tensor, ParameterBakingPass will s

Source from the content-addressed store, hash-verified

9
10
11	class ParameterBakingPass(QuantizationOptimizationPass):
12	"""ParameterBakingPass is a useful tool for quantization simulation
13	acceleration. By default quantizer will bake network parameters once all
14	quantization procedures are finished. For a typical Convolution layer or
15	Gemm layer, which has a non-empty bias tensor, ParameterBakingPass will
16	speed up the layer execution by 30%-50%.
17
18	ParameterBakingPass will rewrite layer parameters with their quantized version,
19	the quantization procedure will strictly follow layer quantization configuration.
20	Once the quantization process finished, this pass will change all parameter quantization configuration states
21	to QuantizationStates.BAKED.
22
23	State QuantizationStates.BAKED indicates corresponding tensor has been pre-quantized and its value
24	can be used without further quantization, executor will directly use a baked value during execution.
25
26	ATTENTION: value is baked inplace, so to say it will rewrite all network parameters.
27	ATTENTION: For platforms using int32 accumulator, a float32 bias tensor might lose precision
28	during the simulation. If you want PPQ simulator to have a consistent result with hardware, it is
29	highly-recommended to calling ParameterBakingPass before deployment, baking procedure will limit bias
30	precision to 23 bits (float32 only has 23 fraction bits).
31	Args:
32	quantize_function (BaseQuantFunction): a BaseQuantFunction instance to quantize all parameters.
33	"""
34	def __init__(self) -> None:
35	super().__init__(name='PPQ Parameter Baking Pass')
36	self._quantize_function = PPQuantFunction
37
38	@ empty_ppq_cache
39	def optimize(
40	self,
41	graph: BaseGraph,
42	**kwargs
43	) -> None:
44
45	for _, operation in graph.operations.items():
46	if not isinstance(operation, QuantableOperation): continue
47	operation.baking_parameters(self._quantize_function)

Callers 6

yolo6_sample.pyFile · 0.90

fp8_sample.pyFile · 0.90

bert_sample.pyFile · 0.90

fp8_sample.pyFile · 0.90

yolo_5.pyFile · 0.85

build_quant_pipelineMethod · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected