MCPcopy
hub / github.com/OpenPPL/ppq / ParameterBakingPass

Class ParameterBakingPass

ppq/quantization/optim/baking.py:11–47  ·  view source on GitHub ↗

ParameterBakingPass is a useful tool for quantization simulation acceleration. By default quantizer will bake network parameters once all quantization procedures are finished. For a typical Convolution layer or Gemm layer, which has a non-empty bias tensor, ParameterBakingPass will s

Source from the content-addressed store, hash-verified

9
10
11class ParameterBakingPass(QuantizationOptimizationPass):
12 """ParameterBakingPass is a useful tool for quantization simulation
13 acceleration. By default quantizer will bake network parameters once all
14 quantization procedures are finished. For a typical Convolution layer or
15 Gemm layer, which has a non-empty bias tensor, ParameterBakingPass will
16 speed up the layer execution by 30%-50%.
17
18 ParameterBakingPass will rewrite layer parameters with their quantized version,
19 the quantization procedure will strictly follow layer quantization configuration.
20 Once the quantization process finished, this pass will change all parameter quantization configuration states
21 to QuantizationStates.BAKED.
22
23 State QuantizationStates.BAKED indicates corresponding tensor has been pre-quantized and its value
24 can be used without further quantization, executor will directly use a baked value during execution.
25
26 ATTENTION: value is baked inplace, so to say it will rewrite all network parameters.
27 ATTENTION: For platforms using int32 accumulator, a float32 bias tensor might lose precision
28 during the simulation. If you want PPQ simulator to have a consistent result with hardware, it is
29 highly-recommended to calling ParameterBakingPass before deployment, baking procedure will limit bias
30 precision to 23 bits (float32 only has 23 fraction bits).
31 Args:
32 quantize_function (BaseQuantFunction): a BaseQuantFunction instance to quantize all parameters.
33 """
34 def __init__(self) -> None:
35 super().__init__(name='PPQ Parameter Baking Pass')
36 self._quantize_function = PPQuantFunction
37
38 @ empty_ppq_cache
39 def optimize(
40 self,
41 graph: BaseGraph,
42 **kwargs
43 ) -> None:
44
45 for _, operation in graph.operations.items():
46 if not isinstance(operation, QuantableOperation): continue
47 operation.baking_parameters(self._quantize_function)

Callers 6

yolo6_sample.pyFile · 0.90
fp8_sample.pyFile · 0.90
bert_sample.pyFile · 0.90
fp8_sample.pyFile · 0.90
yolo_5.pyFile · 0.85
build_quant_pipelineMethod · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected