## PPQ Passive Parameter Quantization Pass(通用被动量化过程) Passive Parameters are those parameters that must share a same scale and offset with other tensors. This pass process 4 types of passive parameter by default, namely: Bias value in Gemm, it must has a scale = input s
| 11 | |
| 12 | |
| 13 | class PassiveParameterQuantizePass(QuantizationOptimizationPass): |
| 14 | """ |
| 15 | ## PPQ Passive Parameter Quantization Pass(通用被动量化过程) |
| 16 | |
| 17 | Passive Parameters are those parameters that must share a same scale and offset with |
| 18 | other tensors. This pass process 4 types of passive parameter by default, namely: |
| 19 | |
| 20 | Bias value in Gemm, it must has a scale = input scale * weight scale. |
| 21 | |
| 22 | Bias value in Conv, it must has a scale = input scale * weight scale. |
| 23 | |
| 24 | Clip min & Clip max in Clip, must has a scale = input scale |
| 25 | |
| 26 | Pading Value, must has a scale = input scale |
| 27 | |
| 28 | ### Parameters: |
| 29 | |
| 30 | * process_clip(Set[str]): |
| 31 | |
| 32 | Whether to process clip min, max |
| 33 | |
| 34 | If not processed, clip min, max will has their state = QuantizationState.FP32 |
| 35 | |
| 36 | * process_bias(bool) |
| 37 | |
| 38 | Whether to process bias |
| 39 | |
| 40 | If not processed, bias will has their state = QuantizationState.ACTIVED |
| 41 | |
| 42 | * process_pad(bool) |
| 43 | |
| 44 | Whether to process clip min, max |
| 45 | |
| 46 | If not processed, pad value will has their state = QuantizationState.SOI |
| 47 | |
| 48 | * clip_visiblity(bool) |
| 49 | |
| 50 | Whether to export quant info of clip min, max |
| 51 | |
| 52 | * pad_visiblity(bool) |
| 53 | |
| 54 | Whether to export quant info of pad value |
| 55 | |
| 56 | ### Usage |
| 57 | This pass is included in PPQ Quantization Setting, you can calling this optimization by: |
| 58 | |
| 59 | setting = QuantizationSettingFactory.default_setting() |
| 60 | |
| 61 | setting.parameter_setting.quantize_passive_parameter = True = True |
| 62 | # calling ppq.api.quantize_onnx_model function with this setting. |
| 63 | ir = quantize_torch_model( |
| 64 | model=model, calib_dataloader=load_calibration_dataset(), setting=setting, |
| 65 | platform=TargetPlatform.PPL_CUDA_INT8, calib_steps=8, input_shape=INPUT_SHAPE, |
| 66 | collate_fn=collate_fn) |
| 67 | """ |
| 68 | def __init__(self, process_clip: bool = True, process_bias: bool = True, process_pad: bool = True, |
| 69 | clip_visiblity: QuantizationVisibility = QuantizationVisibility.INTERNAL, |
| 70 | pad_visiblity: QuantizationVisibility = QuantizationVisibility.INTERNAL): |
no outgoing calls
no test coverage detected