量化一个 onnx 原生的模型 输入一个 onnx 模型的文件路径 返回一个量化后的 PPQ.IR.BaseGraph quantize onnx model, input onnx model and return quantized ppq IR graph. Args: onnx_import_file (str): 被量化的 onnx 模型文件路径 onnx model location calib_dataloader (DataLoader): 校准数据集 calibration data loader cali
(
onnx_import_file: str,
calib_dataloader: DataLoader,
calib_steps: int,
input_shape: List[int],
platform: TargetPlatform,
input_dtype: torch.dtype = torch.float,
inputs: List[Any] = None,
setting: QuantizationSetting = None,
collate_fn: Callable = None,
device: str = 'cuda',
verbose: int = 0,
do_quantize: bool = True,
)
| 183 | |
| 184 | @ empty_ppq_cache |
| 185 | def quantize_onnx_model( |
| 186 | onnx_import_file: str, |
| 187 | calib_dataloader: DataLoader, |
| 188 | calib_steps: int, |
| 189 | input_shape: List[int], |
| 190 | platform: TargetPlatform, |
| 191 | input_dtype: torch.dtype = torch.float, |
| 192 | inputs: List[Any] = None, |
| 193 | setting: QuantizationSetting = None, |
| 194 | collate_fn: Callable = None, |
| 195 | device: str = 'cuda', |
| 196 | verbose: int = 0, |
| 197 | do_quantize: bool = True, |
| 198 | ) -> BaseGraph: |
| 199 | """量化一个 onnx 原生的模型 输入一个 onnx 模型的文件路径 返回一个量化后的 PPQ.IR.BaseGraph quantize |
| 200 | onnx model, input onnx model and return quantized ppq IR graph. |
| 201 | |
| 202 | Args: |
| 203 | onnx_import_file (str): 被量化的 onnx 模型文件路径 onnx model location |
| 204 | |
| 205 | calib_dataloader (DataLoader): 校准数据集 calibration data loader |
| 206 | |
| 207 | calib_steps (int): 校准步数 calibration steps |
| 208 | |
| 209 | collate_fn (Callable): 校准数据的预处理函数 batch collate func for preprocessing |
| 210 | |
| 211 | input_shape (List[int]): 模型输入尺寸,用于执行 jit.trace,对于动态尺寸的模型,输入一个模型可接受的尺寸即可。 |
| 212 | 如果模型存在多个输入,则需要使用 inputs 变量进行传参,此项设置为 None |
| 213 | a list of ints indicating size of input, for multiple inputs, please use |
| 214 | keyword arg inputs for direct parameter passing and this should be set to None |
| 215 | |
| 216 | input_dtype (torch.dtype): 模型输入数据类型,如果模型存在多个输入,则需要使用 inputs 变量进行传参,此项设置为 None |
| 217 | the torch datatype of input, for multiple inputs, please use keyword arg inputs |
| 218 | for direct parameter passing and this should be set to None |
| 219 | |
| 220 | inputs (List[Any], optional): 对于存在多个输入的模型,在Inputs中直接指定一个输入List,从而完成模型的tracing。 |
| 221 | for multiple inputs, please give the specified inputs directly in the form of |
| 222 | a list of arrays |
| 223 | |
| 224 | setting (OptimSetting): 量化配置信息,用于配置量化的各项参数,设置为 None 时加载默认参数。 |
| 225 | Quantization setting, default setting will be used when set None |
| 226 | |
| 227 | do_quantize (Bool, optional): 是否执行量化 whether to quantize the model, defaults to True. |
| 228 | |
| 229 | |
| 230 | platform (TargetPlatform, optional): 量化的目标平台 target backend platform, defaults to TargetPlatform.DSP_INT8. |
| 231 | |
| 232 | device (str, optional): 量化过程的执行设备 execution device, defaults to 'cuda'. |
| 233 | |
| 234 | verbose (int, optional): 是否打印详细信息 whether to print details, defaults to 0. |
| 235 | |
| 236 | Raises: |
| 237 | ValueError: 给定平台不可量化 the given platform doesn't support quantization |
| 238 | KeyError: 给定平台不被支持 the given platform is not supported yet |
| 239 | |
| 240 | Returns: |
| 241 | BaseGraph: 量化后的IR,包含了后端量化所需的全部信息 |
| 242 | The quantized IR, containing all information needed for backend execution |
no test coverage detected