量化一个已经在内存中的 ppq 模型 输入一个量化前的 PPQ.IR.BaseGraph 返回一个量化后的 PPQ.IR.BaseGraph quantize ppq model, input ppq graph and return quantized ppq graph. Args: native (BaseGraph): 被量化的 ppq graph calib_dataloader (DataLoader): 校准数据集 calibration data loader calib_steps (int): 校准步数
(
model: BaseGraph,
calib_dataloader: DataLoader,
calib_steps: int,
input_shape: List[int],
platform: TargetPlatform,
input_dtype: torch.dtype = torch.float,
inputs: List[Any] = None,
setting: QuantizationSetting = None,
collate_fn: Callable = None,
device: str = 'cuda',
verbose: int = 0,
do_quantize: bool = True,
)
| 451 | |
| 452 | @ empty_ppq_cache |
| 453 | def quantize_native_model( |
| 454 | model: BaseGraph, |
| 455 | calib_dataloader: DataLoader, |
| 456 | calib_steps: int, |
| 457 | input_shape: List[int], |
| 458 | platform: TargetPlatform, |
| 459 | input_dtype: torch.dtype = torch.float, |
| 460 | inputs: List[Any] = None, |
| 461 | setting: QuantizationSetting = None, |
| 462 | collate_fn: Callable = None, |
| 463 | device: str = 'cuda', |
| 464 | verbose: int = 0, |
| 465 | do_quantize: bool = True, |
| 466 | ) -> BaseGraph: |
| 467 | """量化一个已经在内存中的 ppq 模型 输入一个量化前的 PPQ.IR.BaseGraph 返回一个量化后的 PPQ.IR.BaseGraph |
| 468 | quantize ppq model, input ppq graph and return quantized ppq graph. |
| 469 | |
| 470 | Args: |
| 471 | native (BaseGraph): 被量化的 ppq graph |
| 472 | |
| 473 | calib_dataloader (DataLoader): 校准数据集 calibration data loader |
| 474 | |
| 475 | calib_steps (int): 校准步数 calibration steps |
| 476 | |
| 477 | collate_fn (Callable): 校准数据的预处理函数 batch collate func for preprocessing |
| 478 | |
| 479 | input_shape (List[int]): 模型输入尺寸,用于执行 jit.trace,对于动态尺寸的模型,输入一个模型可接受的尺寸即可。 |
| 480 | 如果模型存在多个输入,则需要使用 inputs 变量进行传参,此项设置为 None |
| 481 | a list of ints indicating size of input, for multiple inputs, please use |
| 482 | keyword arg inputs for direct parameter passing and this should be set to None |
| 483 | |
| 484 | input_dtype (torch.dtype): 模型输入数据类型,如果模型存在多个输入,则需要使用 inputs 变量进行传参,此项设置为 None |
| 485 | the torch datatype of input, for multiple inputs, please use keyword arg inputs |
| 486 | for direct parameter passing and this should be set to None |
| 487 | |
| 488 | inputs (List[Any], optional): 对于存在多个输入的模型,在Inputs中直接指定一个输入List,从而完成模型的tracing。 |
| 489 | for multiple inputs, please give the specified inputs directly in the form of |
| 490 | a list of arrays |
| 491 | |
| 492 | setting (OptimSetting): 量化配置信息,用于配置量化的各项参数,设置为 None 时加载默认参数。 |
| 493 | Quantization setting, default setting will be used when set None |
| 494 | |
| 495 | do_quantize (Bool, optional): 是否执行量化 whether to quantize the model, defaults to True. |
| 496 | |
| 497 | |
| 498 | platform (TargetPlatform, optional): 量化的目标平台 target backend platform, defaults to TargetPlatform.DSP_INT8. |
| 499 | |
| 500 | device (str, optional): 量化过程的执行设备 execution device, defaults to 'cuda'. |
| 501 | |
| 502 | verbose (int, optional): 是否打印详细信息 whether to print details, defaults to 0. |
| 503 | |
| 504 | Raises: |
| 505 | ValueError: 给定平台不可量化 the given platform doesn't support quantization |
| 506 | KeyError: 给定平台不被支持 the given platform is not supported yet |
| 507 | |
| 508 | Returns: |
| 509 | BaseGraph: 量化后的IR,包含了后端量化所需的全部信息 |
| 510 | The quantized IR, containing all information needed for backend execution |
no test coverage detected