hub / github.com/OpenPPL/ppq / quantize_onnx_model

Function quantize_onnx_model

ppq/api/interface.py:185–276 · view source on GitHub ↗

量化一个 onnx 原生的模型输入一个 onnx 模型的文件路径返回一个量化后的 PPQ.IR.BaseGraph quantize onnx model, input onnx model and return quantized ppq IR graph. Args: onnx_import_file (str): 被量化的 onnx 模型文件路径 onnx model location calib_dataloader (DataLoader): 校准数据集 calibration data loader cali

(
    onnx_import_file: str,
    calib_dataloader: DataLoader,
    calib_steps: int,
    input_shape: List[int],
    platform: TargetPlatform,
    input_dtype: torch.dtype = torch.float,
    inputs: List[Any] = None,
    setting: QuantizationSetting = None,
    collate_fn: Callable = None,
    device: str = 'cuda',
    verbose: int = 0,
    do_quantize: bool = True,
)

Source from the content-addressed store, hash-verified

183
184	@ empty_ppq_cache
185	def quantize_onnx_model(
186	onnx_import_file: str,
187	calib_dataloader: DataLoader,
188	calib_steps: int,
189	input_shape: List[int],
190	platform: TargetPlatform,
191	input_dtype: torch.dtype = torch.float,
192	inputs: List[Any] = None,
193	setting: QuantizationSetting = None,
194	collate_fn: Callable = None,
195	device: str = 'cuda',
196	verbose: int = 0,
197	do_quantize: bool = True,
198	) -> BaseGraph:
199	"""量化一个 onnx 原生的模型输入一个 onnx 模型的文件路径返回一个量化后的 PPQ.IR.BaseGraph quantize
200	onnx model, input onnx model and return quantized ppq IR graph.
201
202	Args:
203	onnx_import_file (str): 被量化的 onnx 模型文件路径 onnx model location
204
205	calib_dataloader (DataLoader): 校准数据集 calibration data loader
206
207	calib_steps (int): 校准步数 calibration steps
208
209	collate_fn (Callable): 校准数据的预处理函数 batch collate func for preprocessing
210
211	input_shape (List[int]): 模型输入尺寸，用于执行 jit.trace，对于动态尺寸的模型，输入一个模型可接受的尺寸即可。
212	如果模型存在多个输入，则需要使用 inputs 变量进行传参，此项设置为 None
213	a list of ints indicating size of input, for multiple inputs, please use
214	keyword arg inputs for direct parameter passing and this should be set to None
215
216	input_dtype (torch.dtype): 模型输入数据类型，如果模型存在多个输入，则需要使用 inputs 变量进行传参，此项设置为 None
217	the torch datatype of input, for multiple inputs, please use keyword arg inputs
218	for direct parameter passing and this should be set to None
219
220	inputs (List[Any], optional): 对于存在多个输入的模型，在Inputs中直接指定一个输入List，从而完成模型的tracing。
221	for multiple inputs, please give the specified inputs directly in the form of
222	a list of arrays
223
224	setting (OptimSetting): 量化配置信息，用于配置量化的各项参数，设置为 None 时加载默认参数。
225	Quantization setting, default setting will be used when set None
226
227	do_quantize (Bool, optional): 是否执行量化 whether to quantize the model, defaults to True.
228
229
230	platform (TargetPlatform, optional): 量化的目标平台 target backend platform, defaults to TargetPlatform.DSP_INT8.
231
232	device (str, optional): 量化过程的执行设备 execution device, defaults to 'cuda'.
233
234	verbose (int, optional): 是否打印详细信息 whether to print details, defaults to 0.
235
236	Raises:
237	ValueError: 给定平台不可量化 the given platform doesn't support quantization
238	KeyError: 给定平台不被支持 the given platform is not supported yet
239
240	Returns:
241	BaseGraph: 量化后的IR，包含了后端量化所需的全部信息
242	The quantized IR, containing all information needed for backend execution

Callers 10

quantize_onnx_model.pyFile · 0.90

quantize.pyFile · 0.90

yolo_x.pyFile · 0.85

02_Quantization.pyFile · 0.85

Example_Benchmark.pyFile · 0.85

Example_PTQ.pyFile · 0.85

Benchmark_with_onnx.pyFile · 0.85

quantize_torch_modelFunction · 0.85

quantizeFunction · 0.85

Calls 7

tracing_operation_metaMethod · 0.95

TorchExecutorClass · 0.90

load_onnx_graphFunction · 0.85

dispatch_graphFunction · 0.85

default_settingMethod · 0.80

quantizeMethod · 0.80

reportMethod · 0.45

Tested by

no test coverage detected