Method optimize

ppq/samples/Tutorial/optimization.py:52–79 · view source on GitHub ↗

(self, graph: BaseGraph, dataloader: Iterable, 
                 collate_fn: Callable, executor: TorchExecutor, **kwargs)

Source from the content-addressed store, hash-verified

50	from ppq import BaseGraph, QuantizationOptimizationPass, TorchExecutor
51	class MyOptim(QuantizationOptimizationPass):
52	def optimize(self, graph: BaseGraph, dataloader: Iterable,
53	collate_fn: Callable, executor: TorchExecutor, **kwargs) -> None:
54	# graph.operations 是一个包含了图中所有 op 的字典
55	for name, op in graph.operations.items():
56
57	# 从图中找出所有已经量化的卷积算子
58	# 对于你的网络而言，并非所有算子最终都会被量化，他们会受到调度策略和 Quantizer策略的双重限制
59	# 因此我们要使用 isinstance(op, QuantableOperation) 来判断它是否是一个量化的算子
60	if op.type == 'Conv' and isinstance(op, QuantableOperation):
61
62	# 对于卷积算子，它可能有 2-3 个输入，其中第二个输入为权重，第三个输入为 bias
63	# 我们修改权重量化信息的 scale
64	op.input_quant_config[1].scale *= 2
65	print(f'Input scale of Op {name} has been enlarged.')
66
67	# 我们接下来解除 Gemm 的量化，在这里 mobilenet_v2 网络只有一个 Gemm 层
68	# 所以我们将所有遇到的 Gemm 的层全部解除量化
69	if op.type == 'Gemm' and isinstance(op, QuantableOperation):
70
71	# config_with_variable 接口将返回量化算子的所有量化信息————包括输入与输出
72	for cfg, _ in op.config_with_variable:
73
74	# 在 PPQ 中有许多方法可以切换算子的量化状态
75	# 将量化状态直接设置为 FP32，即解除了算子的量化
76	cfg.state = QuantizationStates.FP32
77
78	# 也可以直接调用算子的 dequantize 方法
79	# op.dequantize()
80
81	# ------------------------------------------------------------
82	# 如果你使用 ENABLE_CUDA_KERNEL 方法

Callers 1

optimization.pyFile · 0.45

Calls

no outgoing calls

Tested by

no test coverage detected