MCPcopy
hub / github.com/deepspeedai/DeepSpeed / quantize

Method quantize

deepspeed/runtime/quantize.py:51–73  ·  view source on GitHub ↗
(self, parameter_group, overflow, eigenvalue_enabled, block_eigenvalue={})

Source from the content-addressed store, hash-verified

49 return result
50
51 def quantize(self, parameter_group, overflow, eigenvalue_enabled, block_eigenvalue={}):
52
53 if overflow and not eigenvalue_enabled:
54 return
55
56 self.step()
57
58 self.update_fp16_ratio()
59
60 for i in range(len(parameter_group)):
61 for p in parameter_group[i]:
62 if len(p.size()) > 1 and hasattr(p, "start_bits") and p.start_bits:
63 param_id = id(p)
64 if block_eigenvalue is None:
65 eigenvalue, layer_id = None, 0
66 else:
67 eigenvalue, layer_id = block_eigenvalue[param_id] if param_id in block_eigenvalue else (None,
68 0)
69 if eigenvalue is not None:
70 factor = 1 + math.floor(eigenvalue * 4)
71 p.data = self.compute_quantization(p.data, layer_id, factor)
72 else:
73 p.data = self.compute_quantization(p, layer_id)
74
75 def step(self):
76 self.qsteps += 1

Callers 5

quantization_test_helperFunction · 0.95
run_quantize_dsFunction · 0.45
_ensure_quantizedMethod · 0.45
_forward_prologueMethod · 0.45
_take_model_stepMethod · 0.45

Calls 4

stepMethod · 0.95
update_fp16_ratioMethod · 0.95
compute_quantizationMethod · 0.95
sizeMethod · 0.45

Tested by 2

quantization_test_helperFunction · 0.76
run_quantize_dsFunction · 0.36