MCPcopy Index your code
hub / github.com/pytorch/examples / save

Method save

distributed/FSDP2/checkpoint.py:199–209  ·  view source on GitHub ↗
(self, model: FSDPModule, optim: torch.optim.Optimizer)

Source from the content-addressed store, hash-verified

197 return {}
198
199 def save(self, model: FSDPModule, optim: torch.optim.Optimizer):
200 model_state_dict = self._get_full_model_state_dict(model)
201 optim_state_dict = self._get_full_optimizer_state_dict(model, optim)
202 if torch.distributed.get_rank() == 0:
203 new_training_time = int(time.time() * 1000)
204 new_checkpoint_folder = f"{self.folder}/{'dcp_api' if self.dcp_api else 'dtensor_api'}/{new_training_time}"
205 new_model_checkpoint = f"{new_checkpoint_folder}/{MODEL_CHECKPOINT}"
206 new_optim_checkpoint = f"{new_checkpoint_folder}/{OPTIM_CHECKPOINT}"
207 os.makedirs(new_checkpoint_folder, exist_ok=True)
208 torch.save(model_state_dict, new_model_checkpoint)
209 torch.save(optim_state_dict, new_optim_checkpoint)

Callers 15

mainFunction · 0.95
main.pyFile · 0.80
mainFunction · 0.80
save_checkpointFunction · 0.80
train.pyFile · 0.80
convert.pyFile · 0.80
save_model_checkpointFunction · 0.80
upload_to_s3Function · 0.80
_save_snapshotMethod · 0.80
_save_checkpointMethod · 0.80
_save_checkpointMethod · 0.80

Tested by

no test coverage detected