MCPcopy
hub / github.com/deepspeedai/DeepSpeedExamples / load_training_checkpoint

Function load_training_checkpoint

bing_bert/deepspeed_train.py:59–69  ·  view source on GitHub ↗

Utility function for checkpointing model + optimizer dictionaries The main purpose for this is to be able to resume training from that instant again

(args, model, PATH, ckpt_id)

Source from the content-addressed store, hash-verified

57
58
59def load_training_checkpoint(args, model, PATH, ckpt_id):
60 """Utility function for checkpointing model + optimizer dictionaries
61 The main purpose for this is to be able to resume training from that instant again
62 """
63 logger = args.logger
64 _, checkpoint_state_dict = model.network.load_checkpoint(PATH, ckpt_id)
65 epoch = checkpoint_state_dict['epoch']
66 last_global_step = checkpoint_state_dict['last_global_step']
67 last_global_data_samples = checkpoint_state_dict['last_global_data_samples']
68 del checkpoint_state_dict
69 return (epoch, last_global_step, last_global_data_samples)
70
71def get_effective_batch(args, total):
72 if args.local_rank != -1:

Callers 1

load_checkpointFunction · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected