MCPcopy
hub / github.com/deepspeedai/DeepSpeedExamples / forward_step

Function forward_step

Megatron-LM/pretrain_gpt2.py:275–291  ·  view source on GitHub ↗

Forward step.

(data_iterator, model, args, timers)

Source from the content-addressed store, hash-verified

273
274
275def forward_step(data_iterator, model, args, timers):
276 """Forward step."""
277
278 # Get the batch.
279 timers('batch generator').start()
280 tokens, labels, loss_mask, attention_mask, position_ids = get_batch(
281 data_iterator, args, timers)
282 timers('batch generator').stop()
283
284 # Forward model.
285 output = model(tokens, position_ids, attention_mask)
286 losses = mpu.vocab_parallel_cross_entropy(output.contiguous().float(),
287 labels)
288 loss_mask = loss_mask.view(-1)
289 loss = torch.sum(losses.view(-1) * loss_mask) / loss_mask.sum()
290
291 return loss
292
293
294def backward_step(optimizer, model, lm_loss, args, timers):

Callers 2

train_stepFunction · 0.70
evaluateFunction · 0.70

Calls 3

get_batchFunction · 0.70
startMethod · 0.45
stopMethod · 0.45

Tested by

no test coverage detected