MCPcopy
hub / github.com/PRIME-RL/PRIME / load_fsdp_grad

Function load_fsdp_grad

training/verl/utils/fsdp_utils.py:78–82  ·  view source on GitHub ↗
(module, device_id)

Source from the content-addressed store, hash-verified

76
77
78def load_fsdp_grad(module, device_id):
79 for _, param in module.named_parameters():
80 if param.grad is not None:
81 param.grad = param.grad.to(device_id, non_blocking=True)
82 torch.cuda.empty_cache()
83
84
85def offload_fsdp_param_and_grad(module, offload_grad=False):

Callers

nothing calls this directly

Calls 2

named_parametersMethod · 0.80
toMethod · 0.80

Tested by

no test coverage detected