MCPcopy
hub / github.com/hkust-nlp/simpleRL-reason / initialize_global_process_group

Function initialize_global_process_group

verl/utils/distributed.py:18–28  ·  view source on GitHub ↗
(timeout_second=36000)

Source from the content-addressed store, hash-verified

16
17
18def initialize_global_process_group(timeout_second=36000):
19 import torch.distributed
20 from datetime import timedelta
21 torch.distributed.init_process_group('nccl', timeout=timedelta(seconds=timeout_second))
22 local_rank = int(os.environ["LOCAL_RANK"])
23 rank = int(os.environ["RANK"])
24 world_size = int(os.environ["WORLD_SIZE"])
25
26 if torch.distributed.is_initialized():
27 torch.cuda.set_device(local_rank)
28 return local_rank, rank, world_size

Callers 5

test_fsdp_ckptFunction · 0.90
create_trainerFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90

Calls

no outgoing calls

Tested by 2

test_fsdp_ckptFunction · 0.72
create_trainerFunction · 0.72