This function first parses the configuration arguments, using :func:`parse_args()` in case one of the input arguments are not given. Then initialize and set distributed environment by calling global_context's functions. Args: config (Union[str, dict, Config]): Config file or config
(
config: Union[str, Path, Config, Dict],
rank: int,
world_size: int,
host: str,
port: int,
backend: str = "nccl",
local_rank: int = None,
seed: int = 1024,
verbose: bool = True,
)
| 56 | |
| 57 | |
| 58 | def launch( |
| 59 | config: Union[str, Path, Config, Dict], |
| 60 | rank: int, |
| 61 | world_size: int, |
| 62 | host: str, |
| 63 | port: int, |
| 64 | backend: str = "nccl", |
| 65 | local_rank: int = None, |
| 66 | seed: int = 1024, |
| 67 | verbose: bool = True, |
| 68 | ): |
| 69 | """This function first parses the configuration arguments, using :func:`parse_args()` in case one of the input |
| 70 | arguments are not given. Then initialize and set distributed environment by calling global_context's functions. |
| 71 | |
| 72 | Args: |
| 73 | config (Union[str, dict, Config]): Config file or config file path are both acceptable |
| 74 | rank (int): Rank for the default process group |
| 75 | world_size (int): World size of the default process group |
| 76 | host (str): The master address for distributed training |
| 77 | port (str): The master port for distributed training |
| 78 | backend (str, optional): Backend for ``torch.distributed``, defaults to ``nccl`` |
| 79 | local_rank (int, optional): |
| 80 | Rank for the process on the node and is used to set the default CUDA device, |
| 81 | defaults to None. If local_rank = None, the default device ordinal will be calculated automatically. |
| 82 | seed (int, optional): Specified random seed for every process. Defaults to 1024. |
| 83 | verbose (bool, optional): Whether to print logs. Defaults to True. |
| 84 | |
| 85 | Raises: |
| 86 | Exception: Raise exception when config type is wrong |
| 87 | """ |
| 88 | gpc.verbose = verbose |
| 89 | |
| 90 | # set config |
| 91 | assert isinstance( |
| 92 | config, (Config, str, Path, dict) |
| 93 | ), f"expected argument config to be Config, str or Path, but got {type(config)}" |
| 94 | if not isinstance(config, Config) and isinstance(config, dict): |
| 95 | config = Config(config) |
| 96 | if isinstance(config, (str, Path)): |
| 97 | config = Config.from_file(config) |
| 98 | gpc.load_config(config) |
| 99 | |
| 100 | # init default process group |
| 101 | gpc.init_global_dist(rank, world_size, backend, host, port) |
| 102 | |
| 103 | # init process groups for different parallel modes from config |
| 104 | gpc.init_parallel_groups() |
| 105 | |
| 106 | # set cuda device |
| 107 | if torch.cuda.is_available(): |
| 108 | # if local rank is not given, calculate automatically |
| 109 | gpc.set_device(local_rank) |
| 110 | |
| 111 | # set the number of processes running on the same node |
| 112 | gpc.detect_num_processes_on_current_node() |
| 113 | |
| 114 | gpc.set_seed(seed) |
| 115 |
searching dependent graphs…