MCPcopy
hub / github.com/Turing-Project/WriteGPT / multi_card_run

Method multi_card_run

LanguageNetwork/BERT/train.py:37–58  ·  view source on GitHub ↗

Spawns 1 process per GPU

(self)

Source from the content-addressed store, hash-verified

35 self.device_id = device_id
36
37 def multi_card_run(self):
38 """ Spawns 1 process per GPU """
39 init_logger()
40
41 nb_gpu = self.args.world_size
42 mp = torch.multiprocessing.get_context('spawn')
43
44 # Create a thread to listen for errors in the child processes.
45 error_queue = mp.SimpleQueue()
46 error_handler = ErrorHandler(error_queue)
47
48 # Train with multiprocessing.
49 process = []
50 for i in range(nb_gpu):
51 self.device_id = i
52 process.append(mp.Process(target=self.multi_card_train, args=(self.args, self.device_id, error_queue),
53 daemon=True))
54 process[i].start()
55 logger.info(" Starting process pid: %d " % process[i].pid)
56 error_handler.add_child(process[i].pid)
57 for p in process:
58 p.join()
59
60 def multi_card_train(self, error_queue):
61 """ run process """

Callers 1

train.pyFile · 0.80

Calls 4

add_childMethod · 0.95
init_loggerFunction · 0.90
ErrorHandlerClass · 0.85
startMethod · 0.80

Tested by

no test coverage detected