MCPcopy
hub / github.com/hpcaitech/ColossalAI / check_4gpu

Function check_4gpu

tests/test_fp8/test_fp8_all_to_all_single.py:18–24  ·  view source on GitHub ↗
(shape, dtype, fp8_format)

Source from the content-addressed store, hash-verified

16@parameterize("dtype", [torch.bfloat16, torch.float16])
17@parameterize("fp8_format", ["e4m3", "e5m2"])
18def check_4gpu(shape, dtype, fp8_format):
19 x = torch.rand(shape, dtype=dtype, device=get_accelerator().get_current_device())
20 output = torch.empty_like(x)
21 output_fp8 = torch.empty_like(x)
22 all_to_all_single_fp8(output_fp8, x, group=_get_default_group(), fp8_format=fp8_format)
23 dist.all_to_all_single(output, x, group=_get_default_group())
24 assert_close(output, output_fp8, rtol=0.1, atol=0.1)
25
26
27def run_dist(rank, world_size, port):

Callers 1

run_distFunction · 0.70

Calls 3

get_acceleratorFunction · 0.90
all_to_all_single_fp8Function · 0.90
get_current_deviceMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…