MCPcopy
hub / github.com/ModelTC/LightLLM / tppart_model_infer

Function tppart_model_infer

test/benchmark/static_inference/test_vit.py:36–62  ·  view source on GitHub ↗
(model_kvargs, batch_size, image_size)

Source from the content-addressed store, hash-verified

34
35
36def tppart_model_infer(model_kvargs, batch_size, image_size):
37 import torch
38 import torch.distributed as dist
39
40 rank_id = model_kvargs["tp_rank_id"]
41 init_vision_distributed_env(model_kvargs)
42
43 torch.cuda.empty_cache()
44 model_part = VisionTransformer(model_kvargs)
45 test_data = torch.randn((batch_size, 3, image_size, image_size)).cuda().to(torch.bfloat16)
46 # warm up
47 torch.cuda.synchronize()
48 for i in range(10):
49 model_part.forward(test_data)
50 torch.cuda.synchronize()
51
52 torch.cuda.synchronize()
53 start_time = time.time()
54 for i in range(50):
55 model_part.forward(test_data)
56 torch.cuda.synchronize()
57 end_time = time.time()
58 if rank_id == 0:
59 print("time total cost(ms):", (end_time - start_time) / 50 * 1000)
60 print("image per second:", batch_size * 50 / (end_time - start_time))
61
62 return
63
64
65if __name__ == "__main__":

Callers

nothing calls this directly

Calls 4

forwardMethod · 0.95
VisionTransformerClass · 0.90
cudaMethod · 0.45

Tested by

no test coverage detected