Function build_and_run

tests/python/relax/test_codegen_cutlass.py:92–111 · view source on GitHub ↗

(mod, inputs_np, target, legalize=True, cuda_graph=False)

Source from the content-addressed store, hash-verified

90
91
92	def build_and_run(mod, inputs_np, target, legalize=True, cuda_graph=False):
93	with tvm.transform.PassContext(
94	config={
95	"relax.backend.use_cuda_graph": cuda_graph,
96	"relax.transform.apply_legalize_ops": legalize,
97	}
98	):
99	ex = tvm.compile(mod, target)
100
101	dev = tvm.device(target, 0)
102	vm = relax.VirtualMachine(ex, dev)
103	f = vm["main"]
104	inputs = [tvm.runtime.tensor(inp, dev) for inp in inputs_np]
105
106	# For cuda graph, run the compiled function twice to make sure that we can launch the cached
107	# graph on the second run.
108	if cuda_graph:
109	f(*inputs)
110
111	return f(*inputs).numpy()
112
113
114	def build_cutlass(mod, assert_all_bindings_fused=True, num_final_bindings=1):

get_result_with_relax_cutlass_offloadFunction · 0.70

test_kernel_sharingFunction · 0.70

test_conv2d_offloadFunction · 0.70

test_matmul_offloadFunction · 0.70

test_matmul_with_3d_bias_offloadFunction · 0.70

test_attention_rewrite_offloadFunction · 0.70

test_conv2d_residual_broadcastFunction · 0.70

test_layer_normFunction · 0.70

test_rms_normFunction · 0.70

test_conv2d_cuda_graphFunction · 0.70

test_attention_rewrite_multi_queryFunction · 0.70

_test_batched_var_len_attentionFunction · 0.70

numpyMethod · 0.80

fFunction · 0.70

compileMethod · 0.45

deviceMethod · 0.45

no test coverage detected

searching dependent graphs…