MCPcopy
hub / github.com/apache/tvm / _schedule

Function _schedule

python/tvm/exec/gpu_memory_bandwidth.py:156–175  ·  view source on GitHub ↗
(
    sch: s_tir.Schedule,
    len_bx: int,
    len_tx: int,
    len_vec: int,
)

Source from the content-addressed store, hash-verified

154
155
156def _schedule(
157 sch: s_tir.Schedule,
158 len_bx: int,
159 len_tx: int,
160 len_vec: int,
161):
162 # pylint: disable=invalid-name
163 block = sch.get_sblock("B")
164 xo, xi, k = sch.get_loops(block)
165 bx, xo = sch.split(xo, factors=[len_bx, None])
166 xi, tx, vec = sch.split(xi, factors=[None, len_tx, len_vec])
167 sch.reorder(bx, xi, tx, xo, k, vec)
168 bx = sch.fuse(bx, xi)
169 sch.bind(bx, "blockIdx.x")
170 sch.bind(tx, "threadIdx.x")
171 ldg = sch.cache_read(block, 0, "local")
172 sch.compute_at(ldg, k, preserve_unit_loops=True)
173 sch.vectorize(sch.get_loops(ldg)[-1])
174 sch.decompose_reduction(block, k)
175 # pylint: enable=invalid-name
176
177
178def main(): # pylint: disable=too-many-locals

Callers 1

mainFunction · 0.85

Calls 10

get_sblockMethod · 0.80
reorderMethod · 0.80
fuseMethod · 0.80
cache_readMethod · 0.80
compute_atMethod · 0.80
vectorizeMethod · 0.80
decompose_reductionMethod · 0.80
get_loopsMethod · 0.45
splitMethod · 0.45
bindMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…