MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / forward

Method forward

tensorrt_llm/models/enc_dec/model.py:2093–2123  ·  view source on GitHub ↗
(self,
                input_features: Tensor,
                input_lengths=None,
                position_ids=None)

Source from the content-addressed store, hash-verified

2091 self.downsample_factor = 2
2092
2093 def forward(self,
2094 input_features: Tensor,
2095 input_lengths=None,
2096 position_ids=None):
2097 if default_net().plugin_config.remove_input_padding:
2098 # BXT,D -> 1,BxT,D -> 1,D,BxT
2099 input_features = unsqueeze(input_features, 0)
2100 input_features = transpose(input_features, 1, 2)
2101 x_type = input_features.dtype
2102 input_features = cast(input_features, self._dtype)
2103 x = self.transformer.conv1(input_features)
2104 x = gelu(x)
2105 x = self.transformer.conv2(x)
2106 x = cast(x, x_type)
2107 x = gelu(x)
2108 x = transpose(x, 2, 1)
2109 x = x + cast(self.transformer.position_embedding(position_ids), x.dtype)
2110
2111 if default_net().plugin_config.remove_input_padding:
2112 #B,T,D -> BxT,D
2113 x = x.view([-1, self.config.hidden_size])
2114 hidden_states = x
2115 input_lengths = input_lengths // self.downsample_factor
2116 for encoder_layer in self.transformer.layers:
2117 hidden_states = encoder_layer(hidden_states,
2118 input_lengths=input_lengths)
2119
2120 x = hidden_states
2121 x = self.transformer.ln_f(x)
2122 x.mark_output('encoder_output', self._dtype)
2123 return x
2124
2125 def prepare_inputs(self, max_batch_size=16):
2126

Callers

nothing calls this directly

Calls 7

default_netFunction · 0.90
unsqueezeFunction · 0.90
transposeFunction · 0.90
castFunction · 0.90
geluFunction · 0.90
mark_outputMethod · 0.80
viewMethod · 0.45

Tested by

no test coverage detected