MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / forward

Method forward

tensorrt_llm/models/bert/model.py:302–367  ·  view source on GitHub ↗
(self,
                input_ids=None,
                input_lengths=None,
                position_ids=None,
                token_type_ids=None,
                hidden_states=None,
                max_input_length=None)

Source from the content-addressed store, hash-verified

300 ])
301
302 def forward(self,
303 input_ids=None,
304 input_lengths=None,
305 position_ids=None,
306 token_type_ids=None,
307 hidden_states=None,
308 max_input_length=None):
309 # remove_input_padding requires these fields as explicit input
310 mask = None
311 if not default_net().plugin_config.remove_input_padding:
312 seq_len_2d = concat([1, shape(input_ids, 1)])
313
314 # create position ids
315 position_ids_buffer = constant(
316 np.expand_dims(
317 np.arange(self.max_position_embeddings).astype(np.int32),
318 0))
319 tmp_position_ids = slice(position_ids_buffer,
320 starts=[0, 0],
321 sizes=seq_len_2d)
322 tmp_position_ids = expand(tmp_position_ids, shape(input_ids)) #BxL
323 tmp_input_lengths = unsqueeze(input_lengths, 1) #Bx1
324 tmp_input_lengths = expand(tmp_input_lengths,
325 shape(input_ids)) #BxL
326 mask = tmp_position_ids < tmp_input_lengths # BxL
327 mask = mask.cast('int32')
328
329 if position_ids is None:
330 if self.is_roberta:
331 # see create_position_ids_from_input_ids() in https://github.com/huggingface/transformers/blob/main/src/transformers/models/roberta/modeling_roberta.py
332 position_ids = (tmp_position_ids + 1) * mask
333 position_ids = position_ids + self.padding_idx
334 else:
335 position_ids = slice(position_ids_buffer,
336 starts=[0, 0],
337 sizes=seq_len_2d)
338 position_ids = expand(position_ids, shape(input_ids))
339
340 # create token_type_ids
341 if token_type_ids is None:
342 token_type_ids_buffer = constant(
343 np.expand_dims(
344 np.zeros(self.max_position_embeddings).astype(np.int32),
345 0))
346 token_type_ids = slice(token_type_ids_buffer,
347 starts=[0, 0],
348 sizes=seq_len_2d)
349 token_type_ids = expand(token_type_ids, shape(input_ids))
350
351 hidden_states = self.embedding(input_ids, position_ids, token_type_ids)
352 self.register_network_output('embedding_output', hidden_states)
353
354 for idx, layer in enumerate(self.layers):
355 hidden_states = layer(hidden_states=hidden_states,
356 input_lengths=input_lengths,
357 attention_mask=mask,
358 max_input_length=max_input_length)
359 # keep the last layer output name as hidden_states

Callers 2

forwardMethod · 0.45
forwardMethod · 0.45

Calls 11

default_netFunction · 0.85
concatFunction · 0.85
constantFunction · 0.85
sliceFunction · 0.85
expandFunction · 0.85
unsqueezeFunction · 0.85
castMethod · 0.80
embeddingMethod · 0.80
mark_outputMethod · 0.80
shapeFunction · 0.50

Tested by

no test coverage detected