MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / infer_shapes

Method infer_shapes

tensorrt_llm/runtime/session.py:204–237  ·  view source on GitHub ↗

@brief: Set input shapes to given context, and infer the output shapes from the given input shapes. This function should be called every time when the input shapes are changed before calling run(). Or call the context.set_input_shape on all dynamic shaped input

(
            self,
            inputs: List[TensorInfo],
            context: Optional[trt.IExecutionContext] = None
    )

Source from the content-addressed store, hash-verified

202 )
203
204 def infer_shapes(
205 self,
206 inputs: List[TensorInfo],
207 context: Optional[trt.IExecutionContext] = None
208 ) -> List[TensorInfo]:
209 '''
210 @brief: Set input shapes to given context, and infer the output shapes from the given input shapes.
211 This function should be called every time when the input shapes are changed before calling run().
212 Or call the context.set_input_shape on all dynamic shaped input tensors manually.
213 @param inputs: list of TensorInfo object, each item represents an input tensor
214 @param context: TensorRT execution context, if None, use the default context
215 @return: list of TensorInfo object, each item represents an output tensor, returns None if failed
216 '''
217 # set shape to the default context if context is not specified
218 if context is None:
219 context = self.context
220 for i in inputs:
221 if self.engine.get_tensor_mode(i.name) != trt.TensorIOMode.INPUT:
222 raise ValueError(f"Tensor:{i.name} is not an input tensor")
223 if self.engine.get_tensor_dtype(i.name) != i.dtype:
224 raise ValueError(f"Tensor:{i.name} has wrong dtype")
225 if not context.set_input_shape(i.name, i.shape):
226 raise RuntimeError(
227 f"Could not set shape {i.shape} for tensor {i.name}. Please check the profile range for which your model was build."
228 )
229
230 outputs = []
231 for i in range(self.engine.num_io_tensors):
232 name = self.engine.get_tensor_name(i)
233 if self.engine.get_tensor_mode(name) == trt.TensorIOMode.OUTPUT:
234 shape = context.get_tensor_shape(name)
235 dtype = self.engine.get_tensor_dtype(name)
236 outputs.append(TensorInfo(name, dtype, shape))
237 return outputs
238
239 def _set_weight_streaming(self, gpu_weights_percent):
240 if not self.engine.streamable_weights_size:

Callers 15

_debug_runMethod · 0.95
test_plugin_no_cacheMethod · 0.80
run_sessionFunction · 0.80
run_vision_encoderMethod · 0.80
run_engineFunction · 0.80
runFunction · 0.80
runFunction · 0.80
prepareMethod · 0.80
_setupMethod · 0.80
vae_decodeFunction · 0.80
run.pyFile · 0.80
vit_processFunction · 0.80

Calls 2

TensorInfoClass · 0.85
appendMethod · 0.45

Tested by 1

test_plugin_no_cacheMethod · 0.64