hub / github.com/NVIDIA/TensorRT-LLM / pack_strided_memory

Function pack_strided_memory

tensorrt_llm/_dlpack_utils.py:183–210 · view source on GitHub ↗

Pack GPU memory into a PyTorch tensor with specified stride. Parameters: ptr: GPU memory address obtained from cudaMalloc segment_size: Memory size of each segment in bytes segment_stride: Memory stride size between segments in bytes num_segments: Number of

(
    ptr: int, segment_size: int, segment_stride: int, num_segments: int, dtype: torch.dtype, dev_id
)

Source from the content-addressed store, hash-verified

181
182
183	def pack_strided_memory(
184	ptr: int, segment_size: int, segment_stride: int, num_segments: int, dtype: torch.dtype, dev_id
185	):
186	"""
187	Pack GPU memory into a PyTorch tensor with specified stride.
188
189	Parameters:
190	ptr: GPU memory address obtained from cudaMalloc
191	segment_size: Memory size of each segment in bytes
192	segment_stride: Memory stride size between segments in bytes
193	num_segments: Number of segments
194	dtype: PyTorch data type for the resulting tensor
195	dev_id: CUDA device ID
196
197	Returns:
198	PyTorch tensor that references the provided memory
199
200	Note:
201	This function creates a new DLPack capsule each time it's called,
202	even with the same pointer. Each capsule is consumed only once.
203	"""
204	# Create a new capsule each time
205	capsule_wrapper = create_dlpack_capsule(
206	ptr, segment_size, segment_stride, num_segments, dtype, dev_id
207	)
208	torch_tensor = torch.utils.dlpack.from_dlpack(capsule_wrapper.capsule)
209	torch_tensor._capsule_wrapper = capsule_wrapper
210	return torch_tensor

Callers 1

as_torch_strided_tensorMethod · 0.85

Calls 1

create_dlpack_capsuleFunction · 0.85

Tested by

no test coverage detected