MCPcopy Index your code
hub / github.com/NVIDIA/TensorRT-LLM / pack_strided_memory

Function pack_strided_memory

tensorrt_llm/_dlpack_utils.py:183–210  ·  view source on GitHub ↗

Pack GPU memory into a PyTorch tensor with specified stride. Parameters: ptr: GPU memory address obtained from cudaMalloc segment_size: Memory size of each segment in bytes segment_stride: Memory stride size between segments in bytes num_segments: Number of

(
    ptr: int, segment_size: int, segment_stride: int, num_segments: int, dtype: torch.dtype, dev_id
)

Source from the content-addressed store, hash-verified

181
182
183def pack_strided_memory(
184 ptr: int, segment_size: int, segment_stride: int, num_segments: int, dtype: torch.dtype, dev_id
185):
186 """
187 Pack GPU memory into a PyTorch tensor with specified stride.
188
189 Parameters:
190 ptr: GPU memory address obtained from cudaMalloc
191 segment_size: Memory size of each segment in bytes
192 segment_stride: Memory stride size between segments in bytes
193 num_segments: Number of segments
194 dtype: PyTorch data type for the resulting tensor
195 dev_id: CUDA device ID
196
197 Returns:
198 PyTorch tensor that references the provided memory
199
200 Note:
201 This function creates a new DLPack capsule each time it's called,
202 even with the same pointer. Each capsule is consumed only once.
203 """
204 # Create a new capsule each time
205 capsule_wrapper = create_dlpack_capsule(
206 ptr, segment_size, segment_stride, num_segments, dtype, dev_id
207 )
208 torch_tensor = torch.utils.dlpack.from_dlpack(capsule_wrapper.capsule)
209 torch_tensor._capsule_wrapper = capsule_wrapper
210 return torch_tensor

Callers 1

Calls 1

create_dlpack_capsuleFunction · 0.85

Tested by

no test coverage detected