Parameters: input : Tensor (On GPU) The input tensor. Its shape is [batch_size, seq_len, dim] or [num_tokens, dim] for remove_input_padding conv_state_or_ptr : Tensor (On GPU or CPU) The conv state tensor. Its shape is [batch_size, dconv - 1, dim]
(input: Tensor,
conv_state_or_ptr: Tensor,
conv_weight: Tensor,
conv_bias: Tensor,
host_request_types: Tensor,
last_token_ids: Tensor,
dim: int,
dconv: int,
dtype: str,
pre_stride: int = 0,
post_stride: int = 0,
host_context_lengths: Optional[Tensor] = None,
slot_mapping: Optional[Tensor] = None,
apply_silu: bool = True)
| 6837 | |
| 6838 | |
| 6839 | def mamba_conv1d(input: Tensor, |
| 6840 | conv_state_or_ptr: Tensor, |
| 6841 | conv_weight: Tensor, |
| 6842 | conv_bias: Tensor, |
| 6843 | host_request_types: Tensor, |
| 6844 | last_token_ids: Tensor, |
| 6845 | dim: int, |
| 6846 | dconv: int, |
| 6847 | dtype: str, |
| 6848 | pre_stride: int = 0, |
| 6849 | post_stride: int = 0, |
| 6850 | host_context_lengths: Optional[Tensor] = None, |
| 6851 | slot_mapping: Optional[Tensor] = None, |
| 6852 | apply_silu: bool = True): |
| 6853 | ''' |
| 6854 | Parameters: |
| 6855 | input : Tensor (On GPU) |
| 6856 | The input tensor. Its shape is [batch_size, seq_len, dim] or [num_tokens, dim] for remove_input_padding |
| 6857 | |
| 6858 | conv_state_or_ptr : Tensor (On GPU or CPU) |
| 6859 | The conv state tensor. Its shape is [batch_size, dconv - 1, dim] |
| 6860 | Or the CPU tensor of shape [1] for the pointer of paged states. |
| 6861 | |
| 6862 | conv_weight : Tensor (On GPU) |
| 6863 | The weight tensor. Its shape is [1, dconv, dim] |
| 6864 | |
| 6865 | conv_bias : Tensor (On GPU) |
| 6866 | The bias tensor. Its shape is [dim] |
| 6867 | |
| 6868 | host_request_types : Tensor (On CPU) |
| 6869 | The tensor on the host that indicates if a request is in context or |
| 6870 | generation phase. Its shape is [batch_size]. See Inflight Batching |
| 6871 | in docs/source/advanced/gpt-attention.md, |
| 6872 | |
| 6873 | last_token_ids : Tensor (On GPU) |
| 6874 | The inclusive prefix-sum of the lengths or the lengths of the |
| 6875 | sequences in the batch. |
| 6876 | |
| 6877 | dim : int |
| 6878 | The hidden dimension of conv1d |
| 6879 | |
| 6880 | dconv : int |
| 6881 | The window size of conv1d |
| 6882 | |
| 6883 | dtype: str |
| 6884 | data type |
| 6885 | |
| 6886 | pre_stride : int = 0 |
| 6887 | The (pre) stride size of the input tensor. |
| 6888 | The valid values of the input tensor are input[..., pre_stride: dim-post_stride] |
| 6889 | |
| 6890 | post_stride : int = 0 |
| 6891 | The (post) stride size of the input tensor. |
| 6892 | The valid values of the input tensor are input[..., pre_stride: dim-post_stride] |
| 6893 | |
| 6894 | host_context_lengths: Tensor (On CPU) (Optional) |
| 6895 | A host tensor that contains the lengths of the different inputs, |
| 6896 |
no test coverage detected