Store information of input sequence. Args: request_id (int): The ID of input sequence. prompt (str): The prompt of input sequence. input_token_id (List[int]): The tokens ID of input sequence. block_size (int): The block size of input sequence. sample_para
| 60 | |
| 61 | @dataclass |
| 62 | class Sequence: |
| 63 | """Store information of input sequence. |
| 64 | |
| 65 | Args: |
| 66 | request_id (int): The ID of input sequence. |
| 67 | prompt (str): The prompt of input sequence. |
| 68 | input_token_id (List[int]): The tokens ID of input sequence. |
| 69 | block_size (int): The block size of input sequence. |
| 70 | sample_params (SampleParams): The sample_params of input sequence. |
| 71 | block_table (torch.Tensor): The index of input sequence in block_table. |
| 72 | eos_token_id (int): The eos token id for this inference process. |
| 73 | pad_token_id (int): The pad token id for this inference process. |
| 74 | max_output_len (int): Maximum output length. |
| 75 | ignore_eos(bool): Whether to ignore the EOS token and continue generating tokens when encountering the EOS token. |
| 76 | output(str): The output of sequence |
| 77 | """ |
| 78 | |
| 79 | request_id: int |
| 80 | prompt: str |
| 81 | input_token_id: List[int] |
| 82 | block_size: int |
| 83 | sample_params: Any # SampleParams needs to be imported later. |
| 84 | eos_token_id: int |
| 85 | pad_token_id: int |
| 86 | max_output_len: int = 256 |
| 87 | # NOTE(caidi) This is a temporary solution. It's better to move the logic to turn on or off the flag in sampling module in future. |
| 88 | ignore_eos: bool = False |
| 89 | output: str = None |
| 90 | |
| 91 | def __post_init__(self): |
| 92 | self.output_token_id = [] |
| 93 | self.status = RequestStatus.WAITING |
| 94 | |
| 95 | @property |
| 96 | def sentence_len(self) -> int: |
| 97 | """ |
| 98 | Get length of current sentence. |
| 99 | """ |
| 100 | return len(self.input_token_id) + len(self.output_token_id) |
| 101 | |
| 102 | @property |
| 103 | def input_len(self) -> int: |
| 104 | """ |
| 105 | Get length of input sentence. |
| 106 | """ |
| 107 | return len(self.input_token_id) |
| 108 | |
| 109 | @property |
| 110 | def output_len(self) -> int: |
| 111 | """ |
| 112 | Get length of output sentence. |
| 113 | """ |
| 114 | return len(self.output_token_id) |
| 115 | |
| 116 | def check_finish(self) -> bool: |
| 117 | """ |
| 118 | Check whether the inference is finished. |
| 119 |
no outgoing calls
searching dependent graphs…