A state collection for incremental detokenization. Args: ids_offset: offset to all input ids. In LMDeploy, the output ids length is not one by one. It could be random by random. prev_tokens: for incrementally decoding. Default to None, which means the fir
| 14 | |
| 15 | @dataclass |
| 16 | class DetokenizeState: |
| 17 | """A state collection for incremental detokenization. |
| 18 | |
| 19 | Args: |
| 20 | ids_offset: offset to all input ids. In LMDeploy, the output |
| 21 | ids length is not one by one. It could be random by random. |
| 22 | prev_tokens: for incrementally decoding. |
| 23 | Default to None, which means the first round. |
| 24 | prefix_offset: the start index of tokens to be converted to |
| 25 | string (prev + new tokens). Default to 0 for the first round. |
| 26 | read_offset: the end index of tokens to be converted to |
| 27 | string (prev token). Default to 0 for the first round. |
| 28 | """ |
| 29 | ids_offset: int = 0 |
| 30 | prev_tokens: list[str] | None = None |
| 31 | prefix_offset: int = 0 |
| 32 | read_offset: int = 0 |
| 33 | |
| 34 | def as_tuple(self) -> tuple: |
| 35 | """Return a tuple of states.""" |
| 36 | return (self.ids_offset, self.prev_tokens, self.prefix_offset, self.read_offset) |
| 37 | |
| 38 | |
| 39 | class HuggingFaceTokenizer: |
no outgoing calls