MCPcopy
hub / github.com/InternLM/lmdeploy / DetokenizeState

Class DetokenizeState

lmdeploy/tokenizer.py:16–36  ·  view source on GitHub ↗

A state collection for incremental detokenization. Args: ids_offset: offset to all input ids. In LMDeploy, the output ids length is not one by one. It could be random by random. prev_tokens: for incrementally decoding. Default to None, which means the fir

Source from the content-addressed store, hash-verified

14
15@dataclass
16class DetokenizeState:
17 """A state collection for incremental detokenization.
18
19 Args:
20 ids_offset: offset to all input ids. In LMDeploy, the output
21 ids length is not one by one. It could be random by random.
22 prev_tokens: for incrementally decoding.
23 Default to None, which means the first round.
24 prefix_offset: the start index of tokens to be converted to
25 string (prev + new tokens). Default to 0 for the first round.
26 read_offset: the end index of tokens to be converted to
27 string (prev token). Default to 0 for the first round.
28 """
29 ids_offset: int = 0
30 prev_tokens: list[str] | None = None
31 prefix_offset: int = 0
32 read_offset: int = 0
33
34 def as_tuple(self) -> tuple:
35 """Return a tuple of states."""
36 return (self.ids_offset, self.prev_tokens, self.prefix_offset, self.read_offset)
37
38
39class HuggingFaceTokenizer:

Callers 4

generateMethod · 0.90
_inferenceMethod · 0.90
test_tokenizerFunction · 0.90

Calls

no outgoing calls

Tested by 1

test_tokenizerFunction · 0.72