hub / github.com/InternLM/lmdeploy / DetokenizeState

Class DetokenizeState

lmdeploy/tokenizer.py:16–36 · view source on GitHub ↗

A state collection for incremental detokenization. Args: ids_offset: offset to all input ids. In LMDeploy, the output ids length is not one by one. It could be random by random. prev_tokens: for incrementally decoding. Default to None, which means the fir

Source from the content-addressed store, hash-verified

14
15	@dataclass
16	class DetokenizeState:
17	"""A state collection for incremental detokenization.
18
19	Args:
20	ids_offset: offset to all input ids. In LMDeploy, the output
21	ids length is not one by one. It could be random by random.
22	prev_tokens: for incrementally decoding.
23	Default to None, which means the first round.
24	prefix_offset: the start index of tokens to be converted to
25	string (prev + new tokens). Default to 0 for the first round.
26	read_offset: the end index of tokens to be converted to
27	string (prev token). Default to 0 for the first round.
28	"""
29	ids_offset: int = 0
30	prev_tokens: list[str] \| None = None
31	prefix_offset: int = 0
32	read_offset: int = 0
33
34	def as_tuple(self) -> tuple:
35	"""Return a tuple of states."""
36	return (self.ids_offset, self.prev_tokens, self.prefix_offset, self.read_offset)
37
38
39	class HuggingFaceTokenizer:

Callers 4

generateMethod · 0.90

_inferenceMethod · 0.90

test_tokenizerFunction · 0.90

detokenize_incrementallyMethod · 0.85

Calls

no outgoing calls

Tested by 1

test_tokenizerFunction · 0.72