hub / github.com/hpcaitech/ColossalAI / Sequence

Class Sequence

colossalai/inference/struct.py:62–187 · view source on GitHub ↗

Store information of input sequence. Args: request_id (int): The ID of input sequence. prompt (str): The prompt of input sequence. input_token_id (List[int]): The tokens ID of input sequence. block_size (int): The block size of input sequence. sample_para

Source from the content-addressed store, hash-verified

60
61	@dataclass
62	class Sequence:
63	"""Store information of input sequence.
64
65	Args:
66	request_id (int): The ID of input sequence.
67	prompt (str): The prompt of input sequence.
68	input_token_id (List[int]): The tokens ID of input sequence.
69	block_size (int): The block size of input sequence.
70	sample_params (SampleParams): The sample_params of input sequence.
71	block_table (torch.Tensor): The index of input sequence in block_table.
72	eos_token_id (int): The eos token id for this inference process.
73	pad_token_id (int): The pad token id for this inference process.
74	max_output_len (int): Maximum output length.
75	ignore_eos(bool): Whether to ignore the EOS token and continue generating tokens when encountering the EOS token.
76	output(str): The output of sequence
77	"""
78
79	request_id: int
80	prompt: str
81	input_token_id: List[int]
82	block_size: int
83	sample_params: Any # SampleParams needs to be imported later.
84	eos_token_id: int
85	pad_token_id: int
86	max_output_len: int = 256
87	# NOTE(caidi) This is a temporary solution. It's better to move the logic to turn on or off the flag in sampling module in future.
88	ignore_eos: bool = False
89	output: str = None
90
91	def __post_init__(self):
92	self.output_token_id = []
93	self.status = RequestStatus.WAITING
94
95	@property
96	def sentence_len(self) -> int:
97	"""
98	Get length of current sentence.
99	"""
100	return len(self.input_token_id) + len(self.output_token_id)
101
102	@property
103	def input_len(self) -> int:
104	"""
105	Get length of input sentence.
106	"""
107	return len(self.input_token_id)
108
109	@property
110	def output_len(self) -> int:
111	"""
112	Get length of output sentence.
113	"""
114	return len(self.output_token_id)
115
116	def check_finish(self) -> bool:
117	"""
118	Check whether the inference is finished.
119

Callers 6

add_requestMethod · 0.90

test_bucketFunction · 0.90

check_config_and_inferenceFunction · 0.90

check_running_listFunction · 0.90

check_request_handlerFunction · 0.90

test_request_tracerFunction · 0.90

Calls

no outgoing calls

Tested by 5

test_bucketFunction · 0.72

check_config_and_inferenceFunction · 0.72

check_running_listFunction · 0.72

check_request_handlerFunction · 0.72

test_request_tracerFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…