MCPcopy
hub / github.com/hpcaitech/ColossalAI / Sequence

Class Sequence

colossalai/inference/struct.py:62–187  ·  view source on GitHub ↗

Store information of input sequence. Args: request_id (int): The ID of input sequence. prompt (str): The prompt of input sequence. input_token_id (List[int]): The tokens ID of input sequence. block_size (int): The block size of input sequence. sample_para

Source from the content-addressed store, hash-verified

60
61@dataclass
62class Sequence:
63 """Store information of input sequence.
64
65 Args:
66 request_id (int): The ID of input sequence.
67 prompt (str): The prompt of input sequence.
68 input_token_id (List[int]): The tokens ID of input sequence.
69 block_size (int): The block size of input sequence.
70 sample_params (SampleParams): The sample_params of input sequence.
71 block_table (torch.Tensor): The index of input sequence in block_table.
72 eos_token_id (int): The eos token id for this inference process.
73 pad_token_id (int): The pad token id for this inference process.
74 max_output_len (int): Maximum output length.
75 ignore_eos(bool): Whether to ignore the EOS token and continue generating tokens when encountering the EOS token.
76 output(str): The output of sequence
77 """
78
79 request_id: int
80 prompt: str
81 input_token_id: List[int]
82 block_size: int
83 sample_params: Any # SampleParams needs to be imported later.
84 eos_token_id: int
85 pad_token_id: int
86 max_output_len: int = 256
87 # NOTE(caidi) This is a temporary solution. It's better to move the logic to turn on or off the flag in sampling module in future.
88 ignore_eos: bool = False
89 output: str = None
90
91 def __post_init__(self):
92 self.output_token_id = []
93 self.status = RequestStatus.WAITING
94
95 @property
96 def sentence_len(self) -> int:
97 """
98 Get length of current sentence.
99 """
100 return len(self.input_token_id) + len(self.output_token_id)
101
102 @property
103 def input_len(self) -> int:
104 """
105 Get length of input sentence.
106 """
107 return len(self.input_token_id)
108
109 @property
110 def output_len(self) -> int:
111 """
112 Get length of output sentence.
113 """
114 return len(self.output_token_id)
115
116 def check_finish(self) -> bool:
117 """
118 Check whether the inference is finished.
119

Callers 6

add_requestMethod · 0.90
test_bucketFunction · 0.90
check_running_listFunction · 0.90
check_request_handlerFunction · 0.90
test_request_tracerFunction · 0.90

Calls

no outgoing calls

Tested by 5

test_bucketFunction · 0.72
check_running_listFunction · 0.72
check_request_handlerFunction · 0.72
test_request_tracerFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…