MCPcopy
hub / github.com/deepspeedai/DeepSpeedExamples / extend

Method extend

Megatron-LM/data_utils/tokenization.py:115–130  ·  view source on GitHub ↗
(self, other)

Source from the content-addressed store, hash-verified

113 return self
114
115 def extend(self, other):
116 if isinstance(other, (CommandToken, TypeToken)):
117 self.tokenization.append(other.Id)
118 self.text += other.token
119 self.original_text += other.token
120 elif isinstance(other, list) and isinstance(other[0], (CommandToken, TypeToken)):
121 self.tokenization.extend([o.Id for o in other])
122 self.text += [o.token for o in other]
123 self.original_text += [o.token for o in other]
124 elif isinstance(other, Tokenization):
125 self.tokenization.extend(other.tokenization)
126 self.text += other.text
127 self.original_text += other.original_text
128 else:
129 self.tokenization.extend(other)
130 return self
131
132"""define some default command tokens for the tokenizer to use"""
133token_format = "<{0}>"

Callers 15

get_train_datasetFunction · 0.80
tokenizeMethod · 0.80
tokenizeMethod · 0.80
__init__Method · 0.80
__init__Method · 0.80
__init__Method · 0.80
__init__Method · 0.80
__getitem__Method · 0.80
tokenizeMethod · 0.80
tokenizeMethod · 0.80
generate_samplesFunction · 0.80

Calls 1

appendMethod · 0.80

Tested by

no test coverage detected