MCPcopy
hub / github.com/zai-org/CogView / EncodeAsIds

Method EncodeAsIds

data_utils/unified_tokenizer.py:85–90  ·  view source on GitHub ↗
(self, text, process_fn=None)

Source from the content-addressed store, hash-verified

83 return self.EncodeAsIds(inputs, process_fn=process_fn)
84
85 def EncodeAsIds(self, text, process_fn=None):
86 processed_text = text
87 if process_fn is not None:
88 processed_text = process_fn(processed_text)
89 ids = self.txt_tokenizer.encode(processed_text)
90 return [x + self.img_tokenizer.num_tokens for x in ids]
91
92 def DecodeIds(self, ids):
93 ret, img_buffer, txt_buffer, ret_imgs = [], [], [], []

Callers 2

__call__Method · 0.95
parse_queryMethod · 0.95

Calls 2

process_fnFunction · 0.85
encodeMethod · 0.45

Tested by

no test coverage detected