MCPcopy Index your code
hub / github.com/NVIDIA/Megatron-LM / tokenize

Method tokenize

tools/preprocess_data.py:41–42  ·  view source on GitHub ↗
(self, *text)

Source from the content-addressed store, hash-verified

39
40class IdentitySplitter(object):
41 def tokenize(self, *text):
42 return text
43
44
45class Encoder(object):

Callers 6

encodeMethod · 0.45
splitMethod · 0.45
encodeMethod · 0.45
__getitem__Method · 0.45
text_to_bertMethod · 0.45
build_partial_dbFunction · 0.45

Calls

no outgoing calls

Tested by

no test coverage detected