Code
Hub
Workspaces
Connect
Indexed graphs
Engine
MCP
copy
Index your code
hub
/
github.com/NVIDIA/Megatron-LM
/ tokenize
Method
tokenize
tools/preprocess_data.py:41–42 ·
view source on GitHub ↗
(self, *text)
Source
from the content-addressed store, hash-verified
39
40
class
IdentitySplitter(object):
41
def
tokenize(self, *text):
42
return
text
43
44
45
class
Encoder(object):
Callers
6
encode
Method · 0.45
split
Method · 0.45
encode
Method · 0.45
__getitem__
Method · 0.45
text_to_bert
Method · 0.45
build_partial_db
Function · 0.45
Calls
no outgoing calls
Tested by
no test coverage detected