Function load_vocab

LanguageNetwork/GPT2/scripts/tokenization.py:123–135 · view source on GitHub ↗

Loads a vocabulary file into a dictionary.

(vocab_file)

Source from the content-addressed store, hash-verified

121
122
123	def load_vocab(vocab_file):
124	"""Loads a vocabulary file into a dictionary."""
125	vocab = collections.OrderedDict()
126	index = 0
127	with tf.io.gfile.GFile(vocab_file, "r") as reader:
128	while True:
129	token = convert_to_unicode(reader.readline())
130	if not token:
131	break
132	token = token.strip()
133	vocab[token] = index
134	index += 1
135	return vocab
136
137
138	def convert_by_vocab(vocab, items):

__init__Method · 0.70

__init__Method · 0.50

convert_to_unicodeFunction · 0.70

no test coverage detected