Function whitespace_tokenize

tokenizations/tokenization_bert_word_level.py:80–86 · view source on GitHub ↗

Runs basic whitespace cleaning and splitting on a piece of text.

(text)

Source from the content-addressed store, hash-verified

78
79
80	def whitespace_tokenize(text):
81	"""Runs basic whitespace cleaning and splitting on a piece of text."""
82	text = text.strip()
83	if not text:
84	return []
85	tokens = text.split()
86	return tokens
87
88
89	class BertTokenizer(PreTrainedTokenizer):

tokenizeMethod · 0.70

no outgoing calls

no test coverage detected