Method token_counter

llmware/models.py:6130–6140 · view source on GitHub ↗

Uses default GPT2 tokenizer for fast, approximate token count, if needed.

(self, text_sample)

Source from the content-addressed store, hash-verified

6128	return self.api_key
6129
6130	def token_counter(self, text_sample):
6131
6132	""" Uses default GPT2 tokenizer for fast, approximate token count, if needed. """
6133
6134	# note: this is an approximation for counting the input tokens using a default tokenizer
6135	# --to get 100% accurate, need to use the tokenizer being applied on the 'ollama' decoding
6136
6137	tokenizer = Utilities().get_default_tokenizer()
6138	toks = tokenizer.encode(text_sample).ids
6139
6140	return len(toks)
6141
6142	def prompt_engineer (self, query, context, inference_dict=None):
6143

inferenceMethod · 0.95

UtilitiesClass · 0.90

get_default_tokenizerMethod · 0.80

encodeMethod · 0.80

no test coverage detected