MCPcopy
hub / github.com/jaymody/picoGPT / encode

Method encode

encoder.py:101–106  ·  view source on GitHub ↗
(self, text)

Source from the content-addressed store, hash-verified

99 return word
100
101 def encode(self, text):
102 bpe_tokens = []
103 for token in re.findall(self.pat, text):
104 token = "".join(self.byte_encoder[b] for b in token.encode("utf-8"))
105 bpe_tokens.extend(self.encoder[bpe_token] for bpe_token in self.bpe(token).split(" "))
106 return bpe_tokens
107
108 def decode(self, tokens):
109 text = "".join([self.decoder[token] for token in tokens])

Callers 2

mainFunction · 0.80
mainFunction · 0.80

Calls 1

bpeMethod · 0.95

Tested by

no test coverage detected