MCPcopy
hub / github.com/MemPalace/mempalace / _tokenize

Function _tokenize

mempalace/searcher.py:63–72  ·  view source on GitHub ↗

Lowercase + strip to alphanumeric tokens of length ≥ 2. Tolerates ``None`` documents — Chroma can return ``None`` in the ``documents`` field for drawers without text content, which would otherwise raise ``AttributeError`` mid-rerank.

(text: str)

Source from the content-addressed store, hash-verified

61
62
63def _tokenize(text: str) -> list:
64 """Lowercase + strip to alphanumeric tokens of length ≥ 2.
65
66 Tolerates ``None`` documents — Chroma can return ``None`` in the
67 ``documents`` field for drawers without text content, which would
68 otherwise raise ``AttributeError`` mid-rerank.
69 """
70 if not text:
71 return []
72 return _TOKEN_RE.findall(text.lower())
73
74
75def _bm25_scores(

Callers 5

_bm25_scoresFunction · 0.70
_bm25_only_via_sqliteFunction · 0.70
search_memoriesFunction · 0.70

Calls

no outgoing calls