MCPcopy
hub / github.com/MaartenGr/BERTopic / FastEmbedBackend

Class FastEmbedBackend

bertopic/backend/_fastembed.py:8–54  ·  view source on GitHub ↗

FastEmbed embedding model. The FastEmbed embedding model used for generating sentence embeddings. Arguments: embedding_model: A FastEmbed embedding model Examples: To create a model, you can load in a string pointing to a supported FastEmbed model: ```python f

Source from the content-addressed store, hash-verified

6
7
8class FastEmbedBackend(BaseEmbedder):
9 """FastEmbed embedding model.
10
11 The FastEmbed embedding model used for generating sentence embeddings.
12
13 Arguments:
14 embedding_model: A FastEmbed embedding model
15
16 Examples:
17 To create a model, you can load in a string pointing to a supported
18 FastEmbed model:
19
20 ```python
21 from bertopic.backend import FastEmbedBackend
22
23 sentence_model = FastEmbedBackend("BAAI/bge-small-en-v1.5")
24 ```
25 """
26
27 def __init__(self, embedding_model: str = "BAAI/bge-small-en-v1.5"):
28 super().__init__()
29
30 supported_models = [m["model"] for m in TextEmbedding.list_supported_models()]
31
32 if isinstance(embedding_model, str) and embedding_model in supported_models:
33 self.embedding_model = TextEmbedding(model_name=embedding_model)
34 else:
35 raise ValueError(
36 "Please select a correct FasteEmbed model: \n"
37 "the model must be a string and must be supported. \n"
38 "The supported TextEmbedding model list is here: https://qdrant.github.io/fastembed/examples/Supported_Models/"
39 )
40
41 def embed(self, documents: List[str], verbose: bool = False) -> np.ndarray:
42 """Embed a list of n documents/words into an n-dimensional
43 matrix of embeddings.
44
45 Arguments:
46 documents: A list of documents or words to be embedded
47 verbose: Controls the verbosity of the process
48
49 Returns:
50 Document/words embeddings with shape (n, m) with `n` documents/words
51 that each have an embeddings size of `m`
52 """
53 embeddings = np.array(list(self.embedding_model.embed(documents, show_progress_bar=verbose)))
54 return embeddings

Callers 1

select_backendFunction · 0.90

Calls

no outgoing calls

Tested by

no test coverage detected