MCPcopy
hub / github.com/MaartenGr/BERTopic / save

Method save

bertopic/_bertopic.py:3416–3531  ·  view source on GitHub ↗

Saves the model to the specified path or folder. When saving the model, make sure to also keep track of the versions of dependencies and Python used. Loading and saving the model should be done using the same dependencies and Python. Moreover, models saved in one ver

(
        self,
        path,
        serialization: Literal["safetensors", "pickle", "pytorch"] = "pickle",
        save_embedding_model: Union[bool, str] = True,
        save_ctfidf: bool = False,
    )

Source from the content-addressed store, hash-verified

3414 )
3415
3416 def save(
3417 self,
3418 path,
3419 serialization: Literal["safetensors", "pickle", "pytorch"] = "pickle",
3420 save_embedding_model: Union[bool, str] = True,
3421 save_ctfidf: bool = False,
3422 ):
3423 """Saves the model to the specified path or folder.
3424
3425 When saving the model, make sure to also keep track of the versions
3426 of dependencies and Python used. Loading and saving the model should
3427 be done using the same dependencies and Python. Moreover, models
3428 saved in one version of BERTopic should not be loaded in other versions.
3429
3430 Arguments:
3431 path: If `serialization` is 'safetensors' or `pytorch`, this is a directory.
3432 If `serialization` is `pickle`, then this is a file.
3433 serialization: If `pickle`, the entire model will be pickled. If `safetensors`
3434 or `pytorch` the model will be saved without the embedding,
3435 dimensionality reduction, and clustering algorithms.
3436 This is a very efficient format and typically advised.
3437 save_embedding_model: If serialization is `pickle`, then you can choose to skip
3438 saving the embedding model. If serialization is `safetensors`
3439 or `pytorch`, this variable can be used as a string pointing
3440 towards a huggingface model.
3441 save_ctfidf: Whether to save c-TF-IDF information if serialization is `safetensors`
3442 or `pytorch`
3443
3444 Examples:
3445 To save the model in an efficient and safe format (safetensors) with c-TF-IDF information:
3446
3447 ```python
3448 topic_model.save("model_dir", serialization="safetensors", save_ctfidf=True)
3449 ```
3450
3451 If you wish to also add a pointer to the embedding model, which will be downloaded from
3452 HuggingFace upon loading:
3453
3454 ```python
3455 embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
3456 topic_model.save("model_dir", serialization="safetensors", save_embedding_model=embedding_model)
3457 ```
3458
3459 or if you want save the full model with pickle:
3460
3461 ```python
3462 topic_model.save("my_model")
3463 ```
3464
3465 NOTE: Pickle can run arbitrary code and is generally considered to be less safe than
3466 safetensors.
3467 """
3468 if serialization == "pickle":
3469 logger.warning(
3470 "When you use `pickle` to save/load a BERTopic model,"
3471 "please make sure that the environments in which you save"
3472 "and load the model are **exactly** the same. The version of BERTopic,"
3473 "its dependencies, and python need to remain the same."

Callers 7

test_load_save_modelFunction · 0.95
test_full_modelFunction · 0.80
push_to_hf_hubFunction · 0.80
save_hfFunction · 0.80
save_ctfidfFunction · 0.80
save_imagesFunction · 0.80
merge_modelsMethod · 0.80

Calls 1

warningMethod · 0.80

Tested by 2

test_load_save_modelFunction · 0.76
test_full_modelFunction · 0.64