MCPcopy
hub / github.com/MaartenGr/BERTopic / _save_representative_docs

Method _save_representative_docs

bertopic/_bertopic.py:4217–4233  ·  view source on GitHub ↗

Save the 3 most representative docs per topic. Arguments: documents: Dataframe with documents and their corresponding IDs Updates: self.representative_docs_: Populate each topic with 3 representative docs

(self, documents: pd.DataFrame)

Source from the content-addressed store, hash-verified

4215 logger.info("Representation - Completed \u2713")
4216
4217 def _save_representative_docs(self, documents: pd.DataFrame):
4218 """Save the 3 most representative docs per topic.
4219
4220 Arguments:
4221 documents: Dataframe with documents and their corresponding IDs
4222
4223 Updates:
4224 self.representative_docs_: Populate each topic with 3 representative docs
4225 """
4226 repr_docs, _, _, _ = self._extract_representative_docs(
4227 self.c_tf_idf_,
4228 documents,
4229 self.topic_representations_,
4230 nr_samples=500,
4231 nr_repr_docs=3,
4232 )
4233 self.representative_docs_ = repr_docs
4234
4235 def _extract_representative_docs(
4236 self,

Callers 3

fit_transformMethod · 0.95
merge_topicsMethod · 0.95
reduce_topicsMethod · 0.95

Calls 1

Tested by

no test coverage detected