MCPcopy
hub / github.com/MaartenGr/BERTopic / get_representative_docs

Method get_representative_docs

bertopic/_bertopic.py:1826–1869  ·  view source on GitHub ↗

Extract the best representing documents per topic. Note: This does not extract all documents per topic as all documents are not saved within BERTopic. To get all documents, please run the following: ```python # When you used `.fit

(self, topic: int | None = None)

Source from the content-addressed store, hash-verified

1824 return document_info
1825
1826 def get_representative_docs(self, topic: int | None = None) -> List[str]:
1827 """Extract the best representing documents per topic.
1828
1829 Note:
1830 This does not extract all documents per topic as all documents
1831 are not saved within BERTopic. To get all documents, please
1832 run the following:
1833
1834 ```python
1835 # When you used `.fit_transform`:
1836 df = pd.DataFrame({"Document": docs, "Topic": topic})
1837
1838 # When you used `.fit`:
1839 df = pd.DataFrame({"Document": docs, "Topic": topic_model.topics_})
1840 ```
1841
1842 Arguments:
1843 topic: A specific topic for which you want
1844 the representative documents
1845
1846 Returns:
1847 Representative documents of the chosen topic
1848
1849 Examples:
1850 To extract the representative docs of all topics:
1851
1852 ```python
1853 representative_docs = topic_model.get_representative_docs()
1854 ```
1855
1856 To get the representative docs of a single topic:
1857
1858 ```python
1859 representative_docs = topic_model.get_representative_docs(12)
1860 ```
1861 """
1862 check_is_fitted(self)
1863 if isinstance(topic, int):
1864 if self.representative_docs_.get(topic):
1865 return self.representative_docs_[topic]
1866 else:
1867 return None
1868 else:
1869 return self.representative_docs_
1870
1871 @staticmethod
1872 def get_topic_tree(

Callers 1

Calls 1

check_is_fittedFunction · 0.90

Tested by 1