MCPcopy
hub / github.com/MaartenGr/BERTopic / _update_topic_size

Method _update_topic_size

bertopic/_bertopic.py:4455–4462  ·  view source on GitHub ↗

Calculate the topic sizes. Arguments: documents: Updated dataframe with documents and their corresponding IDs and newly added Topics

(self, documents: pd.DataFrame)

Source from the content-addressed store, hash-verified

4453 return c_tf_idf, words
4454
4455 def _update_topic_size(self, documents: pd.DataFrame):
4456 """Calculate the topic sizes.
4457
4458 Arguments:
4459 documents: Updated dataframe with documents and their corresponding IDs and newly added Topics
4460 """
4461 self.topic_sizes_ = collections.Counter(documents.Topic.values.tolist())
4462 self.topics_ = documents.Topic.astype(int).tolist()
4463
4464 def _extract_words_per_topic(
4465 self,

Callers 12

partial_fitMethod · 0.95
update_topicsMethod · 0.95
merge_topicsMethod · 0.95
delete_topicsMethod · 0.95
_cluster_embeddingsMethod · 0.95
_reduce_to_n_topicsMethod · 0.95
_auto_reduce_topicsMethod · 0.95
test_extract_topicsFunction · 0.80

Calls

no outgoing calls

Tested by 3

test_extract_topicsFunction · 0.64