Visualize the ranks of all terms across all topics. Each topic is represented by a set of words. These words, however, do not all equally represent the topic. This visualization shows how many words are needed to represent a topic and at which point the beneficial effect of adding w
(
topic_model,
topics: List[int] | None = None,
log_scale: bool = False,
custom_labels: Union[bool, str] = False,
title: str = "<b>Term score decline per Topic</b>",
width: int = 800,
height: int = 500,
)
| 4 | |
| 5 | |
| 6 | def visualize_term_rank( |
| 7 | topic_model, |
| 8 | topics: List[int] | None = None, |
| 9 | log_scale: bool = False, |
| 10 | custom_labels: Union[bool, str] = False, |
| 11 | title: str = "<b>Term score decline per Topic</b>", |
| 12 | width: int = 800, |
| 13 | height: int = 500, |
| 14 | ) -> go.Figure: |
| 15 | """Visualize the ranks of all terms across all topics. |
| 16 | |
| 17 | Each topic is represented by a set of words. These words, however, |
| 18 | do not all equally represent the topic. This visualization shows |
| 19 | how many words are needed to represent a topic and at which point |
| 20 | the beneficial effect of adding words starts to decline. |
| 21 | |
| 22 | Arguments: |
| 23 | topic_model: A fitted BERTopic instance. |
| 24 | topics: A selection of topics to visualize. These will be colored |
| 25 | red where all others will be colored black. |
| 26 | log_scale: Whether to represent the ranking on a log scale |
| 27 | custom_labels: If bool, whether to use custom topic labels that were defined using |
| 28 | `topic_model.set_topic_labels`. |
| 29 | If `str`, it uses labels from other aspects, e.g., "Aspect1". |
| 30 | title: Title of the plot. |
| 31 | width: The width of the figure. |
| 32 | height: The height of the figure. |
| 33 | |
| 34 | Returns: |
| 35 | fig: A plotly figure |
| 36 | |
| 37 | Examples: |
| 38 | To visualize the ranks of all words across |
| 39 | all topics simply run: |
| 40 | |
| 41 | ```python |
| 42 | topic_model.visualize_term_rank() |
| 43 | ``` |
| 44 | |
| 45 | Or if you want to save the resulting figure: |
| 46 | |
| 47 | ```python |
| 48 | fig = topic_model.visualize_term_rank() |
| 49 | fig.write_html("path/to/file.html") |
| 50 | ``` |
| 51 | |
| 52 | <iframe src="../../getting_started/visualization/term_rank.html" |
| 53 | style="width:1000px; height: 530px; border: 0px;""></iframe> |
| 54 | |
| 55 | <iframe src="../../getting_started/visualization/term_rank_log.html" |
| 56 | style="width:1000px; height: 530px; border: 0px;""></iframe> |
| 57 | |
| 58 | Reference: |
| 59 | |
| 60 | This visualization was heavily inspired by the |
| 61 | "Term Probability Decline" visualization found in an |
| 62 | analysis by the amazing [tmtoolkit](https://tmtoolkit.readthedocs.io/). |
| 63 | Reference to that specific analysis can be found |
nothing calls this directly
no test coverage detected