Return the number of times a term occurs within a given document. @params: term, the term to search a document for, and document, the document to search within @returns: an integer representing the number of times a term is found within the document @exa
(term: str, document: str)
| 41 | |
| 42 | |
| 43 | def term_frequency(term: str, document: str) -> int: |
| 44 | """ |
| 45 | Return the number of times a term occurs within |
| 46 | a given document. |
| 47 | @params: term, the term to search a document for, and document, |
| 48 | the document to search within |
| 49 | @returns: an integer representing the number of times a term is |
| 50 | found within the document |
| 51 | |
| 52 | @examples: |
| 53 | >>> term_frequency("to", "To be, or not to be") |
| 54 | 2 |
| 55 | """ |
| 56 | # strip all punctuation and newlines and replace it with '' |
| 57 | document_without_punctuation = document.translate( |
| 58 | str.maketrans("", "", string.punctuation) |
| 59 | ).replace("\n", "") |
| 60 | tokenize_document = document_without_punctuation.split(" ") # word tokenization |
| 61 | return len([word for word in tokenize_document if word.lower() == term.lower()]) |
| 62 | |
| 63 | |
| 64 | def document_frequency(term: str, corpus: str) -> tuple[int, int]: |