MCPcopy
hub / github.com/langroid/langroid / iterate_pages

Method iterate_pages

langroid/parsing/document_parser.py:911–924  ·  view source on GitHub ↗

Simulate iterating through pages. In a DOCX file, pages are not explicitly defined, so we consider each paragraph as a separate 'page' for simplicity.

(self)

Source from the content-addressed store, hash-verified

909 """
910
911 def iterate_pages(self) -> Generator[Tuple[int, Any], None, None]:
912 """
913 Simulate iterating through pages.
914 In a DOCX file, pages are not explicitly defined,
915 so we consider each paragraph as a separate 'page' for simplicity.
916 """
917 try:
918 import docx
919 except ImportError:
920 raise LangroidImportError("python-docx", "docx")
921
922 doc = docx.Document(self.doc_bytes)
923 for i, para in enumerate(doc.paragraphs, start=1):
924 yield i, [para]
925
926 def get_document_from_page(self, page: Any) -> Document:
927 """

Callers

nothing calls this directly

Calls 1

LangroidImportErrorClass · 0.90

Tested by

no test coverage detected