hub / github.com/langroid/langroid / iterate_pages

Method iterate_pages

langroid/parsing/document_parser.py:911–924 · view source on GitHub ↗

Simulate iterating through pages. In a DOCX file, pages are not explicitly defined, so we consider each paragraph as a separate 'page' for simplicity.

(self)

Source from the content-addressed store, hash-verified

909	"""
910
911	def iterate_pages(self) -> Generator[Tuple[int, Any], None, None]:
912	"""
913	Simulate iterating through pages.
914	In a DOCX file, pages are not explicitly defined,
915	so we consider each paragraph as a separate 'page' for simplicity.
916	"""
917	try:
918	import docx
919	except ImportError:
920	raise LangroidImportError("python-docx", "docx")
921
922	doc = docx.Document(self.doc_bytes)
923	for i, para in enumerate(doc.paragraphs, start=1):
924	yield i, [para]
925
926	def get_document_from_page(self, page: Any) -> Document:
927	"""

Callers

nothing calls this directly

Calls 1

LangroidImportErrorClass · 0.90

Tested by

no test coverage detected