Function get_document

pageindex/retrieve.py:81–97 · view source on GitHub ↗

Return JSON with document metadata: doc_id, doc_name, doc_description, type, status, page_count (PDF) or line_count (Markdown).

(documents: dict, doc_id: str)

Source from the content-addressed store, hash-verified

79	# ── Tool functions ────────────────────────────────────────────────────────────
80
81	def get_document(documents: dict, doc_id: str) -> str:
82	"""Return JSON with document metadata: doc_id, doc_name, doc_description, type, status, page_count (PDF) or line_count (Markdown)."""
83	doc_info = documents.get(doc_id)
84	if not doc_info:
85	return json.dumps({'error': f'Document {doc_id} not found'})
86	result = {
87	'doc_id': doc_id,
88	'doc_name': doc_info.get('doc_name', ''),
89	'doc_description': doc_info.get('doc_description', ''),
90	'type': doc_info.get('type', ''),
91	'status': 'completed',
92	}
93	if doc_info.get('type') == 'pdf':
94	result['page_count'] = _count_pages(doc_info)
95	else:
96	result['line_count'] = doc_info.get('line_count', 0)
97	return json.dumps(result)
98
99
100	def get_document_structure(documents: dict, doc_id: str) -> str:

get_documentMethod · 0.70

_count_pagesFunction · 0.85

no test coverage detected