MCPcopy
hub / github.com/VectifyAI/PageIndex / generate_toc_continue

Function generate_toc_continue

pageindex/page_index.py:507–539  ·  view source on GitHub ↗
(toc_content, part, model=None)

Source from the content-addressed store, hash-verified

505
506### add verify completeness
507def generate_toc_continue(toc_content, part, model=None):
508 print('start generate_toc_continue')
509 prompt = """
510 You are an expert in extracting hierarchical tree structure.
511 You are given a tree structure of the previous part and the text of the current part.
512 Your task is to continue the tree structure from the previous part to include the current part.
513
514 The structure variable is the numeric system which represents the index of the hierarchy section in the table of contents. For example, the first section has structure index 1, the first subsection has structure index 1.1, the second subsection has structure index 1.2, etc.
515
516 For the title, you need to extract the original title from the text, only fix the space inconsistency.
517
518 The provided text contains tags like <physical_index_X> and <physical_index_X> to indicate the start and end of page X. \
519
520 For the physical_index, you need to extract the physical index of the start of the section from the text. Keep the <physical_index_X> format.
521
522 The response should be in the following format.
523 [
524 {
525 "structure": <structure index, "x.x.x"> (string),
526 "title": <title of the section, keep the original title>,
527 "physical_index": "<physical_index_X> (keep the format)"
528 },
529 ...
530 ]
531
532 Directly return the additional part of the final JSON structure. Do not output anything else."""
533
534 prompt = prompt + '\nGiven text\n:' + part + '\nPrevious tree structure\n:' + json.dumps(toc_content, indent=2)
535 response, finish_reason = llm_completion(model=model, prompt=prompt, return_finish_reason=True)
536 if finish_reason == 'finished':
537 return extract_json(response)
538 else:
539 raise Exception(f'finish reason: {finish_reason}')
540
541### add verify completeness
542def generate_toc_init(part, model=None):

Callers 1

process_no_tocFunction · 0.85

Calls 2

llm_completionFunction · 0.85
extract_jsonFunction · 0.85

Tested by

no test coverage detected