MCPcopy
hub / github.com/VectifyAI/PageIndex / check_toc

Function check_toc

pageindex/page_index.py:696–732  ·  view source on GitHub ↗
(page_list, opt=None)

Source from the content-addressed store, hash-verified

694
695
696def check_toc(page_list, opt=None):
697 toc_page_list = find_toc_pages(start_page_index=0, page_list=page_list, opt=opt)
698 if len(toc_page_list) == 0:
699 print('no toc found')
700 return {'toc_content': None, 'toc_page_list': [], 'page_index_given_in_toc': 'no'}
701 else:
702 print('toc found')
703 toc_json = toc_extractor(page_list, toc_page_list, opt.model)
704
705 if toc_json['page_index_given_in_toc'] == 'yes':
706 print('index found')
707 return {'toc_content': toc_json['toc_content'], 'toc_page_list': toc_page_list, 'page_index_given_in_toc': 'yes'}
708 else:
709 current_start_index = toc_page_list[-1] + 1
710
711 while (toc_json['page_index_given_in_toc'] == 'no' and
712 current_start_index < len(page_list) and
713 current_start_index < opt.toc_check_page_num):
714
715 additional_toc_pages = find_toc_pages(
716 start_page_index=current_start_index,
717 page_list=page_list,
718 opt=opt
719 )
720
721 if len(additional_toc_pages) == 0:
722 break
723
724 additional_toc_json = toc_extractor(page_list, additional_toc_pages, opt.model)
725 if additional_toc_json['page_index_given_in_toc'] == 'yes':
726 print('index found')
727 return {'toc_content': additional_toc_json['toc_content'], 'toc_page_list': additional_toc_pages, 'page_index_given_in_toc': 'yes'}
728
729 else:
730 current_start_index = additional_toc_pages[-1] + 1
731 print('index not found')
732 return {'toc_content': toc_json['toc_content'], 'toc_page_list': toc_page_list, 'page_index_given_in_toc': 'no'}
733
734
735

Callers 1

tree_parserFunction · 0.85

Calls 2

find_toc_pagesFunction · 0.85
toc_extractorFunction · 0.85

Tested by

no test coverage detected