MCPcopy
hub / github.com/llmware-ai/llmware / msa_processing

Function msa_processing

solutions/rag/example-6-rag-multi-step-query.py:26–103  ·  view source on GitHub ↗

In this example, we will use the 'AgreementsLarge' sample files which consists of ~80 contracts. We need to quickly identify the 'master service agreements' as we only want to analyze those contracts.

(library_name, llm_model_name)

Source from the content-addressed store, hash-verified

24
25
26def msa_processing(library_name, llm_model_name):
27
28 """ In this example, we will use the 'AgreementsLarge' sample files which consists of ~80 contracts. We
29 need to quickly identify the 'master service agreements' as we only want to analyze those contracts. """
30
31 local_path = Setup().load_sample_files()
32 agreements_path = os.path.join(local_path, "AgreementsLarge")
33
34 # create a library with all of the Agreements (~80 contracts)
35 print(f"\nStarting: Parsing 'AgreementsLarge' Folder")
36 msa_lib = Library().create_new_library(library_name)
37 msa_lib.add_files(agreements_path)
38
39 # find the "master service agreements" (MSA) - we know that 'master services agreement' will always
40 # be on the first page of the agreement, so we can use that as a good proxy for automatically filtering
41 # to our target set of documents
42
43 print(f"\nCompleted Parsing - now, let's look for the 'master service agreements', e.g., 'msa'")
44
45 q = Query(msa_lib)
46 query = '"master services agreement"'
47 results = q.text_search_by_page(query, page_num=1, results_only=False)
48
49 # results_only = False will return a dictionary with 4 keys: {"query", "results", "doc_ID", "file_source"}
50 msa_docs = results["file_source"]
51 msa_doc_ids = results["doc_ID"]
52
53 # load prompt/llm locally
54 prompter = Prompt().load_model(llm_model_name)
55
56 print("update: identified the following msa doc id: ", msa_doc_ids)
57
58 # analyze each MSA - "query" & "llm prompt"
59 for i, doc_id in enumerate(msa_doc_ids):
60
61 print("\n")
62 docs = msa_docs[i]
63 if os.sep in docs:
64 # handles difference in windows file formats vs. Mac/Linux
65 docs = docs.split(os.sep)[-1]
66
67 print (i+1, "Reviewing MSA - ", doc_id, docs)
68
69 # look for the termination provisions in each document
70 doc_filter = {"doc_ID": [doc_id]}
71 termination_provisions = q.text_query_with_document_filter("termination", doc_filter)
72
73 # package the provisions as a source to a prompt
74 sources = prompter.add_source_query_results(termination_provisions)
75
76 # if you want to see more details about how the sources are packaged: uncomment this line-
77 # print("update: sources - ", sources)
78
79 # call the LLM and ask our question
80 response = prompter.prompt_with_source("What is the notice for termination for convenience?")
81
82 # post processing fact checking
83 stats = prompter.evidence_comparison_stats(response)

Calls 15

text_search_by_pageMethod · 0.95
SetupClass · 0.90
LibraryClass · 0.90
QueryClass · 0.90
PromptClass · 0.90
HumanInTheLoopClass · 0.90
load_sample_filesMethod · 0.80
create_new_libraryMethod · 0.80
add_filesMethod · 0.80
prompt_with_sourceMethod · 0.80

Tested by

no test coverage detected