hub / github.com/ScrapeGraphAI/Scrapegraph-ai / RAGNode

Class RAGNode

scrapegraphai/nodes/rag_node.py:10–106 · view source on GitHub ↗

A node responsible for compressing the input tokens and storing the document in a vector database for retrieval. Relevant chunks are stored in the state. It allows scraping of big documents without exceeding the token limit of the language model. Attributes: llm_model: An

Source from the content-addressed store, hash-verified

8
9
10	class RAGNode(BaseNode):
11	"""
12	A node responsible for compressing the input tokens and storing the document
13	in a vector database for retrieval. Relevant chunks are stored in the state.
14
15	It allows scraping of big documents without exceeding the token limit of the language model.
16
17	Attributes:
18	llm_model: An instance of a language model client, configured for generating answers.
19	verbose (bool): A flag indicating whether to show print statements during execution.
20
21	Args:
22	input (str): Boolean expression defining the input keys needed from the state.
23	output (List[str]): List of output keys to be updated in the state.
24	node_config (dict): Additional configuration for the node.
25	node_name (str): The unique identifier name for the node, defaulting to "Parse".
26	"""
27
28	def __init__(
29	self,
30	input: str,
31	output: List[str],
32	node_config: Optional[dict] = None,
33	node_name: str = "RAG",
34	):
35	super().__init__(node_name, "node", input, output, 2, node_config)
36
37	self.llm_model = node_config["llm_model"]
38	self.embedder_model = node_config.get("embedder_model", None)
39	self.verbose = (
40	False if node_config is None else node_config.get("verbose", False)
41	)
42
43	def execute(self, state: dict) -> dict:
44	self.logger.info(f"--- Executing {self.node_name} Node ---")
45
46	try:
47	from qdrant_client import QdrantClient
48	from qdrant_client.models import Distance, PointStruct, VectorParams
49	except ImportError:
50	raise ImportError(
51	"qdrant_client is not installed. Please install it using 'pip install qdrant-client'."
52	)
53
54	if self.node_config.get("client_type") in ["memory", None]:
55	client = QdrantClient(":memory:")
56	elif self.node_config.get("client_type") == "local_db":
57	client = QdrantClient(path="path/to/db")
58	elif self.node_config.get("client_type") == "image":
59	client = QdrantClient(url="http://localhost:6333")
60	else:
61	raise ValueError("client_type provided not correct")
62
63	docs = [elem.get("summary") for elem in state.get("docs")]
64	ids = list(range(1, len(state.get("docs")) + 1))
65
66	if state.get("embeddings"):
67	import openai

Callers 2

custom_graph_openai.pyFile · 0.90

_create_graphMethod · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected