hub / github.com/unclecode/crawl4ai / cache_url

Function cache_url

crawl4ai/database.py:59–81 · view source on GitHub ↗

(url: str, html: str, cleaned_html: str, markdown: str, extracted_content: str, success: bool, media : str = "{}", links : str = "{}", metadata : str = "{}", screenshot: str = "")

Source from the content-addressed store, hash-verified

57	return None
58
59	def cache_url(url: str, html: str, cleaned_html: str, markdown: str, extracted_content: str, success: bool, media : str = "{}", links : str = "{}", metadata : str = "{}", screenshot: str = ""):
60	check_db_path()
61	try:
62	conn = sqlite3.connect(DB_PATH)
63	cursor = conn.cursor()
64	cursor.execute('''
65	INSERT INTO crawled_data (url, html, cleaned_html, markdown, extracted_content, success, media, links, metadata, screenshot)
66	VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
67	ON CONFLICT(url) DO UPDATE SET
68	html = excluded.html,
69	cleaned_html = excluded.cleaned_html,
70	markdown = excluded.markdown,
71	extracted_content = excluded.extracted_content,
72	success = excluded.success,
73	media = excluded.media,
74	links = excluded.links,
75	metadata = excluded.metadata,
76	screenshot = excluded.screenshot
77	''', (url, html, cleaned_html, markdown, extracted_content, success, media, links, metadata, screenshot))
78	conn.commit()
79	conn.close()
80	except Exception as e:
81	print(f"Error caching URL: {e}")
82
83	def get_total_count() -> int:
84	check_db_path()

Callers 3

run_oldMethod · 0.85

process_htmlMethod · 0.85

Calls 2

check_db_pathFunction · 0.85

closeMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…