MCPcopy
hub / github.com/unclecode/crawl4ai / cache_url

Function cache_url

crawl4ai/database.py:59–81  ·  view source on GitHub ↗
(url: str, html: str, cleaned_html: str, markdown: str, extracted_content: str, success: bool, media : str = "{}", links : str = "{}", metadata : str = "{}", screenshot: str = "")

Source from the content-addressed store, hash-verified

57 return None
58
59def cache_url(url: str, html: str, cleaned_html: str, markdown: str, extracted_content: str, success: bool, media : str = "{}", links : str = "{}", metadata : str = "{}", screenshot: str = ""):
60 check_db_path()
61 try:
62 conn = sqlite3.connect(DB_PATH)
63 cursor = conn.cursor()
64 cursor.execute('''
65 INSERT INTO crawled_data (url, html, cleaned_html, markdown, extracted_content, success, media, links, metadata, screenshot)
66 VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
67 ON CONFLICT(url) DO UPDATE SET
68 html = excluded.html,
69 cleaned_html = excluded.cleaned_html,
70 markdown = excluded.markdown,
71 extracted_content = excluded.extracted_content,
72 success = excluded.success,
73 media = excluded.media,
74 links = excluded.links,
75 metadata = excluded.metadata,
76 screenshot = excluded.screenshot
77 ''', (url, html, cleaned_html, markdown, extracted_content, success, media, links, metadata, screenshot))
78 conn.commit()
79 conn.close()
80 except Exception as e:
81 print(f"Error caching URL: {e}")
82
83def get_total_count() -> int:
84 check_db_path()

Callers 3

run_oldMethod · 0.85
process_htmlMethod · 0.85
process_htmlMethod · 0.85

Calls 2

check_db_pathFunction · 0.85
closeMethod · 0.45

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…