MCPcopy Index your code
hub / github.com/unclecode/crawl4ai / sanitize_html

Function sanitize_html

crawl4ai/utils.py:132–140  ·  view source on GitHub ↗
(html)

Source from the content-addressed store, hash-verified

130 return parsed_objects, unparsed_segments
131
132def sanitize_html(html):
133 # Replace all unwanted and special characters with an empty string
134 sanitized_html = html
135 # sanitized_html = re.sub(r'[^\w\s.,;:!?=\[\]{}()<>\/\\\-"]', '', html)
136
137 # Escape all double and single quotes
138 sanitized_html = sanitized_html.replace('"', '\\"').replace("'", "\\'")
139
140 return sanitized_html
141
142def sanitize_input_encode(text: str) -> str:
143 """Sanitize input to handle potential encoding issues."""

Callers 5

get_content_of_websiteFunction · 0.85
extract_blocksFunction · 0.85
extractMethod · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…