MCPcopy
hub / github.com/ArchiveBox/ArchiveBox / validate_links

Function validate_links

archivebox/index/__init__.py:124–133  ·  view source on GitHub ↗
(links: Iterable[Link])

Source from the content-addressed store, hash-verified

122
123@enforce_types
124def validate_links(links: Iterable[Link]) -> List[Link]:
125 timer = TimedProgress(TIMEOUT * 4)
126 try:
127 links = archivable_links(links) # remove chrome://, about:, mailto: etc.
128 links = sorted_links(links) # deterministically sort the links based on timestamp, url
129 links = fix_duplicate_links(links) # merge/dedupe duplicate timestamps & urls
130 finally:
131 timer.end()
132
133 return list(links)
134
135@enforce_types
136def archivable_links(links: Iterable[Link]) -> Iterable[Link]:

Callers 1

parse_links_from_sourceFunction · 0.85

Calls 5

endMethod · 0.95
TimedProgressClass · 0.85
archivable_linksFunction · 0.85
sorted_linksFunction · 0.85
fix_duplicate_linksFunction · 0.85

Tested by

no test coverage detected