MCPcopy Index your code
hub / github.com/sqlmapproject/sqlmap / parseSitemap

Function parseSitemap

lib/parse/sitemap.py:20–56  ·  view source on GitHub ↗
(url, retVal=None)

Source from the content-addressed store, hash-verified

18abortedFlag = None
19
20def parseSitemap(url, retVal=None):
21 global abortedFlag
22
23 if retVal is not None:
24 logger.debug("parsing sitemap '%s'" % url)
25
26 try:
27 if retVal is None:
28 abortedFlag = False
29 retVal = OrderedSet()
30
31 try:
32 content = Request.getPage(url=url, raise404=True)[0] if not abortedFlag else ""
33 except _http_client.InvalidURL:
34 errMsg = "invalid URL given for sitemap ('%s')" % url
35 raise SqlmapSyntaxException(errMsg)
36
37 for match in re.finditer(r"<loc>\s*([^<]+)", content or ""):
38 if abortedFlag:
39 break
40 url = match.group(1).strip()
41 if url.endswith(".xml") and "sitemap" in url.lower():
42 if kb.followSitemapRecursion is None:
43 message = "sitemap recursion detected. Do you want to follow? [y/N] "
44 kb.followSitemapRecursion = readInput(message, default='N', boolean=True)
45 if kb.followSitemapRecursion:
46 parseSitemap(url, retVal)
47 else:
48 retVal.add(url)
49
50 except KeyboardInterrupt:
51 abortedFlag = True
52 warnMsg = "user aborted during sitemap parsing. sqlmap "
53 warnMsg += "will use partial list"
54 logger.warning(warnMsg)
55
56 return retVal

Callers 1

crawlFunction · 0.90

Calls 6

addMethod · 0.95
OrderedSetClass · 0.90
readInputFunction · 0.90
debugMethod · 0.80
getPageMethod · 0.80

Tested by

no test coverage detected

Used in the wild real call sites across dependent graphs

searching dependent graphs…