MCPcopy
hub / github.com/wkentaro/gdown / parse_url

Function parse_url

gdown/parse_url.py:10–46  ·  view source on GitHub ↗

Parse URLs especially for Google Drive links. file_id: ID of file on Google Drive. is_download_link: Flag if it is download link of Google Drive.

(url: str)

Source from the content-addressed store, hash-verified

8
9
10def parse_url(url: str) -> tuple[str | None, bool]:
11 """Parse URLs especially for Google Drive links.
12
13 file_id: ID of file on Google Drive.
14 is_download_link: Flag if it is download link of Google Drive.
15 """
16 parsed = urllib.parse.urlparse(url)
17 query = urllib.parse.parse_qs(parsed.query)
18 is_gdrive = is_google_drive_url(url=url)
19 is_download_link = parsed.path.endswith("/uc")
20
21 if not is_gdrive:
22 return None, is_download_link
23
24 file_id = None
25 if "id" in query:
26 file_ids = query["id"]
27 if len(file_ids) == 1:
28 file_id = file_ids[0]
29 else:
30 patterns = [
31 r"^/file/d/(.*?)/(edit|view)$",
32 r"^/file/u/[0-9]+/d/(.*?)/(edit|view)$",
33 r"^/document/d/(.*?)/(edit|htmlview|view)$",
34 r"^/document/u/[0-9]+/d/(.*?)/(edit|htmlview|view)$",
35 r"^/presentation/d/(.*?)/(edit|htmlview|view)$",
36 r"^/presentation/u/[0-9]+/d/(.*?)/(edit|htmlview|view)$",
37 r"^/spreadsheets/d/(.*?)/(edit|htmlview|view)$",
38 r"^/spreadsheets/u/[0-9]+/d/(.*?)/(edit|htmlview|view)$",
39 ]
40 for pattern in patterns:
41 match = re.match(pattern, parsed.path)
42 if match:
43 file_id = match.groups()[0]
44 break
45
46 return file_id, is_download_link

Callers 3

downloadFunction · 0.90
test_parse_urlFunction · 0.90

Calls 1

is_google_drive_urlFunction · 0.85

Tested by 2

test_parse_urlFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…