Function parse_url

gdown/parse_url.py:10–46 · view source on GitHub ↗

Parse URLs especially for Google Drive links. file_id: ID of file on Google Drive. is_download_link: Flag if it is download link of Google Drive.

(url: str)

Source from the content-addressed store, hash-verified

8
9
10	def parse_url(url: str) -> tuple[str \| None, bool]:
11	"""Parse URLs especially for Google Drive links.
12
13	file_id: ID of file on Google Drive.
14	is_download_link: Flag if it is download link of Google Drive.
15	"""
16	parsed = urllib.parse.urlparse(url)
17	query = urllib.parse.parse_qs(parsed.query)
18	is_gdrive = is_google_drive_url(url=url)
19	is_download_link = parsed.path.endswith("/uc")
20
21	if not is_gdrive:
22	return None, is_download_link
23
24	file_id = None
25	if "id" in query:
26	file_ids = query["id"]
27	if len(file_ids) == 1:
28	file_id = file_ids[0]
29	else:
30	patterns = [
31	r"^/file/d/(.*?)/(edit\|view)$",
32	r"^/file/u/[0-9]+/d/(.*?)/(edit\|view)$",
33	r"^/document/d/(.*?)/(edit\|htmlview\|view)$",
34	r"^/document/u/[0-9]+/d/(.*?)/(edit\|htmlview\|view)$",
35	r"^/presentation/d/(.*?)/(edit\|htmlview\|view)$",
36	r"^/presentation/u/[0-9]+/d/(.*?)/(edit\|htmlview\|view)$",
37	r"^/spreadsheets/d/(.*?)/(edit\|htmlview\|view)$",
38	r"^/spreadsheets/u/[0-9]+/d/(.*?)/(edit\|htmlview\|view)$",
39	]
40	for pattern in patterns:
41	match = re.match(pattern, parsed.path)
42	if match:
43	file_id = match.groups()[0]
44	break
45
46	return file_id, is_download_link

downloadFunction · 0.90

test_parse_url_non_gdriveFunction · 0.90

test_parse_urlFunction · 0.90

is_google_drive_urlFunction · 0.85

test_parse_url_non_gdriveFunction · 0.72

test_parse_urlFunction · 0.72

searching dependent graphs…