Function normalize_source_name

scripts/clean.py:193–228 · view source on GitHub ↗

清洗书源名称。目标是去掉装饰、署名、版本号、括号说明等噪音，但不把名称洗空。

(name: str)

Source from the content-addressed store, hash-verified

191
192
193	def normalize_source_name(name: str) -> str:
194	"""
195	清洗书源名称。
196
197	目标是去掉装饰、署名、版本号、括号说明等噪音，但不把名称洗空。
198	"""
199	original = clean_spaces(name)
200	if not original:
201	return ""
202
203	text = strip_decorations(original)
204	fallback = text or original
205
206	for pattern in (
207	LEADING_SINGLE_LETTER,
208	TRAILING_AUTHOR_TAG,
209	TRAILING_DOMAIN,
210	TRAILING_VERSION,
211	TRAILING_CN_NUMBER,
212	):
213	text = apply_if_usable(text, pattern)
214
215	if '/' in text:
216	aliases = [clean_spaces(part) for part in re.split(r'[/｜\|]', text) if clean_spaces(part)]
217	if len(aliases) > 1 and is_usable_name(aliases[0]):
218	text = aliases[0]
219
220	for suffix in DIRECT_SUFFIXES:
221	while text.endswith(suffix):
222	candidate = clean_spaces(text[:-len(suffix)])
223	if not is_usable_name(candidate):
224	break
225	text = candidate
226
227	text = clean_spaces(TRAILING_SYMBOLS.sub("", text))
228	return text if is_usable_name(text) else fallback
229
230
231	def normalize_group(group: str) -> str:

__init__Method · 0.90

_strip_ascii_noiseMethod · 0.90

canonicalize_nameMethod · 0.90

test_normalize_source_name_keeps_core_nameMethod · 0.90

test_normalize_source_name_never_returns_empty_for_noisy_but_valid_namesMethod · 0.90

clean_sourceFunction · 0.85

clean_spacesFunction · 0.85

strip_decorationsFunction · 0.85

apply_if_usableFunction · 0.85

is_usable_nameFunction · 0.85

test_normalize_source_name_keeps_core_nameMethod · 0.72

test_normalize_source_name_never_returns_empty_for_noisy_but_valid_namesMethod · 0.72