MCPcopy
hub / github.com/k2-fsa/OmniVoice / read_test_list

Function read_test_list

omnivoice/utils/data_utils.py:29–68  ·  view source on GitHub ↗

Read a JSONL test list file. Each line should be a JSON object. Only ``id`` and ``text`` are required; all other fields are optional (default to ``None``): id, text, ref_audio, ref_text, instruct, language_id, language_name, duration, speed Note: ``language_name`` is o

(path)

Source from the content-addressed store, hash-verified

27
28
29def read_test_list(path):
30 """Read a JSONL test list file.
31
32 Each line should be a JSON object. Only ``id`` and ``text`` are required;
33 all other fields are optional (default to ``None``):
34 id, text, ref_audio, ref_text, instruct,
35 language_id, language_name, duration, speed
36
37 Note: ``language_name`` is only used by evaluation scripts (under
38 ``omnivoice/eval/``) for grouping and reporting results. The model
39 itself only consumes ``language_id``.
40
41 Returns a list of dicts.
42 """
43 path = Path(path)
44 samples = []
45 with path.open("r", encoding="utf-8") as f:
46 for line_no, line in enumerate(f, 1):
47 line = line.strip()
48 if not line:
49 continue
50 try:
51 obj = json.loads(line)
52 except json.JSONDecodeError:
53 logging.warning(f"Skipping malformed JSON at line {line_no}: {line}")
54 continue
55
56 sample = {
57 "id": obj.get("id"),
58 "text": obj.get("text"),
59 "ref_audio": obj.get("ref_audio"),
60 "ref_text": obj.get("ref_text"),
61 "language_id": obj.get("language_id"),
62 "language_name": obj.get("language_name"),
63 "duration": obj.get("duration"),
64 "speed": obj.get("speed"),
65 "instruct": obj.get("instruct"),
66 }
67 samples.append(sample)
68 return samples

Callers 8

mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90
mainFunction · 0.90

Calls

no outgoing calls

Tested by

no test coverage detected