MCPcopy
hub / github.com/travisvn/openai-edge-tts / prepare_tts_input_with_context

Function prepare_tts_input_with_context

app/handle_text.py:4–62  ·  view source on GitHub ↗

Prepares text for a TTS API by cleaning Markdown and adding minimal contextual hints for certain Markdown elements like headers. Preserves paragraph separation. Args: text (str): The raw text containing Markdown or other formatting. Returns: str: Cleaned text with

(text: str)

Source from the content-addressed store, hash-verified

2import emoji
3
4def prepare_tts_input_with_context(text: str) -> str:
5 """
6 Prepares text for a TTS API by cleaning Markdown and adding minimal contextual hints
7 for certain Markdown elements like headers. Preserves paragraph separation.
8
9 Args:
10 text (str): The raw text containing Markdown or other formatting.
11
12 Returns:
13 str: Cleaned text with contextual hints suitable for TTS input.
14 """
15
16 # Remove emojis
17 text = emoji.replace_emoji(text, replace='')
18
19 # Add context for headers
20 def header_replacer(match):
21 level = len(match.group(1)) # Number of '#' symbols
22 header_text = match.group(2).strip()
23 if level == 1:
24 return f"Title — {header_text}\n"
25 elif level == 2:
26 return f"Section — {header_text}\n"
27 else:
28 return f"Subsection — {header_text}\n"
29
30 text = re.sub(r"^(#{1,6})\s+(.*)", header_replacer, text, flags=re.MULTILINE)
31
32 # Announce links (currently commented out for potential future use)
33 # text = re.sub(r"\[([^\]]+)\]\((https?:\/\/[^\)]+)\)", r"\1 (link: \2)", text)
34
35 # Remove links while keeping the link text
36 text = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", text)
37
38 # Describe inline code
39 text = re.sub(r"`([^`]+)`", r"code snippet: \1", text)
40
41 # Remove bold/italic symbols but keep the content
42 text = re.sub(r"(\*\*|__|\*|_)", '', text)
43
44 # Remove code blocks (multi-line) with a description
45 text = re.sub(r"```([\s\S]+?)```", r"(code block omitted)", text)
46
47 # Remove image syntax but add alt text if available
48 text = re.sub(r"!\[([^\]]*)\]\([^\)]+\)", r"Image: \1", text)
49
50 # Remove HTML tags
51 text = re.sub(r"</?[^>]+(>|$)", '', text)
52
53 # Normalize line breaks
54 text = re.sub(r"\n{2,}", '\n\n', text) # Ensure consistent paragraph separation
55
56 # Replace multiple spaces within lines
57 text = re.sub(r" {2,}", ' ', text)
58
59 # Trim leading and trailing whitespace from the whole text
60 text = text.strip()
61

Callers 3

text_to_speechFunction · 0.90
elevenlabs_ttsFunction · 0.90
azure_ttsFunction · 0.90

Calls

no outgoing calls

Tested by

no test coverage detected