MCPcopy
hub / github.com/chinesehuazhou/python-weekly / parse_md

Function parse_md

resources/weekly_workflow.py:213–235  ·  view source on GitHub ↗

解析markdown文件内容,提取二级标题和其下的列表项 :param file_content: markdown文件内容 :return: 字典,key为二级标题,value为其下的列表项

(file_content)

Source from the content-addressed store, hash-verified

211 return new_content
212
213def parse_md(file_content):
214 """
215 解析markdown文件内容,提取二级标题和其下的列表项
216 :param file_content: markdown文件内容
217 :return: 字典,key为二级标题,value为其下的列表项
218 """
219 # 提取所有二级标题
220 titles = re.findall(r'## (.*?)\n', file_content)
221 # 提取二级标题和列表项(带链接的形式)
222 sub_titles = re.findall(r'## (.*?)\n|\d+、\[(.*?)\]\(.*?\)', file_content)
223
224 # 初始化结果字典
225 parsed_content = {title: [] for title in titles}
226
227 # 遍历匹配结果,将列表项添加到对应的标题下
228 current_title = None
229 for title, sub_title in sub_titles:
230 if title: # 找到新的二级标题
231 current_title = title
232 elif current_title is not None and sub_title: # 找到列表项
233 parsed_content[current_title].append(sub_title.strip())
234
235 return parsed_content
236
237def content_to_string(contents):
238 """

Callers 1

read_mdFunction · 0.70

Calls

no outgoing calls

Tested by

no test coverage detected