MCPcopy
hub / github.com/IPADS-SAI/MobiAgent / get_word_list

Method get_word_list

utils/advanced_ocr.py:289–318  ·  view source on GitHub ↗

从图像中获取词语列表 Args: image_path: 图像文件路径 Returns: List[str]: 词语列表

(self, image_path: str)

Source from the content-addressed store, hash-verified

287 return None
288
289 def get_word_list(self, image_path: str) -> List[str]:
290 """
291 从图像中获取词语列表
292
293 Args:
294 image_path: 图像文件路径
295
296 Returns:
297 List[str]: 词语列表
298 """
299 text, backup_text = self.extract_text_from_image(image_path)
300 words = []
301
302 if text:
303 processed = self.process_text(text)
304 words.extend(processed.words)
305 if processed.cleaned:
306 words.append(processed.cleaned)
307 if processed.no_spaces:
308 words.append(processed.no_spaces)
309
310 if backup_text and backup_text != text:
311 processed_backup = self.process_text(backup_text)
312 words.extend(processed_backup.words)
313 if processed_backup.cleaned:
314 words.append(processed_backup.cleaned)
315 if processed_backup.no_spaces:
316 words.append(processed_backup.no_spaces)
317
318 return list(set(words))
319
320 def extract_xml_text(self, xml_content: str) -> str:
321 """从XML内容中提取可视文本"""

Callers

nothing calls this directly

Calls 2

process_textMethod · 0.95

Tested by

no test coverage detected