MCPcopy
hub / github.com/nltk/nltk_data / get_id

Function get_id

tools/build_collections.py:34–42  ·  view source on GitHub ↗

Given a full path, extract only the filename (i.e. the nltk_data id) :param xml_path: A full path, e.g. "./packages/corpora/abc.xml" :type xml_path: str :return: The filename, without the extension, e.g. "abc" :rtype: str

(xml_path: str)

Source from the content-addressed store, hash-verified

32 f.write(ElementTree.tostring(et).decode("utf8"))
33
34def get_id(xml_path: str) -> str:
35 """Given a full path, extract only the filename (i.e. the nltk_data id)
36
37 :param xml_path: A full path, e.g. "./packages/corpora/abc.xml"
38 :type xml_path: str
39 :return: The filename, without the extension, e.g. "abc"
40 :rtype: str
41 """
42 return os.path.splitext(os.path.basename(xml_path))[0]
43
44# Write `collection/all-corpora.xml` based on all files under /packages/corpora
45corpora_items = [get_id(xml_path) for xml_path in glob(f"{ROOT}/packages/corpora/*.xml")]

Callers 1

Calls

no outgoing calls

Tested by

no test coverage detected