MCPcopy
hub / github.com/attardi/wikiextractor / define_template

Function define_template

wikiextractor/extract.py:1812–1859  ·  view source on GitHub ↗

Adds a template defined in the :param page:. @see https://en.wikipedia.org/wiki/Help:Template#Noinclude.2C_includeonly.2C_and_onlyinclude

(title, page)

Source from the content-addressed store, hash-verified

1810
1811
1812def define_template(title, page):
1813 """
1814 Adds a template defined in the :param page:.
1815 @see https://en.wikipedia.org/wiki/Help:Template#Noinclude.2C_includeonly.2C_and_onlyinclude
1816 """
1817 global templates
1818 global redirects
1819
1820 # title = normalizeTitle(title)
1821
1822 # check for redirects
1823 m = re.match('#REDIRECT.*?\[\[([^\]]*)]]', page[0], re.IGNORECASE)
1824 if m:
1825 redirects[title] = m.group(1) # normalizeTitle(m.group(1))
1826 return
1827
1828 text = unescape(''.join(page))
1829
1830 # We're storing template text for future inclusion, therefore,
1831 # remove all <noinclude> text and keep all <includeonly> text
1832 # (but eliminate <includeonly> tags per se).
1833 # However, if <onlyinclude> ... </onlyinclude> parts are present,
1834 # then only keep them and discard the rest of the template body.
1835 # This is because using <onlyinclude> on a text fragment is
1836 # equivalent to enclosing it in <includeonly> tags **AND**
1837 # enclosing all the rest of the template body in <noinclude> tags.
1838
1839 # remove comments
1840 text = comment.sub('', text)
1841
1842 # eliminate <noinclude> fragments
1843 text = reNoinclude.sub('', text)
1844 # eliminate unterminated <noinclude> elements
1845 text = re.sub(r'<noinclude\s*>.*$', '', text, flags=re.DOTALL)
1846 text = re.sub(r'<noinclude/>', '', text)
1847
1848 onlyincludeAccumulator = ''
1849 for m in re.finditer('<onlyinclude>(.*?)</onlyinclude>', text, re.DOTALL):
1850 onlyincludeAccumulator += m.group(1)
1851 if onlyincludeAccumulator:
1852 text = onlyincludeAccumulator
1853 else:
1854 text = reIncludeonly.sub('', text)
1855
1856 if text:
1857 if title in templates and templates[title] != text:
1858 logging.warn('Redefining: %s', title)
1859 templates[title] = text

Callers 1

load_templatesFunction · 0.85

Calls 1

unescapeFunction · 0.85

Tested by

no test coverage detected