MCPcopy
hub / github.com/nltk/nltk_data

github.com/nltk/nltk_data @main sqlite

repository ↗ · DeepWiki ↗
2 symbols 18 edges 2 files 2 documented · 100%
README

Data Distribution for NLTK

This repository contains data packages (corpora, models, tokenizers, etc.) for use with NLTK.

Installation

To install data using the NLTK downloader, run:

import nltk
nltk.download()

For detailed instructions, please see the NLTK website.


Recent Enhancements

Note: You do not need to update index.xml when adding or modifying packages. It is automatically rebuilt after changes are merged.

Licensing Transparency (PR #242)

  • Added a top-level LICENSE (Apache License 2.0) for the repository.
  • Added LICENSE-OVERVIEW.md summarizing the licensing structure, with emphasis on the diversity of dataset licenses and the importance of reviewing individual terms.
  • Added DATASET-LICENSES.md — a comprehensive, grouped list of all data packages and their licenses, highlighting any ambiguous or unclarified licensing.
  • These changes improve transparency, support responsible use, and aid compliance for all users.

Contribution Guidelines

  • Introduced a detailed CONTRIBUTING.md with step-by-step instructions for adding a new data package using Git and GitHub.
  • Please see CONTRIBUTING.md for instructions on adding datasets and making other contributions.
  • Contributors are encouraged to clarify dataset licenses and to consult the new licensing overview and dataset license table.

For instructions on adding new data packages, please see CONTRIBUTING.md. For licensing details, see LICENSE-OVERVIEW.md and DATASET-LICENSES.md.

Core symbols most depended-on inside this repo

write
called by 3
tools/build_collections.py
get_id
called by 2
tools/build_collections.py

Shape

Function 2

Languages

Python100%

Modules by API surface

tools/build_collections.py2 symbols

For agents

$ claude mcp add nltk_data \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact