MCPcopy
hub / github.com/microsoft/unilm

github.com/microsoft/unilm @s2s-ft.v0.3 sqlite

repository ↗ · DeepWiki ↗ · release s2s-ft.v0.3 ↗
636 symbols 1,800 edges 48 files 191 documented · 30%
README

UniLM

We develop pre-trained models for natural language understanding (NLU) and generation (NLG) tasks

The family of UniLM:

UniLM: unified pre-training for language understanding and generation

MiniLM (new): small pre-trained models for language understanding and generation

LayoutLM (new): multimodal (text + layout/format + image) pre-training for document understanding (e.g. scanned documents, PDF, etc.)

s2s-ft (new): sequence-to-sequence fine-tuning toolkit

News

Release

***** New February, 2020: UniLM v2 | MiniLM v1 | LayoutLM v1 | s2s-ft v1 release *****

***** October 1st, 2019: UniLM v1 release *****

License

This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the transformers project.

Microsoft Open Source Code of Conduct

Contact Information

For help or issues using UniLM, please submit a GitHub issue.

For other communications related to UniLM, please contact Li Dong (lidong1@microsoft.com), Furu Wei (fuwei@microsoft.com).

Core symbols most depended-on inside this repo

from_pretrained
called by 21
s2s-ft/s2s_ft/modeling_decoding.py
add
called by 12
unilm-v1/src/biunilm/loader_utils.py
convert_tokens_to_ids
called by 11
unilm-v1/src/pytorch_pretrained_bert/tokenization.py
load
called by 9
s2s-ft/s2s_ft/modeling_decoding.py
load
called by 8
unilm-v1/src/pytorch_pretrained_bert/modeling.py
whitespace_tokenize
called by 7
unilm-v1/src/pytorch_pretrained_bert/tokenization.py
convert_ids_to_tokens
called by 7
unilm-v1/src/pytorch_pretrained_bert/tokenization.py
step
called by 7
unilm-v1/src/pytorch_pretrained_bert/optimization.py

Shape

Method 382
Function 138
Class 116

Languages

Python100%

Modules by API surface

unilm-v1/src/pytorch_pretrained_bert/modeling.py120 symbols
s2s-ft/s2s_ft/modeling_decoding.py111 symbols
s2s-ft/s2s_ft/modeling.py38 symbols
unilm-v1/src/gigaword/bs_pyrouge.py34 symbols
unilm-v1/src/cnndm/bs_pyrouge.py34 symbols
s2s-ft/evaluations/bs_pyrouge.py34 symbols
unilm-v1/src/pytorch_pretrained_bert/tokenization.py24 symbols
layoutlm/utils_classification.py23 symbols
unilm-v1/src/biunilm/loader_utils.py18 symbols
unilm-v1/src/pytorch_pretrained_bert/optimization.py15 symbols
layoutlm/modeling_layoutlm.py14 symbols
unilm-v1/src/biunilm/seq2seq_loader.py13 symbols

Dependencies from manifests, versioned

pillow7.1.0 · 1×
seqeval0.0.12 · 1×
transformers2.2.1 · 1×

For agents

$ claude mcp add unilm \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact