MCPcopy
hub / github.com/ymcui/Chinese-LLaMA-Alpaca

github.com/ymcui/Chinese-LLaMA-Alpaca @v5.0 sqlite

repository ↗ · DeepWiki ↗ · release v5.0 ↗
109 symbols 420 edges 18 files 14 documented · 13%
README

🇨🇳中文 | 🌐English | 📖文档/Docs | ❓提问/Issues | 💬讨论/Discussions

<img src="https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/v5.0/pics/banner.png" width="700"/>









<img alt="GitHub" src="https://img.shields.io/github/license/ymcui/Chinese-LLaMA-Alpaca.svg?color=blue&style=flat-square">
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.2-green"></a>

To promote open research of large models in the Chinese NLP community, this project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning. These models expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhancing Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions.

Technical Report (V2)[Cui, Yang, and Yao, 2023] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Main contents of this project:

  • 🚀 Extended Chinese vocabulary on top of original LLaMA with significant encode/decode efficiency
  • 🚀 Open-sourced the Chinese LLaMA (general purpose) and Alpaca (instruction-tuned)
  • 🚀 Open-sourced the pre-training and instruction finetuning (SFT) scripts for further tuning on user's data
  • 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
  • 🚀 Support 🤗transformers, llama.cpp, text-generation-webui, LlamaChat, LangChain, , privateGPT, etc.
  • Released versions: 7B (basic, Plus, Pro), 13B (basic, Plus, Pro), 33B (basic, Plus, Pro)

💡 The following image shows the actual experience effect of the 7B version model after local deployment (animation unaccelerated, tested on Apple M1 Max).


Chinese-LLaMA-Alpaca-2| Visual Chinese-LLaMA-Alpaca | Multi-modal VLE | Chinese MiniRBT | Chinese LERT | Chinese-English PERT | Chinese MacBERT | Chinese ELECTRA | Chinese XLNet | Chinese BERT | Knowledge distillation tool TextBrewer | Model pruning tool TextPruner

News

[July 19, 2023] Release v5.0: Release Alpaca-Pro models, significantly improve generation quality. Along with Plus-33B models.

[July 19, 2023] We are launching Chinese-LLaMA-Alpaca-2 project.

[July 10, 2023] Beta channel preview, know coming updates in advance. See Discussion

[July 7, 2023] The Chinese-LLaMA-Alpaca family welcomes a new member: Visual Chinese-LLaMA-Alpaca model for visual question answering and chat. The 7B test version is available.

[June 30, 2023] 8K context support with llama.cpp. See Discussion. For 4K+ context support with transformers, see PR#705.

[June 16, 2023] Release v4.1: New technical report, add C-Eval inference script, add low-resource model merging script, etc.

[June 8, 2023] Release v4.0: LLaMA/Alpaca 33B versions are available. We also add privateGPT demo, C-Eval results, etc.

Content Navigation

Chapter Description
Download Download links for Chinese LLaMA and Alpaca
Model Reconstruction (Important) Explains how to merge downloaded LoRA models with the original LLaMA
Quick Deployment Steps for quantize and deploy LLMs on personal computers
Example Results Examples of the system output
Training Details Introduces the training details of Chinese LLaMA and Alpaca
FAQ Replies to some common questions
Limitations Limitations of the models involved in this project

Model Download

⚠️ User Notice (Must Read)

The official LLaMA models released by Facebook prohibit commercial use, and the official model weights have not been open-sourced (although there are many third-party download links available online). In order to comply with the relevant licenses, it is currently not possible to release the complete model weights. We appreciate your understanding. After Facebook fully opens up the model weights, this project will update its policies accordingly. What is released here are the LoRA weights, which can be seen as a "patch" for the original LLaMA model, and the complete weights can be obtained by merging the two.

Which model should I use?

The following table provides a basic comparison of the Chinese LLaMA and Alpaca models, as well as recommended usage scenarios (including, but not limited to).

💡 Plus versions are trained on more data, which is highly recommended for use.

Comparison Item Chinese LLaMA Chinese Alpaca
Training Method Traditional CLM (trained on general corpus) Instruction Fine-tuning (trained on instruction data)
Model Type Base model Instruction-following model (like ChatGPT)
Training Data unsupervised free text supervised instruction data
Vocab size[3] 49953 49954=49953+1 (pad token)
Input Template Not required Must meet template requirements[1]
Suitable Scenarios ✔️ Text continuation: Given a context, let the model continue writing 1. Instruction understanding (Q&A, writing, advice, etc.)
  1. Multi-turn context understanding (chat, etc.) | | Unsuitable Scenarios ❌ | Instruction understanding, multi-turn chat, etc. | Unrestricted free text generation | | llama.cpp | Use -p parameter to specify context | Use -ins parameter to enable instruction understanding + chat mode | | text-generation-webui | Not suitable for chat mode | Use --cpu to run without a GPU; if not satisfied with generated content, consider modifying prompt | | LlamaChat | Choose "LLaMA" when loading the model | Choose "Alpaca" when loading the model | | inference_hf.py | No additional startup parameters required | Add --with_prompt parameter when launching | | web-demo | Not applicable | Simply provide the Alpaca model location; support multi-turn conversations | | LangChain-demo / privateGPT | Not applicable | Simply provide the Alpaca model location | | Known Issues | If not controlled for termination, it will continue writing until reaching the output length limit.[2] | Please use Pro models to avoid short responses (in Plus series). |

[1] Templates are built-in for (llama.cpp/LlamaChat/inference_hf.py/web-demo/LangChain-demo.

[2] If you encounter issues such as low-quality model responses, nonsensical answers, or failure to understand questions, please check whether you are using the correct model and startup parameters for the scenario.

[3] Alpaca model has an additional pad token in vocabulary than LLaMA. Please do not mix LLaMA/Alpaca tokenizers.

Recommended Models

Below is a list of models recommended for this project. These models typically use more training data and optimized model training methods and parameters, so they should be used preferentially (for other models, please check Other Models). If you want to experience ChatGPT-like interaction, please use the Alpaca model instead of the LLaMA model. For Alpaca models, please use Pro versions for longer responses. If you prefer shorter response, please use Plus series instead.

Model Type Data Required Original Model[1] Size[2] Download Links[3]
Chinese-LLaMA-Plus-7B base model general 120G LLaMA-7B 790M [BaiduDisk]

[Google Drive] | | Chinese-LLaMA-Plus-13B | base model | general 120G | LLaMA-13B | 1.0G | [BaiduDisk]

[Google Drive] | | Chinese-LLaMA-Plus-33B 🆕 | base model | general 120G | LLaMA-33B | 1.3G[6] | [BaiduDisk]

[Google Drive] | | Chinese-Alpaca-Pro-7B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-7B &

LLaMA-Plus-7B*[4] | 1.1G | [BaiduDisk]

[Google Drive] | | Chinese-Alpaca-Pro-13B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-13B &

LLaMA-Plus-13B[4]* | 1.3G | [BaiduDisk]

[Google Drive] | | Chinese-Alpaca-Pro-33B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-33B &

LLaMA-Plus-33B[4]* | 2.1G | [BaiduDisk]

[Google Drive] |

[1] The original LLaMA model needs to be applied for use in Facebook-LLaMA or refer to this PR. Due to copyright issues, this project cannot provide downloads, and we ask for your understanding.

[2] The reconstructed model is slightly larger than the original LLaMA (due to the expanded vocabulary); the 7B model is about 13G+.

[3] After downloading, be sure to check whether the SHA256 of the ZIP file is consistent; for the full valu

Core symbols most depended-on inside this repo

clear_torch_cache
called by 4
scripts/inference/gradio_demo.py
translate_state_dict_key
called by 2
scripts/merge_llama_with_chinese_lora.py
unpermute
called by 2
scripts/merge_llama_with_chinese_lora.py
translate_state_dict_key
called by 2
scripts/merge_llama_with_chinese_lora_low_mem.py
unpermute
called by 2
scripts/merge_llama_with_chinese_lora_low_mem.py
format_example
called by 2
scripts/ceval/llama_evaluator.py
normalize_answer
called by 2
scripts/ceval/evaluator.py
generate_prompt
called by 2
scripts/inference/inference_hf.py

Shape

Function 50
Method 34
Class 22
Route 3

Languages

Python100%

Modules by API surface

scripts/inference/gradio_demo.py21 symbols
scripts/training/run_clm_pt_with_peft.py16 symbols
scripts/openai_server_demo/openai_api_server.py10 symbols
scripts/ceval/evaluator.py10 symbols
scripts/training/run_clm_sft_with_peft.py9 symbols
scripts/openai_server_demo/openai_api_protocol.py9 symbols
scripts/openai_server_demo/patches.py6 symbols
scripts/inference/patches.py6 symbols
scripts/ceval/llama_evaluator.py6 symbols
scripts/merge_llama_with_chinese_lora_low_mem.py5 symbols
scripts/training/build_dataset.py4 symbols
scripts/merge_llama_with_chinese_lora.py4 symbols

Dependencies from manifests, versioned

sentencepiece0.1.97 · 1×
torch1.13.1 · 1×
transformers4.30.0 · 1×

For agents

$ claude mcp add Chinese-LLaMA-Alpaca \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact