🇨🇳中文 | 🌐English | 📖文档/Docs | ❓提问/Issues | 💬讨论/Discussions
<img src="https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/v5.0/pics/banner.png" width="700"/>
<img alt="GitHub" src="https://img.shields.io/github/license/ymcui/Chinese-LLaMA-Alpaca.svg?color=blue&style=flat-square">
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.2-green"></a>
To promote open research of large models in the Chinese NLP community, this project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning. These models expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhancing Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions.
Technical Report (V2):[Cui, Yang, and Yao, 2023] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Main contents of this project:
💡 The following image shows the actual experience effect of the 7B version model after local deployment (animation unaccelerated, tested on Apple M1 Max).

Chinese-LLaMA-Alpaca-2| Visual Chinese-LLaMA-Alpaca | Multi-modal VLE | Chinese MiniRBT | Chinese LERT | Chinese-English PERT | Chinese MacBERT | Chinese ELECTRA | Chinese XLNet | Chinese BERT | Knowledge distillation tool TextBrewer | Model pruning tool TextPruner
[July 19, 2023] Release v5.0: Release Alpaca-Pro models, significantly improve generation quality. Along with Plus-33B models.
[July 19, 2023] We are launching Chinese-LLaMA-Alpaca-2 project.
[July 10, 2023] Beta channel preview, know coming updates in advance. See Discussion
[July 7, 2023] The Chinese-LLaMA-Alpaca family welcomes a new member: Visual Chinese-LLaMA-Alpaca model for visual question answering and chat. The 7B test version is available.
[June 30, 2023] 8K context support with llama.cpp. See Discussion. For 4K+ context support with transformers, see PR#705.
[June 16, 2023] Release v4.1: New technical report, add C-Eval inference script, add low-resource model merging script, etc.
[June 8, 2023] Release v4.0: LLaMA/Alpaca 33B versions are available. We also add privateGPT demo, C-Eval results, etc.
| Chapter | Description |
|---|---|
| Download | Download links for Chinese LLaMA and Alpaca |
| Model Reconstruction | (Important) Explains how to merge downloaded LoRA models with the original LLaMA |
| Quick Deployment | Steps for quantize and deploy LLMs on personal computers |
| Example Results | Examples of the system output |
| Training Details | Introduces the training details of Chinese LLaMA and Alpaca |
| FAQ | Replies to some common questions |
| Limitations | Limitations of the models involved in this project |
The official LLaMA models released by Facebook prohibit commercial use, and the official model weights have not been open-sourced (although there are many third-party download links available online). In order to comply with the relevant licenses, it is currently not possible to release the complete model weights. We appreciate your understanding. After Facebook fully opens up the model weights, this project will update its policies accordingly. What is released here are the LoRA weights, which can be seen as a "patch" for the original LLaMA model, and the complete weights can be obtained by merging the two.
The following table provides a basic comparison of the Chinese LLaMA and Alpaca models, as well as recommended usage scenarios (including, but not limited to).
💡 Plus versions are trained on more data, which is highly recommended for use.
| Comparison Item | Chinese LLaMA | Chinese Alpaca |
|---|---|---|
| Training Method | Traditional CLM (trained on general corpus) | Instruction Fine-tuning (trained on instruction data) |
| Model Type | Base model | Instruction-following model (like ChatGPT) |
| Training Data | unsupervised free text | supervised instruction data |
| Vocab size[3] | 49953 | 49954=49953+1 (pad token) |
| Input Template | Not required | Must meet template requirements[1] |
| Suitable Scenarios ✔️ | Text continuation: Given a context, let the model continue writing | 1. Instruction understanding (Q&A, writing, advice, etc.) |
-p parameter to specify context | Use -ins parameter to enable instruction understanding + chat mode |
| text-generation-webui | Not suitable for chat mode | Use --cpu to run without a GPU; if not satisfied with generated content, consider modifying prompt |
| LlamaChat | Choose "LLaMA" when loading the model | Choose "Alpaca" when loading the model |
| inference_hf.py | No additional startup parameters required | Add --with_prompt parameter when launching |
| web-demo | Not applicable | Simply provide the Alpaca model location; support multi-turn conversations |
| LangChain-demo / privateGPT | Not applicable | Simply provide the Alpaca model location |
| Known Issues | If not controlled for termination, it will continue writing until reaching the output length limit.[2] | Please use Pro models to avoid short responses (in Plus series). |[1] Templates are built-in for (llama.cpp/LlamaChat/inference_hf.py/web-demo/LangChain-demo.
[2] If you encounter issues such as low-quality model responses, nonsensical answers, or failure to understand questions, please check whether you are using the correct model and startup parameters for the scenario.
[3] Alpaca model has an additional pad token in vocabulary than LLaMA. Please do not mix LLaMA/Alpaca tokenizers.
Below is a list of models recommended for this project. These models typically use more training data and optimized model training methods and parameters, so they should be used preferentially (for other models, please check Other Models). If you want to experience ChatGPT-like interaction, please use the Alpaca model instead of the LLaMA model. For Alpaca models, please use Pro versions for longer responses. If you prefer shorter response, please use Plus series instead.
| Model | Type | Data | Required Original Model[1] | Size[2] | Download Links[3] |
|---|---|---|---|---|---|
| Chinese-LLaMA-Plus-7B | base model | general 120G | LLaMA-7B | 790M | [BaiduDisk] |
[Google Drive] | | Chinese-LLaMA-Plus-13B | base model | general 120G | LLaMA-13B | 1.0G | [BaiduDisk]
[Google Drive] | | Chinese-LLaMA-Plus-33B 🆕 | base model | general 120G | LLaMA-33B | 1.3G[6] | [BaiduDisk]
[Google Drive] | | Chinese-Alpaca-Pro-7B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-7B &
LLaMA-Plus-7B*[4] | 1.1G | [BaiduDisk]
[Google Drive] | | Chinese-Alpaca-Pro-13B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-13B &
LLaMA-Plus-13B[4]* | 1.3G | [BaiduDisk]
[Google Drive] | | Chinese-Alpaca-Pro-33B 🆕 | instruction-following model | instruction 4.3M | *LLaMA-33B &
LLaMA-Plus-33B[4]* | 2.1G | [BaiduDisk]
[1] The original LLaMA model needs to be applied for use in Facebook-LLaMA or refer to this PR. Due to copyright issues, this project cannot provide downloads, and we ask for your understanding.
[2] The reconstructed model is slightly larger than the original LLaMA (due to the expanded vocabulary); the 7B model is about 13G+.
[3] After downloading, be sure to check whether the SHA256 of the ZIP file is consistent; for the full valu
$ claude mcp add Chinese-LLaMA-Alpaca \
-- python -m otcore.mcp_server <graph>