hub / github.com/ymcui/Chinese-LLaMA-Alpaca

github.com/ymcui/Chinese-LLaMA-Alpaca @v5.0 sqlite

repository ↗ · DeepWiki ↗ · release v5.0 ↗

109 symbols 420 edges 18 files 14 documented · 13%

README

🇨🇳中文 | 🌐English | 📖文档/Docs | ❓提问/Issues | 💬讨论/Discussions

<img src="https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/v5.0/pics/banner.png" width="700"/>









<img alt="GitHub" src="https://img.shields.io/github/license/ymcui/Chinese-LLaMA-Alpaca.svg?color=blue&style=flat-square">
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.2-green"></a>

To promote open research of large models in the Chinese NLP community, this project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning. These models expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhancing Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions.

Technical Report (V2)：[Cui, Yang, and Yao, 2023] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Main contents of this project:

🚀 Extended Chinese vocabulary on top of original LLaMA with significant encode/decode efficiency
🚀 Open-sourced the Chinese LLaMA (general purpose) and Alpaca (instruction-tuned)
🚀 Open-sourced the pre-training and instruction finetuning (SFT) scripts for further tuning on user's data
🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
🚀 Support 🤗transformers, llama.cpp, text-generation-webui, LlamaChat, LangChain, , privateGPT, etc.
Released versions: 7B (basic, Plus, Pro), 13B (basic, Plus, Pro), 33B (basic, Plus, Pro)

💡 The following image shows the actual experience effect of the 7B version model after local deployment (animation unaccelerated, tested on Apple M1 Max).

News

[July 19, 2023] Release v5.0: Release Alpaca-Pro models, significantly improve generation quality. Along with Plus-33B models.

[July 19, 2023] We are launching Chinese-LLaMA-Alpaca-2 project.

[July 10, 2023] Beta channel preview, know coming updates in advance. See Discussion

[July 7, 2023] The Chinese-LLaMA-Alpaca family welcomes a new member: Visual Chinese-LLaMA-Alpaca model for visual question answering and chat. The 7B test version is available.

[June 30, 2023] 8K context support with llama.cpp. See Discussion. For 4K+ context support with transformers, see PR#705.

[June 16, 2023] Release v4.1: New technical report, add C-Eval inference script, add low-resource model merging script, etc.

[June 8, 2023] Release v4.0: LLaMA/Alpaca 33B versions are available. We also add privateGPT demo, C-Eval results, etc.

Content Navigation

Chapter	Description
Download	Download links for Chinese LLaMA and Alpaca
Model Reconstruction	(Important) Explains how to merge downloaded LoRA models with the original LLaMA
Quick Deployment	Steps for quantize and deploy LLMs on personal computers
Example Results	Examples of the system output
Training Details	Introduces the training details of Chinese LLaMA and Alpaca
FAQ	Replies to some common questions
Limitations	Limitations of the models involved in this project

Model Download

⚠️ User Notice (Must Read)

The official LLaMA models released by Facebook prohibit commercial use, and the official model weights have not been open-sourced (although there are many third-party download links available online). In order to comply with the relevant licenses, it is currently not possible to release the complete model weights. We appreciate your understanding. After Facebook fully opens up the model weights, this project will update its policies accordingly. What is released here are the LoRA weights, which can be seen as a "patch" for the original LLaMA model, and the complete weights can be obtained by merging the two.

Which model should I use?

The following table provides a basic comparison of the Chinese LLaMA and Alpaca models, as well as recommended usage scenarios (including, but not limited to).

💡 Plus versions are trained on more data, which is highly recommended for use.

Comparison Item	Chinese LLaMA	Chinese Alpaca
Training Method	Traditional CLM (trained on general corpus)	Instruction Fine-tuning (trained on instruction data)
Model Type	Base model	Instruction-following model (like ChatGPT)
Training Data	unsupervised free text	supervised instruction data
Vocab size^[3]	49953	49954=49953+1 (pad token)
Input Template	Not required	Must meet template requirements^[1]
Suitable Scenarios ✔️	Text continuation: Given a context, let the model continue writing	1. Instruction understanding (Q&A, writing, advice, etc.)

Multi-turn context understanding (chat, etc.) | | Unsuitable Scenarios ❌ | Instruction understanding, multi-turn chat, etc. | Unrestricted free text generation | | llama.cpp | Use -p parameter to specify context | Use -ins parameter to enable instruction understanding + chat mode | | text-generation-webui | Not suitable for chat mode | Use --cpu to run without a GPU; if not satisfied with generated content, consider modifying prompt | | LlamaChat | Choose "LLaMA" when loading the model | Choose "Alpaca" when loading the model | | inference_hf.py | No additional startup parameters required | Add --with_prompt parameter when launching | | web-demo | Not applicable | Simply provide the Alpaca model location; support multi-turn conversations | | LangChain-demo / privateGPT | Not applicable | Simply provide the Alpaca model location | | Known Issues | If not controlled for termination, it will continue writing until reaching the output length limit.^[2] | Please use Pro models to avoid short responses (in Plus series). |

[1] Templates are built-in for (llama.cpp/LlamaChat/inference_hf.py/web-demo/LangChain-demo.

[2] If you encounter issues such as low-quality model responses, nonsensical answers, or failure to understand questions, please check whether you are using the correct model and startup parameters for the scenario.

[3] Alpaca model has an additional pad token in vocabulary than LLaMA. Please do not mix LLaMA/Alpaca tokenizers.

Recommended Models

Below is a list of models recommended for this project. These models typically use more training data and optimized model training methods and parameters, so they should be used preferentially (for other models, please check Other Models). If you want to experience ChatGPT-like interaction, please use the Alpaca model instead of the LLaMA model. For Alpaca models, please use Pro versions for longer responses. If you prefer shorter response, please use Plus series instead.

Model	Type	Data	Required Original Model^[1]	Size^[2]	Download Links^[3]
Chinese-LLaMA-Plus-7B	base model	general 120G	LLaMA-7B	790M	[BaiduDisk]

LLaMA-Plus-7B*^[4] | 1.1G | [BaiduDisk]

LLaMA-Plus-13B^[4]* | 1.3G | [BaiduDisk]

LLaMA-Plus-33B^[4]* | 2.1G | [BaiduDisk]

[Google Drive] |

[1] The original LLaMA model needs to be applied for use in Facebook-LLaMA or refer to this PR. Due to copyright issues, this project cannot provide downloads, and we ask for your understanding.

[2] The reconstructed model is slightly larger than the original LLaMA (due to the expanded vocabulary); the 7B model is about 13G+.

[3] After downloading, be sure to check whether the SHA256 of the ZIP file is consistent; for the full valu

Core symbols most depended-on inside this repo

clear_torch_cache

called by 4

scripts/inference/gradio_demo.py

translate_state_dict_key

called by 2

scripts/merge_llama_with_chinese_lora.py

unpermute

called by 2

scripts/merge_llama_with_chinese_lora.py

translate_state_dict_key

called by 2

scripts/merge_llama_with_chinese_lora_low_mem.py

unpermute

called by 2

scripts/merge_llama_with_chinese_lora_low_mem.py

format_example

called by 2

scripts/ceval/llama_evaluator.py

normalize_answer

called by 2

scripts/ceval/evaluator.py

generate_prompt

called by 2

scripts/inference/inference_hf.py

Shape

Function 50

Method 34

Class 22

Route 3

Languages

Python100%

Modules by API surface

scripts/inference/gradio_demo.py21 symbols

scripts/training/run_clm_pt_with_peft.py16 symbols

scripts/openai_server_demo/openai_api_server.py10 symbols

scripts/ceval/evaluator.py10 symbols

scripts/training/run_clm_sft_with_peft.py9 symbols

scripts/openai_server_demo/openai_api_protocol.py9 symbols

scripts/openai_server_demo/patches.py6 symbols

scripts/inference/patches.py6 symbols

scripts/ceval/llama_evaluator.py6 symbols

scripts/merge_llama_with_chinese_lora_low_mem.py5 symbols

scripts/training/build_dataset.py4 symbols

scripts/merge_llama_with_chinese_lora.py4 symbols

Dependencies from manifests, versioned

sentencepiece0.1.97 · 1×

torch1.13.1 · 1×

transformers4.30.0 · 1×

For agents

$ claude mcp add Chinese-LLaMA-Alpaca \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact