hub / github.com/CrazyBoyM/llama3-Chinese-chat

github.com/CrazyBoyM/llama3-Chinese-chat @main sqlite

repository ↗ · DeepWiki ↗

48 symbols 240 edges 17 files 0 documented · 0%

README

llama3-Chinese-chat

1st version of Chinese-llama3

首个llama3 中文版
本仓库供交流llama3中文相关学习内容，欢迎任何热心朋友加入共建

训练 & 推理流程

训练流程图训练与推理

看图快速学习: https://deepwiki.com/CrazyBoyM/llama3-Chinese-chat

通知

🔥新增LLM-Chinese仓库，欢迎关注，偏教程性质，以「模型中文化」为一个典型的模型训练问题切入场景，指导读者上手学习LLM二次微调训练：https://github.com/CrazyBoyM/LLM-Chinese （含gemma2 中文版模型，2b、 9b尺寸）

如果你有自己微调的版本或者在网上发现有趣的特化版本，欢迎在issue区评论收录。
如有你有想要建设的内容版块，欢迎fork提交PR成为核心作者成员。
(注意：目前不再接受仅修改单个字、句的typo-PR，请避免频繁提交该类PR)

News 更新记录

2024-07-25 llama3.1 中文DPO版训练权重放出。
2024-07-24 llama3.1 中文版训练计划启动。
2024-05-17 🎉 整理的llama3中文化数据集合在modelscope下载量达2.9k次，连续三周处于modelscope网站首页：数据下载地址
2024-05-17 💪 增加手写API部署教程、命令调用，文档地址
2024-05-13 💪 增加LMStudio电脑本地部署教程，文档教程，手把手视频教程
2024-05-04 五一假期间：🚀 新增语言偏好强化对齐版本（直接对英文instruct版做DPO）。保持原汁原味的口吻回复(喜欢趣味语言、emoji表情)，模型下载，gguf量化版下载，语言偏好强化数据集工作已开源
2024-04-21 晚上2点：增加训练教程、推理教程、网页部署等文档整理
2023-04-20 晚上23点：instruct 中文版训练完成
2024-04-20 早上7点：v2版训练完成
2024-04-19 下午1点：🍺 世界上首个llama3 中文版训练完成，晚上没睡觉哈哈，使用170k+高质量多轮中文对话数据连夜训练得到。

Demo 演示示例

llama3-base-8b 中文SFT版

llama3-instruct-8b 中文DPO版

llama3.1-instruct-8b 中文DPO版

llama3 可用Chat版模型整理

llama3.1 - shareAI-DPO中文 8B版本（RLHF中文） - 训练数据开源： https://huggingface.co/datasets/shareAI/DPO-zh-en-emoji - 训练细节分享：DPO(beta 0.5) + lora rank128, alpha256 + 打开"lm_head", "input_layernorm", "post_attention_layernorm", "norm"层训练. - 算力：8 * A100，5分钟，感谢opencsg社区的友情赞助支持。
- 模型下载 - OpenCSG： https://opencsg.com/models/shareAI/llama3.1-8b-instruct-dpo-zh
- 模型下载 - modelscope： https://modelscope.cn/models/shareAI/llama3.1-8b-instruct-dpo-zh
- 模型下载 - Huggingface： https://huggingface.co/shareAI/llama3.1-8b-instruct-dpo-zh - GGUF版本下载（ollama、lmstudio可用）：https://huggingface.co/shareAI/llama3.1-8b-instruct-dpo-zh/blob/main/llama3.1_8b_chinese_chat_q4_k_m-shareAI.gguf - GGUF版本国内下载（hf-mirror 国内加速站点）：https://hf-mirror.com/shareAI/llama3.1-8b-instruct-dpo-zh - ollama命令直接运行：ollama run shareai/llama3.1-dpo-zh - openCSG wukong中文 405B版本 (SFT中文） - shareAI & openCSG联合发布 - 介绍文章：https://mp.weixin.qq.com/s/7_lDZ6Zslq_WUckfuTToyQ - 模型开源：https://opencsg.com/models/OpenCSG/CSG-Wukong-Chinese-Llama3.1-405B - openbuddy - openbuddy-llama3.1-8b（SFT中文）：https://modelscope.cn/models/OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k

llama3相关对话版本优质权重整理：（欢迎issue补充） - shareAI系列： - base预训练 + 直接中文SFT版: - 训练数据：https://modelscope.cn/datasets/baicai003/Llama3-Chinese-dataset/summary - V1版 - OpenCSG满速下载：https://opencsg.com/models/shareAI/llama3-Chinese-chat-8b - WiseModel满速下载：https://wisemodel.cn/models/shareAI/llama3-Chinese-chat-8b - V2版 - modelscope：https://modelscope.cn/models/baicai003/Llama3-Chinese_v2/summary - 思维导图生成能力强化LoRA：https://modelscope.cn/models/shareAI/llama3-instruct-8b-cn-doc2markmap-lora - Instruct + 继续中文SFT版： - modelscope模型下载：https://modelscope.cn/models/baicai003/llama-3-8b-Instruct-chinese_v2/summary - 云服务器镜像在线体验（点击即用，免费 4 小时）：https://www.suanyun.cn/console/share?uuid=b1ba51908f8a4bd1af37148765c293ee - Instruct + 强化学习中文版： - llama3 instruct DPO版（10分钟左右可训练好，对原多语言instruct版最小化性能损伤，实测超过大多中文大量训练版） - modelscope下载：https://modelscope.cn/models/baicai003/Llama3-Chinese-instruct-DPO-beta0.5/summary - 偏好学习数据集：DPO-zh-en-emoji

Base预训练 + 海量中文优质数据增量预训练：正在进行中
70b 中文版：计划中
by zhuangxialie，因对话模版设置错误，需要用fastchat体验：
- Base + 中文SFT：https://modelscope.cn/models/zhuangxialie/Llama3_Chinese_Sft/files
- Base + ORPO：https://modelscope.cn/models/zhuangxialie/Llama3-Chinese-ORPO/summary
- Instruct + DPO：https://www.modelscope.cn/models/zhuangxialie/Llama3-Chinese-DPO/summary
llama3 Pro（加block版，推荐网友积极在该方案上做更多尝试、探索）：
linjh1118网友（第一个ORPO偏好对齐 + 扩展2*blocks）：https://github.com/linjh1118/Llama3-Chinese-ORPO
llama3 Moe增强版：
cooper12121-llama3-8x8b-MoE：https://github.com/cooper12121/llama3-8x8b-MoE
长上下文版本：
联通微调版v2 (中文，28k上下文）：https://huggingface.co/UnicomLLM/Unichat-llama3-Chinese-8B-28K
262k上下文（英文）：https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k
262k上下文（中文）：计划中
无限上下文版本：计划中，参考：https://medium.com/neoxia/llm-infini-attention-with-linear-complexity-3209b87a77c3
其他普通中文微调版本：
中兴微调版（DPO） - 70B：https://www.modelscope.cn/models/ZTEAIM2024/Llama3_70B_instruct_chinese/summary
联通微调版（SFT）：https://www.modelscope.cn/models/UnicomAI/Unichat-llama3-Chinese/summary
Openbuddy微调版（SFT，据说不错）：https://www.modelscope.cn/models/OpenBuddy/openbuddy-llama3-8b-v21.1-8k/summary
zhichen微调版（ORPO方法，应该是第一个orpo）：https://github.com/seanzhang-zhichen/llama3-chinese
shenzhi-wang微调版（ORPO方法，也说是第一个orpo）：https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat
Rookie微调版（SFT）：https://github.com/Rookie1019/Llama-3-8B-Instruct-Chinese
hit-sz klc-lab 微调版：https://github.com/zyg18181818/Llama-3-Chinese
破解安全限制系列（nsfw）：
Unholy：https://huggingface.co/Undi95/Llama-3-Unholy-8B
neural-chat：https://hf-mirror.com/Locutusque/llama-3-neural-chat-v1-8b
dolphin：https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b
Orion: https://huggingface.co/Orion-zhen/Llama3-70B-Orion-Chinese 破限+中文, 并保留了原版llama3喜欢emoji的习惯
v-llama3 多模态版：（支持文字以外的输入、输出）
图像问答：
- Bunny-Llama-3-8B-V：https://wisemodel.cn/models/BAAI/Bunny-Llama-3-8B-V
- llava-llama-3-8b：https://huggingface.co/xtuner/llava-llama-3-8b-v1_1
视频理解（可支持 1 分钟内视频问答）：https://github.com/THUDM/CogVLM2
agent工具能力增强版：
ModelScope Chinese Agent版V1（可根据要求帮你选择工具，中文对话）：https://modelscope.cn/models/swift/Llama3-Chinese-8B-Instruct-Agent-v1/summary
EmoLLM 心理领域数据微调版：
在线体验链接：https://st-app-center-006861-9746-jlroxvg.openxlab.space/
或前往OpenXLab EmoLLM3.0-Llama3启动
模型下载地址
- OpenXLab： https://openxlab.org.cn/models/detail/chg0901/EmoLLM-Llama3-8B-Instruct3.0
- ModelScope： https://modelscope.cn/models/chg0901/EmoLLM-Llama3-8B-Instruct3.0/summary
小说、网文、故事撰写任务增强版：计划中
音乐生成任务版：计划中
猫娘扮演版：计划中
涩涩版：计划中

注意由于只训练了常见对话，Base + SFT版有可能会出现不符合预期的回复（尤其是对于一些非常见回答），本教程更多用于优质资源整理（包含如何对llama3进行中文微调，怎样制作中文对话数据集，角色扮演、agent能力增强，扩充上下文长度，如何进行网页部署和量化，手机、电脑cpu推理部署等），将会逐渐整理补充进来。

模型使用方式

云端服务部署

简单API方式

文档教程：https://github.com/CrazyBoyM/llama3-Chinese-chat/tree/main/deploy/API

vLLM方式（推荐，兼容OpenAI格式）

文档教程：https://github.com/CrazyBoyM/llama3-Chinese-chat/tree/main/deploy/vLLM

本地电脑部署

LMStudio电脑本地部署方式（有UI界面）

文档教程：https://github.com/CrazyBoyM/llama3-Chinese-chat/blob/main/deploy/LMStudio/README.md
视频教程：https://www.bilibili.com/video/BV1nt421g79T

ollama 命令行工具方式 (推荐, 简单易用)

首先，去官网下载安装ollama：https://ollama.com/
然后，打开终端命令行，执行以下命令即可开始与AI对话：

ollama run shareai/llama3.1-dpo-zh

Streamlit 网页推理方式（适合训练后，调试、测试模型）

pip install -U streamlit transformers==4.40.1

首先通过以上命令安装streamlit，然后通过下面命令启动网页以便访问，'/path/to/model'需要改成你的权重下载路径。
V1版本：

streamlit run deploy/web_streamlit_for_v1.py /path/to/model --theme.base="dark"

Instruct版本（支持自定义system prompt)

streamlit run deploy/web_streamlit_for_instruct.py /path/to/model --theme.base="dark"

Instruct DPO版（支持自定义system prompt，喜欢使用有趣语言风格和表情回复)

streamlit run deploy/web_streamlit_for_instruct_v2.py /path/to/model --theme.base="dark"

Python 代码推理方式

点击展开

默认情况下直接运行以下代码即可体验llama3中文对话，请自行修改model_name_or_path为你下载的模型路径

``` from transformers import AutoTokenizer, AutoConfig, AddedToken, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel from dataclasses import dataclass from typing import Dict import torch import copy

定义聊天模板

@dataclass class Template: template_name:str system_format: str user_format: str assistant_format: str system: str stop_word: str

template_dict: Dict[str, Template] = dict()

def register_template(template_name, system_format, user_format, assistant_format, system, stop_word=None): template_dict[template_name] = Template( template_name=template_name, system_format=system_format, user_format=user_format, assistant_format=assistant_format, system=system, stop_word=stop_word, )

这里的系统提示词是训练时使用的，推理时可以自行尝试修改效果

register_template( template_name='llama3', system_format='<|begin_of_text|><>\n{content}\n<>\n\n', user_format='<|start_header_id|>user<|end_header_id|>\n\n{content}<|eot_id|>', assistant_format='<|start_header_id|>assistant<|end_header_id|>\n\n{content}<|end_of_text|>\n', system="You are a helpful, excellent and smart assistant. " "Please respond to the user using the language they input, ensuring the language is elegant and fluent." "If you don't know the answer to a question, please don't share false information.", stop_word='<|end_of_text|>' )

加载模型

def load_model(model_name_or_path, load_in_4bit=False, adapter_name_or_path=None): if load_in_4bit: quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", llm_int8_threshold=6.0, llm_int8_has_fp16_weight=False, ) else: quantization_config = None

# 加载base model
model = AutoModelForCausalLM.from_pretrained(
    model_name_or_path,
    load_in_4bit=load_in_4bit,
    trust_remote_code=True,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    device_map='auto',
    quantization_config=quantization_config
)

# 加载adapter
if adapter_name_or_path is not None:
    model = PeftModel.from_pretrained(model, adapter_name_or_path)

return model

加载tokenizer

def load_tokenizer(model_name_or_path): tokenizer = AutoTokenizer.from_pretrained( model_name_or_path, trust_remote_code=True, use_fast=False )

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

return tokenizer

构建prompt

def build_prompt(tokenizer, template, query, history, system=None): template_name = template.template_name system_format = template.system_format user_format = template.user_format assistant_format = template.assistant_format system = system if system is not None else template.system

history.append({"role": 'user', 'message': query})
input_ids = []

# 添加系统信息
if system_format is not None:
    if system is not None:
        system_text = system_format.format(content=system)
        input_ids = tokenizer.encode(system_text, add_special_tokens=False)
# 拼接历史对话
for item in history:
    role, message = item['role'], item['message']
    if role == 'user':
        message = user_format.format(content=message, stop_token=tokenizer.eos_token)
    else:
        message = assistant_format.format(content=message, stop_token=tokenizer.eos_token)
    tokens = tokenizer.encode(message, add_special_tokens=False)
    input_ids += tokens
input_ids = torch.tensor([input_ids], dtype=torch.long)

return input_ids

def main(): model_name_or_path = 'shareAI/llama3-Chinese-chat-8b' # 模型名称或路径，请修改这里 template_name = 'llama3' adapter_name_or_path = None

template = template_dict[template_name]
# 若开启4bit推理能够节省很多显存，但效果可能下降
load_in_4bit = False

# 生成超参配置，可修改以取得更好的效果
max_new_tokens = 500 # 每次回复时，AI生成文本的最大长度
top_p = 0.9
temperature = 0.6 # 越大越有创造性，越小越保守
repetition_penalty = 1.1 # 越大越能避免吐字重复

# 加载模型
print(f'Loading model from: {model_name_or_path}')
print(f'adapter_name_or_path: {adapter_name_or_path}')
model = load_model(
    model_name_or_path,
    load_in_4bit=load_in_4bit,
    adapter_name_or_path=adapter_name_or_path
).eval()
tokenizer = load_tokenizer(model_name_or_path if adapter_name_or_path is None else adapter_name_or_path)
if template.stop_word is None:
    template.stop_word = tokenizer.eos_token
stop_token_id = tokenizer.encode(template.stop_word, add_special_tokens=True)
assert len(stop_token_id) == 1
stop_token_id = stop_token_id[0]

history = []

query = input('# User：')
while True:
    query = query.strip()
    input_ids =

Core symbols most depended-on inside this repo

convert_jsonl

called by 1

tools/convert_firefly_data_to_sharegpt.py

convert_entry

called by 1

tools/convert_raw_data_for_firefly.py

convert_jsonl

called by 1

tools/convert_raw_data_for_firefly.py

replace_keywords_and_remove_lines

called by 1

tools/change_info.py

init_embeddings_average

called by 1

tools/expand_embedding_and_lmhead.py

draw

called by 1

tools/expand_embedding_and_lmhead.py

Shape

Function 41

Class 6

Route 1

Languages

Python100%

Modules by API surface

deploy/web_streamlit_for_v1.py7 symbols

deploy/web_streamlit_for_instruct_v2.py7 symbols

deploy/web_streamlit_for_instruct.py7 symbols

deploy/python/chat_demo.py6 symbols

deploy/streamlit/web_llama3_chat.py5 symbols

deploy/streamlit/web_gemma2_chat.py5 symbols

tools/expand_embedding_and_lmhead.py3 symbols

tools/convert_raw_data_for_firefly.py2 symbols

deploy/API/easy_server_demo.py2 symbols

tools/sample_data.py1 symbols

tools/count_data.py1 symbols

tools/convert_firefly_data_to_sharegpt.py1 symbols

For agents

$ claude mcp add llama3-Chinese-chat \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/CrazyBoyM/llama3-Chinese-chat @main sqlite

llama3-Chinese-chat

1st version of Chinese-llama3

训练 & 推理流程

通知

News 更新记录

Demo 演示示例

llama3-base-8b 中文SFT版

llama3-instruct-8b 中文DPO版

llama3.1-instruct-8b 中文DPO版

llama3 可用Chat版模型整理

模型使用方式

云端服务部署

简单API方式

vLLM方式 （推荐，兼容OpenAI格式）

本地电脑部署

LMStudio电脑本地部署方式 （有UI界面）

ollama 命令行工具方式 (推荐, 简单易用)

Streamlit 网页推理方式 （适合训练后，调试、测试模型）

Python 代码推理方式

定义聊天模板

这里的系统提示词是训练时使用的，推理时可以自行尝试修改效果

加载模型

加载tokenizer

构建prompt

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

For agents

vLLM方式（推荐，兼容OpenAI格式）

LMStudio电脑本地部署方式（有UI界面）

Streamlit 网页推理方式（适合训练后，调试、测试模型）