hub / github.com/OptimalScale/LMFlow

github.com/OptimalScale/LMFlow @v1.0.0 sqlite

repository ↗ · DeepWiki ↗ · release v1.0.0 ↗

863 symbols 4,003 edges 153 files 221 documented · 26%

README

LMFlow

English | 简体中文 | Español | 日本語 | 한국어 | हिंदी

An extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community.

LMFlow-features

Latest News

[2025-03-18] With full support for Accelerate and lots of streamlining, LMFlow-nightly is now available! Feel free to try out the latest features and improvements by git checkout lmflow-nightly.
[2024-12-02] Support Hymba, a new family of small language models featuring a hybrid-head parallel architecture. Check out Post-training Hymba for more details.
[2024-07-01] 🏆 LMFlow receives the Best Demo Paper Award at NAACL 2024! 🎉
[2024-06-30] Expanding Optimization Options! We now support custom optimizer training with a variety of optimizers. Dive into the details and try out the new features with our updated script at custom_optimizers.
[2024-04-25] :rocket: Support conversation template! We've preset the latest Llama-3 and Phi-3 conversation templates as well as some frequently used templates such as chatml (see all templates here), and we are working on adding more preset templates. Adding corresponding --conversation_template in the shell script and you are all set! :rocket:
[2024-03-27] Support LISA, enabling 7B training in 24G memory without offloading!
[2023-09-11] Support speculative decoding. Check out speculative_decoding for the usage and acceleration details.
[2023-08-14] Support long context inference with position interpolation (Linear & NTK scaling ) for LLaMA models. Check out postion_interpolation for more details.

More news...

[2023-08-07] Support Flash Attention-2. Check out flash_attention for more details.
[2023-08-02] Support Llama2, ChatGLM2, and Baichuan models.
[2023-07-23] LMFlow multimodal chatbot is now available! Support multimodal inputs of images and texts. Online Demo is also provided (We hold the service on a single GPU, hence one may experience "queuing" or "application busy" sometimes when multiple users are accessing at the same time, please wait and attempt again later when such event happens)
[2023-06-22] LMFlow paper is out! Check out our implementation details at https://arxiv.org/abs/2306.12420
[2023-06-16] Our finetuned Robin-33B-V2 scored an impressive 64.1 on the Huggingface LLM leaderboard in our offline evaluation, outperforming major open-source LLMs! All checkpoints (7B, 13B, 33B, and 65B) are released! Checkout the performance here.
[2023-06-07] LMFlow is now officially available on PyPI! Install it with pip install lmflow-finetune!
[2023-05-30] Release Robin-13B-v2 and Robin-33B-v2!
[2023-05-15] Release LMFlow-data, the training dataset of Robin-7B-v2. A new test data is also released.
[2023-05-09] Release Robin-7B-v2, achieving competitive performance on chitchat, commonsense reasoning and instruction-following tasks. Refer to our comprehensive study.
[2023-05-08] Release LMFlow Benchmark, an automatic evaluation framework for open-source chat-style LLMs. Benchmark results on 31 popular models are reported. Participate in LMFlow Benchmark.
[2023-04-21] Release Robin-7B (based on LLaMA-7B), and two models for commercial use: Parakeets-2.7B (based on GPT-NEO-2.7B) and Cokatoo-7B (based on StableLM-7B) Download here
[2023-04-15] Inference: Support streaming output and ChatGLM.
[2023-04-10] We propose a new alignment algorithm: Reward rAnked FineTuning (RAFT), which is more efficient than conventional (PPO-based) RLHF. [Paper]
[2023-04-02] Web service is online!
[2023-04-01] Release three instruction-tuned checkpoints and three medical checkpoints in model zoo: LLaMA-7B-tuned, LLaMA-13B-tuned, LLaMA-33B-tuned, LLaMA-7B-medical, LLaMA-13B-medical, and LLaMA-33B-medical.
[2023-03-27] Support full tuning and lora tuning for all decoder models.
[2023-03-27] Tasked tuned model beats ChatGPT on medical domain.
[2023-03-27] Release code and checkpoints - version 0.0.1! Our tasked-tuned model beats ChatGPT on medical domain.

LMFlow
Latest News
Table of Contents
Quick Start
Supported Features
Support
License
Citation

Quick Start

Setup

Our package has been tested on Linux OS (Ubuntu 20.04). Other OS platforms (MacOS, Windows) are not fully tested, where you may encounter unexpected errors. If you are using LMFlow for the first time, we recommend you to try on a Linux machine or Google Colab.

git clone -b v0.0.9 https://github.com/OptimalScale/LMFlow.git
cd LMFlow
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
pip install -e .

for CUDA versions 10.3-11.7

git clone -b v0.0.5 https://github.com/OptimalScale/LMFlow.git
cd LMFlow
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
pip install -e .

[!TIP] We use WandB to track and visualize the training process by default. Before running the training scripts, users may need to log in to WandB using the command:

bash wandb login

For detailed instructions, refer to the WandB Quickstart Guide. Step 1 (registration) and Step 2 (login using your WandB API key) should be sufficient to set up your environment.

Disabling wandb

One can disable wandb by either:

Adding environment variable before running the training command.

bash export WANDB_MODE=disabled

OR, specifying the integrations to report the results and logs to. In the training script, add:

bash --report_to none \

Prepare Dataset

Please refer to our doc.

Finetuning

Estimated Hardware Requirement

Method	0.5B	3B	7B	14B	30B	70B	`x`B
Full `bf16`/`fp16`	9GB	55GB	120GB	240GB	600GB	1200GB	`18x`GB
LoRA	1GB	6GB	16GB	32GB	64GB	160GB	`2x`GB
QLoRA `quant_bit=8`	0.7GB	3GB	10GB	20GB	40GB	80GB	`x`GB
QLoRA `quant_bit=4`	0.4GB	1.5GB	6GB	12GB	24GB	48GB	`x/2`GB

Full Finetuning

Full training updates all the parameters to finetune a language model. Here is an example to finetune a GPT-2 base model.

cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune.sh \
  --model_name_or_path gpt2 \
  --dataset_path data/alpaca/train_conversation \
  --output_model_path output_models/finetuned_gpt2

[!TIP] For conversation dataset, specify a conversation template for better performance by adding --conversation_template to the command.

Llama-3-8B conversation dataset example

```bash cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune.sh \ --model_name_or_path meta-llama/Meta-Llama-3-8B \ --dataset_path data/alpaca/train_conversation \ --conversation_template llama3 \ --output_model_path output_models/finetuned_llama3_8b ```

LISA

LISA is a memory-efficient finetuning algorithm that allows tradeoff between memory and the number of randomly unfreezed layers. This script currently is only tested in single gpus. Please stay tuned for our latest updates :smile:

cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune_with_lisa.sh \
  --model_name_or_path meta-llama/Llama-2-7b-hf \
  --dataset_path data/alpaca/train_conversation \
  --output_model_path output_models/finetuned_llama2_7b \
  --lisa_activated_layers 1 \
  --lisa_interval_steps 20

[!TIP]

Llama-2-7B conversation dataset example

```bash cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune_with_lisa.sh \ --model_name_or_path meta-llama/Llama-2-7b-hf \ --dataset_path data/alpaca/train_conversation \ --conversation_template llama2 \ --output_model_path output_models/finetuned_llama2_7b_lisa \ --lisa_activated_layers 1 \ --lisa_interval_steps 20 ```

LoRA

LoRA is a parameter-efficient finetuning algorithm and is more efficient than full finetuning.

cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune_with_lora.sh \
  --model_name_or_path facebook/galactica-1.3b \
  --dataset_path data/alpaca/train_conversation \
  --output_lora_path output_models/finetuned_galactica_lora

[!TIP]

Llama-2-7B conversation dataset example

```bash cd data && ./download.sh alpaca && cd -

bash ./scripts/run_finetune_with_lora.sh \ --model_name_or_path meta-llama/Llama-2-7b-hf \ --dataset_path data/alpaca/train_conversation \ --conversation_template llama2 \ --output_model_path output_models/finetuned_llama2_7b_lora \ ```

Merge LoRA Weight

Merge LoRA weight and the base model into one using:

```sh bash

Core symbols most depended-on inside this repo

called by 138

experimental/LISA-diffusion/latent_consistency_model/train_lcm_distill_sd_wds_lisa.py

format

called by 114

src/lmflow/utils/conversation_template/base.py

save

called by 34

src/lmflow/datasets/dataset.py

encode

called by 29

src/lmflow/models/hf_decoder_model.py

get_type

called by 28

src/lmflow/datasets/dataset.py

log

called by 28

src/lmflow/pipeline/utils/raft_trainer.py

step

called by 24

src/lmflow/optim/adan.py

get_backend_dataset

called by 24

src/lmflow/datasets/dataset.py

Shape

Method 512

Function 207

Class 144

Languages

Python100%

Modules by API surface

src/lmflow/pipeline/utils/raft_trainer.py70 symbols

experimental/LISA-diffusion/latent_consistency_model/train_lcm_distill_sd_wds_lisa.py35 symbols

experimental/LISA-diffusion/latent_consistency_model/train_lcm_distill_sd_wds_lora.py32 symbols

src/lmflow/utils/conversation_template/base.py28 symbols

tests/pipeline/test_finetuner_distributed_loss.py23 symbols

src/lmflow/args.py22 symbols

src/lmflow/datasets/dataset.py21 symbols

src/lmflow/pipeline/vllm_inferencer.py20 symbols

src/lmflow/pipeline/inferencer.py19 symbols

src/lmflow/models/hf_model_mixin.py17 symbols

src/lmflow/models/hf_decoder_model.py15 symbols

experimental/LISA-diffusion/instruct_pix2pix/train_instruct_pix2pix_lisa.py15 symbols

Dependencies from manifests, versioned

Jinja23.1.2 · 1×

MarkupSafe2.1.2 · 1×

Pillow9.5.0 · 1×

PyYAML6.0 · 1×

Pygments2.14.0 · 1×

accelerate0.27.2 · 1×

asttokens2.2.1 · 1×

backcall0.2.0 · 1×

bitsandbytes0.40.0 · 1×

certifi2022.12.7 · 1×

charset-normalizer3.1.0 · 1×

clip1.0 · 1×

For agents

$ claude mcp add LMFlow \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/OptimalScale/LMFlow @v1.0.0 sqlite

LMFlow

English | 简体中文 | Español | 日本語 | 한국어 | हिंदी

Latest News

Table of Contents

Quick Start

Setup

Prepare Dataset

Finetuning

Estimated Hardware Requirement

Full Finetuning

LISA

LoRA

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

For agents