hub / github.com/hpcaitech/ColossalAI

github.com/hpcaitech/ColossalAI @v0.5.1 sqlite

repository ↗ · DeepWiki ↗ · release v0.5.1 ↗ · compare 2 versions

12,628 symbols 62,128 edges 1,670 files 3,577 documented · 28%

README

Colossal-AI

Colossal-AI: Making large AI models cheaper, faster, and more accessible

Paper | Documentation | Examples | Forum | GPU Cloud Playground | Blog

| English | 中文 |

Get Started with Colossal-AI Without Setup

Access high-end, on-demand compute for your research instantly—no setup needed.

Limited Academic Bonuses:

Top up $1,000 and receive 300 credits
Top up $500 and receive 100 credits

Latest News

Why Colossal-AI
Features
Colossal-AI for Real World Applications
Parallel Training Demo
- LLaMA 1/2/3
- MoE
- GPT-3
- GPT-2
- BERT
- PaLM
- OPT
- ViT
- Recommendation System Models
Single GPU Training Demo
- GPT-2
- PaLM
Inference
Installation
- PyPI
- Install From Source
Use Docker
Community
Contributing
Cite Us

Why Colossal-AI

Prof. James Demmel (UC Berkeley): Colossal-AI makes training AI models efficient, easy, and scalable.

(back to top)

Features

Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.

Parallelism strategies
Data Parallelism
Pipeline Parallelism
1D, 2D, 2.5D, 3D Tensor Parallelism
Sequence Parallelism
Zero Redundancy Optimizer (ZeRO)
Auto-Parallelism
Heterogeneous Memory Management
PatrickStar
Friendly Usage
Parallelism based on the configuration file

(back to top)

Colossal-AI in the Real World

Open-Sora

Open-Sora：Revealing Complete Model Parameters, Training Details, and Everything for Sora-like Video Generation Models [code] [blog] [Model weights] [Demo] [GPU Cloud Playground] [OpenSora Image]

(back to top)

Colossal-LLaMA-2

[GPU Cloud Playground] [LLaMA3 Image]

7B: One half-day of training using a few hundred dollars yields similar results to mainstream large models, open-source and commercial-free domain-specific LLM solution. [code] [blog] [HuggingFace model weights] [Modelscope model weights]
13B: Construct refined 13B private model with just $5000 USD. [code] [blog] [HuggingFace model weights] [Modelscope model weights]

Model	Backbone	Tokens Consumed	MMLU (5-shot)	CMMLU (5-shot)	AGIEval (5-shot)	GAOKAO (0-shot)	CEval (5-shot)
Baichuan-7B	-	1.2T	42.32 (42.30)	44.53 (44.02)	38.72	36.74	42.80
Baichuan-13B-Base	-	1.4T	50.51 (51.60)	55.73 (55.30)	47.20	51.41	53.60
Baichuan2-7B-Base	-	2.6T	46.97 (54.16)	57.67 (57.07)	45.76	52.60	54.00
Baichuan2-13B-Base	-	2.6T	54.84 (59.17)	62.62 (61.97)	52.08	58.25	58.10
ChatGLM-6B	-	1.0T	39.67 (40.63)	41.17 (-)	40.10	36.53	38.90
ChatGLM2-6B	-	1.4T	44.74 (45.46)	49.40 (-)	46.36	45.49	51.70
InternLM-7B	-	1.6T	46.70 (51.00)	52.00 (-)	44.77	61.64	52.80
Qwen-7B	-	2.2T	54.29 (56.70)	56.03 (58.80)	52.47	56.42	59.60
Llama-2-7B	-	2.0T	44.47 (45.30)	32.97 (-)	32.60	25.46	-
Linly-AI/Chinese-LLaMA-2-7B-hf	Llama-2-7B	1.0T	37.43	29.92	32.00	27.57	-
wenge-research/yayi-7b-llama2	Llama-2-7B	-	38.56	31.52	30.99	25.95	-
ziqingyang/chinese-llama-2-7b	Llama-2-7B	-	33.86	34.69	34.52	25.18	34.2
TigerResearch/tigerbot-7b-base	Llama-2-7B	0.3T	43.73	42.04	37.64	30.61	-
LinkSoul/Chinese-Llama-2-7b	Llama-2-7B	-	48.41	38.31	38.45	27.72	-
FlagAlpha/Atom-7B	Llama-2-7B	0.1T	49.96	41.10	39.83	33.00	-
IDEA-CCNL/Ziya-LLaMA-13B-v1.1	Llama-13B	0.11T	50.25	40.99	40.04	30.54	-
Colossal-LLaMA-2-7b-base	Llama-2-7B	0.0085T	53.06	49.89	51.48	58.82	50.2
Colossal-LLaMA-2-13b-base	Llama-2-13B	0.025T	56.42	61.80	54.69	69.53

Core symbols most depended-on inside this repo

append

called by 1433

colossalai/legacy/nn/parallel/reducer.py

called by 1012

colossalai/lazy/lazy_init.py

cuda

called by 531

colossalai/lazy/lazy_init.py

size

called by 513

colossalai/cluster/process_group_mesh.py

get_accelerator

called by 444

colossalai/accelerator/api.py

clone

called by 438

colossalai/lazy/lazy_init.py

get_world_size

called by 424

colossalai/legacy/context/parallel_context.py

parameters

called by 377

colossalai/zero/gemini/gemini_ddp.py

Shape

Method 7,093

Function 3,781

Class 1,654

Route 100

Languages

Python100%

Modules by API surface

examples/community/roberta/pretraining/model/deberta_v2.py99 symbols

examples/community/roberta/pretraining/model/bert.py92 symbols

examples/images/diffusion/ldm/models/diffusion/ddpm.py88 symbols

colossalai/legacy/pipeline/rpc/_pipeline_base.py85 symbols

colossalai/shardformer/layer/_operation.py74 symbols

examples/tutorial/sequence_parallel/data/datasets/indexed_dataset.py71 symbols

applications/ColossalChat/examples/community/ray/train_prompts_on_ray.py71 symbols

colossalai/shardformer/modeling/chatglm2_6b/modeling_chatglm.py62 symbols

colossalai/legacy/trainer/hooks/_metric_hook.py59 symbols

colossalai/legacy/nn/layer/parallel_1d/layers.py58 symbols

colossalai/fx/tracer/experimental.py56 symbols

colossalai/booster/plugin/hybrid_parallel_plugin.py56 symbols

Used by 1 indexed graphs manifest dependencies, hub-wide

github.com/hpcaitech/Open-Sora

Dependencies from manifests, versioned

PuLP2.7.0 · 1×

Requests2.31.0 · 1×

SentencePiece0.1.99 · 1×

accelerate0.20.3 · 1×

albumentations1.3.0 · 1×

autoflake2.2.1 · 1×

bitsandbytes0.39.0 · 1×

black23.9.1 · 1×

chromadb0.4.9 · 1×

colossalai0.4.7 · 1×

coverage7.2.3 · 1×

datasets2.14.7 · 1×

For agents

$ claude mcp add ColossalAI \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/hpcaitech/ColossalAI @v0.5.1 sqlite

Colossal-AI

Paper | Documentation | Examples | Forum | GPU Cloud Playground | Blog

Get Started with Colossal-AI Without Setup

Latest News

Table of Contents

Why Colossal-AI

Features

Colossal-AI in the Real World

Open-Sora

Colossal-LLaMA-2

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Used by 1 indexed graphs manifest dependencies, hub-wide

Dependencies from manifests, versioned

For agents