hub / github.com/RUCAIBox/LLMSurvey

github.com/RUCAIBox/LLMSurvey @main sqlite

1,240 symbols 3,455 edges 143 files 104 documented · 8%

README

LLMSurvey

A collection of papers and resources related to Large Language Models.

The organization of papers refers to our survey "A Survey of Large Language Models".

Please let us know if you find out a mistake or have any suggestions by e-mail: batmanfly@gmail.com

(we suggest ccing another email francis_kun_zhou@163.com meanwhile, in case of any unsuccessful delivery issue.)

If you find our survey useful for your research, please cite the following paper:

@article{LLMSurvey,
    title={A Survey of Large Language Models},
    author={Zhao, Wayne Xin and Zhou, Kun and Li, Junyi and Tang, Tianyi and Wang, Xiaolei and Hou, Yupeng and Min, Yingqian and Zhang, Beichen and Zhang, Junjie and Dong, Zican and Du, Yifan and Yang, Chen and Chen, Yushuo and Chen, Zhipeng and Jiang, Jinhao and Ren, Ruiyang and Li, Yifan and Tang, Xinyu and Liu, Zikang and Liu, Peiyu and Nie, Jian-Yun and Wen, Ji-Rong},
    year={2023},
    journal={arXiv preprint arXiv:2303.18223},
    url={http://arxiv.org/abs/2303.18223}
}

🚀(New) We have released the Chinese book of our survey!

The Chinese book focuses on providing explanations for beginners in the field of LLMs, aiming to present a comprehensive framework and roadmap for LLMs. This book is suitable for senior undergraduate students and junior graduate students with a foundation in deep learning and can serve as an introductory technical book. You can download the Chinese book at https://llmbook-zh.github.io/.

Here is our Chinese book sales page.

chinese_version

🚀(New) The content about long CoT reasoning

In our latest version, we add new content of the recent popular reasoning paradigm by allocating more time to thinking before responding to a problem. We focus on long CoT reasoning which is the mainstream approach taken by recent LLMs, such as DeepSeek-R1 and OpenAI's o-series models. We first discuss the reasoning patterns and advantages of the long CoT paradigm. Then we present the construction approaches of long CoT data, including data distillation, search-based data synthesis, and multi-agent collaboration. Moreover, we introduce the commonly-used two training methods: long CoT instruction tuning and scaling reinforcement learning training. Finally, we conduct a in-depth discussion about recent test-time scaling efforts for LLMs.

Cover

The trends of the number of papers related to LLMs on arXiv

Here are the trends of the cumulative numbers of arXiv papers that contain the keyphrases “language model” (since June 2018) and “large language model” (since October 2019), respectively.

arxiv_llms

The statistics are calculated using exact match by querying the keyphrases in title or abstract by months. We set different x-axis ranges for the two keyphrases, because “language models” have been explored at an earlier time. We label the points corresponding to important landmarks in the research progress of LLMs. A sharp increase occurs after the release of ChatGPT: the average number of published arXiv papers that contain “large language model” in title or abstract goes from 0.40 per day to 8.58 per day.

Technical Evolution of GPT-series Models

A brief illustration for the technical evolution of GPT-series models. We plot this figure mainly based on the papers, blog articles and official APIs from OpenAI. Here, solid lines denote that there exists an explicit evidence (e.g., the official statement that a new model is developed based on a base model) on the evolution path between two models, while dashed lines denote a relatively weaker evolution relation.

gpt-series

Evolutionary Graph of LLaMA Family

An evolutionary graph of the research work conducted on LLaMA. Due to the huge number, we cannot include all the LLaMA variants in this figure, even much excellent work.

LLaMA_family

To support incremental update, we share the source file of this figure, and welcome the readers to include the desired models by submitting the pull requests on our GitHub page. If you're instrested, please request by application.

Prompts

We collect some useful tips for designing prompts that are collected from online notes and experiences from our authors, where we also show the related ingredients and principles (introduced in Section 8.1).

prompt examples

Please click here to view more detailed information.

Welcome everyone to provide us with more relevant tips in the form of issues. After selection, we will regularly update them on GitHub and indicate the source.

Experiments

Instruction Tuning Experiments

We will explore the effect of different types of instructions in fine-tuning LLMs (i.e., 7B LLaMA26), as well as examine the usefulness of several instruction improvement strategies.

instruction_tuning_table

Please click here to view more detailed information.

Ability Evaluaition Experiments

We conduct a fine-grained evaluation on the abilities discussed in Section 7.1 and Section 7.2. For each kind of ability, we select representative tasks and datasets for conducting evaluation experiments to examine the corresponding performance of LLMs.

ability_main

Please click here to view more detailed information.

We also call for support of computing power for conducting more comprehensive experiments.

LLMSurvey
Chinese Version
🚀(New) The trends of the number of papers related to LLMs on arXiv
🚀(New) Technical Evolution of GPT-series Models
🚀(New) Evolutionary Graph of LLaMA Family
🚀(New) Prompts
🚀(New) Experiments
- Instruction Tuning Experiments
- Ability Evaluaition Experiments
Table of Contents
Timeline of LLMs
List of LLMs
Paper List
Acknowledgments
Update Log

Timeline of LLMs

LLMs_timeline

List of LLMs

Category	model	Release Time	Size(B)	Link
Publicly Accessbile	T5	2019/10	11	Paper
mT5	2021/03	13	Paper
PanGu-α	2021/05	13	Paper
CPM-2	2021/05	198	Paper
T0	2021/10	11	Paper
GPT-NeoX-20B	2022/02	20	Paper
CodeGen	2022/03	16	Paper
Tk-Instruct	2022/04	11	Paper
UL2	2022/02	20	Paper
OPT	2022/05	175	Paper
YaLM	2022/06	100	GitHub
NLLB	2022/07	55	Paper
BLOOM	2022/07	176	Paper
GLM	2022/08	130	Paper
Flan-T5	2022/10	11	Paper
mT0	2022/11	13	Paper
Galatica	2022/11	120	Paper
BLOOMZ	2022/11	176	Paper
OPT-IML	2022/12	175	Paper
Pyt

Core symbols most depended-on inside this repo

load

called by 74

Experiments/InstructTuning/mmlu/modeling.py

write

called by 68

Experiments/ToolManipulation/HotPotQA/wrappers.py

tree_to_variable_index

called by 50

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/utils.py

commaSep1

called by 18

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/grammar.js

called by 17

Experiments/ToolManipulation/HotPotQA/wrappers.py

normalize_answer

called by 8

Experiments/ToolManipulation/HotPotQA/wrappers.py

process_single_data

called by 6

Experiments/SymbolicReasoning/data_process.py

ngrams

called by 6

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/utils.py

Shape

Method 696

Function 392

Class 151

Route 1

Languages

Python100%

TypeScript1%

Modules by API surface

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3.8_grammar.py155 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python2-grammar.py101 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python2-grammar-crlf.py101 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3-grammar.py98 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/parser/tree-sitter-python/examples/python3-grammar-crlf.py98 symbols

Experiments/ToolManipulation/HotPotQA/wrappers.py32 symbols

Experiments/InstructTuning/mmlu/modeling.py29 symbols

Experiments/SymbolicReasoning/data_process.py26 symbols

Experiments/MathematicalReasoning/data_process.py26 symbols

Experiments/LanguageGeneration/HumanEval/model.py22 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/weighted_ngram_match.py16 symbols

Experiments/ToolManipulation/Gorilla/eval/eval-scripts/codebleu/bleu.py15 symbols

Dependencies from manifests, versioned

nan2.15.0 · 1×

tree-sitter-cli0.20.1 · 1×

Flask2.3.2 · 1×

Flask-Slack0.1.5 · 1×

GitPython3.1.31 · 1×

Levenshtein0.21.0 · 1×

Pillow9.4.0 · 1×

PyYAML6.0 · 1×

Werkzeug2.3.4 · 1×

accelerate0.17.1 · 1×

aiohttp3.8.4 · 1×

aiosignal1.3.1 · 1×

For agents

$ claude mcp add LLMSurvey \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/RUCAIBox/LLMSurvey @main sqlite

LLMSurvey

🚀(New) We have released the Chinese book of our survey!

🚀(New) The content about long CoT reasoning

The trends of the number of papers related to LLMs on arXiv

Technical Evolution of GPT-series Models

Evolutionary Graph of LLaMA Family

Prompts

Experiments

Instruction Tuning Experiments

Ability Evaluaition Experiments

Table of Contents

Timeline of LLMs

List of LLMs

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

For agents