hub / github.com/datawhalechina/fun-rec

github.com/datawhalechina/fun-rec @main sqlite

1,233 symbols 4,216 edges 185 files 461 documented · 37%

README

Deep Recommendation Algorithms in Practice（wheat-book）

From Cascade Architecture to Generative Paradigm

English | 中文

Note: This project is still under development and is being updated frequently. Pull Requests are not accepted at this moment. If you have any suggestions or encounter any issues, please feel free to provide feedback via Issue.

This book systematically covers the full technical evolution of recommendation systems, from classical cascade architectures to the generative paradigm. It is organized into two parts: the first covers candidate retrieval techniques including collaborative filtering, embedding-based retrieval, and sequential retrieval, along with ranking and re-ranking methods such as feature crossing, multi-objective modeling, and multi-scenario modeling; the second focuses on frontier generative recommendation, encompassing LLM foundations, Scaling Law architecture exploration, end-to-end generative modeling, chain-of-thought reasoning, and diffusion-based recommendation, culminating in a hands-on production-grade system project. Ideal for readers with a machine learning background who want to systematically master both the theory and engineering practice of recommendation algorithms.

📖 Table of Contents

Part I: Cascade Architecture

1. Introduction to Recommendation Systems
What is a Recommendation System? / Book Overview
2. Fast Candidate Retrieval
Collaborative Filtering: Item-based CF / User-based CF / Matrix Factorization
Embedding-based Retrieval: I2I / U2I
Sequential Retrieval: User Interest Representation / Full-history Modeling & Streaming Index
3. Precise Preference Prediction
Memorization & Generalization
Feature Crossing: 2nd-order / Higher-order
Sequential Modeling: Local Activation Attention / Interest Evolution Modeling / Behavior-to-Session Modeling
Multi-objective Modeling: Architecture Evolution / Task Dependency Modeling / Multi-loss Optimization
Multi-scenario Modeling: Multi-tower Architecture / Dynamic Weight Modeling
4. Re-ranking & Diversity Modeling
Greedy-based Re-ranking: Maximum Marginal Relevance / Determinantal Point Process
Personalized Re-ranking: Transformer Re-ranking Model / Permutation-based Re-ranking Model

Part II: Generative Paradigm

5. Foundations of Generative Recommendation
Evolution of Recommendation Paradigms: Discriminative Modeling / Generative Core Ideas / Essential Differences
Building Blocks of Generative Architectures: Transformer / Diffusion Models
Fundamentals of LLM Modeling: Three-stage Paradigm / From LLM to Generative Recommendation
Tokenizer Techniques for Recommendation: Paradigm Evolution / End-to-end Discretization / Industrial Solutions / Key Challenges
6. Scaling Law Architecture Exploration
HSTU Architecture Evolution: First Scaling Law Exploration / Engineering Breakthrough / Hybrid Paradigm Breakthrough
Hardware-aware Architecture Design: HW-Aware Unified Architecture / Unified Sequence & Feature Interaction Modeling
7. End-to-End Generative Modeling
OneRec Architecture Evolution: OneRec-V1 Pioneering Exploration / OneRec-V2 Efficiency & Performance Breakthrough
Query Completion & Product Retrieval: OneSug Query Completion / OneSearch Product Retrieval
Auction Mechanisms & Multi-scenario Advertising: EGA Unified Auction & Generation / GPR Pretrained Ad Generation
8. Reasoning-enhanced Recommendation
Unifying Collaborative and Linguistic Semantics: Item Index Learning / Semantic Alignment Training / PLUM Framework
OneRec-Think Reasoning Framework: Item Alignment / Reasoning Activation / Reasoning Enhancement / Think-Ahead Architecture
Exploration of Autonomous Reasoning: RecZero / RecOne / Future Directions
9. Diffusion-based Recommendation
Fundamentals of Diffusion Models: Diffusion Taxonomy / Forward Noising & Reverse Denoising / Training & Sampling / Conditional Generation
Diffusion-based Data Augmentation: DiffuASR Sequential Augmentation / Diff-MSR Cross-scenario Augmentation
Feature Enhancement & Diversity Optimization: AsymDiffRec Feature Enhancement / DMSG Diversity Optimization
10. Production-grade Recommendation System
Project Background & Goals / System Architecture Design / Offline Pipeline / Online Pipeline / Frontend & Interaction / Deployment & Operations

We also establish a FunRec learning community (WeChat group + knowledge planet), where the WeChat group is convenient for daily communication and discussion, and the knowledge planet is convenient for content retention. Some early recorded videos related to technology are also on Bilibili All technical sharing content is on Bilibili. Since the WeChat group's QR code is only valid for 7 days, just add the following WeChat Code, with remark: Fun-Rec, you will be added into a Fun-Rec discussion group. If you think the WeChat group is too noisy, it is recommended to add the knowledge planet directly!

Thanks

Core Contributors

Ruyi Luo MSc, Xidian University Senior Recommendation Algorithm Engineer

Bo Kang Visiting Professor, Ghent University Co-founder of nobl.ai

Special thanks to kenken-xr、swallown1、Lyons-T、zhongqiangwu960812、@wangych6、@morningsky、@hilbert-yaa、@maxxbaba、@pearfl、@ChungKingExpress、@storyandwine、@SYC1123、@luzixiao、@Evan-wyl、@Sm1les、@LSGOMYP for their early help and support to this project.

Scan the QR code below to follow the Datawhale Official Account

Datawhale, a learning community focused on the field of AI. Our mission is for the learner, and grow together with learners. Currently, there are thousands of people have joined the learning community, and we have organized learning in various fields such as machine learning, deep learning, data analysis, data mining, web crawling, programming, statistics, MySQL, and data competitions. You can join us by searching for the Datawhale Official Account on WeChat.

LICENSE

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Core symbols most depended-on inside this repo

call

called by 342

src/funrec/models/layers.py

called by 230

docs/_static/sphinx_materialdesign_theme.js

called by 188

docs/_static/sphinx_materialdesign_theme.js

called by 150

docs/_static/sphinx_materialdesign_theme.js

range

called by 121

docs/_static/underscore-1.13.1.js

called by 83

docs/_static/sphinx_materialdesign_theme.js

called by 63

docs/_static/sphinx_materialdesign_theme.js

called by 57

docs/_static/sphinx_materialdesign_theme.js

Shape

Function 767

Method 317

Class 127

Route 22

Languages

Python55%

TypeScript45%

Modules by API surface

src/funrec/models/layers.py150 symbols

docs/_static/underscore-1.13.1.js114 symbols

docs/_static/jquery-3.6.0.js112 symbols

docs/_static/jquery.js81 symbols

docs/_static/underscore.js72 symbols

docs/_static/sphinx_materialdesign_theme.js63 symbols

docs/_static/material-design-lite-1.3.0/material.js32 symbols

web_project/backend/app/schemas.py26 symbols

src/funrec/models/fm_recall.py23 symbols

docs/_static/material-design-lite-1.3.0/material.min.js23 symbols

web_project/backend/online/pipeline.py22 symbols

src/funrec/models/dien.py19 symbols

Dependencies from manifests, versioned

@vitejs/plugin-vue6.0.1 · 1×

autoprefixer10.4.17 · 1×

axios1.6.5 · 1×

pinia3.0.4 · 1×

postcss8.4.33 · 1×

tailwindcss3.4.1 · 1×

vite7.2.1 · 1×

vue3.4.15 · 1×

vue-router4.2.5 · 1×

elasticsearch9.2.0 · 1×

faiss-cpu1.7.4 · 1×

fastapi0.103.2 · 1×

Datastores touched

funrec_dbDatabase · 1 repos

For agents

$ claude mcp add fun-rec \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/datawhalechina/fun-rec @main sqlite

Deep Recommendation Algorithms in Practice（wheat-book）

📖 Table of Contents

Thanks

Follow Us

LICENSE

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

Datastores touched

For agents