MCPcopy
hub / github.com/datawhalechina/fun-rec

github.com/datawhalechina/fun-rec @main sqlite

repository ↗ · DeepWiki ↗
1,233 symbols 4,216 edges 185 files 461 documented · 37%
README


Deep Recommendation Algorithms in Practice(wheat-book)

From Cascade Architecture to Generative Paradigm

English | 中文

Note: This project is still under development and is being updated frequently. Pull Requests are not accepted at this moment. If you have any suggestions or encounter any issues, please feel free to provide feedback via Issue.

This book systematically covers the full technical evolution of recommendation systems, from classical cascade architectures to the generative paradigm. It is organized into two parts: the first covers candidate retrieval techniques including collaborative filtering, embedding-based retrieval, and sequential retrieval, along with ranking and re-ranking methods such as feature crossing, multi-objective modeling, and multi-scenario modeling; the second focuses on frontier generative recommendation, encompassing LLM foundations, Scaling Law architecture exploration, end-to-end generative modeling, chain-of-thought reasoning, and diffusion-based recommendation, culminating in a hands-on production-grade system project. Ideal for readers with a machine learning background who want to systematically master both the theory and engineering practice of recommendation algorithms.

📖 Table of Contents

Part I: Cascade Architecture

  • 1. Introduction to Recommendation Systems
  • What is a Recommendation System? / Book Overview
  • 2. Fast Candidate Retrieval
  • Collaborative Filtering: Item-based CF / User-based CF / Matrix Factorization
  • Embedding-based Retrieval: I2I / U2I
  • Sequential Retrieval: User Interest Representation / Full-history Modeling & Streaming Index
  • 3. Precise Preference Prediction
  • Memorization & Generalization
  • Feature Crossing: 2nd-order / Higher-order
  • Sequential Modeling: Local Activation Attention / Interest Evolution Modeling / Behavior-to-Session Modeling
  • Multi-objective Modeling: Architecture Evolution / Task Dependency Modeling / Multi-loss Optimization
  • Multi-scenario Modeling: Multi-tower Architecture / Dynamic Weight Modeling
  • 4. Re-ranking & Diversity Modeling
  • Greedy-based Re-ranking: Maximum Marginal Relevance / Determinantal Point Process
  • Personalized Re-ranking: Transformer Re-ranking Model / Permutation-based Re-ranking Model

Part II: Generative Paradigm

  • 5. Foundations of Generative Recommendation
  • Evolution of Recommendation Paradigms: Discriminative Modeling / Generative Core Ideas / Essential Differences
  • Building Blocks of Generative Architectures: Transformer / Diffusion Models
  • Fundamentals of LLM Modeling: Three-stage Paradigm / From LLM to Generative Recommendation
  • Tokenizer Techniques for Recommendation: Paradigm Evolution / End-to-end Discretization / Industrial Solutions / Key Challenges
  • 6. Scaling Law Architecture Exploration
  • HSTU Architecture Evolution: First Scaling Law Exploration / Engineering Breakthrough / Hybrid Paradigm Breakthrough
  • Hardware-aware Architecture Design: HW-Aware Unified Architecture / Unified Sequence & Feature Interaction Modeling
  • 7. End-to-End Generative Modeling
  • OneRec Architecture Evolution: OneRec-V1 Pioneering Exploration / OneRec-V2 Efficiency & Performance Breakthrough
  • Query Completion & Product Retrieval: OneSug Query Completion / OneSearch Product Retrieval
  • Auction Mechanisms & Multi-scenario Advertising: EGA Unified Auction & Generation / GPR Pretrained Ad Generation
  • 8. Reasoning-enhanced Recommendation
  • Unifying Collaborative and Linguistic Semantics: Item Index Learning / Semantic Alignment Training / PLUM Framework
  • OneRec-Think Reasoning Framework: Item Alignment / Reasoning Activation / Reasoning Enhancement / Think-Ahead Architecture
  • Exploration of Autonomous Reasoning: RecZero / RecOne / Future Directions
  • 9. Diffusion-based Recommendation
  • Fundamentals of Diffusion Models: Diffusion Taxonomy / Forward Noising & Reverse Denoising / Training & Sampling / Conditional Generation
  • Diffusion-based Data Augmentation: DiffuASR Sequential Augmentation / Diff-MSR Cross-scenario Augmentation
  • Feature Enhancement & Diversity Optimization: AsymDiffRec Feature Enhancement / DMSG Diversity Optimization
  • 10. Production-grade Recommendation System
  • Project Background & Goals / System Architecture Design / Offline Pipeline / Online Pipeline / Frontend & Interaction / Deployment & Operations

We also establish a FunRec learning community (WeChat group + knowledge planet), where the WeChat group is convenient for daily communication and discussion, and the knowledge planet is convenient for content retention. Some early recorded videos related to technology are also on Bilibili All technical sharing content is on Bilibili. Since the WeChat group's QR code is only valid for 7 days, just add the following WeChat Code, with remark: Fun-Rec, you will be added into a Fun-Rec discussion group. If you think the WeChat group is too noisy, it is recommended to add the knowledge planet directly!

image-20220408193745249

Thanks

Core Contributors

pic Ruyi Luo MSc, Xidian University Senior Recommendation Algorithm Engineer pic Bo Kang Visiting Professor, Ghent University Co-founder of nobl.ai

Special thanks to kenken-xrswallown1Lyons-Tzhongqiangwu960812@wangych6@morningsky@hilbert-yaa@maxxbaba@pearfl@ChungKingExpress@storyandwine@SYC1123@luzixiao@Evan-wyl@Sm1les@LSGOMYP for their early help and support to this project.

Follow Us

Scan the QR code below to follow the Datawhale Official Account

Datawhale, a learning community focused on the field of AI. Our mission is for the learner, and grow together with learners. Currently, there are thousands of people have joined the learning community, and we have organized learning in various fields such as machine learning, deep learning, data analysis, data mining, web crawling, programming, statistics, MySQL, and data competitions. You can join us by searching for the Datawhale Official Account on WeChat.

LICENSE

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Core symbols most depended-on inside this repo

call
called by 342
src/funrec/models/layers.py
r
called by 230
docs/_static/sphinx_materialdesign_theme.js
e
called by 188
docs/_static/sphinx_materialdesign_theme.js
t
called by 150
docs/_static/sphinx_materialdesign_theme.js
range
called by 121
docs/_static/underscore-1.13.1.js
i
called by 83
docs/_static/sphinx_materialdesign_theme.js
n
called by 63
docs/_static/sphinx_materialdesign_theme.js
a
called by 57
docs/_static/sphinx_materialdesign_theme.js

Shape

Function 767
Method 317
Class 127
Route 22

Languages

Python55%
TypeScript45%

Modules by API surface

src/funrec/models/layers.py150 symbols
docs/_static/underscore-1.13.1.js114 symbols
docs/_static/jquery-3.6.0.js112 symbols
docs/_static/jquery.js81 symbols
docs/_static/underscore.js72 symbols
docs/_static/sphinx_materialdesign_theme.js63 symbols
docs/_static/material-design-lite-1.3.0/material.js32 symbols
web_project/backend/app/schemas.py26 symbols
src/funrec/models/fm_recall.py23 symbols
docs/_static/material-design-lite-1.3.0/material.min.js23 symbols
web_project/backend/online/pipeline.py22 symbols
src/funrec/models/dien.py19 symbols

Dependencies from manifests, versioned

@vitejs/plugin-vue6.0.1 · 1×
autoprefixer10.4.17 · 1×
axios1.6.5 · 1×
pinia3.0.4 · 1×
postcss8.4.33 · 1×
tailwindcss3.4.1 · 1×
vite7.2.1 · 1×
vue3.4.15 · 1×
vue-router4.2.5 · 1×
elasticsearch9.2.0 · 1×
faiss-cpu1.7.4 · 1×
fastapi0.103.2 · 1×

Datastores touched

funrec_dbDatabase · 1 repos

For agents

$ claude mcp add fun-rec \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact