hub / github.com/microsoft/RD-Agent

github.com/microsoft/RD-Agent @v0.8.0 sqlite

repository ↗ · DeepWiki ↗ · release v0.8.0 ↗

2,426 symbols 10,027 edges 446 files 873 documented · 36%

README

🖥️ Live Demo | 🎥 Demo Video ▶️YouTube | 📖 Documentation | 📄 Tech Report | 📃 Papers

[![CI](https://github.com/microsoft/RD-Agent/actions/workflows/ci.yml/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/ci.yml) [![CodeQL](https://github.com/microsoft/RD-Agent/actions/workflows/github-code-scanning/codeql/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/github-code-scanning/codeql) [![Dependabot Updates](https://github.com/microsoft/RD-Agent/actions/workflows/dependabot/dependabot-updates/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/dependabot/dependabot-updates) [![Lint PR Title](https://github.com/microsoft/RD-Agent/actions/workflows/pr.yml/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/pr.yml) [![Release.yml](https://github.com/microsoft/RD-Agent/actions/workflows/release.yml/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/release.yml) [![Platform](https://img.shields.io/badge/platform-Linux-blue)](https://pypi.org/project/rdagent/#files) [![PyPI](https://img.shields.io/pypi/v/rdagent)](https://pypi.org/project/rdagent/) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/rdagent)](https://pypi.org/project/rdagent/) [![Release](https://img.shields.io/github/v/release/microsoft/RD-Agent)](https://github.com/microsoft/RD-Agent/releases) [![GitHub](https://img.shields.io/github/license/microsoft/RD-Agent)](https://github.com/microsoft/RD-Agent/blob/main/LICENSE) [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit) [![Checked with mypy](https://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/) [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![Chat](https://img.shields.io/badge/chat-discord-blue)](https://discord.gg/ybQ97B6Jjy) [![Documentation Status](https://readthedocs.org/projects/rdagent/badge/?version=latest)](https://rdagent.readthedocs.io/en/latest/?badge=latest) [![Readthedocs Preview](https://github.com/microsoft/RD-Agent/actions/workflows/readthedocs-preview.yml/badge.svg)](https://github.com/microsoft/RD-Agent/actions/workflows/readthedocs-preview.yml) [![arXiv](https://img.shields.io/badge/arXiv-2505.14738-00ff00.svg)](https://arxiv.org/abs/2505.14738) # 📰 News | 🗞️ News | 📝 Description | | -- | ------ | | NeurIPS 2025 Acceptance | We are thrilled to announce that our paper [R&D-Agent-Quant](https://arxiv.org/abs/2505.15155) has been accepted to NeurIPS 2025 | | [Technical Report Release](#overall-technical-report) | Overall framework description and results on MLE-bench | | [R&D-Agent-Quant Release](#deep-application-in-diverse-scenarios) | Apply R&D-Agent to quant trading | | MLE-Bench Results Released | R&D-Agent currently leads as the [top-performing machine learning engineering agent](#-the-best-machine-learning-engineering-agent) on MLE-bench | | Support LiteLLM Backend | We now fully support **[LiteLLM](https://github.com/BerriAI/litellm)** as our default backend for integration with multiple LLM providers. | | General Data Science Agent | [Data Science Agent](https://rdagent.readthedocs.io/en/latest/scens/data_science.html) | | Kaggle Scenario release | We release **[Kaggle Agent](https://rdagent.readthedocs.io/en/latest/scens/data_science.html)**, try the new features! | | Official WeChat group release | We created a WeChat group, welcome to join! (🗪[QR Code](https://github.com/microsoft/RD-Agent/issues/880)) | | Official Discord release | We launch our first chatting channel in Discord (🗪[![Chat](https://img.shields.io/badge/chat-discord-blue)](https://discord.gg/ybQ97B6Jjy)) | | First release | **R&D-Agent** is released on GitHub | # 🏆 The Best Machine Learning Engineering Agent! [MLE-bench](https://github.com/openai/mle-bench) is a comprehensive benchmark evaluating the performance of AI agents on machine learning engineering tasks. Utilizing datasets from 75 Kaggle competitions, MLE-bench provides robust assessments of AI systems' capabilities in real-world ML engineering scenarios. R&D-Agent currently leads as the top-performing machine learning engineering agent on MLE-bench: | Agent | Low == Lite (%) | Medium (%) | High (%) | All (%) | |---------|--------|-----------|---------|----------| | R&D-Agent o3(R)+GPT-4.1(D) | 51.52 ± 6.9 | 19.3 ± 5.5 | 26.67 ± 0 | 30.22 ± 1.5 | | R&D-Agent o1-preview | 48.18 ± 2.49 | 8.95 ± 2.36 | 18.67 ± 2.98 | 22.4 ± 1.1 | | AIDE o1-preview | 34.3 ± 2.4 | 8.8 ± 1.1 | 10.0 ± 1.9 | 16.9 ± 1.1 | **Notes:** - **O3(R)+GPT-4.1(D)**: This version is designed to both reduce average time per loop and leverage a cost-effective combination of backend LLMs by seamlessly integrating Research Agent (o3) with Development Agent (GPT-4.1). - **AIDE o1-preview**: Represents the previously best public result on MLE-bench as reported in the original MLE-bench paper. - Average and standard deviation results for R&D-Agent o1-preview is based on a independent of 5 seeds and for R&D-Agent o3(R)+GPT-4.1(D) is based on 6 seeds. - According to MLE-Bench, the 75 competitions are categorized into three levels of complexity: **Low==Lite** if we estimate that an experienced ML engineer can produce a sensible solution in under 2 hours, excluding the time taken to train any models; **Medium** if it takes between 2 and 10 hours; and **High** if it takes more than 10 hours. You can inspect the detailed runs of the above results online. - [R&D-Agent o1-preview detailed runs](https://aka.ms/RD-Agent_MLE-Bench_O1-preview) - [R&D-Agent o3(R)+GPT-4.1(D) detailed runs](https://aka.ms/RD-Agent_MLE-Bench_O3_GPT41) For running R&D-Agent on MLE-bench, refer to **[MLE-bench Guide: Running ML Engineering via MLE-bench](https://rdagent.readthedocs.io/en/latest/scens/data_science.html)** # 🥇 The First Data-Centric Quant Multi-Agent Framework! R&D-Agent for Quantitative Finance, in short **RD-Agent(Q)**, is the first data-centric, multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization. ![image](https://github.com/user-attachments/assets/3198bc10-47ba-4ee0-8a8e-46d5ce44f45d) Extensive experiments in real stock markets show that, at a cost under $10, RD-Agent(Q) achieves approximately 2× higher ARR than benchmark factor libraries while using over 70% fewer factors. It also surpasses state-of-the-art deep time-series models under smaller resource budgets. Its alternating factor–model optimization further delivers excellent trade-off between predictive accuracy and strategy robustness. You can learn more details about **RD-Agent(Q)** through the [paper](https://arxiv.org/abs/2505.15155) and reproduce it through the [documentation](https://rdagent.readthedocs.io/en/latest/scens/quant_agent_fin.html). # Data Science Agent Preview Check out our demo video showcasing the current progress of our Data Science Agent under development: https://github.com/user-attachments/assets/3eccbecb-34a4-4c81-bce4-d3f8862f7305 # 🌟 Introduction Our focused scenario

R&D-Agent aims to automate the most critical and valuable aspects of the industrial R&D process, and we begin with focusing on the data-driven scenarios to streamline the development of models and data. Methodologically, we have identified a framework with two key components: 'R' for proposing new ideas and 'D' for implementing them. We believe that the automatic evolution of R&D will lead to solutions of significant industrial value. R&D is a very general scenario. The advent of R&D-Agent can be your - 💰 **Automatic Quant Factory** ([🎥Demo Video](https://rdagent.azurewebsites.net/factor_loop)|[▶️YouTube](https://www.youtube.com/watch?v=X4DK2QZKaKY&t=6s)) - 🤖 **Data Mining Agent:** Iteratively proposing data & models ([🎥Demo Video 1](https://rdagent.azurewebsites.net/model_loop)|[▶️YouTube](https://www.youtube.com/watch?v=dm0dWL49Bc0&t=104s)) ([🎥Demo Video 2](https://rdagent.azurewebsites.net/dmm)|[▶️YouTube](https://www.youtube.com/watch?v=VIaSTZuoZg4)) and implementing them by gaining knowledge from data. - 🦾 **Research Copilot:** Auto read research papers ([🎥Demo Video](https://rdagent.azurewebsites.net/report_model)|[▶️YouTube](https://www.youtube.com/watch?v=BiA2SfdKQ7o)) / financial reports ([🎥Demo Video](https://rdagent.azurewebsites.net/report_factor)|[▶️YouTube](https://www.youtube.com/watch?v=ECLTXVcSx-c)) and implement model structures or building datasets. - 🤖 **Kaggle Agent:** Auto Model Tuning and Feature Engineering([🎥Demo Video Coming Soon...]()) and implementing them to achieve more in competitions. - ... You can click the links above to view the demo. We're continuously adding more methods and scenarios to the project to enhance your R&D processes and boost productivity. Additionally, you can take a closer look at the examples in our **[🖥️ Live Demo](https://rdagent.azurewebsites.net/)**.

# ⚡ Quick start ### RD-Agent currently only supports Linux. You can try above demos by running the following command: ### 🐳 Docker installation. Users must ensure Docker is installed before attempting most scenarios. Please refer to the [official 🐳Docker page](https://docs.docker.com/engine/install/) for installation instructions. Ensure the current user can run Docker commands **without using sudo**. You can verify this by executing `docker run hello-world`. ### 🐍 Create a Conda Environment - Create a new conda environment with Python (3.10 and 3.11 are well-tested in our CI): ```sh conda create -n rdagent python=3.10 ``` - Activate the environment: ```sh conda activate rdagent ``` ### 🛠️ Install the R&D-Agent #### For Users - You can directly install the R&D-Agent package from PyPI: ```sh pip install rdagent ``` #### For Developers - If you want to try the latest version or contribute to RD-Agent, you can install it from the source and follow the development setup: ```sh git clone https://github.com/microsoft/RD-Agent cd RD-Agent make dev ``` More details can be found in the [development setup](https://rdagent.readthedocs.io/en/latest/development.html). ### 💊 Health check - rdagent provides a health check that currently checks two things. - whether the docker installation was successful. - whether the default port used by the [rdagent ui](https://github.com/microsoft/RD-Agent?tab=readme-ov-file#%EF%B8%8F-monitor-the-application-results) is occupied. ```sh rdagent health_check --no-check-env ``` ### ⚙️ Configuration - The demos requires following ability: - ChatCompletion - json_mode - embedding query You can set your Chat Model and Embedding Model in the following ways: > **🔥 Attention**: We now provide experimental support for **DeepSeek** models! You can use DeepSeek's official API for cost-effective and high-performance inference. See the configuration example below for DeepSeek setup. - **Using LiteLLM (Default)**: We now support LiteLLM as a backend for integration with multiple LLM providers. You can configure in multiple ways: **Option 1: Unified API base for both models** *Configuration Example: `OpenAI` Setup :* ```bash cat << EOF > .env # Set to any model supported by LiteLLM. CHAT_MODEL=gpt-4o EMBEDDING_MODEL=text-embedding-3-small # Configure unified API base OPENAI_API_BASE= OPENAI_API_KEY= ``` *Configuration Example: `Azure OpenAI` Setup :* > Before using this configuration, please confirm in advance that your `Azure OpenAI API key` supports `embedded models`. ```bash cat << EOF > .env EMBEDDING_MODEL=azure/ CHAT_MODEL=azure/ AZURE_API_KEY= AZURE_API_BASE= AZURE_API_VERSION= ``` **Option 2: Separate API bases for Chat and Embedding models** ```bash cat << EOF > .env # Set to any model supported by LiteLLM. # Configure separate API bases for chat and embedding # CHAT MODEL: CHAT_MODEL=gpt-4o OPENAI_API_BASE= OPENAI_API_KEY= # EMBEDDING MODEL: # TAKE siliconflow as an example, you can use other providers. # Note: embedding requires litellm_proxy prefix EMBEDDING_MODEL=litellm_proxy/BAAI/bge-large-en-v1.5 LITELLM_PROXY_API_KEY= LITELLM_PROXY_API_BASE=https://api.siliconflow.cn/v1 ``` *Configuration Example: `DeepSeek` Setup :* >Since many users encounter configuration errors when setting up DeepSeek. Here's a complete working example for DeepSeek Setup: ```bash cat << EOF > .env # CHAT MODEL: Using DeepSeek Official API CHAT_MODEL=deepseek/deepseek-chat DEEPSEEK_API_KEY= # EM

Core symbols most depended-on inside this repo

append

called by 453

rdagent/oai/backend/deprec.py

get

called by 448

rdagent/app/CI/run.py

called by 321

rdagent/utils/agent/tpl.py

info

called by 208

rdagent/log/logger.py

copy

called by 169

rdagent/core/experiment.py

select

called by 85

rdagent/scenarios/data_science/proposal/exp_gen/trace_scheduler.py

warning

called by 79

rdagent/log/logger.py

log_object

called by 76

rdagent/log/logger.py

Shape

Method 1,264

Function 636

Class 506

Route 20

Languages

Python100%

Modules by API surface

test/notebook/test_util.py68 symbols

rdagent/log/ui/web.py63 symbols

rdagent/components/coder/model_coder/benchmark/gt_code/visnet.py61 symbols

rdagent/utils/env.py49 symbols

rdagent/core/experiment.py47 symbols

rdagent/oai/backend/base.py45 symbols

rdagent/core/proposal.py42 symbols

rdagent/scenarios/data_science/proposal/exp_gen/proposal.py41 symbols

rdagent/scenarios/data_science/debug/data.py41 symbols

rdagent/components/coder/CoSTEER/knowledge_management.py39 symbols

rdagent/components/knowledge_management/graph.py38 symbols

rdagent/app/CI/run.py36 symbols

Dependencies from manifests, versioned

litellm1.73 · 1×

streamlit1.47 · 1×

For agents

$ claude mcp add RD-Agent \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact