hub / github.com/microsoft/MarS

github.com/microsoft/MarS @main sqlite

335 symbols 1,268 edges 55 files 260 documented · 78%

README

📄 Paper + 🏠️ Project Website

<img src="https://img.shields.io/badge/build-pass-green" alt="build">
<img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT">
<img src="https://img.shields.io/badge/version-1.0.0-blue" alt="version">
<img src="https://img.shields.io/badge/python-3.11%20|%203.12-blue" alt="python">
<img src="https://img.shields.io/badge/platform-linux%20-lightgrey" alt="platform">
<img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome">
<img src="https://img.shields.io/badge/docs-latest-brightgreen" alt="documentation">

📢 Announcements

Event	Description
📦 Code & Tools Release	We've released the core code and tools for our order agent, including several examples for downstream applications. The associated model will be made public following its final review. Please see the Release Overview section below for more details.
🎈 ICLR 2025 Acceptance	We are thrilled to announce that our paper has been accepted to ICLR 2025!
🌐 Join Our Community	Connect with us on 💬 WeChat Group and 👾 Discord to share your feedback and insights!
🌟 First Release	We are excited to announce our first release! Check out the repo and enjoy your journey.

📦 Release Overview

Welcome to our project! We are excited to release the foundational code and tools designed for market simulation and analysis.

Important Note: While our associated Hugging Face model is fully prepared, it is currently set to private awaiting final review approval. We appreciate your patience regarding its public availability.

In the meantime, you can gain significant value and understanding from the core functionalities through the following examples and code explorations:

Explore Key Examples:

📊 Stylized Facts Report: Evaluates 11 key market characteristics Stylized Fact Analysis.
📈 Predict and Simulate: Explore future prediction with Simulation as Forecasting example.
💹 Market Impact Analysis: Study market impacts with our Market Impact Analysis.
✨ Interactive Exploration: Use our Interactive Demo to run these analyses easily.

Delve into the Underlying Architecture:

OrderModel: Understand how orders are generated.
OrderState: See how market states are represented and updated.
OrderAgent: Examine the agent responsible for order generation.

Kindly note that the examples and demo's full functionality depends on the public release of our Hugging Face model, which will happen once the final review is finalized. We apologize for this temporary limitation and appreciate your patience.

🖥️ Usage & Notes

Installation Options

Option 1: Using VS Code Dev Containers (Recommended)

We provide a fully configured development environment using VS Code Dev Containers:

git clone https://github.com/microsoft/MarS.git
cd MarS

Then, with VS Code and the Dev Containers extension installed: 1. Open the project folder in VS Code 2. Important: Before reopening in container, modify the .devcontainer/devcontainer.json file to change "source=/data/" to <your/data/path> exists on your host machine 3. When prompted, click "Reopen in Container" or use the command palette (F1) and select "Dev Containers: Reopen in Container" 4. The container will build with all dependencies and extensions configured 5. Once inside the container, install the project dependencies:

pip install -e .[dev]

Option 2: Using Docker Directly

git clone https://github.com/microsoft/MarS.git
cd MarS
docker build -t mars-env -f .devcontainer/Dockerfile .
# Modify this path to match your data directory
docker run -it --cap-add=SYS_ADMIN --device=/dev/fuse --security-opt=apparmor:unconfined --shm-size=20gb --gpus=all --privileged -v <your/data/path>:/data -v $(pwd):/workspaces/MarS -w /workspaces/MarS mars-env
# Inside the container
pip install -e .[dev]

Important: We strongly recommend using docker to run MarS. Direct installation without Docker is not supported due to specific system dependencies and CUDA requirements.

Download Model and Pre-requisites

We've simplified downloading all necessary components (model, converters, validation samples, and stylized facts data) using a single script:

python download.py

Important Note: Since our model associated the hugging face repository is currently under review and not yet public, we have temporarily made the prerequisites available (converters, validation samples, and stylized facts data, not including the model) through OneDrive. Please download the prerequisites from OneDrive and place them under your input_root_dir in market_simulation/conf.py instead of running the download.py script.

Note: The download requires sufficient disk space and may take some time depending on your internet connection.

Starting the Order Model Ray Server

MarS uses Ray Serve to deploy the order model as a scalable, production-ready service. To start the order model Ray server:

bash scripts/start-order-model.sh

Prerequisites: - The Ray server must be running and accessible at the configured IP and port - Sufficient computational resources are required to run the model

To explore all of our demos in a user-friendly interface:

streamlit run market_simulation/examples/demo/home_app.py

The demo applications are designed to provide a quick and visual understanding of each tool's capabilities. However, there are some important considerations:

Using Demos vs Scripts: - If you want to quickly understand what these tools can do, run the Streamlit demos for an interactive experience. - If you need to use these tools with your own data or in production, you'll need to modify the corresponding scripts (report_stylized_facts.py, forecast.py, market_impact.py) directly.

Direct Model Interaction

If you want to interact with the model directly after starting the server, you can use the ModelClient.

from market_simulation.rollout.model_client import ModelClient
from market_simulation.conf import C

client = ModelClient(
    model_name=C.model_serving.model_name,
    ip=C.model_serving.ip,
    port=C.model_serving.port,
)

predictions = client.get_prediction(your_input_data)

🔧Production Deployment Prerequisites

Real Order-Level Data: While our demos use noise agents to generate initial states, production-grade applications require complete order-level historical data to accurately simulate market behavior.
Sufficient Computational Resources: Our research simulations typically run 128 trajectories per state to generate robust signals. In our experiments, we utilized 128 GPUs running parallel simulations across different instruments and starting states.
Optimized Inference Pipeline: The current implementation prioritizes validating the model's scalability, realistic, interactive, and controllable order generation capabilities. For production deployment, significant optimizations are necessary.

⚡Performance Optimization Strategies

Several strategies can substantially improve inference performance for production deployment:

Advanced Serving System: Replace the current Ray-based batch inference with more optimized systems like vLLM to achieve higher throughput and lower latency.
Efficient Model Architectures: While we currently use LLaMA for its reliability during testing, exploring more efficient architectures such as linear attention models (RetNet, RWKV), state space models (Mamba), Mixture of Experts (MoE), or Multi-head Latent Attention (MLA) could significantly improve performance.
Model Compression: Implement quantization, distillation, and pruning to reduce model size and computational requirements while maintaining accuracy.
KV-Cache Optimization: Our current implementation uses fixed-length sequences with sliding windows, which needs special design for KV-cache.
Multi-Token Prediction: Generating multiple tokens simultaneously instead of one-by-one order generation could substantially reduce inference time.

The demos provide a user-friendly interface to experiment with different parameters and visualize results, while the scripts offer more flexibility for integration into your own workflows and data pipelines.

📊 Stylized Facts Report

The Stylized Facts Report evaluates 11 key market characteristics identified by Cont (2001) to assess the realism of market simulations. These characteristics, known as "stylized facts," are empirical patterns consistently observed across different financial markets, instruments, and time periods.

Usage:

To run the stylized facts analysis:

# Ensure you've run the download.py script first to get the required data
python market_simulation/examples/report_stylized_facts.py

Stylized Facts Table:

Fact #	Fact Name	Historical	Simulated
1	Absence of autocorrelations	×	×
2	Heavy tails	×	×
3	Gain/loss asymmetry
4	Aggregational Gaussianity	×	×
5	Intermittency	×	×
6	Volatility clustering	×	×
7	Conditional heavy tails	×	×
8	Slow decay of autocorrelation in absolute returns	×	×
9	Leverage effect
10	Volume/volatility correlation	×	×
11	Asymmetry in timescales	×	×

Key Results:

9 out of 11 stylized facts are successfully reproduced in both historical and simulated data
The absent facts (Gain/loss asymmetry and Leverage effect) have also been noted as missing in modern US markets (Dow 30 stocks)
The simulated data shows similar patterns to historical data across all 11 facts

Key Stylized Facts Examined:

Absence of autocorrelations: Linear autocorrelations of asset returns quickly decay after short time intervals
$\text{corr}(r(t, \Delta t), r(t+\tau, \Delta t))$
Heavy tails: Return distributions display power-law or Pareto-like tails
Measured through kurtosis of returns
Aggregational Gaussianity: Return distributions become more normal as the time scale increases
Kurtosis of returns approaches Gaussian levels at longer time scales
Intermittency: High degree of variability in returns with irregular bursts
Measured using Fano factor (variance-to-mean ratio) of extreme returns
Volatility clustering: Positive autocorrelation in volatility measures, showing high-volatility events tend to cluster
$\text{corr}(|r(t, \Delta t)|, |r(t+\tau, \Delta t)|)$
Conditional heavy tails: Return distributions still exhibit heavy tails even after accounting for volatility clustering
Kurtosis of normalized returns (divided by local volatility)
Slow decay of absolute return autocorrelation: Absolute returns' autocorrelation decays slowly as a power law
Similar to volatility clustering, measured across different lag periods
Volume/volatility correlation: Trading volume correlation with volatility measures
$\text{corr}(v(t, \Delta t), |r(t, \Delta t)|)$
Asymmetry in timescales: Coarse-grained volatility predicts fine-scale volatility better than the reverse
Correlation between coarse returns (absolute sum) and fine returns (sum of absolutes)

Our methodology rigorously tests these facts using 11,591 simulated trajectories for the top 500 liquid stocks in the Chinese market, comparing simulation outputs against historical data.

📈 Market Forecast

The Market Forecast tool demonstrates the predictive capabilities of the MarS model by simulating future market prices and trends through order-level simulation rather than direct price prediction.

Order-Level Trajectory Generation: A Paradigm Shift

Traditional forecasting approaches attempt to directly model price movements based on historical data. Our approach is fundamentally different:

Order-Level Simulation: Instead of predicting prices directly, MarS generates individual order events with their full properties (price, volume, direction)
Emergent Market Behavior: Prices and trends emerge naturally from the interactions of these simulated orders
Multiple Possible Futures: B

Core symbols most depended-on inside this repo

save_and_close_fig

called by 13

market_simulation/examples/report_stylized_facts.py

mlib/core/time_utils.py

_get_limit_order

called by 10

market_simulation/agents/trading_agent.py

mlib/core/limit_order.py

get_bin_index

called by 9

market_simulation/utils/bin_converter.py

get_return_info

called by 9

market_simulation/examples/report_stylized_facts.py

Shape

Method 190

Function 93

Class 52

Languages

Python100%

Modules by API surface

mlib/core/engine.py26 symbols

market_simulation/states/order_state.py22 symbols

market_simulation/examples/report_stylized_facts.py21 symbols

mlib/core/orderbook.py19 symbols

mlib/core/base_agent.py19 symbols

mlib/core/limit_order.py17 symbols

mlib/core/exchange.py17 symbols

mlib/core/event.py16 symbols

market_simulation/agents/trading_agent.py16 symbols

market_simulation/examples/demo/market_impact_app.py12 symbols

mlib/core/level.py10 symbols

market_simulation/models/order_model.py10 symbols

For agents

$ claude mcp add MarS \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact