MCPcopy
hub / github.com/zai-org/Open-AutoGLM

github.com/zai-org/Open-AutoGLM @main sqlite

repository ↗ · DeepWiki ↗
255 symbols 936 edges 40 files 247 documented · 97%
README

Open-AutoGLM

中文阅读.

👋 Join our<a href="https://github.com/zai-org/Open-AutoGLM/raw/main/resources/WECHAT.md" target="_blank"> Wechat</a> or <a href="https://discord.gg/HvT5BaPg3H" target="_blank">Discord</a> community.






👋 Follow AutoGLM Autotyper <a href="https://x.com/Autotyper_Agent?s=20" target="_blank">X</a> account

Quick Start

You can use Claude Code with GLM Coding Plan and enter the following prompt to quickly deploy this project:

Access the documentation and install AutoGLM for me
https://raw.githubusercontent.com/zai-org/Open-AutoGLM/refs/heads/main/README_en.md

Project Introduction

Phone Agent is a mobile intelligent assistant framework built on AutoGLM. It understands phone screen content in a multimodal manner and helps users complete tasks through automated operations. The system controls devices via ADB (Android Debug Bridge), perceives screens using vision-language models, and generates and executes operation workflows through intelligent planning. Users simply describe their needs in natural language, such as "Open eBay and search for wireless earphones." and Phone Agent will automatically parse the intent, understand the current interface, plan the next action, and complete the entire workflow. The system also includes a sensitive operation confirmation mechanism and supports manual takeover during login or verification code scenarios. Additionally, it provides remote ADB debugging capabilities, allowing device connection via WiFi or network for flexible remote control and development.

⚠️ This project is for research and learning purposes only. It is strictly prohibited to use for illegal information acquisition, system interference, or any illegal activities. Please carefully review the Terms of Use.

Integration with Other Automation Tools

Midscene.js

Midscene.js is an open-source, vision-model-driven UI automation SDK that supports JavaScript or YAML flow syntax for cross-platform automation.

Midscene.js already supports AutoGLM; see the Midscene.js integration guide to quickly try AutoGLM automation on both iOS and Android devices.

Model Download Links

Model Download Links
AutoGLM-Phone-9B 🤗 Hugging Face

🤖 ModelScope | | AutoGLM-Phone-9B-Multilingual | 🤗 Hugging Face

🤖 ModelScope |

AutoGLM-Phone-9B is optimized for Chinese mobile applications, while AutoGLM-Phone-9B-Multilingual supports English scenarios and is suitable for applications containing English or other language content.

Environment Setup

1. Python Environment

Python 3.10 or higher is recommended.

2. Device Debug Tools

Choose the appropriate tool based on your device type:

For Android Devices - Using ADB

  1. Download the official ADB installation package and extract it to a custom path
  2. Configure environment variables

  3. MacOS configuration: In Terminal or any command line tool

bash # Assuming the extracted directory is ~/Downloads/platform-tools. Adjust the command if different. export PATH=${PATH}:~/Downloads/platform-tools

For HarmonyOS Devices - Using HDC

  1. Download HDC tool:
  2. From HarmonyOS SDK
  3. Configure environment variables

  4. MacOS/Linux configuration:

bash # Assuming the extracted directory is ~/Downloads/harmonyos-sdk/toolchains. Adjust according to actual path. export PATH=${PATH}:~/Downloads/harmonyos-sdk/toolchains

  • Windows configuration: Add the HDC tool directory to the system PATH environment variable

3. Android 7.0+ or HarmonyOS Device with Developer Mode and USB Debugging Enabled

  1. Enable Developer Mode: The typical method is to find Settings > About Phone > Build Number and tap it rapidly about 10 times until a popup shows "Developer mode has been enabled." This may vary slightly between phones; search online for tutorials if you can't find it.
  2. Enable USB Debugging: After enabling Developer Mode, go to Settings > Developer Options > USB Debugging and enable it
  3. Some devices may require a restart after setting developer options for them to take effect. You can test by connecting your phone to your computer via USB cable and running adb devices to see if device information appears. If not, the connection has failed.

Please carefully check the relevant permissions

Permissions

4. Install ADB Keyboard (Required for Android Devices Only, for Text Input)

Note: HarmonyOS devices use native input methods and do not require ADB Keyboard.

If you are using an Android device:

Download the installation package and install it on the corresponding Android device. Note: After installation, you need to enable ADB Keyboard in Settings > Input Method or Settings > Keyboard List for it to work.(or use command adb shell ime enable com.android.adbkeyboard/.AdbIMEHow-to-use)

Deployment Preparation

1. Install Dependencies

pip install -r requirements.txt 
pip install -e .

2. Configure ADB or HDC

For Android Devices

Make sure your USB cable supports data transfer, not just charging.

Ensure ADB is installed and connect the device via USB cable:

# Check connected devices
adb devices

# Output should show your device, e.g.:
# List of devices attached
# emulator-5554   device

For HarmonyOS Devices

Make sure your USB cable supports data transfer, not just charging.

Ensure HDC is installed and connect the device via USB cable:

# Check connected devices
hdc list targets

# Output should show your device, e.g.:
# 7001005458323933328a01bce01c2500

3. Start Model Service

You can choose to deploy the model service yourself or use a third-party model service provider.

Option A: Use Third-Party Model Services

If you don't want to deploy the model yourself, you can use the following third-party services that have already deployed our model:

1. z.ai

  • Documentation: https://docs.z.ai/api-reference/introduction
  • --base-url: https://api.z.ai/api/paas/v4
  • --model: autoglm-phone-multilingual
  • --apikey: Apply for your own API key on the z.ai platform

2. Novita AI

  • Documentation: https://novita.ai/models/model-detail/zai-org-autoglm-phone-9b-multilingual
  • --base-url: https://api.novita.ai/openai
  • --model: zai-org/autoglm-phone-9b-multilingual
  • --apikey: Apply for your own API key on the Novita AI platform

3. Parasail

  • Documentation: https://www.saas.parasail.io/serverless?name=auto-glm-9b-multilingual
  • --base-url: https://api.parasail.io/v1
  • --model: parasail-auto-glm-9b-multilingual
  • --apikey: Apply for your own API key on the Parasail platform

Example usage with third-party services:

# Using z.ai
python main.py --base-url https://api.z.ai/api/paas/v4 --model "autoglm-phone-multilingual" --apikey "your-z-ai-api-key" "Open Chrome browser"

# Using Novita AI
python main.py --base-url https://api.novita.ai/openai --model "zai-org/autoglm-phone-9b-multilingual" --apikey "your-novita-api-key" "Open Chrome browser"

# Using Parasail
python main.py --base-url https://api.parasail.io/v1 --model "parasail-auto-glm-9b-multilingual" --apikey "your-parasail-api-key" "Open Chrome browser"

Option B: Deploy Model Yourself

If you prefer to deploy the model locally or on your own server:

  1. Download the model and install the inference engine framework according to the For Model Deployment section in requirements.txt.
  2. Start via SGlang / vLLM to get an OpenAI-format service. Here's a vLLM deployment solution; please strictly follow the startup parameters we provide:

  3. vLLM:

python3 -m vllm.entrypoints.openai.api_server \
 --served-model-name autoglm-phone-9b-multilingual \
 --allowed-local-media-path /   \
 --mm-encoder-tp-mode data \
 --mm_processor_cache_type shm \
 --mm_processor_kwargs "{\"max_pixels\":5000000}" \
 --max-model-len 25480  \
 --chat-template-content-format string \
 --limit-mm-per-prompt "{\"image\":10}" \
 --model zai-org/AutoGLM-Phone-9B-Multilingual \
 --port 8000
  • This model has the same architecture as GLM-4.1V-9B-Thinking. For detailed information about model deployment, you can also check GLM-V for model deployment and usage guides.

  • After successful startup, the model service will be accessible at http://localhost:8000/v1. If you deploy the model on a remote server, access it using that server's IP address.

4. Check Model Deployment

After starting the model service, you can use the following command to verify the deployment:

python scripts/check_deployment_en.py --base-url http://localhost:8000/v1 --model autoglm-phone-9b-multilingual

If using a third-party model service:

# Novita AI
python scripts/check_deployment_en.py --base-url https://api.novita.ai/openai --model zai-org/autoglm-phone-9b-multilingual --apikey your-novita-api-key

# Parasail
python scripts/check_deployment_en.py --base-url https://api.parasail.io/v1 --model parasail-auto-glm-9b-multilingual --apikey your-parasail-api-key

Upon successful execution, the script will display the model's inference result and token statistics, helping you confirm whether the model deployment is working correctly.

Using AutoGLM

Command Line

Set the --base-url and --model parameters according to your deployed model. For example:

# Android device - Interactive mode
python main.py --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual"

# Android device - Specify task
python main.py --base-url http://localhost:8000/v1 "Open Maps and search for nearby coffee shops"

# HarmonyOS device - Interactive mode
python main.py --device-type hdc --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual"

# HarmonyOS device - Specify task
python main.py --device-type hdc --base-url http://localhost:8000/v1 "Open Maps and search for nearby coffee shops"

# Use API key for authentication
python main.py --apikey sk-xxxxx

# Use English system prompt
python main.py --lang en --base-url http://localhost:8000/v1 "Open Chrome browser"

# List supported apps (Android)
python main.py --list-apps

# List supported apps (HarmonyOS)
python main.py --device-type hdc --list-apps

Python API

from phone_agent import PhoneAgent
from phone_agent.model import ModelConfig

# Configure model
model_config = ModelConfig(
    base_url="http://localhost:8000/v1",
    model_name="autoglm-phone-9b-multilingual",
)

# Create Agent
agent = PhoneAgent(model_config=model_config)

# Execute task
result = agent.run("Open eBay and search for wireless earphones")
print(result)

Remote Debugging

Phone Agent supports remote ADB/HDC debugging via WiFi/network, allowing device control without a USB connection.

Configure Remote Debugging

Enable Wireless Debugging on Phone

Android Devices

Ensure the phone and computer are on the same WiFi network, as shown below:

Enable Wireless Debugging

HarmonyOS Devices

Ensure the phone and computer are on the same WiFi network: 1. Go to Settings > System & Updates > Developer Options 2. Enable USB Debugging and Wireless Debugging 3. Note the displayed IP address and port number

Use Standard ADB/HDC Commands on Computer

# Android device - Connect via WiFi, replace with the IP address and port shown on your phone
adb connect 192.168.1.100:5555

# Verify connection
adb devices
# Should show: 192.168.1.100:5555    device

# HarmonyOS device - Connect via WiFi
hdc tconn 192.168.1.100:5555

# Verify connection
hdc list targets
# Should show: 192.168.1.100:5555

Device Management Commands

Android Devices (ADB)

# List all connected devices
adb devices

# Connect to remote device
adb connect 192.168.1.100:5555

# Disconnect specific device
adb disconnect 192.168.1.100:5555

# Execute task on specific device
python main.py --device-id 192.168.1.100:5555 --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual" "Open TikTok and browse videos"

HarmonyOS Devices (HDC)

# List all connected devices
hdc list targets

# Connect to remote device
hdc tconn 192.168.1.100:5555

# Disconnect specific device
hdc tdisconn 192.168.1.100:5555

# Execute task on specific device
python main.py --device-type hdc --device-id 192.168.1.100:5555 --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual" "Open TikTok and browse videos"

Python API Remote Connection

Android Devices (ADB)

```python from phone_agent.adb import ADBConnection, list_devices

Create connection manager

conn = ADBC

Core symbols most depended-on inside this repo

run
called by 45
phone_agent/agent.py
_run_hdc_command
called by 29
phone_agent/hdc/connection.py
get_device_factory
called by 12
phone_agent/device_factory.py
get_messages
called by 12
phone_agent/config/i18n.py
_get_hdc_prefix
called by 8
phone_agent/hdc/device.py
_get_adb_prefix
called by 8
phone_agent/adb/device.py
_get_wda_session_url
called by 7
phone_agent/xctest/device.py
_get_wda_session_url
called by 6
phone_agent/xctest/input.py

Shape

Method 113
Function 110
Class 32

Languages

Python100%

Modules by API surface

phone_agent/actions/handler.py25 symbols
phone_agent/device_factory.py21 symbols
phone_agent/actions/handler_ios.py21 symbols
phone_agent/xctest/connection.py16 symbols
phone_agent/hdc/connection.py16 symbols
phone_agent/adb/connection.py14 symbols
phone_agent/model/client.py12 symbols
phone_agent/xctest/device.py11 symbols
phone_agent/agent_ios.py11 symbols
phone_agent/agent.py11 symbols
phone_agent/xctest/input.py10 symbols
phone_agent/config/timing.py10 symbols

Dependencies from manifests, versioned

Pillow12.0.0 · 1×
openai2.9.0 · 1×
requests2.31.0 · 1×

For agents

$ claude mcp add Open-AutoGLM \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact