MCPcopy
hub / github.com/bytedance/UI-TARS-desktop

github.com/bytedance/UI-TARS-desktop @mcp-http-server@1.1.5 sqlite

repository ↗ · DeepWiki ↗ · release mcp-http-server@1.1.5 ↗
1,832 symbols 5,076 edges 569 files 204 documented · 11%
README

[!IMPORTANT]

[2025-03-18] We released a technical preview version of a new desktop app - Agent TARS, a multimodal AI agent that leverages browser operations by visually interpreting web pages and seamlessly integrating with command lines and file systems.

UI-TARS

UI-TARS Desktop

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

    &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2501.12326">Paper</a> &nbsp&nbsp
    | 🤗 <a href="https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B">Hugging Face Models</a>&nbsp&nbsp
    | &nbsp&nbsp🫨 <a href="https://discord.gg/pTXwYVjfcs">Discord</a>&nbsp&nbsp
    | &nbsp&nbsp🤖 <a href="https://www.modelscope.cn/collections/UI-TARS-bccb56fa1ef640">ModelScope</a>&nbsp&nbsp

🖥️ Desktop Application &nbsp&nbsp | &nbsp&nbsp 👓 Midscene (use in browser) &nbsp&nbsp | &nbsp&nbsp Ask DeepWiki.com

Showcases

Instruction Video
Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting.
Could you help me check the latest open issue of the UI-TARS-Desktop project on GitHub?

News

  • [2025-04-17] - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports the advanced UI-TARS-1.5 model for improved performance and precise control.
  • [2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
  • [2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.

Features

  • 🤖 Natural language control powered by Vision-Language Model
  • 🖥️ Screenshot and visual recognition support
  • 🎯 Precise mouse and keyboard control
  • 💻 Cross-platform support (Windows/MacOS/Browser)
  • 🔄 Real-time feedback and status display
  • 🔐 Private and secure - fully local processing

Quick Start

See Quick Start.

Deployment

See Deployment.

Contributing

See CONTRIBUTING.md.

SDK (Experimental)

See @ui-tars/sdk

License

UI-TARS Desktop is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:

@article{qin2025ui,
  title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
  author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
  journal={arXiv preprint arXiv:2501.12326},
  year={2025}
}

Extension points exported contracts — how you extend this code

SearchEngineAdapter (Interface)
(no doc) [8 implementers]
packages/agent-infra/search/browser-search/src/types.ts
LLMProvider (Interface)
(no doc) [1 implementers]
apps/agent-tars/src/main/llmProvider/interfaces/LLMProvider.ts
BrowserOperatorOptions (Interface)
(no doc)
packages/ui-tars/operators/browser-operator/src/types.ts
ThoughtStepCardProps (Interface)
(no doc)
apps/ui-tars/src/renderer/src/components/ThoughtChain/index.tsx
YamlData (Interface)
(no doc)
scripts/merge-yml/merge-yml.ts
AnimatedButtonProps (Interface)
(no doc)
examples/operator-browserbase/app/components/AnimatedButton.tsx
Logger (Interface)
(no doc) [2 implementers]
packages/agent-infra/logger/src/types.ts
Props (Interface)
(no doc)
apps/agent-tars/src/renderer/src/components/ErrorBoundary.tsx

Core symbols most depended-on inside this repo

error
called by 250
packages/agent-infra/logger/src/types.ts
info
called by 243
packages/agent-infra/logger/src/types.ts
log
called by 172
packages/agent-infra/logger/src/types.ts
cn
called by 107
apps/ui-tars/src/renderer/src/utils/index.ts
fn
called by 64
packages/ui-tars/sdk/tests/fixtures/async-hooks-test/file1.ts
close
called by 54
packages/agent-infra/browser/src/types.ts
emitEvent
called by 46
packages/agent-infra/browser-use/src/agent/types.ts
warn
called by 45
packages/agent-infra/logger/src/types.ts

Shape

Function 849
Method 544
Interface 216
Class 191
Enum 32

Languages

TypeScript100%

Modules by API surface

packages/agent-infra/browser-use/src/browser/page.ts44 symbols
apps/agent-tars/src/renderer/src/agent/EventManager.ts30 symbols
packages/ui-tars/operators/browser-operator/src/browser-operator.ts27 symbols
packages/ui-tars/visualizer/src/component/player.tsx25 symbols
packages/agent-infra/logger/src/types.ts24 symbols
apps/ui-tars/src/renderer/src/components/ui/sidebar.tsx23 symbols
packages/agent-infra/mcp-client/src/index.ts22 symbols
packages/agent-infra/browser-use/src/agent/messages/service.ts21 symbols
apps/agent-tars/src/main/utils/logger.ts21 symbols
packages/agent-infra/browser-use/src/dom/views.ts20 symbols
packages/ui-tars/visualizer/src/component/timeline.tsx19 symbols
packages/agent-infra/browser-use/src/browser/context.ts19 symbols

Dependencies from manifests, versioned

@agent-infra/bing-searchworkspace:* · 1×
@agent-infra/browserworkspace:* · 1×
@agent-infra/browser-searchworkspace:* · 1×
@agent-infra/browser-useworkspace:* · 1×
@agent-infra/duckduckgo-searchworkspace:* · 1×
@agent-infra/loggerworkspace:* · 1×
@agent-infra/mcp-clientworkspace:* · 1×
@agent-infra/mcp-server-browserworkspace:* · 1×
@agent-infra/mcp-server-commandsworkspace:* · 1×
@agent-infra/mcp-server-filesystemworkspace:* · 1×
@agent-infra/mcp-sharedworkspace:* · 1×
@agent-infra/searchworkspace:* · 1×

For agents

$ claude mcp add UI-TARS-desktop \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact