AI-powered agent for automating Android phone interactions. The agent uses a vision-language model to understand screen content and decide on actions to complete user tasks. Args: model_config: Configuration for the AI model. agent_config: Configuration for the age
| 40 | |
| 41 | |
| 42 | class PhoneAgent: |
| 43 | """ |
| 44 | AI-powered agent for automating Android phone interactions. |
| 45 | |
| 46 | The agent uses a vision-language model to understand screen content |
| 47 | and decide on actions to complete user tasks. |
| 48 | |
| 49 | Args: |
| 50 | model_config: Configuration for the AI model. |
| 51 | agent_config: Configuration for the agent behavior. |
| 52 | confirmation_callback: Optional callback for sensitive action confirmation. |
| 53 | takeover_callback: Optional callback for takeover requests. |
| 54 | |
| 55 | Example: |
| 56 | >>> from phone_agent import PhoneAgent |
| 57 | >>> from phone_agent.model import ModelConfig |
| 58 | >>> |
| 59 | >>> model_config = ModelConfig(base_url="http://localhost:8000/v1") |
| 60 | >>> agent = PhoneAgent(model_config) |
| 61 | >>> agent.run("Open WeChat and send a message to John") |
| 62 | """ |
| 63 | |
| 64 | def __init__( |
| 65 | self, |
| 66 | model_config: ModelConfig | None = None, |
| 67 | agent_config: AgentConfig | None = None, |
| 68 | confirmation_callback: Callable[[str], bool] | None = None, |
| 69 | takeover_callback: Callable[[str], None] | None = None, |
| 70 | ): |
| 71 | self.model_config = model_config or ModelConfig() |
| 72 | self.agent_config = agent_config or AgentConfig() |
| 73 | |
| 74 | self.model_client = ModelClient(self.model_config) |
| 75 | self.action_handler = ActionHandler( |
| 76 | device_id=self.agent_config.device_id, |
| 77 | confirmation_callback=confirmation_callback, |
| 78 | takeover_callback=takeover_callback, |
| 79 | ) |
| 80 | |
| 81 | self._context: list[dict[str, Any]] = [] |
| 82 | self._step_count = 0 |
| 83 | |
| 84 | def run(self, task: str) -> str: |
| 85 | """ |
| 86 | Run the agent to complete a task. |
| 87 | |
| 88 | Args: |
| 89 | task: Natural language description of the task. |
| 90 | |
| 91 | Returns: |
| 92 | Final message from the agent. |
| 93 | """ |
| 94 | self._context = [] |
| 95 | self._step_count = 0 |
| 96 | |
| 97 | # First step with user prompt |
| 98 | result = self._execute_step(task, is_first=True) |
| 99 |
no outgoing calls
no test coverage detected