
Android Mobile
Control Android devices remotely through AI agents with full UI interaction capabilities including touch, swipe, typing, and app management.
What it does
- Click and swipe on Android screens
- Type text into input fields
- Take screenshots of device screen
- Launch and list installed apps
- Extract UI elements as structured JSON
- Press system buttons (back, home, recent)
Best for
Tools (9)
Initialize the Android device connection. Must be called before using any other mobile tools.
Get UI elements from Android screen as JSON with hierarchical structure. Returns a JSON structure where elements contain their child elements, showing parent-child relationships. Only includes focusable elements or elements with text/content_desc/hint attributes.
Click on a specific coordinate on the Android screen. Args: x: X coordinate to click y: Y coordinate to click
Input text into the currently focused text field on Android. Args: text: The text to input submit: Whether to submit text (press Enter key) after typing
Press a physical or virtual button on the Android device. Args: button: Button name (BACK, HOME, RECENT, ENTER)