Skip to content

k-l-lambda/anthroid

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

285 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anthroid

Anthroid logo

Anthroid is a native Android app for mobile agentic workflows: a Claude Code-style agent that can work with your phone’s native capabilities (camera, voice, clipboard, notifications, app/URL launching, location/calendar, screenshots, and optional UI automation).

Demos

Cross-App Automation

Launch other apps and automate multi-step tasks. Here the agent opens a shopping app, searches for products, and navigates the interface.

Cross-app automation demo

Remote Server Monitoring

Use voice commands to check server status. The agent connects via SSH, runs diagnostics, and reports back.

Server monitoring demo

Document Analysis & Local File Management

Take a photo or pick from gallery, then ask the agent to analyze. Here it reads insurance documents and creates a comparison report. The agent can also organize, rename, and move files on your device — tasks that are typically cumbersome on mobile due to limited input methods.

Document analysis demo

When it helps

  • On-call / incident triage away from your desk: run quick diagnostics in a terminal, keep a chat thread with context, and resume the conversation later.
  • Field work & device setup: scan QR codes to avoid retyping tokens/URLs, open links on-device, and copy/paste between apps.
  • Hands-busy workflows: dictate messages with offline speech recognition so you can keep moving even with poor network.
  • "What am I looking at?" troubleshooting: attach photos/screenshots for visual context, then follow a short step-by-step checklist.
  • Phone-use automation: copy key info to clipboard, open deep links, launch apps, set reminders via notifications, and combine device context (location/calendar) with instructions.

Overview

Anthroid (Android + Anthropic) is a mobile implementation of Claude Code, designed around mobile-native input methods and device capabilities.

Architecture Comparison

Unlike typical mobile AI apps that merely serve as a frontend to remote agent systems, Anthroid runs the full agent logic locally on your device:

Typical Mobile AI Apps Anthroid
flowchart TB
    subgraph phone["📱 Mobile Device"]
        ui["UI Only"]
    end
    subgraph server["☁️ Remote Server"]
        agent["Agent Logic"]
        tools["Tool Executor"]
    end
    subgraph cloud["☁️ LLM Provider"]
        api["LLM API"]
    end
    ui <-->|"Every action"| agent
    agent --> tools
    agent <--> api
Loading

❌ Tools run on remote server
❌ Limited device access
❌ Network latency for every action

flowchart TB
    subgraph phone["📱 Mobile Device"]
        subgraph runtime["Agent Runtime"]
            tools["Tool Executor<br/>Bash / Files / Camera / Voice"]
        end
    end
    subgraph cloud["☁️ Cloud"]
        api["LLM API<br/>(Claude / OpenAI compatible)"]
    end
    runtime <-->|"Inference requests"| api
Loading

✅ Tools execute locally
✅ Full device access
✅ Only LLM inference calls remote API
✅ Supports any OpenAI-compatible endpoint

This means Anthroid can directly access your camera, files, clipboard, and other device capabilities without round-tripping through a remote server.

Design goals

  • Device tools: access to location, calendar, clipboard, notifications, URL/app launching, and more.
  • Mobile input: voice dictation (offline sherpa-onnx) and camera capture/QR scanning.
  • Agent-style automation: optional overlay shows what the agent is doing when it operates outside the app.
  • Terminal at hand: a full bash environment for advanced workflows.

Features

Chat Interface

  • Native Android chat UI with streaming responses
  • Markdown rendering - Tables, bold, italic, code blocks, links
  • Conversation history - Resume past conversations
  • Light blue user bubbles, gray assistant bubbles

Voice Input

  • Offline speech recognition using sherpa-onnx
  • Supports Chinese, English, Japanese, Korean, Cantonese
  • Press-and-hold microphone button to speak
  • Real-time transcription display

Camera & Vision

  • Take photos to add visual context to messages
  • Gallery picker for existing images
  • Multiple images per message
  • QR code scanning with instant text insertion and clipboard copy

AI Agent Tools

Claude can execute tools on your device:

Tool Description
Bash Run terminal commands
Read/Write/Edit File operations
Glob/Grep Search files and content
Notification Show Android notifications
Clipboard Read/write clipboard
Open URL Launch browser
Launch App Open installed apps
Location Get GPS coordinates
Calendar Query calendar events
Screenshot Capture device screen
Screen Tap/Swipe UI automation

Screen Automation Overlay

When Claude launches other apps or performs actions outside Anthroid:

  • Floating banner appears at the top of screen showing agent status
  • Streaming text displays what Claude is currently doing
  • Stop button to cancel the operation at any time
  • Auto-hides after task completion (tap to return to Anthroid)
  • Requires overlay permission (Draw over other apps)

OpenClaw Agent Runtime (v1.0)

Local OpenClaw agent with full tool-use, model selection, and context management:

  • pi-embedded-runner bundled in APK — no external agent server needed
  • 60+ Android tools via MCP bridge (accessibility, camera, clipboard, notifications, etc.)
  • File-based memoryMEMORY.md + memory/*.md injected into system prompt, persists across sessions
  • Any LLM provider — Anthropic, OpenAI-compatible APIs, or proxy endpoints (e.g. PPIO)

Gateway Integration (Optional)

Connect to an OpenClaw gateway server for distributed agent features:

  • Memory sync — automatic pull/push of memory/ files on session start/end (git-based incremental)
  • Profile sync/sync-profile command pulls agent identity, skills, and config files
  • Remote agent view — view and interact with OpenClaw sessions and SSH+tmux sessions on remote servers
  • Pending message delivery — 60s polling for timed messages from gateway agents
  • System notifications — IM-like notifications for gateway messages when app is backgrounded
  • Gateway Settings UI — configure host/port/token in Settings, or use /set-gateway command / QR code

Quick Send Candidates

  • Frequently used short messages appear as chips above the input bar
  • Tap to send immediately, long-press to insert into input for editing
  • Automatically tracked by frequency (threshold: 5 uses)

Terminal Environment

Built-in Linux terminal for advanced users:

  • Full bash shell environment
  • Package manager (apt/pkg)
  • Node.js, Python, and more available

Installation

Download APK

Get the latest release from GitHub Releases.

Build from Source

# Clone repository
git clone https://github.com/k-l-lambda/anthroid.git
cd anthroid

# Build debug APK
./gradlew assembleDebug

# Install on device
adb install app/build/outputs/apk/debug/anthroid-app_apt-android-7-debug_arm64-v8a.apk

Requirements

  • Android 7.0+ (API 24)
  • ARM64 device recommended
  • ~200MB storage (+ optional 239MB for voice model)

Setup

API Configuration

  1. Get your Claude API key from Anthropic Console
  2. In Anthroid, go to Settings > API Configuration
  3. Enter your API key and base URL

Or use QR code for quick setup:

  1. Generate QR code with API credentials
  2. Open Camera > QR scan mode
  3. Scan the QR code

Voice Input Setup

  1. Go to Settings > Components
  2. Download ASR Model (239MB, one-time)
  3. Wait for model initialization
  4. Microphone button appears in chat

Gateway Setup (Optional)

Connect to an OpenClaw gateway for memory sync and remote agent features:

  1. Go to Settings > Gateway Connection
  2. Enter host, port, and token
  3. Enable the toggle

Or use the chat command: /set-gateway <host:port> [token]

Or scan a QR code generated from tools/qr-generator.html.

Architecture

┌──────────────────────────────────────────────────┐
│  Anthroid App (Android)                          │
│                                                  │
│  ┌────────────────────────────────────────────┐  │
│  │  Kotlin Layer                              │  │
│  │  ├─ Chat UI (Markdown, voice, camera)      │  │
│  │  ├─ ClaudeViewModel (CLI/API/OpenClaw)     │  │
│  │  ├─ MCP Server (NanoHTTPD :8765)           │  │
│  │  ├─ Gateway Manager (WebSocket + sync)     │  │
│  │  └─ AndroidTools (60+ device tools)        │  │
│  └──────────────┬─────────────────────────────┘  │
│                 │ HTTP localhost:8765              │
│  ┌──────────────┴─────────────────────────────┐  │
│  │  Node.js Layer (OpenClaw Agent)            │  │
│  │  ├─ pi-embedded-runner (agent runtime)     │  │
│  │  ├─ android-tools-bridge.mjs (MCP bridge)  │  │
│  │  └─ workspace/ (memory, skills, data)      │  │
│  └────────────────────────────────────────────┘  │
│                                                  │
│  ┌────────────────────────────────────────────┐  │
│  │  Terminal (Termux fork)                    │  │
│  │  bash, Node.js, Python, apt/pkg            │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────┘
         │                           │
         ▼                           ▼
   ☁️ LLM API              ☁️ OpenClaw Gateway
   (Claude/OpenAI)          (optional, WebSocket)

Key Technologies

  • Kotlin - Primary language
  • OpenClaw pi-embedded-runner - Local agent runtime with tool-use
  • CameraX - Camera capture
  • ML Kit - QR code scanning
  • sherpa-onnx - Offline speech recognition
  • Markwon - Markdown rendering
  • OkHttp - WebSocket for gateway connection
  • BouncyCastle - Ed25519 device authentication

License

GPLv3 - Same license as Termux. See LICENSE.md.

Credits

About

An agentic android app

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors