Hero service managing Claude Code-style autonomous agents with web UI and tracking.
  • Rust 75.2%
  • JavaScript 11.1%
  • HTML 11%
  • CSS 2.7%
Find a file
mik-tf e70510420a
Some checks failed
lab publish / publish (push) Failing after 8m26s
ci(lab-publish): publish latest from main, latest-dev from development
Match the canonical pattern so development publishes the latest-dev prerelease
tag instead of overwriting the stable latest release. Pick the release tag by
branch (main -> latest, development -> latest-dev, v* -> ref name).

See lhumina_code/home#268

Signed-by: mik-tf <mik-tf@noreply.invalid>
2026-06-07 17:36:34 -04:00
.claude Add agent worktree file server with deferred creation and resilient cleanup 2026-05-29 20:20:11 +00:00
.forgejo/workflows ci(lab-publish): publish latest from main, latest-dev from development 2026-06-07 17:36:34 -04:00
crates chore: remove all hard version pinning from hero_* dependencies 2026-06-06 08:41:27 +02:00
docs Add containerized execution 2026-06-04 03:31:58 +00:00
.gitignore chore: remove Cargo.lock and update gitignore 2026-06-06 08:04:49 +02:00
2026-05-22_21-58.png Add read-only file server for agent working trees 2026-05-29 20:20:11 +00:00
Cargo.toml chore: remove all hard version pinning from hero_* dependencies 2026-06-06 08:41:27 +02:00
PURPOSE.md Remove hero_orchestrator_web crate and chat UI 2026-05-30 05:46:01 +00:00
README.md Add voice mode to agent widget 2026-06-03 21:02:46 +00:00
rust-toolchain.toml Initial hero_claude: agent supervisor, RPC, SDK, UI 2026-05-09 01:03:21 +00:00

hero_orchestrator

Hero service that manages Claude Code-style autonomous agents. Submit a prompt

  • working directory + model + thinking effort via the admin UI or RPC; the service spawns the claude CLI as a subprocess, parses its stream-json output, and tracks agents through running → awaiting_review → completed.

Service management

Lifecycle is owned by lab, which reads the per-binary service.toml manifests and registers one hero_proc service per binary (_server, _admin) — exactly like every other Hero service. The hero_orchestrator CLI does not register or start anything itself.

lab build --install --start   # build, install, and start the services
hero_orchestrator doctor      # diagnostics (claude binary, login, db, sockets)
hero_orchestrator login       # shell out to `claude login`

Do not run hero_orchestrator --start. It no longer exists. The old composite registration created a hero_orchestrator service that bound the same sockets as the lab-managed trio, causing a split-brain (duplicate processes, 403 unknown proxy token, vanishing agents).

The admin UI is reachable through hero_router once started. It binds a Unix socket at ~/hero/var/sockets/hero_orchestrator/admin.sock.

Database: ~/hero/var/hero_orchestrator/db.sqlite (created on first start).

Agent widget embed

Embed the navbar agent composer with an explicit home MCP service:

<hero-agent-widget home-mcp="hero_books"></hero-agent-widget>

home-mcp is required, must be the canonical service name, and must match hero_[a-z0-9_]+. The widget does not infer service identity from URLs and does not default to hero_orchestrator. Widget state is scoped to the current browser tab and configured home-mcp via sessionStorage.

Voice mode (spoken conversation)

The widget has a conversation control — a speech-bubble glyph (deliberately not a microphone, to set it apart from the hero_voice mic bar) — in the composer, the working header, and the follow-up row. Tapping it opens a half-duplex spoken conversation: you talk, the mic stops at a natural pause, the transcript fires a turn (voice: true), and the agent's reply is read back sentence-by-sentence. Tap again to interrupt playback or stop the loop. On reload the loop always starts idle (no silent hot mic).

Runtime dependency. Voice mode requires hero_voice to be running and routed through hero_router on the host origin. The widget reaches three /hero_voice/... mounts:

  • /hero_voice/admin/voice-widget/components.js — the streaming recorder + TTS helpers, loaded lazily on first activation (no host-template edit needed).
  • the transcribe WebSocket — streaming speech-to-text with server-side endpointing.
  • voiceservice.synthesize_speech over /hero_voice/rpc/rpc — text-to-speech.

Loading is automatic and idempotent: if the hero_voice mic bar (<hero-voice-bar>) is already on the page, the widget reuses the already-loaded components.js instead of injecting a second copy; if it isn't, the widget injects it once (shared across multiple widget instances). When hero_voice is absent or unrouted, the conversation control degrades gracefully — it disables with an explanatory tooltip and text chat is unaffected.

See the hero_voice_widget skill for wiring the mic bar itself; both widgets share the same components.js bundle and coexist without double-loading.

Requirements

  • Linux x86_64
  • claude CLI on PATH — install with npm i -g @anthropic-ai/claude-code, then claude login once.

Crates

Crate Role
hero_orchestrator Operator CLI binary (login / doctor)
hero_orchestrator_server RPC daemon hero_orchestrator_server (binds rpc.sock)
hero_orchestrator_lib Models, SQLite store, supervisor, claude CLI driver
hero_orchestrator_sdk Generated OpenRPC client from openrpc.json
hero_orchestrator_admin Askama + Bootstrap + Unpoly admin dashboard (admin.sock)
hero_orchestrator_examples Integration / smoke tests

Service introspection

Every binary supports --info and --info --json:

hero_orchestrator --info               # CLI manifest (TOML)
hero_orchestrator_server --info        # server manifest (TOML)
hero_orchestrator_admin --info         # admin manifest (TOML)
# add --json to any of the above for JSON output

Quota proxy scope

By default only short-lived naming/commit shadow agents route their Anthropic traffic through the in-process quota proxy (HERO_ORCHESTRATOR_PROXY_SCOPE=shadows_only, the default). Long-running standard/planner agents connect directly to api.anthropic.com, so a proxy or orchestrator restart never interrupts their in-flight API calls, and a stale split-brain instance can only ever 403 a short, retried shadow rather than a user-facing agent. Account-level usage stays visible via the hourly oauth_usage_poll, and the shadows keep a fresh trickle of per-request anthropic-ratelimit-* snapshots flowing in between polls.

Set HERO_ORCHESTRATOR_PROXY_SCOPE=all to route every agent through the proxy and recover full per-request usage telemetry — appropriate once the split-brain issue is resolved.

Shadow naming/commit turns are retried up to 3× on transient failure. API and auth errors are detected via the CLI result line's is_error flag (not by string-matching the text), so an error like API Error: 403 unknown proxy token is treated as a failed turn and retried — never written out as a session name or commit message.

Quota proxy diagnostics

The quota proxy logs structured diagnostics to help track down 403 unknown proxy token and intermittent upstream errors:

  • On bind it logs its unix socket path and the orchestrator PID. A token miss is logged at warn with the same socket/PID, so a frequent miss can be pinned to a split-brain duplicate (a request hitting an instance that never minted the token) or a restart re-hydration gap.
  • Any non-2xx upstream response is logged at warn with the method, path, status, upstream request-id, and whether the request carried authorization / anthropic-beta headers.
  • Set HERO_ORCHESTRATOR_PROXY_LOG_ERRORS=1 to also log the first ~512 bytes of non-2xx upstream response bodies. Off by default so normal operation stays quiet and SSE streams are never buffered.

Relationship to claude-agent-sdk-python

This is, in spirit, a Rust port of the slice of claude-agent-sdk-python that we actually use. Both spawn the claude CLI as a subprocess in --print --output-format stream-json mode, drain it line-by-line, and turn the events into something application code can consume. The Python SDK gives you a library you embed; hero_orchestrator wraps the same idea behind a service (RPC + SQLite + admin UI) so multiple agents can run side-by-side and be observed.

Implemented

  • One-shot prompts via claude -p with stream-json output
  • Per-agent subprocess supervisor (parallel, no concurrency cap in v1)
  • Model selection (claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-7)
  • Effort levels (none, low, medium, high) → --effort flag
  • Permission mode (hard-wired to bypassPermissions for v1)
  • Cancellation (SIGTERM via Child::kill_on_drop + oneshot)
  • Token / cost accumulation from the terminal result line
  • Stream message persistence (every line stored verbatim, classified by type)
  • Crash recovery — running agents on disk are reconciled to failed on server restart so they don't appear stuck
  • Working-directory filter and three-bucket dashboard (running / awaiting_review / completed)

Not implemented (yet)

  • Bidirectional control protocol. The Python SDK speaks a JSON-RPC dialect to a long-lived claude process for streaming user input, mid-session tool approval, and interrupt/set_permission_mode calls. We do one-shot only.
  • Hooks. No PreToolUse / PostToolUse / Stop / UserPromptSubmit callbacks.
  • Resumable sessions (--resume, --continue) and partial-message streaming (--include-partial-messages).

If you need any of these, the claude CLI flags exist — claude_cli.rs is the file to extend.