- Rust 69.7%
- HTML 15.1%
- Shell 12.7%
- JavaScript 1.5%
- Makefile 1%
|
Some checks failed
All socket references were using flat paths like `hero_aibroker_ui.sock` or binary-scoped paths like `hero_aibroker_server/rpc.sock`. Updated all occurrences to follow the Hero socket convention: $HERO_SOCKET_DIR/hero_aibroker/ui.sock $HERO_SOCKET_DIR/hero_aibroker/rpc.sock Affected: mcp_hero/src/main.rs, llm_client.rs, examples, sdk doc comment, mcp_servers.example.json, and README files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| crates | ||
| docs | ||
| scripts | ||
| .env.example | ||
| .gitignore | ||
| buildenv.sh | ||
| Cargo.toml | ||
| Makefile | ||
| mcp_servers.example.json | ||
| modelsconfig.yml | ||
| README.md | ||
AI Broker
A lightweight LLM request broker with an OpenAI-compatible REST API that intelligently routes requests to multiple LLM providers with cost-aware strategies. All communication is via Unix Domain Sockets — no TCP ports.
Features
- OpenAI-Compatible API — Drop-in replacement for OpenAI clients (via Unix socket)
- Multi-Provider Support — OpenAI, OpenRouter, Groq, SambaNova
- Smart Routing — Automatic model selection based on cost or quality
- Cost Tracking — Per-request cost calculation and tracking
- Request Tracking — Detailed per-IP request tracking with timestamps and durations
- Streaming Support — Real-time streaming responses via SSE
- MCP Broker — Aggregate tools from multiple MCP (Model Context Protocol) servers
- Rate Limiting — Per-IP rate limiting with configurable limits
- Audio APIs — Text-to-speech and speech-to-text support (Groq, SambaNova, OpenAI)
- Config-Based Audio Models — STT/TTS models defined in
modelsconfig.ymlwith automatic fallback - Embeddings — Vector embedding generation
- Many Chat Models — Latest Claude 4.x, Gemini 3, GPT-5.2, o3-mini, Grok 4.1, Kimi K2.5 and more
- Persistent Billing — SQLite-based request logging for billing and analytics
- API Key Support — Optional API key authentication system
- Unix Socket Architecture — All services communicate over Unix Domain Sockets; no open TCP ports
Project Structure
hero_aibroker/
├── crates/
│ ├── hero_aibroker/ # CLI + service manager (--start / --stop)
│ ├── hero_aibroker_lib/ # Core business logic (shared library)
│ ├── hero_aibroker_sdk/ # Generated OpenRPC client + types
│ ├── hero_aibroker_server/ # Server binary (two Unix sockets)
│ ├── hero_aibroker_ui/ # Admin dashboard binary (Unix socket)
│ ├── hero_aibroker_examples/ # SDK examples and integration tests
│ ├── hero_broker_server/ # Multi-MCP broker binary
│ └── mcp/
│ ├── mcp_common/ # Shared MCP utilities
│ ├── mcp_exa/ # Exa semantic search
│ ├── mcp_hero/ # Hero MCP integration
│ ├── mcp_ping/ # Ping test server
│ ├── mcp_scraperapi/ # ScraperAPI web scraping
│ ├── mcp_scrapfly/ # Scrapfly web scraping
│ ├── mcp_serpapi/ # SerpAPI web search
│ └── mcp_serper/ # Serper web search
├── modelsconfig.yml # Model definitions and pricing
└── mcp_servers.example.json # MCP server configuration template
Dependency Graph
hero_aibroker_lib (core logic)
↑
hero_aibroker_sdk (types, protocol, RPC client)
↑ ↑ ↑
| | |
server CLI UI
Architecture
All services bind Unix Domain Sockets under ~/hero/var/sockets/. There are no TCP listeners.
┌──────────────────────────────────────────────────────────────────┐
│ CLI (hero_aibroker chat/models/tools/health) │
│ connects via: ~/hero/var/sockets/hero_aibroker_server.sock │
└──────────────────────────────────┬───────────────────────────────┘
│ JSON-RPC
┌──────────────────────────────────▼───────────────────────────────┐
│ hero_aibroker_server │
│ ├── JSON-RPC admin API → ~/hero/var/sockets/ │
│ │ hero_aibroker_server.sock │
│ └── REST (OpenAI-compat) → ~/hero/var/sockets/ │
│ hero_aibroker_server_rest.sock │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Service Layer │ │
│ │ (Routing logic, model selection, cost calculation) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Provider Layer │ │
│ │ (OpenAI, Groq, SambaNova, OpenRouter adapters) │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ hero_aibroker_ui (admin dashboard) │
│ binds: ~/hero/var/sockets/hero_aibroker/ui.sock │
│ proxies JSON-RPC requests to hero_aibroker_server │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ hero_broker_server (MCP broker) │
│ binds multiple sockets for aggregated MCP services │
└──────────────────────────────────────────────────────────────────┘
All services are registered with and managed by hero_proc. The hero_aibroker --start command handles all registration and startup.
Quick Start
Prerequisites
- Rust 1.70 or later
hero_procinstalled and running- At least one LLM provider API key
Environment Variables
Source your env file before running:
source ~/.config/env.sh # or wherever you keep your secrets
LLM provider keys (at least one required):
| Variable | Description |
|---|---|
GROQ_API_KEY / GROQ_API_KEYS |
Groq API key(s) |
OPENROUTER_API_KEY / OPENROUTER_API_KEYS |
OpenRouter API key(s) |
SAMBANOVA_API_KEY / SAMBANOVA_API_KEYS |
SambaNova API key(s) |
OPENAI_API_KEY / OPENAI_API_KEYS |
OpenAI API key(s) |
Both singular and plural variants are accepted. Use comma-separated values for multiple keys per provider — the broker creates separate provider instances and distributes requests across them for higher throughput, load distribution, and automatic failover.
Web/search tool keys (optional, used by MCP servers):
| Variable | Description |
|---|---|
SERPAPI_API_KEYS |
SerpAPI web search |
SERPER_API_KEYS |
Serper web search |
EXA_API_KEYS |
Exa semantic search |
SCRAPERAPI_API_KEYS |
ScraperAPI web scraping |
SCRAPFLY_API_KEYS |
Scrapfly web scraping |
Service configuration:
| Variable | Default | Description |
|---|---|---|
ROUTING_STRATEGY |
cheapest |
cheapest or best |
MCP_CONFIG_PATH |
— | Path to MCP server config JSON |
MODELS_CONFIG_PATH |
— | Path to model config YAML |
ADMIN_TOKEN |
— | Simple admin auth token |
HERO_SECRET |
— | Hero Auth JWT secret |
Run
source ~/.config/env.sh
make run # build, install, and start all services via hero_proc
Stop
make stop # stop all hero_proc-managed services
Development Mode
make rundev # run server directly in debug mode (no hero_proc, logs to stdout)
make cli # interactive CLI session (debug build)
API Reference
All REST endpoints are served on the Unix socket at ~/hero/var/sockets/hero_aibroker_server_rest.sock. Use curl --unix-socket to reach them.
List Models
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/v1/models
Chat Completions
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
Text-to-Speech
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model": "tts-1", "input": "Hello, world!", "voice": "alloy"}' \
--output speech.mp3
Available TTS models: tts-1, tts-1-hd (requires OPENAI_API_KEY)
Speech-to-Text
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/v1/audio/transcriptions \
-F "file=@audio.mp3" \
-F "model=whisper-1"
Available STT models:
whisper-1— multi-provider (Groq → SambaNova → OpenAI fallback chain)whisper-large-v3— direct Groq/SambaNova access
Embeddings
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-3-small", "input": "Hello, world!"}'
JSON-RPC Admin API
The admin API is served on ~/hero/var/sockets/hero_aibroker_server.sock:
# Health check
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"health","params":{},"id":1}'
# List models
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"models.list","params":{},"id":2}'
# List MCP tools
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"mcp.list_tools","params":{},"id":3}'
Billing & Usage
# View all IP usage and costs
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/billing/usage
# View specific IP usage
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
http://localhost/billing/usage/127.0.0.1
All requests are persisted to SQLite with IP address, model, token usage, cost in USD, timestamps, and success/error status.
# Export to CSV
sqlite3 -header -csv requests.db "SELECT * FROM request_logs;" > billing.csv
CLI Usage
The hero_aibroker binary is both the service manager and the interactive CLI. It connects via ~/hero/var/sockets/hero_aibroker_server.sock.
# Interactive chat
hero_aibroker chat --model gpt-4o
# Chat with the default auto-routing model
hero_aibroker chat
# List available models
hero_aibroker models
# List MCP tools
hero_aibroker tools
# Check server health
hero_aibroker health
CLI Options
Global options:
-m, --model <MODEL>— model to use for chat (default:auto)--socket <PATH>— custom socket path (default:~/hero/var/sockets/hero_aibroker_server.sock)
Chat sub-command options:
-m, --model <MODEL>— model to use (overrides global--model)
Service Management
hero_aibroker --start # register all services with hero_proc and start them
hero_aibroker --stop # stop all services via hero_proc
Model Configuration
Models are defined in modelsconfig.yml. The file controls display names, tiers, capabilities, context windows, and per-provider backends with pricing:
models:
gpt-4o:
display_name: "GPT-4o"
tier: premium
capabilities:
- tool_calling
- vision
context_window: 128000
backends:
- provider: openrouter
model_id: openai/gpt-4o
priority: 1
input_cost: 2.5 # USD per million tokens
output_cost: 10.0
Set MODELS_CONFIG_PATH to point to your config file, or place modelsconfig.yml in the working directory.
Auto Model Selection
Use special model names for automatic selection:
| Model Name | Description |
|---|---|
auto |
Use the configured ROUTING_STRATEGY |
autocheapest |
Select the cheapest available model |
autobest |
Select the best premium model |
MCP Integration
The broker aggregates tools from multiple MCP (Model Context Protocol) servers managed by hero_broker_server. Configure servers in a JSON file pointed to by MCP_CONFIG_PATH (see mcp_servers.example.json):
{
"mcpServers": [
{
"name": "serper",
"command": "/path/to/mcp_serper",
"args": [],
"env": {}
},
{
"name": "exa",
"command": "/path/to/mcp_exa",
"args": [],
"env": {}
}
]
}
Included MCP Servers
All MCP binaries are built as part of the workspace and managed by hero_broker_server:
| Binary | Description | Required Key |
|---|---|---|
mcp_serper |
Web search via Serper | SERPER_API_KEYS |
mcp_serpapi |
Web search via SerpAPI | SERPAPI_API_KEYS |
mcp_exa |
Semantic search via Exa | EXA_API_KEYS |
mcp_scraperapi |
Web scraping via ScraperAPI | SCRAPERAPI_API_KEYS |
mcp_scrapfly |
Web scraping via Scrapfly | SCRAPFLY_API_KEYS |
mcp_ping |
Ping/test server | — |
mcp_hero |
Hero OS service discovery + LLM-driven Python code generation and execution | HERO_SECRET |
MCP REST Endpoints
| Endpoint | Description |
|---|---|
GET /mcp/tools |
List all aggregated tools |
POST /mcp/tools/:name |
Call a specific tool |
GET /mcp/sse |
SSE endpoint for MCP clients |
Development
Building
# Release build (all workspace crates)
cargo build --release
# Debug build
cargo build
# Build a specific crate
cargo build -p hero_aibroker_server
cargo build -p hero_aibroker
# Fast check (no codegen)
make check
Running Tests
# Run all tests
cargo test --all
# Run tests for a specific crate
cargo test -p hero_aibroker_lib
# Full build + test cycle
make all
Code Quality
make fmt # format code
make fmt-check # check formatting without modifying
make lint # run clippy (warnings as errors)
make lint-fix # run clippy and auto-fix
Logs
make logs # tail hero_aibroker_server logs via hero_proc
make logs-ui # tail hero_aibroker_ui logs via hero_proc
Status
make status # show service status and installed binaries
Deployment
Install Binaries
make install # build release and install to ~/hero/bin
Ship to Registry
make ship-binary # tag + push to trigger CI build
make ship-binary TAG=1.2.3 # override version tag
Docker
make docker-build # build Docker image
make docker-run # run Docker container (source env vars first)
License
MIT License