No description
- Rust 77.8%
- HTML 20.6%
- JavaScript 1.3%
- CSS 0.3%
|
All checks were successful
Test / test (push) Successful in 7m18s
- Remove duplicated BUILD_NR const blocks from all binaries (now supplied by service_base!()) - Replace ~/hero/var/sockets/… strings with herolib_core::base::resolve_socket_dir() / hero_socket_dir() / path_root() - Add herolib_core dependency to hero_embedder_examples - Bump hero_proc_sdk and herolib_core/herolib_derive git revisions in Cargo.lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| .hero | ||
| crates | ||
| docs | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| favicon.svg | ||
| openrpc.json | ||
| PURPOSE.md | ||
| README.md | ||
HeroEmbedder
A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.
Architecture
hero_embedder/
├── crates/
│ ├── hero_embedder_lib/ # Library: server internals (ML, storage, retrieval)
│ ├── hero_embedderd/ # Binary: ONNX daemon (TCP, loads all models once)
│ ├── hero_embedder_server/ # Binary: JSON-RPC daemon (Unix socket)
│ ├── hero_embedder_web/ # Binary: Axum web dashboard (Unix socket)
│ ├── hero_embedder_sdk/ # Library: JSON-RPC client and types
│ ├── hero_embedder/ # Binary: CLI using the SDK
│ └── hero_embedder_examples/ # Examples: SDK usage demonstrations
└── Cargo.toml # Workspace root
Dependency Graph
hero_embedderd (ONNX models, TCP)
↑
hero_embedder_server (JSON-RPC Unix socket, delegates embed/rerank to daemon)
↑
hero_embedder_sdk (JSON-RPC client)
↑ ↑
│ │
hero_embedder hero_embedder_web
(CLI) (admin UI)
Lifecycle
lab drives the full build/install/start/stop pipeline on top of hero_proc.
lab service embedder --install # build + install all binaries
lab service embedder --start # register with hero_proc and start
lab service embedder --stop # stop all binaries
lab service embedder --status # status of all binaries
Sockets
| Service | Socket Path | Type |
|---|---|---|
| Server | $HERO_SOCKET_DIR/hero_embedder/rpc.sock |
Unix Socket (OpenRPC / JSON-RPC 2.0) |
| Web UI | $HERO_SOCKET_DIR/hero_embedder/web.sock |
Unix Socket (HTTP admin dashboard) |
| Proxy | $HERO_SOCKET_DIR/hero_embedder_proxy/rpc.sock |
Unix Socket (namespace-isolating proxy) |
| Daemon | TCP 127.0.0.1:8092 (configurable) |
HTTP JSON-RPC + /health |
All server/web sockets are Unix sockets only. External access is provided by hero_router.
The daemon TCP port is intended for loopback use; cross-node access goes through hero_router.
Features
- Embedding Generation: BGE models (small/base) with INT8/FP32 options
- Semantic Search: Fast cosine similarity search
- Reranking: Cross-encoder model for improved accuracy
- Namespaces: Isolated document collections for multi-tenant use
- Persistence: Documents stored in redb databases
- Web UI: Bootstrap-based admin dashboard with live updates
Quality Levels
Quality is set per namespace when creating it. All 4 models are loaded at startup.
| Level | Name | Model | Weights | Dimensions | Use Case |
|---|---|---|---|---|---|
| 1 | Fast | bge-small | INT8 | 384 | Real-time, low latency |
| 2 | Balanced | bge-small | FP32 | 384 | Default, good balance |
| 3 | Quality | bge-base | INT8 | 768 | Better accuracy |
| 4 | Best | bge-base | FP32 | 768 | Maximum quality |
API
JSON-RPC 2.0 endpoint at POST /rpc
Server Info
{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}
Embedding
{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}
Index Management
{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
Search
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}
Rerank
{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}
Namespaces
{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}
CLI Client
hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs
SDK Usage (Rust)
use hero_embedder_sdk::HeroEmbedderClient;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let socket = format!("{}/hero/var/sockets/hero_embedder/rpc.sock",
std::env::var("HOME")?);
let client = HeroEmbedderClient::connect_socket(&socket).await?;
let results = client.search("hello", 10, None, None).await?;
Ok(())
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
EMBEDDER_MODELS |
~/hero/var/embedder/models |
Models directory |
EMBEDDER_DATA |
~/hero/var/embedder/data |
Data directory |
HERO_EMBEDDERD_PORT |
8092 |
TCP port hero_embedderd listens on |
HERO_EMBEDDERD_URL |
http://127.0.0.1:8092 |
URL hero_embedder_server uses to reach the daemon |
HERO_SOCKET_DIR |
~/hero/var/sockets |
Base directory for Unix sockets |
Data Storage
~/hero/var/embedder/
├── models/
│ ├── bge-small/
│ ├── bge-base/
│ └── bge-reranker-base/
└── data/
├── default/
│ └── q2/
│ └── rag.redb
└── corpus.redb
License
MIT