No description
  • Rust 77.8%
  • HTML 20.6%
  • JavaScript 1.3%
  • CSS 0.3%
Find a file
despiegk 5862e80f85
All checks were successful
Test / test (push) Successful in 7m18s
refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts
- Remove duplicated BUILD_NR const blocks from all binaries (now supplied by service_base!())
- Replace ~/hero/var/sockets/… strings with herolib_core::base::resolve_socket_dir() / hero_socket_dir() / path_root()
- Add herolib_core dependency to hero_embedder_examples
- Bump hero_proc_sdk and herolib_core/herolib_derive git revisions in Cargo.lock

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 09:17:38 +02:00
.forgejo/workflows ci: write toolchain PATH to GITHUB_PATH once (clean up env preamble) 2026-05-12 21:32:56 +02:00
.hero chore: add hero_builder artifacts and fix dep version strings to policy minimum 2026-05-10 13:39:45 +02:00
crates refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts 2026-05-17 09:17:38 +02:00
docs feat: remove auth module and add ensure_deps ONNX Runtime setup 2026-03-20 18:11:56 +01:00
.gitignore chore(deps): commit Cargo.lock and bump herolib to dev tip with logger 2026-05-03 15:04:09 +02:00
Cargo.lock refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts 2026-05-17 09:17:38 +02:00
Cargo.toml fix(deps): bump hero_rpc/herolib_core/hero_proc_sdk pins from 0.5.0 to 0.6.0 2026-05-12 17:13:05 +02:00
favicon.svg fix: update favicon.svg to match navbar search-heart icon 2026-02-10 16:17:28 -05:00
openrpc.json fix: absolute binary paths, graceful shutdown, rename client to SDK 2026-02-28 18:42:47 +03:00
PURPOSE.md feat: adopt service_base!() macro across all binaries; add service.toml manifests; bump deps 2026-05-16 13:42:18 +02:00
README.md feat: adopt service_base!() macro across all binaries; add service.toml manifests; bump deps 2026-05-16 13:42:18 +02:00

HeroEmbedder

A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.

Architecture

hero_embedder/
├── crates/
│   ├── hero_embedder_lib/         # Library: server internals (ML, storage, retrieval)
│   ├── hero_embedderd/            # Binary: ONNX daemon (TCP, loads all models once)
│   ├── hero_embedder_server/      # Binary: JSON-RPC daemon (Unix socket)
│   ├── hero_embedder_web/         # Binary: Axum web dashboard (Unix socket)
│   ├── hero_embedder_sdk/         # Library: JSON-RPC client and types
│   ├── hero_embedder/             # Binary: CLI using the SDK
│   └── hero_embedder_examples/    # Examples: SDK usage demonstrations
└── Cargo.toml                     # Workspace root

Dependency Graph

hero_embedderd  (ONNX models, TCP)
      ↑
hero_embedder_server  (JSON-RPC Unix socket, delegates embed/rerank to daemon)
      ↑
hero_embedder_sdk  (JSON-RPC client)
      ↑        ↑
      │        │
hero_embedder   hero_embedder_web
(CLI)          (admin UI)

Lifecycle

lab drives the full build/install/start/stop pipeline on top of hero_proc.

lab service embedder --install   # build + install all binaries
lab service embedder --start     # register with hero_proc and start
lab service embedder --stop      # stop all binaries
lab service embedder --status    # status of all binaries

Sockets

Service Socket Path Type
Server $HERO_SOCKET_DIR/hero_embedder/rpc.sock Unix Socket (OpenRPC / JSON-RPC 2.0)
Web UI $HERO_SOCKET_DIR/hero_embedder/web.sock Unix Socket (HTTP admin dashboard)
Proxy $HERO_SOCKET_DIR/hero_embedder_proxy/rpc.sock Unix Socket (namespace-isolating proxy)
Daemon TCP 127.0.0.1:8092 (configurable) HTTP JSON-RPC + /health

All server/web sockets are Unix sockets only. External access is provided by hero_router. The daemon TCP port is intended for loopback use; cross-node access goes through hero_router.

Features

  • Embedding Generation: BGE models (small/base) with INT8/FP32 options
  • Semantic Search: Fast cosine similarity search
  • Reranking: Cross-encoder model for improved accuracy
  • Namespaces: Isolated document collections for multi-tenant use
  • Persistence: Documents stored in redb databases
  • Web UI: Bootstrap-based admin dashboard with live updates

Quality Levels

Quality is set per namespace when creating it. All 4 models are loaded at startup.

Level Name Model Weights Dimensions Use Case
1 Fast bge-small INT8 384 Real-time, low latency
2 Balanced bge-small FP32 384 Default, good balance
3 Quality bge-base INT8 768 Better accuracy
4 Best bge-base FP32 768 Maximum quality

API

JSON-RPC 2.0 endpoint at POST /rpc

Server Info

{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}

Embedding

{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}

Index Management

{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}

Rerank

{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}

Namespaces

{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}

CLI Client

hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs

SDK Usage (Rust)

use hero_embedder_sdk::HeroEmbedderClient;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let socket = format!("{}/hero/var/sockets/hero_embedder/rpc.sock",
        std::env::var("HOME")?);
    let client = HeroEmbedderClient::connect_socket(&socket).await?;

    let results = client.search("hello", 10, None, None).await?;
    Ok(())
}

Environment Variables

Variable Default Description
EMBEDDER_MODELS ~/hero/var/embedder/models Models directory
EMBEDDER_DATA ~/hero/var/embedder/data Data directory
HERO_EMBEDDERD_PORT 8092 TCP port hero_embedderd listens on
HERO_EMBEDDERD_URL http://127.0.0.1:8092 URL hero_embedder_server uses to reach the daemon
HERO_SOCKET_DIR ~/hero/var/sockets Base directory for Unix sockets

Data Storage

~/hero/var/embedder/
├── models/
│   ├── bge-small/
│   ├── bge-base/
│   └── bge-reranker-base/
└── data/
    ├── default/
    │   └── q2/
    │       └── rag.redb
    └── corpus.redb

License

MIT