No description
  • Rust 55.1%
  • HTML 26.7%
  • Shell 16.2%
  • Makefile 1.7%
  • CSS 0.3%
Find a file
despiegk f20aae1f9e
Some checks failed
Build macOS / build-macos (push) Has been cancelled
Test / test (push) Failing after 55s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 4m5s
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 4m17s
fix: rewrite KVS tests to test storage layer directly
Replace server-dependent HTTP integration tests with direct KvsStorage
tests using tempfile. All 8 tests now run without a server binary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-23 21:11:15 +03:00
.forgejo/workflows refactor: multi-crate workspace structure with clippy fixes 2026-02-21 15:44:51 +03:00
crates fix: rewrite KVS tests to test storage layer directly 2026-02-23 21:11:15 +03:00
docs init 2026-02-10 21:41:39 +04:00
scripts embedder update 2026-02-23 20:59:22 +03:00
.env.example init 2026-02-10 21:41:39 +04:00
.gitignore Simplify TriviaQA bundling: use gzipped text file instead of tar.gz 2026-01-22 16:18:50 +01:00
build.sh Split stats into server and namespace panels with independent polling 2026-01-22 15:41:14 +01:00
buildenv.sh embedder update 2026-02-23 20:59:22 +03:00
Cargo.toml embedder update 2026-02-23 20:59:22 +03:00
download_models.sh Update README: quality is per-namespace, fix data storage structure 2026-01-22 15:49:56 +01:00
favicon.svg fix: update favicon.svg to match navbar search-heart icon 2026-02-10 16:17:28 -05:00
install.sh feat: improve service management and macOS setup 2026-02-08 17:52:47 +02:00
MACOS_ONNX_FIX.md docs: Add macOS ONNX Runtime library path fix documentation 2026-02-08 13:20:08 +04:00
Makefile embedder update 2026-02-23 20:59:22 +03:00
MAKEFILE_ROBUSTNESS.md docs: Add Makefile robustness and validation documentation 2026-02-08 13:46:59 +04:00
OAUTH_DEBUG.md fix: Correct OAuth2 parameters for hero_auth integration 2026-02-10 22:31:10 +04:00
openrpc.json init 2026-02-10 21:33:53 +04:00
README.md embedder update 2026-02-23 20:59:22 +03:00
run.sh feat: improve service management and macOS setup 2026-02-08 17:52:47 +02:00

HeroEmbedder

A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.

Architecture (v2)

hero_embedder/
├── crates/
│   ├── hero_embedder_lib/         # Library: server internals (ML, storage, retrieval)
│   ├── hero_embedder_server/      # Binary: JSON-RPC daemon (Unix socket)
│   ├── hero_embedder_client/      # Library: generated client, types, Rhai bindings
│   ├── hero_embedder/             # Binary: CLI using the client
│   └── hero_embedder_ui/          # Binary: Axum web dashboard using the client
├── scripts/                       # Build and deployment scripts
├── Cargo.toml                     # Workspace root
├── Makefile                       # Build orchestration
└── buildenv.sh                    # Environment configuration

Dependency Graph

hero_embedder_server
        ↑
hero_embedder_lib (server internals)

hero_embedder_client (generated client)
        ↑        ↑
        │        │
hero_embedder   hero_embedder_ui

Ports and Sockets

Service Bind Type Default
Server ~/hero/var/sockets/hero_embedder_server.sock Unix Socket Primary
UI 127.0.0.1:3753 TCP HTTP dashboard

The server binds to a Unix socket only by default. The UI serves the web dashboard on TCP port 3753.

Features

  • Embedding Generation: BGE models (small/base) with INT8/FP32 options
  • Semantic Search: Fast cosine similarity search
  • Reranking: Cross-encoder model for improved accuracy
  • Namespaces: Isolated document collections for multi-tenant use
  • Persistence: Documents stored in redb databases
  • Web UI: Bootstrap-based admin dashboard with live updates

Quick Start

# Full setup: install deps, download models, build, install
make setup

# Run server + UI
make run

# CLI health check
hero_embedder -s "unix://$HOME/hero/var/sockets/hero_embedder_server.sock" health

Quality Levels

Quality is set per namespace when creating it. All 4 models are loaded at startup.

Level Name Model Weights Embeddings Dimensions Use Case
1 Fast bge-small INT8 INT8 384 Real-time, low latency
2 Balanced bge-small FP32 FP16 384 Default, good balance
3 Quality bge-base INT8 INT8 768 Better accuracy
4 Best bge-base FP32 FP16 768 Maximum quality

API

JSON-RPC 2.0 endpoint at POST /rpc

Server Info

{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}

Embedding

{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}

Index Management

{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}

Rerank

{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}

Namespaces

{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}

CLI Client

hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs

Client Usage (Rust)

use hero_embedder_client::HeroEmbedderClient;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Connect via Unix socket
    let client = HeroEmbedderClient::new("unix:///home/user/hero/var/sockets/hero_embedder_server.sock");

    // Or via HTTP
    let client = HeroEmbedderClient::new("http://localhost:3752");

    let results = client.search("hello", 10, None, None).await?;
    Ok(())
}

Environment Variables

Variable Default Description
EMBEDDER_MODELS ~/hero/var/embedder/models Models directory
EMBEDDER_DATA ~/hero/var/embedder/data Data directory
HERO_SECRET (unset) JWT secret (enables auth when set)
HERO_AUTH_URL (unset) URL to hero_auth OAuth2 server

Data Storage

~/hero/var/embedder/
├── models/
│   ├── bge-small/
│   ├── bge-base/
│   └── bge-reranker-base/
└── data/
    ├── default/
    │   └── q2/
    │       └── rag.redb
    └── corpus.redb

Building

make build          # Release build
make check          # Fast code check
make test           # Unit tests
make lint           # Clippy linter
make run            # Full stack (server + UI)
make run-server     # Server only
make run-ui         # UI only
make stop           # Stop all services

License

MIT