No description

Rust 57.1%
HTML 25.8%
Shell 15.2%
Makefile 1.6%
CSS 0.3%

Find a file

mik-tf 52abf5cea2 Some checks failed Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 12s Details Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 16s Details Test / test (push) Failing after 21s Details Build macOS / build-macos (push) Has been cancelled Details fix(ui): use corpus.load_batch for real progress bar in Load Documents The Load Documents button previously called corpus.load which runs as a background job, returning immediately with no progress feedback. Switch to corpus.load_batch in a 250-doc loop so the UI shows live progress (count, percentage, progress bar). Also starts from current namespace doc count to avoid re-processing existing documents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-13 11:52:11 -04:00
.forgejo/workflows	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00
crates	fix(ui): use corpus.load_batch for real progress bar in Load Documents	2026-03-13 11:52:11 -04:00
docs	init	2026-02-10 21:41:39 +04:00
scripts	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00
.env.example	init	2026-02-10 21:41:39 +04:00
.gitignore	Simplify TriviaQA bundling: use gzipped text file instead of tar.gz	2026-01-22 16:18:50 +01:00
build.sh	Split stats into server and namespace panels with independent polling	2026-01-22 15:41:14 +01:00
buildenv.sh	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00
Cargo.toml	fix: normalize RPC params + bump Rust 1.93 + zinit SDK branch fix (#10 )	2026-03-11 19:19:02 +00:00
download_models.sh	Update README: quality is per-namespace, fix data storage structure	2026-01-22 15:49:56 +01:00
favicon.svg	fix: update favicon.svg to match navbar search-heart icon	2026-02-10 16:17:28 -05:00
install.sh	feat: improve service management and macOS setup	2026-02-08 17:52:47 +02:00
MACOS_ONNX_FIX.md	docs: Add macOS ONNX Runtime library path fix documentation	2026-02-08 13:20:08 +04:00
Makefile	feat: add ZinitLifecycle to UI, zinit logging integration, in-process job tracking	2026-03-10 12:43:16 +01:00
MAKEFILE_ROBUSTNESS.md	docs: Add Makefile robustness and validation documentation	2026-02-08 13:46:59 +04:00
OAUTH_DEBUG.md	fix: Correct OAuth2 parameters for hero_auth integration	2026-02-10 22:31:10 +04:00
openrpc.json	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00
README.md	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00

README.md

HeroEmbedder

A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.

Architecture

hero_embedder/
├── crates/
│   ├── hero_embedder_lib/         # Library: server internals (ML, storage, retrieval)
│   ├── hero_embedder_server/      # Binary: JSON-RPC daemon (Unix socket)
│   ├── hero_embedder_sdk/         # Library: JSON-RPC client and types
│   ├── hero_embedder/             # Binary: CLI using the SDK
│   ├── hero_embedder_ui/          # Binary: Axum web dashboard using the SDK
│   └── hero_embedder_examples/    # Examples: SDK usage demonstrations
├── scripts/                       # Build and deployment scripts
├── Cargo.toml                     # Workspace root
├── Makefile                       # Build orchestration
└── buildenv.sh                    # Environment configuration

Dependency Graph

hero_embedder_server
        ↑
hero_embedder_lib (server internals)

hero_embedder_sdk (JSON-RPC client)
        ↑        ↑        ↑
        │        │        │
hero_embedder   hero_embedder_ui   hero_embedder_examples

Sockets

Service	Socket Path	Type
Server	`~/hero/var/sockets/hero_embedder_server.sock`	Unix Socket (OpenRPC)
UI	`~/hero/var/sockets/hero_embedder_ui.sock`	Unix Socket (HTTP + /rpc proxy)

All services bind to Unix sockets only. External access is provided by hero_proxy.

Features

Embedding Generation: BGE models (small/base) with INT8/FP32 options
Semantic Search: Fast cosine similarity search
Reranking: Cross-encoder model for improved accuracy
Namespaces: Isolated document collections for multi-tenant use
Persistence: Documents stored in redb databases
Web UI: Bootstrap-based admin dashboard with live updates

Quick Start

# Full setup: install deps, download models, build, install
make setup

# Run server + UI
make run

# CLI health check
hero_embedder health

Quality Levels

Quality is set per namespace when creating it. All 4 models are loaded at startup.

Level	Name	Model	Weights	Embeddings	Dimensions	Use Case
1	Fast	bge-small	INT8	INT8	384	Real-time, low latency
2	Balanced	bge-small	FP32	FP16	384	Default, good balance
3	Quality	bge-base	INT8	INT8	768	Better accuracy
4	Best	bge-base	FP32	FP16	768	Maximum quality

API

JSON-RPC 2.0 endpoint at POST /rpc

Server Info

{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}

Embedding

{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}

Index Management

{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}

Search

{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}

Rerank

{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}

Namespaces

{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}

CLI Client

hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs

SDK Usage (Rust)

use hero_embedder_sdk::HeroEmbedderClient;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let socket = format!("{}/hero/var/sockets/hero_embedder_server.sock",
        std::env::var("HOME")?);
    let client = HeroEmbedderClient::new(format!("unix://{socket}"));

    let results = client.search("hello", 10, None, None).await?;
    Ok(())
}

Environment Variables

Variable	Default	Description
`EMBEDDER_MODELS`	`~/hero/var/embedder/models`	Models directory
`EMBEDDER_DATA`	`~/hero/var/embedder/data`	Data directory
`HERO_SECRET`	(unset)	JWT secret (enables auth when set)
`HERO_AUTH_URL`	(unset)	URL to hero_auth OAuth2 server

Data Storage

~/hero/var/embedder/
├── models/
│   ├── bge-small/
│   ├── bge-base/
│   └── bge-reranker-base/
└── data/
    ├── default/
    │   └── q2/
    │       └── rag.redb
    └── corpus.redb

Building

make build          # Release build
make check          # Fast code check
make test           # Unit tests
make lint           # Clippy linter
make run            # Full stack (server + UI)
make run-server     # Server only
make run-ui         # UI only
make stop           # Stop all services

License

MIT