No description
- Rust 57.1%
- HTML 25.8%
- Shell 15.2%
- Makefile 1.6%
- CSS 0.3%
|
Some checks failed
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 12s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 16s
Test / test (push) Failing after 21s
Build macOS / build-macos (push) Has been cancelled
The Load Documents button previously called corpus.load which runs as a background job, returning immediately with no progress feedback. Switch to corpus.load_batch in a 250-doc loop so the UI shows live progress (count, percentage, progress bar). Also starts from current namespace doc count to avoid re-processing existing documents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| crates | ||
| docs | ||
| scripts | ||
| .env.example | ||
| .gitignore | ||
| build.sh | ||
| buildenv.sh | ||
| Cargo.toml | ||
| download_models.sh | ||
| favicon.svg | ||
| install.sh | ||
| MACOS_ONNX_FIX.md | ||
| Makefile | ||
| MAKEFILE_ROBUSTNESS.md | ||
| OAUTH_DEBUG.md | ||
| openrpc.json | ||
| README.md | ||
HeroEmbedder
A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.
Architecture
hero_embedder/
├── crates/
│ ├── hero_embedder_lib/ # Library: server internals (ML, storage, retrieval)
│ ├── hero_embedder_server/ # Binary: JSON-RPC daemon (Unix socket)
│ ├── hero_embedder_sdk/ # Library: JSON-RPC client and types
│ ├── hero_embedder/ # Binary: CLI using the SDK
│ ├── hero_embedder_ui/ # Binary: Axum web dashboard using the SDK
│ └── hero_embedder_examples/ # Examples: SDK usage demonstrations
├── scripts/ # Build and deployment scripts
├── Cargo.toml # Workspace root
├── Makefile # Build orchestration
└── buildenv.sh # Environment configuration
Dependency Graph
hero_embedder_server
↑
hero_embedder_lib (server internals)
hero_embedder_sdk (JSON-RPC client)
↑ ↑ ↑
│ │ │
hero_embedder hero_embedder_ui hero_embedder_examples
Sockets
| Service | Socket Path | Type |
|---|---|---|
| Server | ~/hero/var/sockets/hero_embedder_server.sock |
Unix Socket (OpenRPC) |
| UI | ~/hero/var/sockets/hero_embedder_ui.sock |
Unix Socket (HTTP + /rpc proxy) |
All services bind to Unix sockets only. External access is provided by hero_proxy.
Features
- Embedding Generation: BGE models (small/base) with INT8/FP32 options
- Semantic Search: Fast cosine similarity search
- Reranking: Cross-encoder model for improved accuracy
- Namespaces: Isolated document collections for multi-tenant use
- Persistence: Documents stored in redb databases
- Web UI: Bootstrap-based admin dashboard with live updates
Quick Start
# Full setup: install deps, download models, build, install
make setup
# Run server + UI
make run
# CLI health check
hero_embedder health
Quality Levels
Quality is set per namespace when creating it. All 4 models are loaded at startup.
| Level | Name | Model | Weights | Embeddings | Dimensions | Use Case |
|---|---|---|---|---|---|---|
| 1 | Fast | bge-small | INT8 | INT8 | 384 | Real-time, low latency |
| 2 | Balanced | bge-small | FP32 | FP16 | 384 | Default, good balance |
| 3 | Quality | bge-base | INT8 | INT8 | 768 | Better accuracy |
| 4 | Best | bge-base | FP32 | FP16 | 768 | Maximum quality |
API
JSON-RPC 2.0 endpoint at POST /rpc
Server Info
{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}
Embedding
{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}
Index Management
{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
Search
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}
Rerank
{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}
Namespaces
{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}
CLI Client
hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs
SDK Usage (Rust)
use hero_embedder_sdk::HeroEmbedderClient;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let socket = format!("{}/hero/var/sockets/hero_embedder_server.sock",
std::env::var("HOME")?);
let client = HeroEmbedderClient::new(format!("unix://{socket}"));
let results = client.search("hello", 10, None, None).await?;
Ok(())
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
EMBEDDER_MODELS |
~/hero/var/embedder/models |
Models directory |
EMBEDDER_DATA |
~/hero/var/embedder/data |
Data directory |
HERO_SECRET |
(unset) | JWT secret (enables auth when set) |
HERO_AUTH_URL |
(unset) | URL to hero_auth OAuth2 server |
Data Storage
~/hero/var/embedder/
├── models/
│ ├── bge-small/
│ ├── bge-base/
│ └── bge-reranker-base/
└── data/
├── default/
│ └── q2/
│ └── rag.redb
└── corpus.redb
Building
make build # Release build
make check # Fast code check
make test # Unit tests
make lint # Clippy linter
make run # Full stack (server + UI)
make run-server # Server only
make run-ui # UI only
make stop # Stop all services
License
MIT