No description
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| data | ||
| docs | ||
| scripts | ||
| src | ||
| templates | ||
| .gitignore | ||
| build.sh | ||
| Cargo.toml | ||
| download_models.sh | ||
| favicon.svg | ||
| install.sh | ||
| Makefile | ||
| openrpc.json | ||
| README.md | ||
| run.sh | ||
HeroEmbedder
A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.
Features
- Embedding Generation: BGE models (small/base) with INT8/FP32 options
- Semantic Search: Fast cosine similarity search
- Reranking: Cross-encoder model for improved accuracy
- Namespaces: Isolated document collections for multi-tenant use
- Persistence: Documents stored in redb databases
- Web UI: Bootstrap-based interface with live updates
Installation
Quick Install (Recommended)
Download and run the installer script:
curl -sSL https://forge.ourworld.tf/geomind_code/hero_embedder/raw/branch/main/scripts/install_download_run.sh | bash
This will:
- Detect your platform (macOS arm64 or Linux amd64)
- Download
embedderd(server) andembedder(CLI client) - Install to
~/hero/bin - Configure your PATH
Manual Download
# macOS (Apple Silicon)
curl -OJ https://forge.ourworld.tf/api/packages/geomind_code/generic/hero_embedder/dev/embedderd-darwin-arm64
curl -OJ https://forge.ourworld.tf/api/packages/geomind_code/generic/hero_embedder/dev/embedder-darwin-arm64
# Linux (x86_64)
curl -OJ https://forge.ourworld.tf/api/packages/geomind_code/generic/hero_embedder/dev/embedderd-linux-amd64
curl -OJ https://forge.ourworld.tf/api/packages/geomind_code/generic/hero_embedder/dev/embedder-linux-amd64
chmod +x embedderd* embedder*
Build from Source
# 1. Install ONNX Runtime
brew install onnxruntime # macOS
# or see https://onnxruntime.ai for Linux
# 2. Download models
./download_models.sh
# 3. Build and run
./run.sh
Quick Start
# Start the server
embedderd
# In another terminal, use the CLI
embedder embed "hello world"
embedder search "query"
embedder --help
Server starts at http://localhost:3752
Quality Levels
Quality is set per namespace when creating it. All 4 models are loaded at startup for instant availability.
| Level | Name | Model | Weights | Embeddings | Dimensions | Use Case |
|---|---|---|---|---|---|---|
| 1 | Fast | bge-small | INT8 | INT8 | 384 | Real-time, low latency |
| 2 | Balanced | bge-small | FP32 | FP16 | 384 | Default, good balance |
| 3 | Quality | bge-base | INT8 | INT8 | 768 | Better accuracy |
| 4 | Best | bge-base | FP32 | FP16 | 768 | Maximum quality |
Reranker is available for all quality levels via the use_reranker search parameter.
Namespaces
Namespaces provide isolated document collections. Each namespace has its own index and storage.
// List namespaces
{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
// Create namespace with quality level (1-4)
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
// Delete namespace
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}
All document operations accept an optional namespace parameter (defaults to "default"):
// Add to specific namespace
{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "1", "text": "..."}], "my-docs"]}
// Search in namespace
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query", 10, "my-docs"]}
API
JSON-RPC 2.0 endpoint at POST /rpc
Server Info
{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}
Embedding
{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}
Index Management
// Add documents (with optional namespace)
{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
// Get document
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
// Delete document
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
// Count documents
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
// Clear namespace
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
Search
// Search with optional namespace and reranker
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}
Rerank
{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}
Corpus (TriviaQA)
// Download TriviaQA dataset
{"jsonrpc": "2.0", "id": 1, "method": "corpus.download", "params": []}
// Load into namespace
{"jsonrpc": "2.0", "id": 1, "method": "corpus.load", "params": [1000, "namespace"]}
// Benchmark
{"jsonrpc": "2.0", "id": 1, "method": "corpus.benchmark", "params": [null, 10, "namespace"]}
CLI Client
# Build with client feature
cargo build --features client
# Commands
heroembedder health
heroembedder stats
heroembedder embed "hello world"
heroembedder search "query" -k 10
heroembedder add doc1 "document text"
SDK Usage (Rust)
use hero_embedder::sdk::HeroEmbedderClient;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let client = HeroEmbedderClient::new("http://localhost:3752/rpc");
// List namespaces
let namespaces = client.namespace_list().await?;
// Add documents
let docs = vec![
Doc { id: "1".into(), text: "Hello world".into() },
];
client.index_add(docs, Some("my-namespace")).await?;
// Search
let results = client.search("hello", 10, Some("my-namespace"), None).await?;
Ok(())
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
EMBEDDER_MODELS |
~/hero/var/embedder/models |
Models directory |
EMBEDDER_DATA |
~/hero/var/embedder/data |
Data directory |
Data Storage
~/hero/var/embedder/
├── models/
│ ├── bge-small/
│ ├── bge-base/
│ └── bge-reranker-base/
└── data/
├── default/ # Namespace: default
│ └── q2/ # Quality level 2
│ └── rag.redb
├── my-namespace/ # Namespace: my-namespace
│ └── q3/ # Quality level 3
│ └── rag.redb
└── corpus.redb # Shared TriviaQA corpus
Building
# Server only (default)
cargo build --release
# Server + CLI client
cargo build --release --features client
# All features
cargo build --release --features full
Project Structure
hero_embedder/
├── src/
│ ├── main.rs # Server binary
│ ├── lib.rs # Library root
│ ├── config.rs # Quality profiles
│ ├── namespace.rs # Namespace manager
│ ├── state.rs # App state
│ ├── handlers.rs # HTML handlers
│ ├── api/ # JSON-RPC handlers
│ ├── ml/ # Embedding & reranking
│ ├── retrieval/ # Vector search
│ ├── storage/ # Persistence
│ ├── sdk/ # Client SDK
│ └── cli/ # CLI commands
├── templates/ # Web UI templates
├── docs/ # Documentation
│ └── API.md # Full API docs
├── download_models.sh # Model downloader
├── build.sh # Build script
└── run.sh # Run server
Documentation
See docs/API.md for complete API documentation.
License
MIT