hero_embedderd #18

Open
opened 2026-04-19 02:47:32 +00:00 by despiegk · 3 comments
Owner

based on crates/hero_embedder_server make hero_embedderd

There is a server, embedder_server. We want to split it into functions. One becomes the embedder_d from Daemon, which is completely stateless and it only accepts requests from mycelium-enabled networks. And it also logs all the requests, so we know what exactly was done in number of stored, not stored, of processed embeddings, basically, so we know the tokens in and out. And we do that based on the source IP address. It's an IPv6 address, mycelium. And we want it to be completely stateless, so there is no authentication, no user management or nothing. So everyone on the machine can use it, and they will be billed per IPv6 address.

to talk to mycelium use

/home/despiegk/hero/code/hero_skills/claude/skills/mycelium_sdk/SKILL.md
also use skill how to attach to the mycelium network based on skill /hero_sockets

based on crates/hero_embedder_server make hero_embedderd There is a server, embedder_server. We want to split it into functions. One becomes the embedder_d from Daemon, which is completely stateless and it only accepts requests from mycelium-enabled networks. And it also logs all the requests, so we know what exactly was done in number of stored, not stored, of processed embeddings, basically, so we know the tokens in and out. And we do that based on the source IP address. It's an IPv6 address, mycelium. And we want it to be completely stateless, so there is no authentication, no user management or nothing. So everyone on the machine can use it, and they will be billed per IPv6 address. to talk to mycelium use /home/despiegk/hero/code/hero_skills/claude/skills/mycelium_sdk/SKILL.md also use skill how to attach to the mycelium network based on skill /hero_sockets
Author
Owner

Implementation Spec for Issue #18 — hero_embedderd

Objective

Create a new crate hero_embedderd (binary hero_embedderd) in crates/hero_embedderd/ that exposes the same JSON-RPC 2.0 surface as hero_embedder_server but is completely stateless (no auth, no users, no proxy state) and accepts requests only from peers reachable via the local mycelium overlay. For every accepted request the daemon emits a structured per-source-mycelium-IPv6 usage log (operation, tokens in, tokens out, embeddings stored / not-stored / processed) so an external billing aggregator can attribute work to a mycelium IPv6 address.

hero_embedder_server is left untouched. Shared business logic continues to live in hero_embedder_lib. The daemon is a thin transport + filtering + accounting wrapper around the same AppState and api::* handlers the server already calls.

Background

Today crates/hero_embedder_server (src/main.rs):

  • Loads ALL embedder models, the reranker, namespaces, embedding cache, corpus storage from disk into a single Arc<AppState> (defined in hero_embedder_lib::state).
  • Builds an Axum router with /health, POST /rpc, POST / (alias), /openrpc.json, /.well-known/heroservice.json, /favicon.svg, /mcp.
  • Binds only to a Unix Domain Socket at $HERO_SOCKET_DIR/hero_embedder/rpc.sock (no TCP).
  • Dispatches JSON-RPC method names to functions in hero_embedder_lib::api (info / health / embed / rerank / namespace.* / index.* / cache.* / search / stats / corpus.* / jobs.* / logs.* / kvs.* / vectors.*).
  • Logging today goes through OperationLogger (in hero_embedder_lib::logging) which records LogEntry { id, timestamp, operation, status, namespace, message, duration_ms, metadata } in a circular buffer and (best-effort) forwards to hero_proc. The current logger does not know about caller identity (mycelium IPv6) and does not count tokens.

The new daemon must:

  1. Bind only to UDS at $HERO_SOCKET_DIR/hero_embedderd/rpc.sock so that, by Hero socket strategy, all external traffic is forced through hero_router. To enforce "mycelium only", the daemon validates each request's X-Forwarded-For (or the mycelium IPv6 supplied by hero_router) is in the local mycelium subnet — answered by the mycelium SDK at startup (get_info().node_subnet) and on demand (get_public_key_from_ip).
  2. Be stateless: no --hero-os-id, no auth headers (X-Hero-Claims), no user store. The only persistent side effect is structured logs.
  3. Per accepted request, emit a UsageLogEntry { ts, src_ipv6, method, tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored, duration_ms, status } to: (a) tracing::info! with stable JSON fields (machine-readable), and (b) hero_proc via the existing ProcClient::log helper under source hero_embedderd.usage.<method> for downstream aggregation.

Requirements

  • Crate hero_embedderd lives in crates/hero_embedderd/ and is a workspace member.
  • Binary name: hero_embedderd (matches issue title verbatim — no underscore-d suffix mismatch).
  • Service name (for Hero socket dir): hero_embedderd → socket ~/hero/var/sockets/hero_embedderd/rpc.sock.
  • Reuses hero_embedder_lib::{api, state::AppState, namespace, storage, ml, logging, mcp, rpc, config, download, proc_client} verbatim — no fork of business logic.
  • Same JSON-RPC method surface as hero_embedder_server (every method dispatched in jsonrpc_handler of the server is also dispatched in the daemon).
  • Same auxiliary endpoints: GET /health, GET /openrpc.json, GET /.well-known/heroservice.json, GET /favicon.svg. The /mcp endpoint is intentionally not ported (out of scope for the stateless billable RPC).
  • POST /rpc and POST / enforce: caller's mycelium IPv6 ∈ local node's mycelium subnet. Non-mycelium traffic gets 403 Forbidden (HTTP) — but in normal operation it can only arrive through the UDS via hero_router, so the source IP is conveyed by header.
  • Per-request usage log records: source mycelium IPv6, method, tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored, duration_ms, status. Log line is structured JSON via tracing::info! and (best-effort) forwarded to hero_proc.
  • OpenRPC discovery manifest at /.well-known/heroservice.json advertises service name hero_embedderd.
  • Graceful shutdown on SIGINT and SIGTERM (mirrors existing server).
  • Reuses workspace deps; do not add new transitive deps if they already exist in the workspace.
  • Adds mycelium_sdk as a new workspace dep (git, branch=development) and the daemon depends on it.
  • No persistent state, no auth, no user management, no per-tenant prefixing logic. The proxy crate's hero_os_id model is explicitly out of scope.
  • hero_embedder_server keeps building and running unchanged (no edits to its source).

Files to Modify/Create

  • Cargo.toml (workspace root) — add crates/hero_embedderd to [workspace] members, add mycelium_sdk line in [workspace.dependencies].
  • crates/hero_embedderd/Cargo.toml — new crate manifest.
  • crates/hero_embedderd/heroservice.json — discovery manifest (name: "hero_embedderd").
  • crates/hero_embedderd/openrpc.json — copied content of crates/hero_embedder_server/openrpc.json so the discovery surface matches.
  • crates/hero_embedderd/favicon.svg — same favicon as the server.
  • crates/hero_embedderd/src/main.rs — entry point: load models, build state, build router, bind UDS, run.
  • crates/hero_embedderd/src/mycelium_guard.rs — boots a MyceliumClient, captures the local node's mycelium subnet, exposes is_mycelium_addr(ip) -> bool and extract_source_ipv6(headers) -> Option<Ipv6Addr>. Provides an Axum middleware that rejects non-mycelium callers with 403.
  • crates/hero_embedderd/src/usage_log.rs — defines UsageLogEntry, UsageMeter accumulator, and the emit(...) function that writes a structured tracing::info! line and forwards to hero_proc.
  • crates/hero_embedderd/src/dispatch.rs — dispatches the JSON-RPC method to hero_embedder_lib::api::* (mirrors the big match in hero_embedder_server::main::jsonrpc_handler).
  • crates/hero_embedderd/Makefilebuild, install, clean targets matching crates/hero_embedder_server/Makefile.
  • crates/hero_embedderd/README.md — short description of the daemon, how it differs from hero_embedder_server, the per-IP usage log format.

(No edits to crates/hero_embedder_server/** or to crates/hero_embedder_lib/**.)

Implementation Plan

Step 1: Workspace wiring

Files: Cargo.toml (workspace root)

  • Add "crates/hero_embedderd" to [workspace] members.
  • Add a workspace dependency entry for the mycelium SDK: mycelium_sdk = { git = "https://github.com/threefoldtech/mycelium", branch = "development" }.
  • Do NOT modify any other workspace fields.
    Dependencies: none

Step 2: New crate skeleton (manifest + manifests/assets)

Files: crates/hero_embedderd/Cargo.toml, heroservice.json, openrpc.json, favicon.svg, Makefile, README.md

  • Cargo.toml: package name hero_embedderd, workspace inheritance, deps on hero_embedder_lib, mycelium_sdk, tokio, axum, tower-http, tower, hyper, hyper-util, serde, serde_json, anyhow, tracing, tracing-subscriber. Single [[bin]] name = "hero_embedderd".
  • heroservice.json: protocol openrpc, name hero_embedderd.
  • openrpc.json + favicon.svg: byte copies of the server's files.
  • Makefile: mirror server's build/install/clean with new binary name.
  • README.md: stub, fleshed out in Step 7.
    Dependencies: Step 1

Step 3: Mycelium guard module

Files: crates/hero_embedderd/src/mycelium_guard.rs

  • MyceliumGuard { node_subnet, client }. init() connects via the mycelium SDK and reads get_info().node_subnet.
  • extract_source_ipv6(headers) reads X-Forwarded-For then X-Real-IP.
  • is_mycelium(addr) does prefix containment.
  • Axum middleware require_mycelium rejects non-mycelium callers with 403 and stores the resolved IPv6 in request extensions for downstream handlers.
  • HERO_EMBEDDERD_REQUIRE_MYCELIUM=1 (default) means daemon refuses to start if mycelium is unreachable.
    Dependencies: Step 2

Step 4: Usage log module

Files: crates/hero_embedderd/src/usage_log.rs

  • UsageMeter { tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored }.
  • UsageLogEntry { ts_ms, src_ipv6, method, tokens_in, tokens_out, embeddings_*, duration_ms, status }.
  • emit(entry, proc_client): tracing::info! with target hero_embedderd::usage + best-effort hero_proc log under source hero_embedderd.usage.<method>.
  • meter_from_params(method, params): pre-fills tokens_in for embed / rerank / search with an approximate whitespace token count (documented as approximate in README).
  • meter_from_result(method, result, meter): parses the result JSON to extract counts (embedembeddings_processed; index.add/vectors.addembeddings_stored + embeddings_not_stored; search/vectors.searchembeddings_processed).
    Dependencies: Step 2

Step 5: Dispatch module

Files: crates/hero_embedderd/src/dispatch.rs

  • dispatch(state, request) -> (response, method_name) mirrors the match block in hero_embedder_server::main::jsonrpc_handler verbatim, dispatching to hero_embedder_lib::api::*. Excludes the /mcp path (out of scope).
    Dependencies: Step 2

Step 6: Daemon entrypoint (main.rs)

Files: crates/hero_embedderd/src/main.rs

  • Init tracing, resolve EMBEDDER_MODELS / EMBEDDER_DATA, macOS dylib auto-detect, connect ProcClient, ensure models, load embedders + reranker, open namespaces, build AppState.
  • Init MyceliumGuard (fail loud if mycelium is required and unreachable).
  • Build axum Router with the same endpoints (/health, /openrpc.json, /.well-known/heroservice.json, /favicon.svg, POST /rpc, POST /), apply require_mycelium middleware globally, plus permissive CORS.
  • rpc_handler reads SourceIpv6 from request extensions, calls dispatch::dispatch, builds UsageMeter from params + result, emits a usage log line, returns the JSON-RPC response.
  • Bind UDS at $HERO_SOCKET_DIR/hero_embedderd/rpc.sock. Mirror the server's accept-loop and SIGINT/SIGTERM shutdown.
    Dependencies: Steps 3, 4, 5

Step 7: README + verification build

Files: crates/hero_embedderd/README.md

  • Document purpose, how it differs from hero_embedder_server, mycelium-only enforcement, structured usage log line shape, env vars (HERO_SOCKET_DIR, EMBEDDER_MODELS, EMBEDDER_DATA, MYCELIUM_RPC_SOCKET, HERO_EMBEDDERD_REQUIRE_MYCELIUM), example tracing line.
  • Verify: cargo build -p hero_embedderd, cargo build -p hero_embedder_server, cargo build all succeed.
    Dependencies: Step 6

Acceptance Criteria

  • cargo build -p hero_embedderd succeeds from a clean workspace.
  • cargo build -p hero_embedder_server still succeeds (no regression).
  • cargo build (whole workspace) succeeds.
  • cargo test -p hero_embedder_lib still passes (no lib changes).
  • Running hero_embedderd with mycelium up creates ~/hero/var/sockets/hero_embedderd/rpc.sock and serves /health, /openrpc.json, /.well-known/heroservice.json with name: "hero_embedderd".
  • Running hero_embedderd with mycelium unreachable exits with a clear error.
  • A POST /rpc carrying an X-Forwarded-For outside the local mycelium subnet returns 403.
  • A POST /rpc with a valid mycelium-IPv6 X-Forwarded-For is processed and emits exactly one structured usage line via tracing (target hero_embedderd::usage) containing source IPv6, method, tokens_in, tokens_out, embeddings_* counters and duration_ms.
  • No file under crates/hero_embedder_server/ or crates/hero_embedder_lib/ is modified.
  • No new TCP listener is opened by the daemon.

Notes

  • Naming: the issue title is hero_embedderd (concatenated d). I am keeping that exact name throughout (crate, binary, service-socket-dir).
  • Service identity: the daemon owns its own socket dir hero_embedderd/ rather than co-tenanting under hero_embedder/, so it can run side by side with hero_embedder_server.
  • Mycelium source-IP propagation: the daemon binds only to a UDS, so mycelium peers reach it via hero_router, which forwards the original mycelium IPv6 in X-Forwarded-For. The daemon does NOT read the peer credential off the UDS itself.
  • Token counting accuracy (v1): an exact token count requires running the model's tokenizer. v1 uses an approximate whitespace count for tokens_in, documented in README. A v2 follow-up issue can plumb a count_tokens(...) method into the lib.
  • MCP endpoint: deliberately omitted from the daemon — MCP carries its own session/state model that doesn't fit the "stateless, billed per IPv6" charter.
  • Auth: no X-Hero-Claims enforcement, no hero_context checking. The mycelium subnet check IS the access control.
  • Why not amend the server in place?: the issue says "split into functions" — keep hero_embedder_server as the full-featured local-admin RPC, and hero_embedderd as the stateless billable surface for mycelium peers. They share hero_embedder_lib.
## Implementation Spec for Issue #18 — hero_embedderd ### Objective Create a new crate `hero_embedderd` (binary `hero_embedderd`) in `crates/hero_embedderd/` that exposes the same JSON-RPC 2.0 surface as `hero_embedder_server` but is **completely stateless** (no auth, no users, no proxy state) and accepts requests **only from peers reachable via the local mycelium overlay**. For every accepted request the daemon emits a structured per-source-mycelium-IPv6 usage log (operation, tokens in, tokens out, embeddings stored / not-stored / processed) so an external billing aggregator can attribute work to a mycelium IPv6 address. `hero_embedder_server` is left untouched. Shared business logic continues to live in `hero_embedder_lib`. The daemon is a thin transport + filtering + accounting wrapper around the same `AppState` and `api::*` handlers the server already calls. ### Background Today `crates/hero_embedder_server` (`src/main.rs`): - Loads ALL embedder models, the reranker, namespaces, embedding cache, corpus storage from disk into a single `Arc<AppState>` (defined in `hero_embedder_lib::state`). - Builds an Axum router with `/health`, `POST /rpc`, `POST /` (alias), `/openrpc.json`, `/.well-known/heroservice.json`, `/favicon.svg`, `/mcp`. - Binds **only** to a Unix Domain Socket at `$HERO_SOCKET_DIR/hero_embedder/rpc.sock` (no TCP). - Dispatches JSON-RPC method names to functions in `hero_embedder_lib::api` (info / health / embed / rerank / namespace.* / index.* / cache.* / search / stats / corpus.* / jobs.* / logs.* / kvs.* / vectors.*). - Logging today goes through `OperationLogger` (in `hero_embedder_lib::logging`) which records `LogEntry { id, timestamp, operation, status, namespace, message, duration_ms, metadata }` in a circular buffer and (best-effort) forwards to `hero_proc`. The current logger does **not** know about caller identity (mycelium IPv6) and does **not** count tokens. The new daemon must: 1. Bind **only** to UDS at `$HERO_SOCKET_DIR/hero_embedderd/rpc.sock` so that, by Hero socket strategy, all external traffic is forced through `hero_router`. To enforce "mycelium only", the daemon validates each request's `X-Forwarded-For` (or the mycelium IPv6 supplied by `hero_router`) is in the local mycelium subnet — answered by the mycelium SDK at startup (`get_info().node_subnet`) and on demand (`get_public_key_from_ip`). 2. Be stateless: no `--hero-os-id`, no auth headers (`X-Hero-Claims`), no user store. The only persistent side effect is structured logs. 3. Per accepted request, emit a `UsageLogEntry { ts, src_ipv6, method, tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored, duration_ms, status }` to: (a) `tracing::info!` with stable JSON fields (machine-readable), and (b) `hero_proc` via the existing `ProcClient::log` helper under source `hero_embedderd.usage.<method>` for downstream aggregation. ### Requirements - Crate `hero_embedderd` lives in `crates/hero_embedderd/` and is a workspace member. - Binary name: `hero_embedderd` (matches issue title verbatim — no underscore-d suffix mismatch). - Service name (for Hero socket dir): `hero_embedderd` → socket `~/hero/var/sockets/hero_embedderd/rpc.sock`. - Reuses `hero_embedder_lib::{api, state::AppState, namespace, storage, ml, logging, mcp, rpc, config, download, proc_client}` verbatim — **no fork** of business logic. - Same JSON-RPC method surface as `hero_embedder_server` (every method dispatched in `jsonrpc_handler` of the server is also dispatched in the daemon). - Same auxiliary endpoints: `GET /health`, `GET /openrpc.json`, `GET /.well-known/heroservice.json`, `GET /favicon.svg`. The `/mcp` endpoint is intentionally **not** ported (out of scope for the stateless billable RPC). - `POST /rpc` and `POST /` enforce: caller's mycelium IPv6 ∈ local node's mycelium subnet. Non-mycelium traffic gets `403 Forbidden` (HTTP) — but in normal operation it can only arrive through the UDS via `hero_router`, so the source IP is conveyed by header. - Per-request usage log records: source mycelium IPv6, method, `tokens_in`, `tokens_out`, `embeddings_processed`, `embeddings_stored`, `embeddings_not_stored`, `duration_ms`, `status`. Log line is structured JSON via `tracing::info!` and (best-effort) forwarded to `hero_proc`. - `OpenRPC` discovery manifest at `/.well-known/heroservice.json` advertises service name `hero_embedderd`. - Graceful shutdown on SIGINT and SIGTERM (mirrors existing server). - Reuses workspace deps; do **not** add new transitive deps if they already exist in the workspace. - Adds `mycelium_sdk` as a new workspace dep (git, branch=`development`) and the daemon depends on it. - No persistent state, no auth, no user management, no per-tenant prefixing logic. The proxy crate's `hero_os_id` model is explicitly out of scope. - `hero_embedder_server` keeps building and running unchanged (no edits to its source). ### Files to Modify/Create - `Cargo.toml` (workspace root) — add `crates/hero_embedderd` to `[workspace] members`, add `mycelium_sdk` line in `[workspace.dependencies]`. - `crates/hero_embedderd/Cargo.toml` — new crate manifest. - `crates/hero_embedderd/heroservice.json` — discovery manifest (`name: "hero_embedderd"`). - `crates/hero_embedderd/openrpc.json` — copied content of `crates/hero_embedder_server/openrpc.json` so the discovery surface matches. - `crates/hero_embedderd/favicon.svg` — same favicon as the server. - `crates/hero_embedderd/src/main.rs` — entry point: load models, build state, build router, bind UDS, run. - `crates/hero_embedderd/src/mycelium_guard.rs` — boots a `MyceliumClient`, captures the local node's mycelium subnet, exposes `is_mycelium_addr(ip) -> bool` and `extract_source_ipv6(headers) -> Option<Ipv6Addr>`. Provides an Axum middleware that rejects non-mycelium callers with `403`. - `crates/hero_embedderd/src/usage_log.rs` — defines `UsageLogEntry`, `UsageMeter` accumulator, and the `emit(...)` function that writes a structured `tracing::info!` line and forwards to hero_proc. - `crates/hero_embedderd/src/dispatch.rs` — dispatches the JSON-RPC method to `hero_embedder_lib::api::*` (mirrors the big `match` in `hero_embedder_server::main::jsonrpc_handler`). - `crates/hero_embedderd/Makefile` — `build`, `install`, `clean` targets matching `crates/hero_embedder_server/Makefile`. - `crates/hero_embedderd/README.md` — short description of the daemon, how it differs from `hero_embedder_server`, the per-IP usage log format. (No edits to `crates/hero_embedder_server/**` or to `crates/hero_embedder_lib/**`.) ### Implementation Plan #### Step 1: Workspace wiring Files: `Cargo.toml` (workspace root) - Add `"crates/hero_embedderd"` to `[workspace] members`. - Add a workspace dependency entry for the mycelium SDK: `mycelium_sdk = { git = "https://github.com/threefoldtech/mycelium", branch = "development" }`. - Do NOT modify any other workspace fields. Dependencies: none #### Step 2: New crate skeleton (manifest + manifests/assets) Files: `crates/hero_embedderd/Cargo.toml`, `heroservice.json`, `openrpc.json`, `favicon.svg`, `Makefile`, `README.md` - `Cargo.toml`: package name `hero_embedderd`, workspace inheritance, deps on `hero_embedder_lib`, `mycelium_sdk`, `tokio`, `axum`, `tower-http`, `tower`, `hyper`, `hyper-util`, `serde`, `serde_json`, `anyhow`, `tracing`, `tracing-subscriber`. Single `[[bin]] name = "hero_embedderd"`. - `heroservice.json`: protocol openrpc, name `hero_embedderd`. - `openrpc.json` + `favicon.svg`: byte copies of the server's files. - `Makefile`: mirror server's `build`/`install`/`clean` with new binary name. - `README.md`: stub, fleshed out in Step 7. Dependencies: Step 1 #### Step 3: Mycelium guard module Files: `crates/hero_embedderd/src/mycelium_guard.rs` - `MyceliumGuard { node_subnet, client }`. `init()` connects via the mycelium SDK and reads `get_info().node_subnet`. - `extract_source_ipv6(headers)` reads `X-Forwarded-For` then `X-Real-IP`. - `is_mycelium(addr)` does prefix containment. - Axum middleware `require_mycelium` rejects non-mycelium callers with 403 and stores the resolved IPv6 in request extensions for downstream handlers. - `HERO_EMBEDDERD_REQUIRE_MYCELIUM=1` (default) means daemon refuses to start if mycelium is unreachable. Dependencies: Step 2 #### Step 4: Usage log module Files: `crates/hero_embedderd/src/usage_log.rs` - `UsageMeter { tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored }`. - `UsageLogEntry { ts_ms, src_ipv6, method, tokens_in, tokens_out, embeddings_*, duration_ms, status }`. - `emit(entry, proc_client)`: tracing::info! with target `hero_embedderd::usage` + best-effort hero_proc log under source `hero_embedderd.usage.<method>`. - `meter_from_params(method, params)`: pre-fills `tokens_in` for `embed` / `rerank` / `search` with an approximate whitespace token count (documented as approximate in README). - `meter_from_result(method, result, meter)`: parses the result JSON to extract counts (`embed` → `embeddings_processed`; `index.add`/`vectors.add` → `embeddings_stored` + `embeddings_not_stored`; `search`/`vectors.search` → `embeddings_processed`). Dependencies: Step 2 #### Step 5: Dispatch module Files: `crates/hero_embedderd/src/dispatch.rs` - `dispatch(state, request) -> (response, method_name)` mirrors the `match` block in `hero_embedder_server::main::jsonrpc_handler` verbatim, dispatching to `hero_embedder_lib::api::*`. Excludes the `/mcp` path (out of scope). Dependencies: Step 2 #### Step 6: Daemon entrypoint (main.rs) Files: `crates/hero_embedderd/src/main.rs` - Init tracing, resolve `EMBEDDER_MODELS` / `EMBEDDER_DATA`, macOS dylib auto-detect, connect `ProcClient`, ensure models, load embedders + reranker, open namespaces, build `AppState`. - Init `MyceliumGuard` (fail loud if mycelium is required and unreachable). - Build axum Router with the same endpoints (`/health`, `/openrpc.json`, `/.well-known/heroservice.json`, `/favicon.svg`, `POST /rpc`, `POST /`), apply `require_mycelium` middleware globally, plus permissive CORS. - `rpc_handler` reads `SourceIpv6` from request extensions, calls `dispatch::dispatch`, builds `UsageMeter` from params + result, emits a usage log line, returns the JSON-RPC response. - Bind UDS at `$HERO_SOCKET_DIR/hero_embedderd/rpc.sock`. Mirror the server's accept-loop and SIGINT/SIGTERM shutdown. Dependencies: Steps 3, 4, 5 #### Step 7: README + verification build Files: `crates/hero_embedderd/README.md` - Document purpose, how it differs from `hero_embedder_server`, mycelium-only enforcement, structured usage log line shape, env vars (`HERO_SOCKET_DIR`, `EMBEDDER_MODELS`, `EMBEDDER_DATA`, `MYCELIUM_RPC_SOCKET`, `HERO_EMBEDDERD_REQUIRE_MYCELIUM`), example `tracing` line. - Verify: `cargo build -p hero_embedderd`, `cargo build -p hero_embedder_server`, `cargo build` all succeed. Dependencies: Step 6 ### Acceptance Criteria - [ ] `cargo build -p hero_embedderd` succeeds from a clean workspace. - [ ] `cargo build -p hero_embedder_server` still succeeds (no regression). - [ ] `cargo build` (whole workspace) succeeds. - [ ] `cargo test -p hero_embedder_lib` still passes (no lib changes). - [ ] Running `hero_embedderd` with mycelium up creates `~/hero/var/sockets/hero_embedderd/rpc.sock` and serves `/health`, `/openrpc.json`, `/.well-known/heroservice.json` with `name: "hero_embedderd"`. - [ ] Running `hero_embedderd` with mycelium unreachable exits with a clear error. - [ ] A `POST /rpc` carrying an `X-Forwarded-For` outside the local mycelium subnet returns `403`. - [ ] A `POST /rpc` with a valid mycelium-IPv6 `X-Forwarded-For` is processed and emits exactly one structured usage line via `tracing` (target `hero_embedderd::usage`) containing source IPv6, method, tokens_in, tokens_out, embeddings_* counters and duration_ms. - [ ] No file under `crates/hero_embedder_server/` or `crates/hero_embedder_lib/` is modified. - [ ] No new TCP listener is opened by the daemon. ### Notes - **Naming**: the issue title is `hero_embedderd` (concatenated `d`). I am keeping that exact name throughout (crate, binary, service-socket-dir). - **Service identity**: the daemon owns its own socket dir `hero_embedderd/` rather than co-tenanting under `hero_embedder/`, so it can run side by side with `hero_embedder_server`. - **Mycelium source-IP propagation**: the daemon binds only to a UDS, so mycelium peers reach it via `hero_router`, which forwards the original mycelium IPv6 in `X-Forwarded-For`. The daemon does NOT read the peer credential off the UDS itself. - **Token counting accuracy (v1)**: an exact token count requires running the model's tokenizer. v1 uses an approximate whitespace count for `tokens_in`, documented in README. A v2 follow-up issue can plumb a `count_tokens(...)` method into the lib. - **MCP endpoint**: deliberately omitted from the daemon — MCP carries its own session/state model that doesn't fit the "stateless, billed per IPv6" charter. - **Auth**: no `X-Hero-Claims` enforcement, no `hero_context` checking. The mycelium subnet check IS the access control. - **Why not amend the server in place?**: the issue says "split into functions" — keep `hero_embedder_server` as the full-featured local-admin RPC, and `hero_embedderd` as the stateless billable surface for mycelium peers. They share `hero_embedder_lib`.
Author
Owner

Test Results

cargo check — workspace

  • cargo check -p hero_embedderd: PASS (0 warnings, 0 errors)
  • cargo check -p hero_embedder_server: PASS (1 pre-existing warning, 0 errors)
  • cargo check (full workspace): PASS (pre-existing warnings in hero_embedder_app, 0 errors)

cargo test

  • cargo test -p hero_embedderd: PASS — 0 tests (the daemon is a thin wrapper around hero_embedder_lib; there are no daemon-specific unit tests yet)
  • cargo test -p hero_embedder_lib --lib: PASS — 16 passed, 0 failed, 0 ignored
test logging::tests::test_filtering ... ok
test logging::tests::test_circular_buffer_wraparound ... ok
test ml::quantize::tests::test_int8_cosine ... ok
test ml::quantize::tests::test_fp16_cosine ... ok
test ml::quantize::tests::test_fp16_roundtrip ... ok
test ml::quantize::tests::test_int8_roundtrip ... ok
test storage::db::tests::test_fp16_roundtrip ... ok
test retrieval::quantized::tests::test_fp16_index ... ok
test retrieval::quantized::tests::test_int8_index ... ok
test storage::db::tests::test_int8_roundtrip ... ok
test storage::embedding_cache::tests::test_roundtrip ... ok
test storage::embedding_cache::tests::test_cache_miss ... ok
test storage::embedding_cache::tests::test_batch_mixed_hits_misses ... ok
test storage::embedding_cache::tests::test_clear ... ok
test storage::embedding_cache::tests::test_count_and_size ... ok
test storage::embedding_cache::tests::test_quality_isolation ... ok

test result: ok. 16 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s

Pre-existing test failure (out of scope)

cargo test -p hero_embedder_lib (without --lib) fails to compile the integration test target tests/auth_integration_test.rs because jsonwebtoken is not declared as a dev-dependency in crates/hero_embedder_lib/Cargo.toml. Verified pre-existing on a clean development branch (git stash + re-run reproduces the same compile error). Not caused by, and not in scope of, this PR. Worth a separate small follow-up to add jsonwebtoken = { version = "...", default-features = false, features = ["..."] } under [dev-dependencies].

Summary

The new crate compiles cleanly. The pre-existing library and server crates continue to compile and pass all 16 unit tests. No regression introduced.

## Test Results ### `cargo check` — workspace - `cargo check -p hero_embedderd`: PASS (0 warnings, 0 errors) - `cargo check -p hero_embedder_server`: PASS (1 pre-existing warning, 0 errors) - `cargo check` (full workspace): PASS (pre-existing warnings in `hero_embedder_app`, 0 errors) ### `cargo test` - `cargo test -p hero_embedderd`: PASS — 0 tests (the daemon is a thin wrapper around `hero_embedder_lib`; there are no daemon-specific unit tests yet) - `cargo test -p hero_embedder_lib --lib`: PASS — 16 passed, 0 failed, 0 ignored ``` test logging::tests::test_filtering ... ok test logging::tests::test_circular_buffer_wraparound ... ok test ml::quantize::tests::test_int8_cosine ... ok test ml::quantize::tests::test_fp16_cosine ... ok test ml::quantize::tests::test_fp16_roundtrip ... ok test ml::quantize::tests::test_int8_roundtrip ... ok test storage::db::tests::test_fp16_roundtrip ... ok test retrieval::quantized::tests::test_fp16_index ... ok test retrieval::quantized::tests::test_int8_index ... ok test storage::db::tests::test_int8_roundtrip ... ok test storage::embedding_cache::tests::test_roundtrip ... ok test storage::embedding_cache::tests::test_cache_miss ... ok test storage::embedding_cache::tests::test_batch_mixed_hits_misses ... ok test storage::embedding_cache::tests::test_clear ... ok test storage::embedding_cache::tests::test_count_and_size ... ok test storage::embedding_cache::tests::test_quality_isolation ... ok test result: ok. 16 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s ``` ### Pre-existing test failure (out of scope) `cargo test -p hero_embedder_lib` (without `--lib`) fails to compile the integration test target `tests/auth_integration_test.rs` because `jsonwebtoken` is not declared as a dev-dependency in `crates/hero_embedder_lib/Cargo.toml`. Verified pre-existing on a clean `development` branch (`git stash` + re-run reproduces the same compile error). Not caused by, and not in scope of, this PR. Worth a separate small follow-up to add `jsonwebtoken = { version = "...", default-features = false, features = ["..."] }` under `[dev-dependencies]`. ### Summary The new crate compiles cleanly. The pre-existing library and server crates continue to compile and pass all 16 unit tests. No regression introduced.
Author
Owner

Implementation Summary

A new crate hero_embedderd was added that exposes the same JSON-RPC 2.0 surface as hero_embedder_server but is stateless and mycelium-only.

Files created

  • crates/hero_embedderd/Cargo.toml — new workspace member; depends on hero_embedder_lib (path) and mycelium_sdk (workspace dep, git).
  • crates/hero_embedderd/heroservice.json — discovery manifest; service name hero_embedderd.
  • crates/hero_embedderd/openrpc.json — copy of the server's OpenRPC spec (same RPC method surface).
  • crates/hero_embedderd/favicon.svg — copy of the server's favicon.
  • crates/hero_embedderd/Makefilebuild / check / test / clippy / fmt targets.
  • crates/hero_embedderd/README.md — full documentation: how it differs from the server, env vars, usage log line shape, token-counting accuracy notes.
  • crates/hero_embedderd/src/mycelium_guard.rs — boots a MyceliumClient, reads the local node's mycelium subnet via get_info(), and provides:
    • MyceliumGuard::is_mycelium(addr) — cheap prefix containment check.
    • extract_source_ipv6(headers) — pulls the source IPv6 from X-Forwarded-For then X-Real-IP.
    • require_mycelium Axum middleware — rejects non-mycelium callers with 403, stashes the verified SourceIpv6 in request extensions for downstream use.
  • crates/hero_embedderd/src/usage_log.rsUsageMeter, UsageLogEntry, meter_from_params, meter_from_result, and emit(entry, proc_client) which writes a structured tracing::info! line under target hero_embedderd::usage and best-effort forwards to hero_proc under source hero_embedderd.usage.<method>.
  • crates/hero_embedderd/src/dispatch.rs — JSON-RPC method dispatcher, mirroring the server's match block (excluding the /mcp path which is intentionally out of scope for the stateless billable surface).
  • crates/hero_embedderd/src/main.rs — daemon entrypoint: tracing init, model + reranker + namespace + cache load, AppState build, MyceliumGuard::init(), Axum router with require_mycelium middleware applied globally, UDS bind at $HERO_SOCKET_DIR/hero_embedderd/rpc.sock, accept loop with SIGINT/SIGTERM shutdown.

Files modified

  • Cargo.toml (workspace root) — added crates/hero_embedderd to [workspace] members and added mycelium_sdk to [workspace.dependencies] ({ git = "https://forge.ourworld.tf/geomind_code/mycelium_network.git", branch = "development_crate_layout" }). The development_crate_layout branch is where mycelium_sdk currently lives as a workspace member; this should track to development once the crate-layout migration lands upstream.

Files NOT modified

  • crates/hero_embedder_server/** — untouched; the existing server keeps building and running.
  • crates/hero_embedder_lib/** — untouched; all 16 lib unit tests still pass.

Behavior

  • Daemon binds only to UDS at $HERO_SOCKET_DIR/hero_embedderd/rpc.sock (no TCP listener).
  • Every request must arrive through hero_router carrying an X-Forwarded-For whose IPv6 is in the local mycelium subnet, otherwise it gets 403 Forbidden.
  • For each accepted request the daemon emits one structured JSON line via tracing (target hero_embedderd::usage) and forwards it best-effort to hero_proc. Example shape:
    {"ts_ms":1721059200123,"src_ipv6":"4xx:abcd:1234:0:0:0:0:42","method":"embed","tokens_in":42,"tokens_out":3,"embeddings_processed":3,"embeddings_stored":0,"embeddings_not_stored":0,"duration_ms":17,"status":"ok"}
    
  • No hero_context, no X-Hero-Claims, no user / auth handling — billing is purely per source mycelium IPv6.
  • HERO_EMBEDDERD_REQUIRE_MYCELIUM=0 (default 1) allows startup when the mycelium daemon socket is missing; in that mode all requests are rejected as non-mycelium.

Test results

  • cargo check -p hero_embedderd: PASS (0 warnings, 0 errors).
  • cargo check -p hero_embedder_server: PASS (1 pre-existing warning, 0 errors).
  • cargo check (full workspace): PASS.
  • cargo test -p hero_embedder_lib --lib: PASS — 16 passed, 0 failed.
  • cargo test -p hero_embedderd: PASS — 0 tests.

Notes / follow-ups

  • tokens_in is approximate in v1 (whitespace count). A v2 follow-up can plumb a count_tokens(...) helper into hero_embedder_lib for exact tokenization.
  • The mycelium_sdk git dep tracks development_crate_layout — switch to development once the crate-layout migration lands upstream in mycelium_network.
  • Pre-existing: crates/hero_embedder_lib/tests/auth_integration_test.rs does not compile because jsonwebtoken is not declared as a dev-dependency. Out of scope for this PR; worth a separate one-line follow-up.
## Implementation Summary A new crate `hero_embedderd` was added that exposes the same JSON-RPC 2.0 surface as `hero_embedder_server` but is stateless and mycelium-only. ### Files created - `crates/hero_embedderd/Cargo.toml` — new workspace member; depends on `hero_embedder_lib` (path) and `mycelium_sdk` (workspace dep, git). - `crates/hero_embedderd/heroservice.json` — discovery manifest; service name `hero_embedderd`. - `crates/hero_embedderd/openrpc.json` — copy of the server's OpenRPC spec (same RPC method surface). - `crates/hero_embedderd/favicon.svg` — copy of the server's favicon. - `crates/hero_embedderd/Makefile` — `build` / `check` / `test` / `clippy` / `fmt` targets. - `crates/hero_embedderd/README.md` — full documentation: how it differs from the server, env vars, usage log line shape, token-counting accuracy notes. - `crates/hero_embedderd/src/mycelium_guard.rs` — boots a `MyceliumClient`, reads the local node's mycelium subnet via `get_info()`, and provides: - `MyceliumGuard::is_mycelium(addr)` — cheap prefix containment check. - `extract_source_ipv6(headers)` — pulls the source IPv6 from `X-Forwarded-For` then `X-Real-IP`. - `require_mycelium` Axum middleware — rejects non-mycelium callers with `403`, stashes the verified `SourceIpv6` in request extensions for downstream use. - `crates/hero_embedderd/src/usage_log.rs` — `UsageMeter`, `UsageLogEntry`, `meter_from_params`, `meter_from_result`, and `emit(entry, proc_client)` which writes a structured `tracing::info!` line under target `hero_embedderd::usage` and best-effort forwards to `hero_proc` under source `hero_embedderd.usage.<method>`. - `crates/hero_embedderd/src/dispatch.rs` — JSON-RPC method dispatcher, mirroring the server's `match` block (excluding the `/mcp` path which is intentionally out of scope for the stateless billable surface). - `crates/hero_embedderd/src/main.rs` — daemon entrypoint: tracing init, model + reranker + namespace + cache load, `AppState` build, `MyceliumGuard::init()`, Axum router with `require_mycelium` middleware applied globally, UDS bind at `$HERO_SOCKET_DIR/hero_embedderd/rpc.sock`, accept loop with SIGINT/SIGTERM shutdown. ### Files modified - `Cargo.toml` (workspace root) — added `crates/hero_embedderd` to `[workspace] members` and added `mycelium_sdk` to `[workspace.dependencies]` (`{ git = "https://forge.ourworld.tf/geomind_code/mycelium_network.git", branch = "development_crate_layout" }`). The `development_crate_layout` branch is where `mycelium_sdk` currently lives as a workspace member; this should track to `development` once the crate-layout migration lands upstream. ### Files NOT modified - `crates/hero_embedder_server/**` — untouched; the existing server keeps building and running. - `crates/hero_embedder_lib/**` — untouched; all 16 lib unit tests still pass. ### Behavior - Daemon binds only to UDS at `$HERO_SOCKET_DIR/hero_embedderd/rpc.sock` (no TCP listener). - Every request must arrive through `hero_router` carrying an `X-Forwarded-For` whose IPv6 is in the local mycelium subnet, otherwise it gets `403 Forbidden`. - For each accepted request the daemon emits one structured JSON line via `tracing` (target `hero_embedderd::usage`) and forwards it best-effort to `hero_proc`. Example shape: ``` {"ts_ms":1721059200123,"src_ipv6":"4xx:abcd:1234:0:0:0:0:42","method":"embed","tokens_in":42,"tokens_out":3,"embeddings_processed":3,"embeddings_stored":0,"embeddings_not_stored":0,"duration_ms":17,"status":"ok"} ``` - No `hero_context`, no `X-Hero-Claims`, no user / auth handling — billing is purely per source mycelium IPv6. - `HERO_EMBEDDERD_REQUIRE_MYCELIUM=0` (default `1`) allows startup when the mycelium daemon socket is missing; in that mode all requests are rejected as non-mycelium. ### Test results - `cargo check -p hero_embedderd`: PASS (0 warnings, 0 errors). - `cargo check -p hero_embedder_server`: PASS (1 pre-existing warning, 0 errors). - `cargo check` (full workspace): PASS. - `cargo test -p hero_embedder_lib --lib`: PASS — 16 passed, 0 failed. - `cargo test -p hero_embedderd`: PASS — 0 tests. ### Notes / follow-ups - `tokens_in` is approximate in v1 (whitespace count). A v2 follow-up can plumb a `count_tokens(...)` helper into `hero_embedder_lib` for exact tokenization. - The `mycelium_sdk` git dep tracks `development_crate_layout` — switch to `development` once the crate-layout migration lands upstream in `mycelium_network`. - Pre-existing: `crates/hero_embedder_lib/tests/auth_integration_test.rs` does not compile because `jsonwebtoken` is not declared as a dev-dependency. Out of scope for this PR; worth a separate one-line follow-up.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_embedder#18
No description provided.