hero_embedderd #18
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_embedder#18
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
based on crates/hero_embedder_server make hero_embedderd
There is a server, embedder_server. We want to split it into functions. One becomes the embedder_d from Daemon, which is completely stateless and it only accepts requests from mycelium-enabled networks. And it also logs all the requests, so we know what exactly was done in number of stored, not stored, of processed embeddings, basically, so we know the tokens in and out. And we do that based on the source IP address. It's an IPv6 address, mycelium. And we want it to be completely stateless, so there is no authentication, no user management or nothing. So everyone on the machine can use it, and they will be billed per IPv6 address.
to talk to mycelium use
/home/despiegk/hero/code/hero_skills/claude/skills/mycelium_sdk/SKILL.md
also use skill how to attach to the mycelium network based on skill /hero_sockets
Implementation Spec for Issue #18 — hero_embedderd
Objective
Create a new crate
hero_embedderd(binaryhero_embedderd) incrates/hero_embedderd/that exposes the same JSON-RPC 2.0 surface ashero_embedder_serverbut is completely stateless (no auth, no users, no proxy state) and accepts requests only from peers reachable via the local mycelium overlay. For every accepted request the daemon emits a structured per-source-mycelium-IPv6 usage log (operation, tokens in, tokens out, embeddings stored / not-stored / processed) so an external billing aggregator can attribute work to a mycelium IPv6 address.hero_embedder_serveris left untouched. Shared business logic continues to live inhero_embedder_lib. The daemon is a thin transport + filtering + accounting wrapper around the sameAppStateandapi::*handlers the server already calls.Background
Today
crates/hero_embedder_server(src/main.rs):Arc<AppState>(defined inhero_embedder_lib::state)./health,POST /rpc,POST /(alias),/openrpc.json,/.well-known/heroservice.json,/favicon.svg,/mcp.$HERO_SOCKET_DIR/hero_embedder/rpc.sock(no TCP).hero_embedder_lib::api(info / health / embed / rerank / namespace.* / index.* / cache.* / search / stats / corpus.* / jobs.* / logs.* / kvs.* / vectors.*).OperationLogger(inhero_embedder_lib::logging) which recordsLogEntry { id, timestamp, operation, status, namespace, message, duration_ms, metadata }in a circular buffer and (best-effort) forwards tohero_proc. The current logger does not know about caller identity (mycelium IPv6) and does not count tokens.The new daemon must:
$HERO_SOCKET_DIR/hero_embedderd/rpc.sockso that, by Hero socket strategy, all external traffic is forced throughhero_router. To enforce "mycelium only", the daemon validates each request'sX-Forwarded-For(or the mycelium IPv6 supplied byhero_router) is in the local mycelium subnet — answered by the mycelium SDK at startup (get_info().node_subnet) and on demand (get_public_key_from_ip).--hero-os-id, no auth headers (X-Hero-Claims), no user store. The only persistent side effect is structured logs.UsageLogEntry { ts, src_ipv6, method, tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored, duration_ms, status }to: (a)tracing::info!with stable JSON fields (machine-readable), and (b)hero_procvia the existingProcClient::loghelper under sourcehero_embedderd.usage.<method>for downstream aggregation.Requirements
hero_embedderdlives incrates/hero_embedderd/and is a workspace member.hero_embedderd(matches issue title verbatim — no underscore-d suffix mismatch).hero_embedderd→ socket~/hero/var/sockets/hero_embedderd/rpc.sock.hero_embedder_lib::{api, state::AppState, namespace, storage, ml, logging, mcp, rpc, config, download, proc_client}verbatim — no fork of business logic.hero_embedder_server(every method dispatched injsonrpc_handlerof the server is also dispatched in the daemon).GET /health,GET /openrpc.json,GET /.well-known/heroservice.json,GET /favicon.svg. The/mcpendpoint is intentionally not ported (out of scope for the stateless billable RPC).POST /rpcandPOST /enforce: caller's mycelium IPv6 ∈ local node's mycelium subnet. Non-mycelium traffic gets403 Forbidden(HTTP) — but in normal operation it can only arrive through the UDS viahero_router, so the source IP is conveyed by header.tokens_in,tokens_out,embeddings_processed,embeddings_stored,embeddings_not_stored,duration_ms,status. Log line is structured JSON viatracing::info!and (best-effort) forwarded tohero_proc.OpenRPCdiscovery manifest at/.well-known/heroservice.jsonadvertises service namehero_embedderd.mycelium_sdkas a new workspace dep (git, branch=development) and the daemon depends on it.hero_os_idmodel is explicitly out of scope.hero_embedder_serverkeeps building and running unchanged (no edits to its source).Files to Modify/Create
Cargo.toml(workspace root) — addcrates/hero_embedderdto[workspace] members, addmycelium_sdkline in[workspace.dependencies].crates/hero_embedderd/Cargo.toml— new crate manifest.crates/hero_embedderd/heroservice.json— discovery manifest (name: "hero_embedderd").crates/hero_embedderd/openrpc.json— copied content ofcrates/hero_embedder_server/openrpc.jsonso the discovery surface matches.crates/hero_embedderd/favicon.svg— same favicon as the server.crates/hero_embedderd/src/main.rs— entry point: load models, build state, build router, bind UDS, run.crates/hero_embedderd/src/mycelium_guard.rs— boots aMyceliumClient, captures the local node's mycelium subnet, exposesis_mycelium_addr(ip) -> boolandextract_source_ipv6(headers) -> Option<Ipv6Addr>. Provides an Axum middleware that rejects non-mycelium callers with403.crates/hero_embedderd/src/usage_log.rs— definesUsageLogEntry,UsageMeteraccumulator, and theemit(...)function that writes a structuredtracing::info!line and forwards to hero_proc.crates/hero_embedderd/src/dispatch.rs— dispatches the JSON-RPC method tohero_embedder_lib::api::*(mirrors the bigmatchinhero_embedder_server::main::jsonrpc_handler).crates/hero_embedderd/Makefile—build,install,cleantargets matchingcrates/hero_embedder_server/Makefile.crates/hero_embedderd/README.md— short description of the daemon, how it differs fromhero_embedder_server, the per-IP usage log format.(No edits to
crates/hero_embedder_server/**or tocrates/hero_embedder_lib/**.)Implementation Plan
Step 1: Workspace wiring
Files:
Cargo.toml(workspace root)"crates/hero_embedderd"to[workspace] members.mycelium_sdk = { git = "https://github.com/threefoldtech/mycelium", branch = "development" }.Dependencies: none
Step 2: New crate skeleton (manifest + manifests/assets)
Files:
crates/hero_embedderd/Cargo.toml,heroservice.json,openrpc.json,favicon.svg,Makefile,README.mdCargo.toml: package namehero_embedderd, workspace inheritance, deps onhero_embedder_lib,mycelium_sdk,tokio,axum,tower-http,tower,hyper,hyper-util,serde,serde_json,anyhow,tracing,tracing-subscriber. Single[[bin]] name = "hero_embedderd".heroservice.json: protocol openrpc, namehero_embedderd.openrpc.json+favicon.svg: byte copies of the server's files.Makefile: mirror server'sbuild/install/cleanwith new binary name.README.md: stub, fleshed out in Step 7.Dependencies: Step 1
Step 3: Mycelium guard module
Files:
crates/hero_embedderd/src/mycelium_guard.rsMyceliumGuard { node_subnet, client }.init()connects via the mycelium SDK and readsget_info().node_subnet.extract_source_ipv6(headers)readsX-Forwarded-ForthenX-Real-IP.is_mycelium(addr)does prefix containment.require_myceliumrejects non-mycelium callers with 403 and stores the resolved IPv6 in request extensions for downstream handlers.HERO_EMBEDDERD_REQUIRE_MYCELIUM=1(default) means daemon refuses to start if mycelium is unreachable.Dependencies: Step 2
Step 4: Usage log module
Files:
crates/hero_embedderd/src/usage_log.rsUsageMeter { tokens_in, tokens_out, embeddings_processed, embeddings_stored, embeddings_not_stored }.UsageLogEntry { ts_ms, src_ipv6, method, tokens_in, tokens_out, embeddings_*, duration_ms, status }.emit(entry, proc_client): tracing::info! with targethero_embedderd::usage+ best-effort hero_proc log under sourcehero_embedderd.usage.<method>.meter_from_params(method, params): pre-fillstokens_inforembed/rerank/searchwith an approximate whitespace token count (documented as approximate in README).meter_from_result(method, result, meter): parses the result JSON to extract counts (embed→embeddings_processed;index.add/vectors.add→embeddings_stored+embeddings_not_stored;search/vectors.search→embeddings_processed).Dependencies: Step 2
Step 5: Dispatch module
Files:
crates/hero_embedderd/src/dispatch.rsdispatch(state, request) -> (response, method_name)mirrors thematchblock inhero_embedder_server::main::jsonrpc_handlerverbatim, dispatching tohero_embedder_lib::api::*. Excludes the/mcppath (out of scope).Dependencies: Step 2
Step 6: Daemon entrypoint (main.rs)
Files:
crates/hero_embedderd/src/main.rsEMBEDDER_MODELS/EMBEDDER_DATA, macOS dylib auto-detect, connectProcClient, ensure models, load embedders + reranker, open namespaces, buildAppState.MyceliumGuard(fail loud if mycelium is required and unreachable)./health,/openrpc.json,/.well-known/heroservice.json,/favicon.svg,POST /rpc,POST /), applyrequire_myceliummiddleware globally, plus permissive CORS.rpc_handlerreadsSourceIpv6from request extensions, callsdispatch::dispatch, buildsUsageMeterfrom params + result, emits a usage log line, returns the JSON-RPC response.$HERO_SOCKET_DIR/hero_embedderd/rpc.sock. Mirror the server's accept-loop and SIGINT/SIGTERM shutdown.Dependencies: Steps 3, 4, 5
Step 7: README + verification build
Files:
crates/hero_embedderd/README.mdhero_embedder_server, mycelium-only enforcement, structured usage log line shape, env vars (HERO_SOCKET_DIR,EMBEDDER_MODELS,EMBEDDER_DATA,MYCELIUM_RPC_SOCKET,HERO_EMBEDDERD_REQUIRE_MYCELIUM), exampletracingline.cargo build -p hero_embedderd,cargo build -p hero_embedder_server,cargo buildall succeed.Dependencies: Step 6
Acceptance Criteria
cargo build -p hero_embedderdsucceeds from a clean workspace.cargo build -p hero_embedder_serverstill succeeds (no regression).cargo build(whole workspace) succeeds.cargo test -p hero_embedder_libstill passes (no lib changes).hero_embedderdwith mycelium up creates~/hero/var/sockets/hero_embedderd/rpc.sockand serves/health,/openrpc.json,/.well-known/heroservice.jsonwithname: "hero_embedderd".hero_embedderdwith mycelium unreachable exits with a clear error.POST /rpccarrying anX-Forwarded-Foroutside the local mycelium subnet returns403.POST /rpcwith a valid mycelium-IPv6X-Forwarded-Foris processed and emits exactly one structured usage line viatracing(targethero_embedderd::usage) containing source IPv6, method, tokens_in, tokens_out, embeddings_* counters and duration_ms.crates/hero_embedder_server/orcrates/hero_embedder_lib/is modified.Notes
hero_embedderd(concatenatedd). I am keeping that exact name throughout (crate, binary, service-socket-dir).hero_embedderd/rather than co-tenanting underhero_embedder/, so it can run side by side withhero_embedder_server.hero_router, which forwards the original mycelium IPv6 inX-Forwarded-For. The daemon does NOT read the peer credential off the UDS itself.tokens_in, documented in README. A v2 follow-up issue can plumb acount_tokens(...)method into the lib.X-Hero-Claimsenforcement, nohero_contextchecking. The mycelium subnet check IS the access control.hero_embedder_serveras the full-featured local-admin RPC, andhero_embedderdas the stateless billable surface for mycelium peers. They sharehero_embedder_lib.Test Results
cargo check— workspacecargo check -p hero_embedderd: PASS (0 warnings, 0 errors)cargo check -p hero_embedder_server: PASS (1 pre-existing warning, 0 errors)cargo check(full workspace): PASS (pre-existing warnings inhero_embedder_app, 0 errors)cargo testcargo test -p hero_embedderd: PASS — 0 tests (the daemon is a thin wrapper aroundhero_embedder_lib; there are no daemon-specific unit tests yet)cargo test -p hero_embedder_lib --lib: PASS — 16 passed, 0 failed, 0 ignoredPre-existing test failure (out of scope)
cargo test -p hero_embedder_lib(without--lib) fails to compile the integration test targettests/auth_integration_test.rsbecausejsonwebtokenis not declared as a dev-dependency incrates/hero_embedder_lib/Cargo.toml. Verified pre-existing on a cleandevelopmentbranch (git stash+ re-run reproduces the same compile error). Not caused by, and not in scope of, this PR. Worth a separate small follow-up to addjsonwebtoken = { version = "...", default-features = false, features = ["..."] }under[dev-dependencies].Summary
The new crate compiles cleanly. The pre-existing library and server crates continue to compile and pass all 16 unit tests. No regression introduced.
Implementation Summary
A new crate
hero_embedderdwas added that exposes the same JSON-RPC 2.0 surface ashero_embedder_serverbut is stateless and mycelium-only.Files created
crates/hero_embedderd/Cargo.toml— new workspace member; depends onhero_embedder_lib(path) andmycelium_sdk(workspace dep, git).crates/hero_embedderd/heroservice.json— discovery manifest; service namehero_embedderd.crates/hero_embedderd/openrpc.json— copy of the server's OpenRPC spec (same RPC method surface).crates/hero_embedderd/favicon.svg— copy of the server's favicon.crates/hero_embedderd/Makefile—build/check/test/clippy/fmttargets.crates/hero_embedderd/README.md— full documentation: how it differs from the server, env vars, usage log line shape, token-counting accuracy notes.crates/hero_embedderd/src/mycelium_guard.rs— boots aMyceliumClient, reads the local node's mycelium subnet viaget_info(), and provides:MyceliumGuard::is_mycelium(addr)— cheap prefix containment check.extract_source_ipv6(headers)— pulls the source IPv6 fromX-Forwarded-ForthenX-Real-IP.require_myceliumAxum middleware — rejects non-mycelium callers with403, stashes the verifiedSourceIpv6in request extensions for downstream use.crates/hero_embedderd/src/usage_log.rs—UsageMeter,UsageLogEntry,meter_from_params,meter_from_result, andemit(entry, proc_client)which writes a structuredtracing::info!line under targethero_embedderd::usageand best-effort forwards tohero_procunder sourcehero_embedderd.usage.<method>.crates/hero_embedderd/src/dispatch.rs— JSON-RPC method dispatcher, mirroring the server'smatchblock (excluding the/mcppath which is intentionally out of scope for the stateless billable surface).crates/hero_embedderd/src/main.rs— daemon entrypoint: tracing init, model + reranker + namespace + cache load,AppStatebuild,MyceliumGuard::init(), Axum router withrequire_myceliummiddleware applied globally, UDS bind at$HERO_SOCKET_DIR/hero_embedderd/rpc.sock, accept loop with SIGINT/SIGTERM shutdown.Files modified
Cargo.toml(workspace root) — addedcrates/hero_embedderdto[workspace] membersand addedmycelium_sdkto[workspace.dependencies]({ git = "https://forge.ourworld.tf/geomind_code/mycelium_network.git", branch = "development_crate_layout" }). Thedevelopment_crate_layoutbranch is wheremycelium_sdkcurrently lives as a workspace member; this should track todevelopmentonce the crate-layout migration lands upstream.Files NOT modified
crates/hero_embedder_server/**— untouched; the existing server keeps building and running.crates/hero_embedder_lib/**— untouched; all 16 lib unit tests still pass.Behavior
$HERO_SOCKET_DIR/hero_embedderd/rpc.sock(no TCP listener).hero_routercarrying anX-Forwarded-Forwhose IPv6 is in the local mycelium subnet, otherwise it gets403 Forbidden.tracing(targethero_embedderd::usage) and forwards it best-effort tohero_proc. Example shape:hero_context, noX-Hero-Claims, no user / auth handling — billing is purely per source mycelium IPv6.HERO_EMBEDDERD_REQUIRE_MYCELIUM=0(default1) allows startup when the mycelium daemon socket is missing; in that mode all requests are rejected as non-mycelium.Test results
cargo check -p hero_embedderd: PASS (0 warnings, 0 errors).cargo check -p hero_embedder_server: PASS (1 pre-existing warning, 0 errors).cargo check(full workspace): PASS.cargo test -p hero_embedder_lib --lib: PASS — 16 passed, 0 failed.cargo test -p hero_embedderd: PASS — 0 tests.Notes / follow-ups
tokens_inis approximate in v1 (whitespace count). A v2 follow-up can plumb acount_tokens(...)helper intohero_embedder_libfor exact tokenization.mycelium_sdkgit dep tracksdevelopment_crate_layout— switch todevelopmentonce the crate-layout migration lands upstream inmycelium_network.crates/hero_embedder_lib/tests/auth_integration_test.rsdoes not compile becausejsonwebtokenis not declared as a dev-dependency. Out of scope for this PR; worth a separate one-line follow-up.