[bug] hero_rpc OpenRpcTransport UDS drops X-Hero-Context/X-Hero-Claims headers — latent cross-service breakage #233
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
In
hero_rpc, theOpenRpcTransport::post_raw_json_with_headersmethod silently drops all extra headers (e.g.X-Hero-Context,X-Hero-Claims) when the transport is a Unix Domain Socket. Only the HTTP transport branch forwards them.Tracked in: lhumina_code/hero_rpc#42
Root cause
http_post_unix(transport.rs:354) has noextra_headersparameter. TheUnixSocketbranch ofpost_raw_json_with_headers(line 248) calls it without passing the headers:This was a known gap when the function was introduced in commit ed6e7eb (April 13) — the comment reads "Unix-socket transport currently drops the headers — raw socket framing has no place to carry them" — but this is incorrect: we use HTTP-over-UDS, so headers can and must be carried in the HTTP framing.
Current impact
No repos are broken today because the only real caller of
post_raw_json_with_headersisOsisClient(inopenrpc_http_client_lib), which usesOpenRpcTransport::http(via hero_router), not UDS. SoX-Hero-Contextreaches the service correctly through that path.When it becomes a real problem
The moment any service makes a direct server-to-server OSIS call over UDS (bypassing hero_router) while passing a context — e.g. a future
OsisClient::new_unix()constructor — the context header will be silently dropped. The service will return default-context data without any error, making the failure invisible and hard to diagnose.Fix (small, isolated to one file)
All changes are in
hero_rpc/crates/openrpc/src/transport.rs:extra_headers: &[(&str, &str)]parameter tohttp_post_unixhyper::Request::builder()chain insidehttp_post_unixextra_headersthrough in theUnixSocketbranch ofpost_raw_json_with_headers&[]fromcall_unix_http(no headers needed there)No public API changes. No other files need to change.
@despiegk @timur — question before this gets fixed:
If the architectural rule is that all calls go through hero_router, then the scenario described above (a service calling OSIS directly over UDS with
X-Hero-Context) should never happen in the first place.OsisClient::new_unix()would be an architectural violation, and the broken header forwarding would actually enforce the constraint correctly — you get silent wrong behavior if you bypass the router, which signals you are doing it wrong.In that case,
hero_rpc#42does not need to be fixed, and the right resolution is to close it as by design and document that all context-aware calls must route through hero_router.Is that the intent, or are there legitimate cases where a service should call another service directly over UDS while carrying a context?
Root cause found — and it is different from all previous analyses
After tracing the full call chain from hero_biz through hero_router down to the OSIS dispatch layer, the actual bug was found in
hero_rpc/crates/osis/src/rpc/dispatch.rs.What was actually broken
dispatch_jsonrpc_auto_contextread the context name exclusively fromparams._contextin the JSON body, defaulting to"root"if absent:OsisClientsends context via theX-Hero-ContextHTTP header, not in the body.RequestContext::from_headerscorrectly parses the header intohero_context_name, butdispatch_jsonrpc_auto_contextnever consulted it — so every call silently fell back to"root".Fix applied
dispatch_jsonrpc_auto_contextnow prefersrequest_context.hero_context_name(from the header) and falls back toparams._contextfor backwards compatibility:Why the previous analyses went off track
hero_rpc#42 / home#233 — focused on whether UDS transport drops
X-Hero-Contextheaders. The transport was a red herring: the header was being forwarded correctly all the way to the server. The server just never used it for context selection.The UDS header fix was architecturally wrong anyway — per the
herolib_openrpc_authorizeandhero_contextskills, all context-aware calls go throughhero_router, which injects headers. Direct UDS calls are trusted internal calls with no context header — the UDS header-dropping behaviour is correct by design.The Cargo.lock / endpoint format analyses — also red herrings;
OsisClientwas already using the correct endpoint and the correct header. The problem was one layer deeper in how the server processed that header.Summary
The bug was always in the dispatch layer, not in the transport. Fixed in
hero_rpccrates/osis/src/rpc/dispatch.rs.