[bug] hero_indexer discards X-Hero-Context — every caller can hit every Tantivy DB #20

Open
opened 2026-05-01 04:09:24 +00:00 by mik-tf · 0 comments
Owner

Summary

hero_indexer parses the X-Hero-Context header but binds it to _-prefixed locals and never enforces it downstream. Every caller can read every Tantivy DB the indexer hosts. Same shape as the parallel hero_embedder issue.

Source

Why this matters

hero_indexer is a Tantivy-backed full-text (BM25) search service — sibling to hero_embedder's semantic engine. Multi-tenant deployments rely on per-context DB selection at the entry handler; today every caller can hit every DB on the same socket.

Proposed fix

Same two-layer pattern as the hero_embedder issue:

  1. Bind context to DB name at the entry handler — reject or rewrite DB names that don't match the caller's context.
  2. Enforce on listing methods so callers can only enumerate DBs under their own context prefix.

Note: spec/dispatcher mismatch (separate issue)

docs/specs.md advertises doc.update, index.merge, aggregations, and search.explain that are not in handlers.rs:42-66. Filed as a separate docs issue.

Severity

Medium-high. Same threat model as the embedder issue: filesystem-bound socket, but invalidates the per-context isolation claim.

Cross-refs

Spotted during docs_hero Phase 1 source-grounded read (session 52). Reconciliation memo: memory/investigation_roadmap_reconciliation.md.

## Summary `hero_indexer` parses the `X-Hero-Context` header but binds it to `_`-prefixed locals and never enforces it downstream. Every caller can read every Tantivy DB the indexer hosts. Same shape as the parallel `hero_embedder` issue. ## Source - [`crates/hero_indexer_server/src/main.rs:275-292`](https://forge.ourworld.tf/lhumina_code/hero_indexer/src/branch/development/crates/hero_indexer_server/src/main.rs) — header parsed into `_hero_context` / `_hero_claims` etc., never used. - `handlers.rs:42-66` — request dispatch keys off DB name from request params, not from header. ## Why this matters `hero_indexer` is a Tantivy-backed full-text (BM25) search service — sibling to `hero_embedder`'s semantic engine. Multi-tenant deployments rely on per-context DB selection at the entry handler; today every caller can hit every DB on the same socket. ## Proposed fix Same two-layer pattern as the `hero_embedder` issue: 1. **Bind context to DB name at the entry handler** — reject or rewrite DB names that don't match the caller's context. 2. **Enforce on listing methods** so callers can only enumerate DBs under their own context prefix. ## Note: spec/dispatcher mismatch (separate issue) `docs/specs.md` advertises `doc.update`, `index.merge`, aggregations, and `search.explain` that are **not** in `handlers.rs:42-66`. Filed as a separate docs issue. ## Severity Medium-high. Same threat model as the embedder issue: filesystem-bound socket, but invalidates the per-context isolation claim. ## Cross-refs - `hero_embedder` parallel issue (this session) - `hero_aibroker` parallel issue (this session) - [hero_demo#52 — vision](https://forge.ourworld.tf/lhumina_code/hero_demo/issues/52) Spotted during docs_hero Phase 1 source-grounded read (session 52). Reconciliation memo: `memory/investigation_roadmap_reconciliation.md`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_indexer#20
No description provided.