Correctness: Domain supervisor kills server on any domain socket failure #119

Open
opened 2026-05-11 23:08:58 +00:00 by thabeta · 0 comments
Owner

Severity: Medium

Location

crates/hero_aibroker_server/src/main.rsdomain_supervisor

Finding

The server uses select_all on domain accept loops — if ANY domain socket's accept loop fails, the entire server shuts down:

let domain_supervisor = async {
    let (res, _, _) = futures::future::select_all(domain_handles).await;
    match res {
        Ok(inner) => inner,
        Err(join_err) => Err(anyhow::anyhow!("domain accept task panicked: {}", join_err)),
    }
};

Impact

  • A single domain socket failure kills the entire broker
  • No graceful degradation (other domains could continue serving)
  • No distinction between transient and permanent failures
  • No automatic restart of failed domain sockets

Recommendation

  • Log and restart failed domain accept loops
  • Allow the server to continue serving other domains
  • Distinguish between transient errors (EMFILE) and permanent errors
  • Add domain-level health checks
## Severity: Medium ## Location `crates/hero_aibroker_server/src/main.rs` — `domain_supervisor` ## Finding The server uses `select_all` on domain accept loops — if ANY domain socket's accept loop fails, the entire server shuts down: ```rust let domain_supervisor = async { let (res, _, _) = futures::future::select_all(domain_handles).await; match res { Ok(inner) => inner, Err(join_err) => Err(anyhow::anyhow!("domain accept task panicked: {}", join_err)), } }; ``` ## Impact - A single domain socket failure kills the entire broker - No graceful degradation (other domains could continue serving) - No distinction between transient and permanent failures - No automatic restart of failed domain sockets ## Recommendation - Log and restart failed domain accept loops - Allow the server to continue serving other domains - Distinguish between transient errors (EMFILE) and permanent errors - Add domain-level health checks
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_aibroker#119
No description provided.