[P0] Pidfile/deploy race — new daemon exits silently while old binary keeps serving #35

Open
opened 2026-05-23 21:52:21 +00:00 by thabeta · 0 comments
Owner

Problem
A new daemon whose start is rejected by a stale pidfile exits silently while the OLD binary keeps serving; readiness checks false-positive, so a bad deploy looks healthy.

Evidence

  • crates/hero_shrimp_server/src/rpc/pidfile.rs (takeover path).
  • ARCHITECTURE_CLEANUP_PLAN.md documents this as severe; redeploy must work around it (kill via /proc/*/exe, clear pidfile+socket, verify socket-owner binary).

Proposed fix
Fail loud on stale-pidfile takeover (log + non-zero exit or explicit force-takeover), and add a readiness check that verifies the socket owner's /proc/<pid>/exe is the new (non-deleted) binary.


Filed from a comparative audit of Hero Shrimp vs Qwen-Code / kimi-cli / picoclaw (2026-05-23). Severity in title: P0=correctness/trust, P1=reliability/UX, P2=cleanup.

**Problem** A new daemon whose start is rejected by a stale pidfile exits **silently** while the OLD binary keeps serving; readiness checks false-positive, so a bad deploy looks healthy. **Evidence** - `crates/hero_shrimp_server/src/rpc/pidfile.rs` (takeover path). - `ARCHITECTURE_CLEANUP_PLAN.md` documents this as severe; redeploy must work around it (kill via `/proc/*/exe`, clear pidfile+socket, verify socket-owner binary). **Proposed fix** Fail loud on stale-pidfile takeover (log + non-zero exit or explicit force-takeover), and add a readiness check that verifies the socket owner's `/proc/<pid>/exe` is the new (non-deleted) binary. --- _Filed from a comparative audit of Hero Shrimp vs Qwen-Code / kimi-cli / picoclaw (2026-05-23). Severity in title: P0=correctness/trust, P1=reliability/UX, P2=cleanup._
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_shrimp#35
No description provided.