[P2] Verification requires a root-runnable command — nested-manifest builds false-negative #43

Open
opened 2026-05-23 21:52:26 +00:00 by thabeta · 0 comments
Owner

Problem
The zero-hardcoding verdict runs the LLM's verify command from the run root. When the model nests the project in a subdir (e.g. crate in stackcrate/), cargo test at root exits non-zero → false negative. Accepted trade-off of "no hardcoding", but it still bites real runs (job-233/244).

Evidence

  • crates/hero_shrimp_engine/src/verification/runner.rs (command_succeeds runs from workspace root).

Proposed fix
Either nudge the executor to emit a root-runnable command, or let the contract carry a working-dir for the command without reintroducing language/manifest detection.


Filed from a comparative audit of Hero Shrimp vs Qwen-Code / kimi-cli / picoclaw (2026-05-23). Severity in title: P0=correctness/trust, P1=reliability/UX, P2=cleanup.

**Problem** The zero-hardcoding verdict runs the LLM's verify command from the run root. When the model nests the project in a subdir (e.g. crate in `stackcrate/`), `cargo test` at root exits non-zero → false negative. Accepted trade-off of "no hardcoding", but it still bites real runs (job-233/244). **Evidence** - `crates/hero_shrimp_engine/src/verification/runner.rs` (`command_succeeds` runs from workspace root). **Proposed fix** Either nudge the executor to emit a root-runnable command, or let the contract carry a working-dir for the command without reintroducing language/manifest detection. --- _Filed from a comparative audit of Hero Shrimp vs Qwen-Code / kimi-cli / picoclaw (2026-05-23). Severity in title: P0=correctness/trust, P1=reliability/UX, P2=cleanup._
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_shrimp#43
No description provided.