[P1] isIncompleteReply() relies on 7 hardcoded magic strings #40
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_shrimp#40
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
The UI decides whether a turn hit the step limit / is incomplete by regex-matching ~7 fixed phrases ("step limit", "iteration budget", …). If the model's wording drifts, a step-limited turn is shown as complete and the operator thinks the work finished.
Evidence
crates/hero_shrimp_web/ui/src/store.tsisIncompleteReply().Proposed fix
Have the server stamp an explicit
incomplete/stop_reasonfield on the turn/job result and key the UI off that, not prose.Filed from a comparative audit of Hero Shrimp vs Qwen-Code / kimi-cli / picoclaw (2026-05-23). Severity in title: P0=correctness/trust, P1=reliability/UX, P2=cleanup.
Implementation Spec for Issue #40
Objective
Replace the UI's brittle prose-matching (
isIncompleteReply()regex on ~8 hardcoded phrases) with an explicit, structuredincomplete(boolean) andstop_reason(string) field stamped by the server on every turn/job reply. The UI keys its Continue/Retry affordance off that field, falling back to the existing prose heuristic only when the field is absent (backward compatibility with older server responses).Requirements
incomplete: boolandstop_reason: stringon the structured results the UI consumes: the synchronousmessage.sendchat response, thechat.completeSSE event, theturn:endSSE event, and the autonomy/proof job result (details+job.detail).agent_step_limit_reachedevent already recorded intoexecution_control(surfaced bystep_limit_event_from_execution_control), the existingreply_indicates_incomplete_run()matcher, and theproof.step_limit.reachedflag.reply_indicates_incomplete_run()becomes the single, server-owned text fallback (one place, not eight in the UI).incomplete?/stopReason?to the response TS shapes; at all 3 call sites prefer the server field; keepisIncompleteReply()only as a fallback when the server field isundefined.Files to Modify/Create
All modifications (no new files):
crates/hero_shrimp_server/src/rpc/methods/session.rs— stampincomplete/stop_reasonon therun_simple_chat_turnresponse +chat.completeevent.crates/hero_shrimp_engine/src/agent_core/agent/core.rs— stampincomplete/stop_reasonon theturn:endSSE event.crates/hero_shrimp_server/src/rpc/methods/job/reconcile.rs(and/orjob/contract.rs) — stampincomplete/stop_reasoninto the persisted jobdetailssojob.detailcarries it.crates/hero_shrimp_server/src/rpc/methods/job/proof_run.rs—reply_indicates_incomplete_run()is the reused single text fallback; expose a small helper returning the stop_reason string.crates/hero_shrimp_web/ui/src/store.ts— add TS fields; consume server field at the 3 call sites; demoteisIncompleteReply()to fallback.Implementation Plan
Step 1: Add a server-side helper that derives
(incomplete, stop_reason)from existing signalsFiles:
crates/hero_shrimp_server/src/rpc/methods/job/proof_run.rs(nearreply_indicates_incomplete_run, ~line 93).pub fn turn_incomplete_signal(reply: &str) -> Option<&'static str>returningSome(stop_reason)when incomplete, elseNone. Reuses the existingreply_indicates_incomplete_run(reply)matcher (no new phrases):Some("step_limit")on match,Noneotherwise.step_limit_event/proof.step_limit.reachedis available, that takes precedence.Dependencies: none.
Step 2: Stamp
incomplete/stop_reasonon the synchronous chat response andchat.completeeventFiles:
crates/hero_shrimp_server/src/rpc/methods/session.rs(inrun_simple_chat_turn,Ok(raw_reply)arm, ~lines 967-997).let reply = strip_fake_tool_call_envelopes(&raw_reply);computelet stop_reason = crate::rpc::methods::job::proof_run::turn_incomplete_signal(&reply);complete_payload(chat.complete, ~line 969) and theresponseJSON (~line 984):"incomplete": stop_reason.is_some()and"stop_reason": stop_reason(serializes tonullwhenNone).Dependencies: Step 1.
Step 3: Stamp
incomplete/stop_reasonon theturn:endSSE eventFiles:
crates/hero_shrimp_engine/src/agent_core/agent/core.rs(~lines 84-93, thematch &resultthat emitsturn:end).Ok(reply)arm, derive incompleteness from the engine's own fixed step-limit fallback sentence (engine-owned, not drift-prone). Add"incomplete": <bool>and"stop_reason": "step_limit" | null. In theErrarm:"incomplete": true, "stop_reason": "error"(already setsok: false).loop_finalize.rs:52,iteration_shell.rs:121,chat_turn.rs,autonomy_dispatch.rs) so the emitter compares against the same constant the finalizer writes. If a constant is out of scope, match the stable substring"agent step limit"only (one place, engine-owned).Dependencies: none (can share the token string with Step 1).
Step 4: Stamp
incomplete/stop_reasoninto persisted jobdetailsfor autonomy/proof jobsFiles:
crates/hero_shrimp_server/src/rpc/methods/job/reconcile.rs(terminal-details reconciliation, ~lines 339-400, 478) and/orcrates/hero_shrimp_server/src/rpc/methods/job/contract.rs(cleanup_terminal_job_details, ~line 612).details, setdetails["incomplete"] = true/details["stop_reason"] = "step_limit"whenproof.step_limit.reached == true(stamped atproof.rs:480-488) ORdetails["failure_kind"] == "agent_step_limit"(set atproof_run.rs:759) ORreply_indicates_incomplete_run(details["reply"]). Otherwiseincomplete = false/ omitstop_reason.failure_kindis cleared during reconciliation (contract.rs:538); stamp BEFORE that removal or derive fromproof.step_limit(which persists). This path has the richest structured signal and should NOT rely on prose.Dependencies: Step 1.
Step 5: Add TS fields and consume the server field at the 3 store.ts call sites
Files:
crates/hero_shrimp_web/ui/src/store.ts.Msg.incomplete?: boolean(~line 43).isIncompleteReply()(lines 57-69) but re-document as the LEGACY FALLBACK used only when the server omits the field.trackJobUntilDoneterminal):const serverIncomplete = detailJob?.details?.incomplete;thenincomplete: ok && (serverIncomplete ?? isIncompleteReply(kept)).message.sendreturn): response now carriesincomplete; setincomplete: result?.incomplete ?? isIncompleteReply(finalText)(verify the RPC-result variable name near line 1204).turn:endSSE): replaceisIncompleteReply(strippedReply || reply)with(p.incomplete ?? isIncompleteReply(strippedReply || reply)), still gated byp.ok !== false.Dependencies: Steps 2, 3, 4.
Step 6: Tests
Files:
crates/hero_shrimp_server/src/rpc/jsonrpc/tests/chunk_3.rs(nearexecution_control_surfaces_step_limit_event, line 597) andchunk_2.rs(nearstep_limit_reply_is_not_successful_completion, line 372).detailscontainsincomplete == true/stop_reason == "step_limit"whenproof.step_limit.reachedis set, and a clean reply yieldsincomplete == false/ nostop_reason. Add a unit test forturn_incomplete_signal().Dependencies: Steps 1, 4.
Acceptance Criteria
incomplete: true+stop_reason: "step_limit"on themessage.sendsync response, thechat.completeevent, theturn:endevent, and the jobdetails(viajob.detail) — without relying on UI prose matching.incomplete: false(or absentstop_reason) and renders as a complete answer.isIncompleteReply()fallback.reply_indicates_incomplete_run().Notes
EngineClient::sendreturnsResult<String>with two impls (in_process.rs:161,rpc_client.rs:199). Widening it to a struct is a large blast-radius change and is intentionally avoided — the structured signal is carried on the JSON envelopes the UI already consumes.agent_step_limit_reachedevent wherever available; only the synchronous chat turn needs the text fallback, now in one function.stop_reasonasOption<String>→nullwhen not incomplete; UI treatsnull/undefined/absent uniformly via?? fallback.Test Results
Rust —
cargo test -p hero_shrimp_serverRelevant cases:
step_limit_reply_is_not_successful_completion— passedturn_incomplete_signal_maps_step_limit_to_stop_reason(new) — passedexecution_control_surfaces_step_limit_event— passedcargo check -p hero_shrimp_engine -p hero_shrimp_server— clean.Web UI —
crates/hero_shrimp_webvite build— succeeded (99 modules transformed); built bundle regenerated and committed.tsc --noEmitreports only pre-existing type errors in unrelated files (ChatActivity, CostBadge, HeroWelcome, and store.ts lines 1375/1385reasoningText); none are introduced by this change. The production build path is esbuild via vite, which bundles cleanly.Status: PASS.
Test Results
Rust —
cargo test -p hero_shrimp_serverNew cases (all passed):
turn_incomplete_signal_maps_step_limit_to_stop_reasonstamp_incomplete_signal_from_structured_step_limitstamp_incomplete_signal_from_failure_kindstamp_incomplete_signal_clean_run_has_no_stop_reasonExisting regression cases still pass:
step_limit_reply_is_not_successful_completionexecution_control_surfaces_step_limit_eventcargo check -p hero_shrimp_engine -p hero_shrimp_server— clean.Web UI —
crates/hero_shrimp_webvite build— succeeded (99 modules transformed); built bundle regenerated and committed.tsc --noEmitreports only pre-existing type errors in unrelated code (thereasoningTextaccesses in store.ts and a few other components); none are introduced by this change.Status: PASS.
Implementation Summary
The UI no longer decides incompleteness by regex-matching model prose. The server now stamps an explicit
incomplete/stop_reasonon every turn/job reply, and the UI reads that field (prose heuristic kept only as a fallback for older daemons).Files changed
crates/hero_shrimp_server/src/rpc/methods/job/proof_run.rs— addedturn_incomplete_signal()(single text fallback, reusesreply_indicates_incomplete_run); stamp signal at job-details persistence.crates/hero_shrimp_server/src/rpc/methods/session.rs— stampincomplete/stop_reasonon the syncmessage.sendresponse andchat.completeevent.crates/hero_shrimp_engine/src/agent_core/agent/core.rs— stampincomplete/stop_reasonon theturn:endSSE event (engine-owned marker;erroron the failure arm).crates/hero_shrimp_server/src/rpc/methods/job/contract.rs—stamp_incomplete_signal_into_details()(structured step-limit → failure_kind → reply fallback); also stamped during subagent reconciliation beforefailure_kindis cleared.crates/hero_shrimp_web/ui/src/store.ts— prefer the server field at all 3 call sites;isIncompleteReply()demoted to a legacy fallback. Bundle regenerated.crates/hero_shrimp_server/src/rpc/jsonrpc/tests/chunk_2.rs,chunk_3.rs— 4 new tests.Acceptance criteria
incomplete: true/stop_reason: "step_limit"on the sync response,chat.complete,turn:end, and persisted jobdetails— without UI prose matching.incomplete: falseand nostop_reason.isIncompleteReply()fallback.reply_indicates_incomplete_run().Tests
cargo test -p hero_shrimp_server— 301 passed, 0 failed.vite buildsucceeded.Pull request opened: #96
This PR implements the changes discussed in this issue, targeting the integration branch.