[nu-demo] hero_agent prompt.rs: tool-use directive for hero_* questions needs strengthening (and reliable rebuild) #149
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom
On heronu, asking hero_agent
agent.chata plain question like"What is Hero OS and what can it do?"returns a generic "lightweight operating system for all devices" answer even when:search_hero_docstool is registered and, when called explicitly ("Call search_hero_docs with query=..."), returns clean grounded Q&A content describing Hero OS correctly as a "sovereign digital workspace";hero_agent/src/tool_router.rslistssearch_hero_docsin the defaultalways_include(patch landed,strings hero_agent_server | grep always_includeconfirms).So the tool is offered to the LLM, and Claude honors explicit instructions — but it doesn't pro-actively call
search_hero_docswithout an instruction. The LLM is comfortably describing from training data when the system prompt doesn't force grounding.Two gaps
1. The system-prompt directive is too soft
hero_agent/crates/hero_agent/src/prompt.rs::build_system_promptcurrently ends with a Guidelines block:Claude (and any well-trained LLM) reads "when needed" and "first" as guidance, not mandate. It weighs "I already know this" against the guideline and picks "already know." For a documentation-grounded agent, we want the opposite: any question mentioning
hero_*names must trigger asearch_hero_docscall first, with the answer citing the doc page.Proposed replacement (pragmatic — apply on
development_mik_nu_demo, then roll intodevelopmentbehind tests):2. Rebuild story is fragile
The prompt.rs edit was made on heronu and the agent was restarted, but
strings ~/hero/bin/hero_agent_server | grep -c MANDATORYreturned0— the binary was NOT actually rebuilt with the change (cargo incremental compile likely skipped the crate because another file was the ultimatecargo build -p hero_agent_serverroot, and prompt.rs is inhero_agentlib crate).Specifically: the binary at
~/hero/bin/hero_agent_servertimestamped 00:56 was identical to a build from BEFORE the prompt.rs edit — the nu-shell install script that copied it across didn't invokecargo buildwith proper dep tracking.Action:
service_agent.nu(see #135 — still missing) must:cargo build -p hero_agent_server --release(or debug) in the hero_agent workspacestrings $binary | grep $unique_marker_from_src)~/hero/bin/Without step (2) as a guard, prompt.rs edits are silently no-op'd across rebuilds.
Verification plan (once fixed)
After rebuild + restart:
Expected: response should contain
sovereign digital workspace,browser-based, or explicit doc citation like(hero_os_guide, overview). Generic answers ("user-friendly OS", "lightweight") are a regression.Related
[nu-demo] hero_agent should support tool_choice="required" for grounded modes(filed separately — belt-and-suspenders for when system prompts alone aren't enough)Signed-off-by: mik-tf
Fixed in hero_agent commit
61d702dondevelopment.Two soft lines in
build_system_prompt()Guidelines were replaced with one explicit MANDATORY directive:Design choices that match the issue body's analysis:
(hero_os_guide, overview)) makes the grounding signal visible in the response.search_hero_docsreturns nothing, the LLM must say so rather than silently fall back to training data.Regression test —
test_mandatory_grounding_directive(new) asserts presence of the MANDATORY + search_hero_docs + Hero Vired markers in the rendered prompt. This addresses the issue body's section-2 concern ("the rebuild story is fragile — prompt.rs edits were silently no-op'd across rebuilds because the binary on disk wasn't the latest"): any future softening of the directive, OR a build-cache miss producing a stale binary that doesn't include the directive, fails CI atcargo test -p hero_agent.The service-agent rebuild concern from section 2 is otherwise resolved —
service_agent.nu(home#135) is now live andsvc_cargo_installdoes a deterministic rebuild + copy each invocation.Verification:
cargo fmt --check -p hero_agentclean,cargo check -p hero_agentclean, all 7 prompt tests pass (including the new regression).Meta-tracker: home#193.
Signed-off-by: mik-tf