lhumina_code/hero_shrimp

Fork 0

Port Tier 1/2/3 learnings from xmoncode/shrimp #90

Open

thabeta wants to merge 1 commit from port-from-xmoncode-shrimp into integration

thabeta commented

2026-06-04 23:22:14 +00:00

Owner

Summary

Ports a batch of ideas from xmonader's personal shrimp agent (~/xmoncode/shrimp) into hero_shrimp — all of Tier 1/2/3 from the comparison write-up. Each item is adapted to hero_shrimp's architecture (not a copy — upstream is a different crate layout), wired into the engine/runtime/CLI, and unit-tested.

Full details: docs/ports-from-xmoncode-shrimp.md.

What's included

New tools (registered + routed):

repo_wiki — drift-tracked ARCHITECTURE.md from the repo map
find_clones — near-duplicate function bodies (token-bag cosine)
impacted_tests — tests depending on changed files (via blast_radius)
ast_edit — tree-sitter symbol replacement (Rust) with a post-parse rollback gate
expand_context — retrieve the full text of an elided tool output on demand
fork — best-of-N candidate race in isolated git worktrees
mcp_search — BM25 ranking + name-resolve over MCP tools
skill_evolve — deterministic skill minting from recurring success patterns

Behavior / hot paths:

loop-detection cold-start exploration grace
Anthropic prompt cache anchored on the last stable (assistant) message
per-server MCP circuit breaker
RRF + MMR diversity re-rank in memory recall
per-segment shell grant keys in the session approval cache
conversational approve-over-chat (Telegram) + reject-with-feedback
declarative file-defined crews (dependency-wave DAG + typed handoffs)
macOS Seatbelt (sandbox-exec) shell backend
typed llm:delta → MessagePartial at the client edge
council raised 3 → 4 members (MAX_COUNCILORS + tier clamp)
new tools wired into tool_routing groups

Harnesses:

5 new behavioral eval scenarios that assert real on-disk effects
eval/fromscratch/ — held-out-oracle capability harness (ported)

Verification

cargo build --workspace clean, 0 warnings on changed crates.
Unit suite: 1717 passed (2 failures are a sandbox artifact — /tmp is itself a git repo in CI; they pass under a non-git TMPDIR).
Behavioral eval: 16/16 through the real agent loop (scripted LLM). The 5 new scenarios assert real file effects (ast_edit rewrites a file, repo_wiki writes the doc, etc.) — this is what caught a real routing bug where the new tools were registered but never offered to the model.
From-scratch harness, run live: deepseek-v4-flash built a complete bencode encoder/decoder from a spec; the held-out acceptance test passed 5/5.
Live multi-model verification: executor deepseek-v4-flash + a 4-model council (deepseek-v4-pro, z-ai/glm-5.1, minimax/minimax-m3, moonshotai/kimi-k2.6) — all confirmed responding (authoritative: cost ledger + council_positions table). ~$0.14 total.

Notes for the reviewer

Council cap 3 → 4 is included intentionally (raises council size, ~33% more cost per consult). Easy to revert to config-only if undesired.
Not yet exercised end-to-end (unit-tested + isolated, low blast radius): fork, declarative crews, watch, expand_context, conversational approvals. Prompt-cache anchoring and MMR recall are hot-path changes validated by structure/unit tests but not against live external behavior.
Adds 3 dependencies (tree-sitter, tree-sitter-rust, streaming-iterator) for ast_edit.

Test plan

cargo build --workspace
cargo test --workspace (engine 1717 pass; 2 env-only)
make eval → 16/16
eval/fromscratch/run.sh bencode (live) → 5/5 held-out
live executor + 4-model council run

## Summary Ports a batch of ideas from xmonader's personal `shrimp` agent (`~/xmoncode/shrimp`) into `hero_shrimp` — **all of Tier 1/2/3** from the comparison write-up. Each item is adapted to hero_shrimp's architecture (not a copy — upstream is a different crate layout), wired into the engine/runtime/CLI, and unit-tested. Full details: `docs/ports-from-xmoncode-shrimp.md`. ## What's included **New tools (registered + routed):** - `repo_wiki` — drift-tracked `ARCHITECTURE.md` from the repo map - `find_clones` — near-duplicate function bodies (token-bag cosine) - `impacted_tests` — tests depending on changed files (via `blast_radius`) - `ast_edit` — tree-sitter symbol replacement (Rust) with a post-parse rollback gate - `expand_context` — retrieve the full text of an elided tool output on demand - `fork` — best-of-N candidate race in isolated git worktrees - `mcp_search` — BM25 ranking + name-resolve over MCP tools - `skill_evolve` — deterministic skill minting from recurring success patterns **Behavior / hot paths:** - loop-detection cold-start exploration grace - Anthropic prompt cache anchored on the last *stable* (assistant) message - per-server MCP circuit breaker - RRF + MMR diversity re-rank in memory recall - per-segment shell grant keys in the session approval cache - conversational approve-over-chat (Telegram) + reject-with-feedback - declarative file-defined crews (dependency-wave DAG + typed handoffs) - macOS Seatbelt (`sandbox-exec`) shell backend - typed `llm:delta` → `MessagePartial` at the client edge - council raised 3 → 4 members (`MAX_COUNCILORS` + tier clamp) - new tools wired into `tool_routing` groups **Harnesses:** - 5 new behavioral eval scenarios that assert **real on-disk effects** - `eval/fromscratch/` — held-out-oracle capability harness (ported) ## Verification - `cargo build --workspace` clean, **0 warnings** on changed crates. - Unit suite: **1717 passed** (2 failures are a sandbox artifact — `/tmp` is itself a git repo in CI; they pass under a non-git `TMPDIR`). - Behavioral eval: **16/16** through the real agent loop (scripted LLM). The 5 new scenarios assert real file effects (`ast_edit` rewrites a file, `repo_wiki` writes the doc, etc.) — this is what caught a real routing bug where the new tools were registered but never offered to the model. - **From-scratch harness, run live:** `deepseek-v4-flash` built a complete bencode encoder/decoder from a spec; the held-out acceptance test passed **5/5**. - **Live multi-model verification:** executor `deepseek-v4-flash` + a 4-model council (`deepseek-v4-pro`, `z-ai/glm-5.1`, `minimax/minimax-m3`, `moonshotai/kimi-k2.6`) — all confirmed responding (authoritative: cost ledger + `council_positions` table). ~$0.14 total. ## Notes for the reviewer - **Council cap 3 → 4** is included intentionally (raises council size, ~33% more cost per consult). Easy to revert to config-only if undesired. - Not yet exercised end-to-end (unit-tested + isolated, low blast radius): `fork`, declarative crews, `watch`, `expand_context`, conversational approvals. Prompt-cache anchoring and MMR recall are hot-path changes validated by structure/unit tests but not against live external behavior. - Adds 3 dependencies (`tree-sitter`, `tree-sitter-rust`, `streaming-iterator`) for `ast_edit`. ## Test plan - [x] `cargo build --workspace` - [x] `cargo test --workspace` (engine 1717 pass; 2 env-only) - [x] `make eval` → 16/16 - [x] `eval/fromscratch/run.sh bencode` (live) → 5/5 held-out - [x] live executor + 4-model council run

thabeta added 3 commits

2026-06-04 23:22:14 +00:00

Merge pull request 'update main' (#83 ) from development into main

Build Linux / build-linux (push) Successful in 12m16s

Details

Verify / verify (push) Successful in 38m10s

Details

7da7d6f587

Reviewed-on: #83

chore: build on main — hero_lifecycle factor-out + herolib_openrpc, CI 1.96

Build Linux / build-linux (push) Successful in 4m59s

Details

Verify / verify (push) Successful in 32m12s

Details

5644285cce

feat: port Tier 1/2/3 learnings from xmoncode/shrimp

Verify / verify (push) Failing after 21s

Details

bf6c279992

Adapts a batch of ideas from xmonader's personal `shrimp` agent into
hero_shrimp, each wired into the engine/runtime/CLI and unit-tested.
Workspace builds clean (0 warnings); behavioral eval 16/16; live-verified
against deepseek-v4-flash (executor) + a 4-model council (deepseek-v4-pro,
z-ai/glm-5.1, minimax/minimax-m3, moonshotai/kimi-k2.6).

New tools (registered + routed):
- repo_wiki        drift-tracked ARCHITECTURE.md from the repo map
- find_clones      near-dup function bodies (token-bag cosine)
- impacted_tests   tests depending on changed files (blast_radius)
- ast_edit         tree-sitter symbol replacement (Rust) + rollback gate
- expand_context   retrieve full elided tool output on demand
- fork             best-of-N candidate race in git worktrees
- mcp_search       BM25 ranking + name-resolve over MCP tools
- skill_evolve     deterministic skill minting from success patterns

Behavior / hot paths:
- loop-detection cold-start exploration grace
- Anthropic prompt cache anchored on the last stable (assistant) message
- per-server MCP circuit breaker
- RRF+MMR diversity re-rank in memory recall
- per-segment shell grant keys in the session approval cache
- conversational approve-over-chat (Telegram) + reject-with-feedback
- declarative file-defined crews (dependency-wave DAG + typed handoffs)
- macOS Seatbelt (sandbox-exec) shell backend
- typed llm:delta -> MessagePartial at the client edge
- council raised 3 -> 4 members (MAX_COUNCILORS + tier clamp)
- new tools wired into tool_routing groups

Harnesses:
- 5 new behavioral eval scenarios that assert real on-disk effects
- eval/fromscratch/ held-out-oracle capability harness (ported)

Docs: docs/ports-from-xmoncode-shrimp.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

thabeta changed title from ~~port-from-xmoncode-shrimp~~ to Port Tier 1/2/3 learnings from xmoncode/shrimp

2026-06-04 23:23:31 +00:00

Verify / verify (push) Failing after 21s

Details

This pull request can be merged automatically.

This branch is out-of-date with the base branch

You are not authorized to merge this pull request.

View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.

git fetch -u origin port-from-xmoncode-shrimp:port-from-xmoncode-shrimp

git switch port-from-xmoncode-shrimp

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch integration

git merge --no-ff port-from-xmoncode-shrimp

git switch port-from-xmoncode-shrimp

git rebase integration

git switch integration

git merge --ff-only port-from-xmoncode-shrimp

git switch port-from-xmoncode-shrimp

git rebase integration

git switch integration

git merge --no-ff port-from-xmoncode-shrimp

git switch integration

git merge --squash port-from-xmoncode-shrimp