lab checks #304

Open
opened 2026-05-30 08:50:16 +00:00 by despiegk · 3 comments
Owner

You want to make a self-checker for the repository. In the repository, we have service files. So each binary is per binary. Each binary sits in a crate. It can be more than one binary per crate, actually. And there is a service TOML file. And this specifies how that binary or binaries are being used, which sockets, what their role is, and so on. We now want to create multiple checks and tests, basically. And there is one, we make one Rust file per test, where we, based on these files, do research as much as we can with code to validate, as an example, the versions of code we use, if certain things are done properly, but also some AI checks. And then we keep in the root of the repository checks.toml, where we keep track of all the tests. And the tests are also labeled with a version. So we know that if in our code we make a new version of the test, then we know we need to test it again. And we can also give a priority to the test, just 1, 2, 3, so that we can define which tests or checks, not tests or checks, which checks we want to execute. Some of them will just call an agent. We're using our agent SDK for that to basically run it. And we can define in the test which agent we use, the quality of the agent, what it can do and so on. Now, all of these are shuttled through our HeroProc. So we use our lab command to drive these tests, like calling an agent or have, of course, what we can do fast in Rust, we can do just fast in Rust ourselves. And we keep track of the status. So in this metadata file, the TOML file, the checks.TOML file, we keep track of when the last date was it succeeded, what the status is of the test, the version of the test, the name of the test, and that is done per binary and per crate. So we really know if the tests were successful. And then we can easily change, like if we, in that TOML file, remove an entry and we run lab checks again, and then the level, we specify the level, then it will go over the repo and do all of these checks and tests and so on, and we can also attach fixes to it, where we identify potentially an agent or skills.

use skills

/hero_service_toml_info

/hero_proc_sdk

implementation

  • define a name, if not defined in cmdline use hero_check_find_$mm_dd_hh_mm, create a run for it where we can add jobs use /hero_proc_sdk , say max 8 actions at same time
  • add jobs/actions to the run for everything mentioned below

run over multiple repo's,

  • default its repo's on $PATH_ROOT/code who start with hero_
  • for each make actions& jobs as is needed
  • make dependencies between the actions to do the multiple steps as define below, as part of an action we call 'lab check ...' with right option (define the test), at end of test the checks.toml is filled in
  • each action has as name $runname_.... whatever the action is

so doing 'lab check' doesn't take long it creates lots of jobs based on actions in a run

we need lab check --delete. (does stop/delete jobs and actions too, check code in proc)
--list (so we see the runs)
--status
--test ... (specify specific test we want to do, also needed to run in specific job)
--force (means we do anyhow)
--repos name,name. (namefixed, and optional, if specified as "all" then is hero_... in PATH_CODE),, if not specified then repo we are in (find top, functions exist in hero_lib core base class)

.. add more what we need

check_cargo

  • walk over all cargo's (are there lib crate for this) toml files, only do this where there is service.toml file
  • make a list based on /rust_versions (skill) which is hardcoded, do as a .toml file we embed
  • allow for exceptions which can be defined in cargo file (define how)
  • check we already have some checker in our lab
  • if issue identify them
  • if check fails we call 'lab agent --model 1 "fix cargo in $CRATEPATH, the skills to use are /blueprint_base_check and /rust_versions"
  • then do the test again, max 5 time loop, so we make the cargo in order
  • mark in repodir/checks.toml that the check worked, succeeded, date, crate, ...

cargo_check_update.rs

  • walk over repo, make sure there are no easy to find issues, crate per crate, no AI needed, only do this where there is service.toml file and not skipped
  • same for cargo update
  • if check fails we call 'lab agent --model 1 "fix build in $CRATEPATH, the build was done as $BUILDCHECKCMD, if issue retry the command, try to fix the issues, retry max 5 times"
  • mark in repodir/checks.toml that the check worked, succeeded, date, crate, ...

cargo_build.rs

  • for each crate do a build using 'lab build ... ' specify the crate, only do this where there is service.toml file and not skipped
  • if issue call 'lab agent --model 1 "fix build in $CRATEPATH, the build was done as $BUILDCHECKCMD, if issue retry the command, try to fix the issues, retry max 5 times"
  • mark in repodir/checks.toml that the check worked, succeeded, date, crate, ...

proc_run.sh

  • for each crate do a build using 'lab build --restart --force ... ' specify the crate, only do this where there is service.toml file and not skipped and where service has a server part
  • if issue call agent, promt for hero_proc_cmd skill and ask for starting, testing, stopping, asking heroproc for logs, ...

upload.sh

  • for each crate do a build using 'lab build --platforms macos-arm64 --upload --update ' specify the crate, only do this where there is service.toml file and not skipped, check which OS we are on, onlo do for the OS we are on
  • if issue call agent, promt for hero_proc_cmd skill and ask for uploading ... and try to fix, chose skills if needed
You want to make a self-checker for the repository. In the repository, we have service files. So each binary is per binary. Each binary sits in a crate. It can be more than one binary per crate, actually. And there is a service TOML file. And this specifies how that binary or binaries are being used, which sockets, what their role is, and so on. We now want to create multiple checks and tests, basically. And there is one, we make one Rust file per test, where we, based on these files, do research as much as we can with code to validate, as an example, the versions of code we use, if certain things are done properly, but also some AI checks. And then we keep in the root of the repository checks.toml, where we keep track of all the tests. And the tests are also labeled with a version. So we know that if in our code we make a new version of the test, then we know we need to test it again. And we can also give a priority to the test, just 1, 2, 3, so that we can define which tests or checks, not tests or checks, which checks we want to execute. Some of them will just call an agent. We're using our agent SDK for that to basically run it. And we can define in the test which agent we use, the quality of the agent, what it can do and so on. Now, all of these are shuttled through our HeroProc. So we use our lab command to drive these tests, like calling an agent or have, of course, what we can do fast in Rust, we can do just fast in Rust ourselves. And we keep track of the status. So in this metadata file, the TOML file, the checks.TOML file, we keep track of when the last date was it succeeded, what the status is of the test, the version of the test, the name of the test, and that is done per binary and per crate. So we really know if the tests were successful. And then we can easily change, like if we, in that TOML file, remove an entry and we run lab checks again, and then the level, we specify the level, then it will go over the repo and do all of these checks and tests and so on, and we can also attach fixes to it, where we identify potentially an agent or skills. use skills /hero_service_toml_info /hero_proc_sdk ## implementation - define a name, if not defined in cmdline use hero_check_find_$mm_dd_hh_mm, create a run for it where we can add jobs use /hero_proc_sdk , say max 8 actions at same time - add jobs/actions to the run for everything mentioned below run over multiple repo's, - default its repo's on $PATH_ROOT/code who start with hero_ - for each make actions& jobs as is needed - make dependencies between the actions to do the multiple steps as define below, as part of an action we call 'lab check ...' with right option (define the test), at end of test the checks.toml is filled in - each action has as name $runname_.... whatever the action is so doing 'lab check' doesn't take long it creates lots of jobs based on actions in a run we need lab check --delete. (does stop/delete jobs and actions too, check code in proc) --list (so we see the runs) --status --test ... (specify specific test we want to do, also needed to run in specific job) --force (means we do anyhow) --repos name,name. (namefixed, and optional, if specified as "all" then is hero_... in PATH_CODE),, if not specified then repo we are in (find top, functions exist in hero_lib core base class) .. add more what we need ### check_cargo - walk over all cargo's (are there lib crate for this) toml files, only do this where there is service.toml file - make a list based on /rust_versions (skill) which is hardcoded, do as a .toml file we embed - allow for exceptions which can be defined in cargo file (define how) - check we already have some checker in our lab - if issue identify them - if check fails we call 'lab agent --model 1 "fix cargo in $CRATEPATH, the skills to use are /blueprint_base_check and /rust_versions" - then do the test again, max 5 time loop, so we make the cargo in order - mark in repodir/checks.toml that the check worked, succeeded, date, crate, ... ### cargo_check_update.rs - walk over repo, make sure there are no easy to find issues, crate per crate, no AI needed, only do this where there is service.toml file and not skipped - same for cargo update - if check fails we call 'lab agent --model 1 "fix build in $CRATEPATH, the build was done as $BUILDCHECKCMD, if issue retry the command, try to fix the issues, retry max 5 times" - mark in repodir/checks.toml that the check worked, succeeded, date, crate, ... ### cargo_build.rs - for each crate do a build using 'lab build ... ' specify the crate, only do this where there is service.toml file and not skipped - if issue call 'lab agent --model 1 "fix build in $CRATEPATH, the build was done as $BUILDCHECKCMD, if issue retry the command, try to fix the issues, retry max 5 times" - mark in repodir/checks.toml that the check worked, succeeded, date, crate, ... ### proc_run.sh - for each crate do a build using 'lab build --restart --force ... ' specify the crate, only do this where there is service.toml file and not skipped and where service has a server part - if issue call agent, promt for hero_proc_cmd skill and ask for starting, testing, stopping, asking heroproc for logs, ... ### upload.sh - for each crate do a build using 'lab build --platforms macos-arm64 --upload --update ' specify the crate, only do this where there is service.toml file and not skipped, check which OS we are on, onlo do for the OS we are on - if issue call agent, promt for hero_proc_cmd skill and ask for uploading ... and try to fix, chose skills if needed
Author
Owner

Implementation Spec for Issue #304

Objective

Add a lab check (alias lab checks) subcommand to the lab crate that drives a repository self-checker. lab check is a fast orchestrator: it discovers crates that have a service.toml, then for each crate + check it creates jobs/actions in a single hero_proc run (max 8 concurrent), wiring dependencies between steps. Each job re-invokes lab check --test <name> ... to run one specific test fast in Rust; on failure the test escalates to lab agent --model 1 with the right skills, retrying up to 5 times. Results (status, date, version, crate, test name) are persisted in a checks.toml at each repo root. The command supports --delete, --list, --status, --test, --force, and --repos.

Requirements

  • New top-level subcommand lab check (with checks alias) dispatched from main.rs, all logic under crates/lab/src/checker/.
  • Run-name resolution: use --name if given, else hero_check_find_$mm_dd_hh_mm.
  • Create one hero_proc run per invocation; add jobs/actions for every (repo x crate x test); max_concurrency = 8.
  • Each action/job named $runname_<repo>_<crate>_<test>; jobs re-invoke lab check --test <test> --crate <crate> --repos <repo> so the orchestrator returns fast.
  • Express inter-step dependencies via ActionBuilder::depends_on so per-crate test stages run in order (check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload).
  • Repo selection is driven by crates/lab/src/service/services.toml (the existing service manifest), via the existing loader crate::service::services_toml:
    • --repos all (and the unspecified-on-orchestrate default) => every enabled (disabled = false) entry from services.toml, i.e. crate::service::services_toml::all_services().
    • --repos <name,name> => comma-split; each token resolved through services_toml::resolve_name_or_tag (so a tag like infra/ai/dev expands to its services, and a bare name passes through).
    • --repos not specified => the current repo only (walk up to .git via crate::repo::paths::path_home(None)), so a developer can check just the repo they are in.
    • Each resolved service name maps to a repo dir under $PATH_CODE (reuse the repo_local_path/path_code_for pattern from builder/all.rs).
  • Crate selection: only crates that have a service.toml at their root; respect a per-crate skip mechanism.
  • Five concrete checks (one Rust file per test): check_cargo, cargo_check_update, cargo_build, proc_run, upload. Each marks checks.toml on success and runs an agent-repair loop (max 5) on failure.
  • checks.toml at repo root records per crate + per test: test name, version, status, last-success date, priority.
  • Tests carry a version; bumping a test's version forces a re-run; a priority (1/2/3) selects which tests execute at a given level.
  • Embed the canonical dependency-version list (already present as [package.metadata.hero_builder.rust_versions] in lab/Cargo.toml, generated into OUT_DIR/policy.rs) for check_cargo via lab::builder::policy::embedded().
  • --delete stops/deletes jobs, runs, and actions for the run; --list lists runs; --status shows status; --force runs regardless of recorded state.

Files to Modify/Create

  • crates/lab/src/main.rs — add TopCmd::Check { ... } variant + dispatch arm cmd_check(...). (Modify)
  • crates/lab/src/lib.rs — add pub mod checker;. (Modify)
  • crates/lab/src/checker/mod.rs — orchestrator: run-name resolution, repo/crate discovery, hero_proc run + job/action creation, single---test dispatch, --delete/--list/--status handlers, the check registry (id, version, priority, level, skills). (Create)
  • crates/lab/src/checker/checks_toml.rsChecksToml schema + load/save at <repo>/checks.toml; per-crate-per-test entry { name, version, priority, status, last_success, source_hash }; re-run policy. (Create)
  • crates/lab/src/checker/discovery.rs — repo-set resolution via service::services_toml, plus crate discovery filtered by sibling service.toml, server-part detection, and skip detection. (Create)
  • crates/lab/src/checker/check_cargo.rs — deterministic Cargo.toml version/policy check (consumes policy::embedded()), exception support, agent fallback. (Create)
  • crates/lab/src/checker/cargo_check_update.rscargo check + cargo update per crate, agent fallback. (Create)
  • crates/lab/src/checker/cargo_build.rslab build <crate> per crate, agent fallback. (Create)
  • crates/lab/src/checker/proc_run.rslab build --restart --force for crates whose service has a server part, agent fallback prompting /hero_proc_cmd. (Create)
  • crates/lab/src/checker/upload.rslab build --platforms <host-os> --upload --update for current OS, agent fallback prompting /hero_proc_cmd. (Create)
  • crates/lab/src/agent/agent_check.rscrate::agent wrapper for check-repair invocations (model 1/quality + skill attachment in the prompt), mirroring agent_repair.rs. (Create)
  • crates/lab/src/agent/mod.rs — re-export the new wrapper. (Modify)

Implementation Plan

Step 1: Wire the lab check subcommand skeleton

Files: crates/lab/src/main.rs, crates/lab/src/lib.rs, crates/lab/src/checker/mod.rs

  • In lib.rs add pub mod checker;.
  • In main.rs add a TopCmd::Check variant (model the flag block on the existing Build/Infocheck variants). Flags: repos: Option<String>, test: Option<String>, crate_: Option<String> (clap --crate), name: Option<String>, level: Option<u8> (priority 1/2/3, default 1), force: bool, delete: bool, list: bool, status: bool, json: bool. Add #[command(alias = "checks")].
  • Add dispatch arm calling cmd_check(...) (follow the cmd_infocheck pattern), building a CheckOpts and calling lab::checker::run(opts).
  • checker::mod.rs exposes pub fn run(opts: CheckOpts) -> anyhow::Result<()> and pub struct CheckOpts. Branch: --list/--status/--delete short-circuit; --test set => run a single check inline; otherwise orchestrate.
    Dependencies: none

Step 2: checks.toml schema + load/save + re-run policy

Files: crates/lab/src/checker/checks_toml.rs

  • ChecksToml { crates: BTreeMap<String, CrateChecks> }, CrateChecks { checks: BTreeMap<String, CheckEntry> }, CheckEntry { name, version: u32, priority: u8, status: String, last_success: Option<String>, source_hash: Option<String>, reason: Option<String> }.
  • path() = <repo_root>/checks.toml (per the issue, repo root — NOT ~/hero/var).
  • load_or_default / save copying the toml read/write pattern from builder/all.rs AllState.
  • needs_run(entry, current_version, current_hash, force): true if force, missing, entry.version < current_version, hash changed, or status != "pass".
  • mark_success / mark_fail mutators that persist.
  • Per-crate hash via builder::hashing::source_hash(crate_root).
    Dependencies: none (parallel with Step 1)

Step 3: Crate/repo discovery (driven by services.toml)

Files: crates/lab/src/checker/discovery.rs

  • resolve_repos(repos: Option<&str>) -> Vec<PathBuf>:
    • None => current repo only via crate::repo::paths::path_home(None).
    • Some("all") => service::services_toml::all_services() (enabled entries), each mapped to its repo dir under PATH_CODE.
    • Some(list) => split on ,, each token through service::services_toml::resolve_name_or_tag (expands tags), flatten + dedupe, map to repo dirs.
    • Map a service name to its repo path with the repo_local_path/path_code_for pattern from builder/all.rs; skip repos not present locally with a clear warning.
  • crates_with_service_toml(repo_root): call builder::cargo_discovery::load_all_in_repo + discover_binaries_in_repo (as infocheck::run does), dedupe by crate dir, keep only dirs where crate_dir.join("service.toml").exists().
  • crate_has_server(crate_dir): parse service.toml into herolib_core::base::ServiceToml and check for an RPC/server socket or _server binary — needed by proc_run.
  • is_skipped(crate_dir, test_id): read a [package.metadata.hero_checks] skip = [...] table from the crate's Cargo.toml.
    Dependencies: none (parallel with Step 1)

Step 4: Orchestrator — build the hero_proc run with jobs/actions + dependencies

Files: crates/lab/src/checker/mod.rs

  • Mirror builder/all.rs submit_build_run + poll_until_done: use hero_proc_sdk::{RunBuilder, ActionBuilder, hero_proc_factory}; guard with crate::service::hero_proc_exception::is_hero_proc_alive().
  • Run name: opts.name or format!("hero_check_find_{}", now.format("%m_%d_%H_%M")).
  • RunBuilder::new(run_name).max_concurrency(8); do not enable cleanup_on_success (keep history).
  • For each repo -> each crate-with-service.toml -> each selected test (filter by priority <= opts.level and checks_toml::needs_run(...)):
    • Add an inline action named format!("{run_name}_{repo}_{crate}_{test}") whose script is lab check --test <test> --crate <crate> --repos <repo_path> [--force], tagged "hero_check".
    • Chain per-crate ordering with .depends_on(prev_action_name): check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload.
  • .submit(&hp).await? once; print the run URL (http://localhost:9988/hero_proc/admin/#/runs/{run_id}). Returns fast.
    Dependencies: Steps 1, 2, 3

Step 5: Agent-repair wrapper for checks

Files: crates/lab/src/agent/agent_check.rs, crates/lab/src/agent/mod.rs

  • agent_check.rs modeled on agent_repair.rs: pub async fn agent_fix_check(ctx: &CheckFixContext) -> AgentFixResult, using herolib_ai::agent::Agent::claude(quality) with PermissionMode::DangerousSkipPermissions, .streaming(true), .working_dir(crate_root).
  • CheckFixContext { crate_root, crate_name, test_id, attempt_number, max_attempts, skills: Vec<&str>, failure_detail }.
  • Build the instruction with the issue-mandated skill references (check_cargo => /blueprint_base_check + /rust_versions; proc_run/upload => /hero_proc_cmd). Map --model 1 to the fast/haiku tier, escalate by attempt.
  • Re-export from agent/mod.rs. Per the checker boundary rule, checks call crate::agent::agent_fix_check, never herolib_ai directly.
    Dependencies: none (parallel with Steps 1-4); used by Step 6

Step 6: The five check implementations (single-test execution path)

Files: crates/lab/src/checker/{check_cargo,cargo_check_update,cargo_build,proc_run,upload}.rs
Each file: pub const ID: &str, pub const VERSION: u32, pub const PRIORITY: u8, pub const SKILLS: &[&str], and pub fn run(crate_dir: &Path, force: bool) -> Result<bool>. The fast Rust path runs; on failure it loops up to 5 times calling crate::agent::agent_fix_check then re-running the fast path; on success it calls checks_toml::mark_success.

  • check_cargo.rs: load policy::embedded(); walk the crate's Cargo.toml with toml_edit, flag deps deviating from policy; honor [package.metadata.hero_checks.cargo_exceptions]. On fail => agent with ["blueprint_base_check","rust_versions"].
  • cargo_check_update.rs: run cargo check then cargo update in the workspace root; on fail => agent.
  • cargo_build.rs: invoke lab build for the crate; on fail => agent.
  • proc_run.rs: only if crate_has_server; run lab build --restart --force; on fail => agent with /hero_proc_cmd.
  • upload.rs: detect host OS; run lab build --platforms <host-label> --upload --update only for that OS; on fail => agent with /hero_proc_cmd.
  • Register all five in mod.rs's check table.
    Dependencies: Steps 2, 3, 5

Step 7: --delete, --list, --status handlers

Files: crates/lab/src/checker/mod.rs

  • --list: list hero_proc runs whose name starts with hero_check_; print id/name/status.
  • --status: for the named/most-recent check run, print per-job phase (like poll_until_done).
  • --delete: mirror service/fast_teardown.rs::wipe_servicejob_stop/SIGKILL then run_delete + action_delete; or use action_clean_by_tag with the "hero_check" tag for one-call teardown.
    Dependencies: Steps 1, 4

Acceptance Criteria

  • lab check --help lists --repos, --test, --crate, --name, --level, --force, --delete, --list, --status; lab checks works as an alias.
  • lab check creates one hero_proc run named hero_check_find_<mm_dd_hh_mm> (or --name) with max_concurrency=8, returns quickly, prints the run URL.
  • Jobs are named <runname>_<repo>_<crate>_<test> and per-crate run in the order check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload via depends_on.
  • Repo selection comes from services.toml: --repos all = all enabled entries, tags expand, bare names pass through, no --repos = current repo only.
  • Only crates with a service.toml are checked; proc_run only for server crates; upload only on the host OS; skipped crates/tests honored.
  • lab check --test <t> --crate <c> runs the fast Rust check, and on failure calls an agent (--model 1) with the mandated skills, retrying up to 5 times.
  • On success, <repo>/checks.toml records test name, version, priority, status pass, and last_success for that crate+test; a version bump or source-hash change forces a re-run; --force re-runs regardless.
  • --list shows check runs, --status shows per-job status, --delete stops and removes the run's jobs/runs/actions.
  • check_cargo validates against the embedded rust_versions policy and honors per-crate exceptions.
  • cargo build -p lab succeeds; lab check twice with no source changes does no work the second time (idempotent).

Notes

  • The checker/ directory already exists with design docs (instructions.md, ideas.md). Honor its hard rules: reuse builder/cargo_discovery (load_all_in_repo, discover_binaries_in_repo) and builder/hashing::source_hash; do not write a second crate walker or hash function.
  • Per #304, checks.toml lives at the repo root and is per-crate/per-binary (not ~/hero/var).
  • The dependency-version list is already embedded in lab/Cargo.toml under [package.metadata.hero_builder.rust_versions] and generated into OUT_DIR/policy.rs; consume via lab::builder::policy::embedded(). Do not re-embed /rust_versions.
  • Repo selection reuses the existing crate::service::services_toml loader (all_services, resolve_name_or_tag) — services.toml is the single source of truth for which repos all covers (enabled = not disabled).
  • Closest precedent: crates/lab/src/builder/all.rs (lab build all) already does this exact pattern — discover repos, submit one hero_proc run of inline jobs that re-invoke the lab binary, poll, persist a state TOML. The check orchestrator is structurally lab build all with a different job command and per-crate dependency chains.
  • hero_proc must be running; guard with is_hero_proc_alive() and emit the same helpful error as all.rs.
  • Agent skills are passed by name inside the prompt string (e.g. the skills to use are /blueprint_base_check and /rust_versions), not embedded as files. --model 1 maps to a fast/haiku quality tier; escalate by attempt.
## Implementation Spec for Issue #304 ### Objective Add a `lab check` (alias `lab checks`) subcommand to the `lab` crate that drives a repository self-checker. `lab check` is a fast orchestrator: it discovers crates that have a `service.toml`, then for each crate + check it creates jobs/actions in a single hero_proc **run** (max 8 concurrent), wiring dependencies between steps. Each job re-invokes `lab check --test <name> ...` to run one specific test fast in Rust; on failure the test escalates to `lab agent --model 1` with the right skills, retrying up to 5 times. Results (status, date, version, crate, test name) are persisted in a `checks.toml` at each repo root. The command supports `--delete`, `--list`, `--status`, `--test`, `--force`, and `--repos`. ### Requirements - New top-level subcommand `lab check` (with `checks` alias) dispatched from `main.rs`, all logic under `crates/lab/src/checker/`. - Run-name resolution: use `--name` if given, else `hero_check_find_$mm_dd_hh_mm`. - Create one hero_proc run per invocation; add jobs/actions for every (repo x crate x test); `max_concurrency = 8`. - Each action/job named `$runname_<repo>_<crate>_<test>`; jobs re-invoke `lab check --test <test> --crate <crate> --repos <repo>` so the orchestrator returns fast. - Express inter-step dependencies via `ActionBuilder::depends_on` so per-crate test stages run in order (check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload). - **Repo selection is driven by `crates/lab/src/service/services.toml`** (the existing service manifest), via the existing loader `crate::service::services_toml`: - `--repos all` (and the unspecified-on-orchestrate default) => every **enabled** (`disabled = false`) entry from `services.toml`, i.e. `crate::service::services_toml::all_services()`. - `--repos <name,name>` => comma-split; each token resolved through `services_toml::resolve_name_or_tag` (so a tag like `infra`/`ai`/`dev` expands to its services, and a bare name passes through). - `--repos` not specified => the current repo only (walk up to `.git` via `crate::repo::paths::path_home(None)`), so a developer can check just the repo they are in. - Each resolved service name maps to a repo dir under `$PATH_CODE` (reuse the `repo_local_path`/`path_code_for` pattern from `builder/all.rs`). - Crate selection: only crates that have a `service.toml` at their root; respect a per-crate skip mechanism. - Five concrete checks (one Rust file per test): `check_cargo`, `cargo_check_update`, `cargo_build`, `proc_run`, `upload`. Each marks `checks.toml` on success and runs an agent-repair loop (max 5) on failure. - `checks.toml` at repo root records per crate + per test: test name, version, status, last-success date, priority. - Tests carry a version; bumping a test's version forces a re-run; a priority (1/2/3) selects which tests execute at a given level. - Embed the canonical dependency-version list (already present as `[package.metadata.hero_builder.rust_versions]` in `lab/Cargo.toml`, generated into `OUT_DIR/policy.rs`) for `check_cargo` via `lab::builder::policy::embedded()`. - `--delete` stops/deletes jobs, runs, and actions for the run; `--list` lists runs; `--status` shows status; `--force` runs regardless of recorded state. ### Files to Modify/Create - `crates/lab/src/main.rs` — add `TopCmd::Check { ... }` variant + dispatch arm `cmd_check(...)`. (Modify) - `crates/lab/src/lib.rs` — add `pub mod checker;`. (Modify) - `crates/lab/src/checker/mod.rs` — orchestrator: run-name resolution, repo/crate discovery, hero_proc run + job/action creation, single-`--test` dispatch, `--delete`/`--list`/`--status` handlers, the check registry (id, version, priority, level, skills). (Create) - `crates/lab/src/checker/checks_toml.rs` — `ChecksToml` schema + load/save at `<repo>/checks.toml`; per-crate-per-test entry `{ name, version, priority, status, last_success, source_hash }`; re-run policy. (Create) - `crates/lab/src/checker/discovery.rs` — repo-set resolution via `service::services_toml`, plus crate discovery filtered by sibling `service.toml`, server-part detection, and skip detection. (Create) - `crates/lab/src/checker/check_cargo.rs` — deterministic Cargo.toml version/policy check (consumes `policy::embedded()`), exception support, agent fallback. (Create) - `crates/lab/src/checker/cargo_check_update.rs` — `cargo check` + `cargo update` per crate, agent fallback. (Create) - `crates/lab/src/checker/cargo_build.rs` — `lab build <crate>` per crate, agent fallback. (Create) - `crates/lab/src/checker/proc_run.rs` — `lab build --restart --force` for crates whose service has a server part, agent fallback prompting `/hero_proc_cmd`. (Create) - `crates/lab/src/checker/upload.rs` — `lab build --platforms <host-os> --upload --update` for current OS, agent fallback prompting `/hero_proc_cmd`. (Create) - `crates/lab/src/agent/agent_check.rs` — `crate::agent` wrapper for check-repair invocations (model `1`/quality + skill attachment in the prompt), mirroring `agent_repair.rs`. (Create) - `crates/lab/src/agent/mod.rs` — re-export the new wrapper. (Modify) ### Implementation Plan #### Step 1: Wire the `lab check` subcommand skeleton Files: `crates/lab/src/main.rs`, `crates/lab/src/lib.rs`, `crates/lab/src/checker/mod.rs` - In `lib.rs` add `pub mod checker;`. - In `main.rs` add a `TopCmd::Check` variant (model the flag block on the existing `Build`/`Infocheck` variants). Flags: `repos: Option<String>`, `test: Option<String>`, `crate_: Option<String>` (clap `--crate`), `name: Option<String>`, `level: Option<u8>` (priority 1/2/3, default 1), `force: bool`, `delete: bool`, `list: bool`, `status: bool`, `json: bool`. Add `#[command(alias = "checks")]`. - Add dispatch arm calling `cmd_check(...)` (follow the `cmd_infocheck` pattern), building a `CheckOpts` and calling `lab::checker::run(opts)`. - `checker::mod.rs` exposes `pub fn run(opts: CheckOpts) -> anyhow::Result<()>` and `pub struct CheckOpts`. Branch: `--list`/`--status`/`--delete` short-circuit; `--test` set => run a single check inline; otherwise orchestrate. Dependencies: none #### Step 2: `checks.toml` schema + load/save + re-run policy Files: `crates/lab/src/checker/checks_toml.rs` - `ChecksToml { crates: BTreeMap<String, CrateChecks> }`, `CrateChecks { checks: BTreeMap<String, CheckEntry> }`, `CheckEntry { name, version: u32, priority: u8, status: String, last_success: Option<String>, source_hash: Option<String>, reason: Option<String> }`. - `path()` = `<repo_root>/checks.toml` (per the issue, repo root — NOT `~/hero/var`). - `load_or_default` / `save` copying the toml read/write pattern from `builder/all.rs` `AllState`. - `needs_run(entry, current_version, current_hash, force)`: true if force, missing, `entry.version < current_version`, hash changed, or `status != "pass"`. - `mark_success` / `mark_fail` mutators that persist. - Per-crate hash via `builder::hashing::source_hash(crate_root)`. Dependencies: none (parallel with Step 1) #### Step 3: Crate/repo discovery (driven by services.toml) Files: `crates/lab/src/checker/discovery.rs` - `resolve_repos(repos: Option<&str>) -> Vec<PathBuf>`: - `None` => current repo only via `crate::repo::paths::path_home(None)`. - `Some("all")` => `service::services_toml::all_services()` (enabled entries), each mapped to its repo dir under PATH_CODE. - `Some(list)` => split on `,`, each token through `service::services_toml::resolve_name_or_tag` (expands tags), flatten + dedupe, map to repo dirs. - Map a service name to its repo path with the `repo_local_path`/`path_code_for` pattern from `builder/all.rs`; skip repos not present locally with a clear warning. - `crates_with_service_toml(repo_root)`: call `builder::cargo_discovery::load_all_in_repo` + `discover_binaries_in_repo` (as `infocheck::run` does), dedupe by crate dir, keep only dirs where `crate_dir.join("service.toml").exists()`. - `crate_has_server(crate_dir)`: parse `service.toml` into `herolib_core::base::ServiceToml` and check for an RPC/server socket or `_server` binary — needed by `proc_run`. - `is_skipped(crate_dir, test_id)`: read a `[package.metadata.hero_checks] skip = [...]` table from the crate's `Cargo.toml`. Dependencies: none (parallel with Step 1) #### Step 4: Orchestrator — build the hero_proc run with jobs/actions + dependencies Files: `crates/lab/src/checker/mod.rs` - Mirror `builder/all.rs` `submit_build_run` + `poll_until_done`: use `hero_proc_sdk::{RunBuilder, ActionBuilder, hero_proc_factory}`; guard with `crate::service::hero_proc_exception::is_hero_proc_alive()`. - Run name: `opts.name` or `format!("hero_check_find_{}", now.format("%m_%d_%H_%M"))`. - `RunBuilder::new(run_name).max_concurrency(8)`; do not enable cleanup_on_success (keep history). - For each repo -> each crate-with-service.toml -> each selected test (filter by `priority <= opts.level` and `checks_toml::needs_run(...)`): - Add an inline action named `format!("{run_name}_{repo}_{crate}_{test}")` whose script is `lab check --test <test> --crate <crate> --repos <repo_path> [--force]`, tagged `"hero_check"`. - Chain per-crate ordering with `.depends_on(prev_action_name)`: check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload. - `.submit(&hp).await?` once; print the run URL (`http://localhost:9988/hero_proc/admin/#/runs/{run_id}`). Returns fast. Dependencies: Steps 1, 2, 3 #### Step 5: Agent-repair wrapper for checks Files: `crates/lab/src/agent/agent_check.rs`, `crates/lab/src/agent/mod.rs` - `agent_check.rs` modeled on `agent_repair.rs`: `pub async fn agent_fix_check(ctx: &CheckFixContext) -> AgentFixResult`, using `herolib_ai::agent::Agent::claude(quality)` with `PermissionMode::DangerousSkipPermissions`, `.streaming(true)`, `.working_dir(crate_root)`. - `CheckFixContext { crate_root, crate_name, test_id, attempt_number, max_attempts, skills: Vec<&str>, failure_detail }`. - Build the instruction with the issue-mandated skill references (check_cargo => `/blueprint_base_check` + `/rust_versions`; proc_run/upload => `/hero_proc_cmd`). Map `--model 1` to the fast/haiku tier, escalate by attempt. - Re-export from `agent/mod.rs`. Per the checker boundary rule, checks call `crate::agent::agent_fix_check`, never `herolib_ai` directly. Dependencies: none (parallel with Steps 1-4); used by Step 6 #### Step 6: The five check implementations (single-test execution path) Files: `crates/lab/src/checker/{check_cargo,cargo_check_update,cargo_build,proc_run,upload}.rs` Each file: `pub const ID: &str`, `pub const VERSION: u32`, `pub const PRIORITY: u8`, `pub const SKILLS: &[&str]`, and `pub fn run(crate_dir: &Path, force: bool) -> Result<bool>`. The fast Rust path runs; on failure it loops up to 5 times calling `crate::agent::agent_fix_check` then re-running the fast path; on success it calls `checks_toml::mark_success`. - `check_cargo.rs`: load `policy::embedded()`; walk the crate's Cargo.toml with `toml_edit`, flag deps deviating from policy; honor `[package.metadata.hero_checks.cargo_exceptions]`. On fail => agent with `["blueprint_base_check","rust_versions"]`. - `cargo_check_update.rs`: run `cargo check` then `cargo update` in the workspace root; on fail => agent. - `cargo_build.rs`: invoke `lab build` for the crate; on fail => agent. - `proc_run.rs`: only if `crate_has_server`; run `lab build --restart --force`; on fail => agent with `/hero_proc_cmd`. - `upload.rs`: detect host OS; run `lab build --platforms <host-label> --upload --update` only for that OS; on fail => agent with `/hero_proc_cmd`. - Register all five in `mod.rs`'s check table. Dependencies: Steps 2, 3, 5 #### Step 7: `--delete`, `--list`, `--status` handlers Files: `crates/lab/src/checker/mod.rs` - `--list`: list hero_proc runs whose name starts with `hero_check_`; print id/name/status. - `--status`: for the named/most-recent check run, print per-job phase (like `poll_until_done`). - `--delete`: mirror `service/fast_teardown.rs::wipe_service` — `job_stop`/SIGKILL then `run_delete` + `action_delete`; or use `action_clean_by_tag` with the `"hero_check"` tag for one-call teardown. Dependencies: Steps 1, 4 ### Acceptance Criteria - [ ] `lab check --help` lists `--repos`, `--test`, `--crate`, `--name`, `--level`, `--force`, `--delete`, `--list`, `--status`; `lab checks` works as an alias. - [ ] `lab check` creates one hero_proc run named `hero_check_find_<mm_dd_hh_mm>` (or `--name`) with `max_concurrency=8`, returns quickly, prints the run URL. - [ ] Jobs are named `<runname>_<repo>_<crate>_<test>` and per-crate run in the order check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload via depends_on. - [ ] Repo selection comes from `services.toml`: `--repos all` = all enabled entries, tags expand, bare names pass through, no `--repos` = current repo only. - [ ] Only crates with a `service.toml` are checked; `proc_run` only for server crates; `upload` only on the host OS; skipped crates/tests honored. - [ ] `lab check --test <t> --crate <c>` runs the fast Rust check, and on failure calls an agent (`--model 1`) with the mandated skills, retrying up to 5 times. - [ ] On success, `<repo>/checks.toml` records test name, version, priority, status `pass`, and `last_success` for that crate+test; a version bump or source-hash change forces a re-run; `--force` re-runs regardless. - [ ] `--list` shows check runs, `--status` shows per-job status, `--delete` stops and removes the run's jobs/runs/actions. - [ ] `check_cargo` validates against the embedded `rust_versions` policy and honors per-crate exceptions. - [ ] `cargo build -p lab` succeeds; `lab check` twice with no source changes does no work the second time (idempotent). ### Notes - The `checker/` directory already exists with design docs (`instructions.md`, `ideas.md`). Honor its hard rules: reuse `builder/cargo_discovery` (`load_all_in_repo`, `discover_binaries_in_repo`) and `builder/hashing::source_hash`; do not write a second crate walker or hash function. - Per #304, `checks.toml` lives at the repo root and is per-crate/per-binary (not `~/hero/var`). - The dependency-version list is already embedded in `lab/Cargo.toml` under `[package.metadata.hero_builder.rust_versions]` and generated into `OUT_DIR/policy.rs`; consume via `lab::builder::policy::embedded()`. Do not re-embed `/rust_versions`. - Repo selection reuses the existing `crate::service::services_toml` loader (`all_services`, `resolve_name_or_tag`) — `services.toml` is the single source of truth for which repos `all` covers (enabled = not `disabled`). - Closest precedent: `crates/lab/src/builder/all.rs` (`lab build all`) already does this exact pattern — discover repos, submit one hero_proc run of inline jobs that re-invoke the `lab` binary, poll, persist a state TOML. The check orchestrator is structurally `lab build all` with a different job command and per-crate dependency chains. - hero_proc must be running; guard with `is_hero_proc_alive()` and emit the same helpful error as `all.rs`. - Agent skills are passed by name inside the prompt string (e.g. `the skills to use are /blueprint_base_check and /rust_versions`), not embedded as files. `--model 1` maps to a fast/haiku quality tier; escalate by attempt.
Author
Owner

Test Results

  • cargo build -p lab: success (clean compile, 1 pre-existing unrelated warning)
  • cargo test -p lab: all targets green
unit (src/lib + bin):  53 passed; 0 failed
blueprint tests:        8 passed; 0 failed
skills_loader tests:    9 passed; 0 failed
doctests:               1 passed; 1 ignored
----------------------------------------
Total:                 71 passed; 0 failed

Smoke checks (no hero_proc run triggered)

  • lab check --help and lab checks --help (alias) list all flags: --repos, --test, --crate, --name, --level, --force, --delete, --list, --status, --json.
  • lab check --list connects to hero_proc and reports No check runs found. (exit 0).

The orchestration, single-test, and agent-repair paths are I/O- and hero_proc-bound; they are verified by the clean build and CLI wiring. A live end-to-end run (submitting jobs, agent repair loops) is left for manual validation against a running hero_proc.

## Test Results - `cargo build -p lab`: success (clean compile, 1 pre-existing unrelated warning) - `cargo test -p lab`: all targets green ``` unit (src/lib + bin): 53 passed; 0 failed blueprint tests: 8 passed; 0 failed skills_loader tests: 9 passed; 0 failed doctests: 1 passed; 1 ignored ---------------------------------------- Total: 71 passed; 0 failed ``` ### Smoke checks (no hero_proc run triggered) - `lab check --help` and `lab checks --help` (alias) list all flags: `--repos`, `--test`, `--crate`, `--name`, `--level`, `--force`, `--delete`, `--list`, `--status`, `--json`. - `lab check --list` connects to hero_proc and reports `No check runs found.` (exit 0). The orchestration, single-test, and agent-repair paths are I/O- and hero_proc-bound; they are verified by the clean build and CLI wiring. A live end-to-end run (submitting jobs, agent repair loops) is left for manual validation against a running hero_proc.
Author
Owner

Implementation Summary

Added the lab check (alias lab checks) repository self-checker to the lab crate.

What it does

lab check is a fast orchestrator. It resolves the target repos from services.toml, discovers the crates in each that ship a service.toml, and for every (crate x check) it submits one inline job into a single hero_proc run (max 8 concurrent). Each job re-invokes lab check --test <id> --crate <name> --repos <path> to run one check fast in Rust; per crate the jobs are chained with depends_on in the canonical order. On failure a check escalates to an agent (crate::agent, mapping the issue's --model 1) and retries up to 5 times. Results are recorded per crate + per check in a checks.toml at each repo root.

Repo selection (from services.toml)

  • --repos all => every enabled entry in services.toml
  • --repos infra,hero_db => tags expand, bare names pass through
  • no --repos => the current repo only (walk up to .git)

Checks (one Rust file each)

  • check_cargo (prio 1) - Cargo.toml deps validated against the embedded rust_versions policy; honors [package.metadata.hero_checks.cargo_exceptions]
  • cargo_check_update (prio 1) - cargo check + cargo update
  • cargo_build (prio 2) - lab build <crate>
  • proc_run (prio 2) - lab build --restart --force for crates with a server part
  • upload (prio 3) - lab build --platforms <host> --upload --update for the host OS only

Per-crate skips via [package.metadata.hero_checks] skip = [...]. Re-run policy: a check re-runs on --force, a version bump, a source-hash change, or a prior non-pass.

CLI

--repos, --test, --crate, --name, --level (1..=3), --force, --delete (one-call teardown via action_clean_by_tag on the hero_check tag), --list, --status, --json.

Files

Created under crates/lab/src/checker/: mod.rs (registry + dispatch), checks_toml.rs, discovery.rs, orchestrator.rs, management.rs, and the five check modules. Added crates/lab/src/agent/agent_check.rs (+ re-export). Wired Check into main.rs and pub mod checker; into lib.rs.

Reuse (no forks)

Crate discovery via builder::cargo_discovery, hashing via builder::hashing::source_hash, the embedded version policy via builder::policy::embedded(), and repo resolution via service::services_toml.

Notes / caveats

  • A live end-to-end run against a running hero_proc (job submission + agent repair loops) is left for manual validation.
  • proc_run/upload run real builds/uploads; gate them with --level accordingly (defaults to level 1, which excludes them).
## Implementation Summary Added the `lab check` (alias `lab checks`) repository self-checker to the `lab` crate. ### What it does `lab check` is a fast orchestrator. It resolves the target repos from `services.toml`, discovers the crates in each that ship a `service.toml`, and for every (crate x check) it submits one inline job into a single hero_proc run (max 8 concurrent). Each job re-invokes `lab check --test <id> --crate <name> --repos <path>` to run one check fast in Rust; per crate the jobs are chained with `depends_on` in the canonical order. On failure a check escalates to an agent (`crate::agent`, mapping the issue's `--model 1`) and retries up to 5 times. Results are recorded per crate + per check in a `checks.toml` at each repo root. ### Repo selection (from services.toml) - `--repos all` => every enabled entry in `services.toml` - `--repos infra,hero_db` => tags expand, bare names pass through - no `--repos` => the current repo only (walk up to `.git`) ### Checks (one Rust file each) - `check_cargo` (prio 1) - Cargo.toml deps validated against the embedded `rust_versions` policy; honors `[package.metadata.hero_checks.cargo_exceptions]` - `cargo_check_update` (prio 1) - `cargo check` + `cargo update` - `cargo_build` (prio 2) - `lab build <crate>` - `proc_run` (prio 2) - `lab build --restart --force` for crates with a server part - `upload` (prio 3) - `lab build --platforms <host> --upload --update` for the host OS only Per-crate skips via `[package.metadata.hero_checks] skip = [...]`. Re-run policy: a check re-runs on `--force`, a version bump, a source-hash change, or a prior non-pass. ### CLI `--repos`, `--test`, `--crate`, `--name`, `--level` (1..=3), `--force`, `--delete` (one-call teardown via `action_clean_by_tag` on the `hero_check` tag), `--list`, `--status`, `--json`. ### Files Created under `crates/lab/src/checker/`: `mod.rs` (registry + dispatch), `checks_toml.rs`, `discovery.rs`, `orchestrator.rs`, `management.rs`, and the five check modules. Added `crates/lab/src/agent/agent_check.rs` (+ re-export). Wired `Check` into `main.rs` and `pub mod checker;` into `lib.rs`. ### Reuse (no forks) Crate discovery via `builder::cargo_discovery`, hashing via `builder::hashing::source_hash`, the embedded version policy via `builder::policy::embedded()`, and repo resolution via `service::services_toml`. ### Notes / caveats - A live end-to-end run against a running hero_proc (job submission + agent repair loops) is left for manual validation. - `proc_run`/`upload` run real builds/uploads; gate them with `--level` accordingly (defaults to level 1, which excludes them).
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#304
No description provided.