lab checks #304
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_skills#304
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
You want to make a self-checker for the repository. In the repository, we have service files. So each binary is per binary. Each binary sits in a crate. It can be more than one binary per crate, actually. And there is a service TOML file. And this specifies how that binary or binaries are being used, which sockets, what their role is, and so on. We now want to create multiple checks and tests, basically. And there is one, we make one Rust file per test, where we, based on these files, do research as much as we can with code to validate, as an example, the versions of code we use, if certain things are done properly, but also some AI checks. And then we keep in the root of the repository checks.toml, where we keep track of all the tests. And the tests are also labeled with a version. So we know that if in our code we make a new version of the test, then we know we need to test it again. And we can also give a priority to the test, just 1, 2, 3, so that we can define which tests or checks, not tests or checks, which checks we want to execute. Some of them will just call an agent. We're using our agent SDK for that to basically run it. And we can define in the test which agent we use, the quality of the agent, what it can do and so on. Now, all of these are shuttled through our HeroProc. So we use our lab command to drive these tests, like calling an agent or have, of course, what we can do fast in Rust, we can do just fast in Rust ourselves. And we keep track of the status. So in this metadata file, the TOML file, the checks.TOML file, we keep track of when the last date was it succeeded, what the status is of the test, the version of the test, the name of the test, and that is done per binary and per crate. So we really know if the tests were successful. And then we can easily change, like if we, in that TOML file, remove an entry and we run lab checks again, and then the level, we specify the level, then it will go over the repo and do all of these checks and tests and so on, and we can also attach fixes to it, where we identify potentially an agent or skills.
use skills
/hero_service_toml_info
/hero_proc_sdk
implementation
run over multiple repo's,
so doing 'lab check' doesn't take long it creates lots of jobs based on actions in a run
we need lab check --delete. (does stop/delete jobs and actions too, check code in proc)
--list (so we see the runs)
--status
--test ... (specify specific test we want to do, also needed to run in specific job)
--force (means we do anyhow)
--repos name,name. (namefixed, and optional, if specified as "all" then is hero_... in PATH_CODE),, if not specified then repo we are in (find top, functions exist in hero_lib core base class)
.. add more what we need
check_cargo
cargo_check_update.rs
cargo_build.rs
proc_run.sh
upload.sh
Implementation Spec for Issue #304
Objective
Add a
lab check(aliaslab checks) subcommand to thelabcrate that drives a repository self-checker.lab checkis a fast orchestrator: it discovers crates that have aservice.toml, then for each crate + check it creates jobs/actions in a single hero_proc run (max 8 concurrent), wiring dependencies between steps. Each job re-invokeslab check --test <name> ...to run one specific test fast in Rust; on failure the test escalates tolab agent --model 1with the right skills, retrying up to 5 times. Results (status, date, version, crate, test name) are persisted in achecks.tomlat each repo root. The command supports--delete,--list,--status,--test,--force, and--repos.Requirements
lab check(withchecksalias) dispatched frommain.rs, all logic undercrates/lab/src/checker/.--nameif given, elsehero_check_find_$mm_dd_hh_mm.max_concurrency = 8.$runname_<repo>_<crate>_<test>; jobs re-invokelab check --test <test> --crate <crate> --repos <repo>so the orchestrator returns fast.ActionBuilder::depends_onso per-crate test stages run in order (check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload).crates/lab/src/service/services.toml(the existing service manifest), via the existing loadercrate::service::services_toml:--repos all(and the unspecified-on-orchestrate default) => every enabled (disabled = false) entry fromservices.toml, i.e.crate::service::services_toml::all_services().--repos <name,name>=> comma-split; each token resolved throughservices_toml::resolve_name_or_tag(so a tag likeinfra/ai/devexpands to its services, and a bare name passes through).--reposnot specified => the current repo only (walk up to.gitviacrate::repo::paths::path_home(None)), so a developer can check just the repo they are in.$PATH_CODE(reuse therepo_local_path/path_code_forpattern frombuilder/all.rs).service.tomlat their root; respect a per-crate skip mechanism.check_cargo,cargo_check_update,cargo_build,proc_run,upload. Each markschecks.tomlon success and runs an agent-repair loop (max 5) on failure.checks.tomlat repo root records per crate + per test: test name, version, status, last-success date, priority.[package.metadata.hero_builder.rust_versions]inlab/Cargo.toml, generated intoOUT_DIR/policy.rs) forcheck_cargovialab::builder::policy::embedded().--deletestops/deletes jobs, runs, and actions for the run;--listlists runs;--statusshows status;--forceruns regardless of recorded state.Files to Modify/Create
crates/lab/src/main.rs— addTopCmd::Check { ... }variant + dispatch armcmd_check(...). (Modify)crates/lab/src/lib.rs— addpub mod checker;. (Modify)crates/lab/src/checker/mod.rs— orchestrator: run-name resolution, repo/crate discovery, hero_proc run + job/action creation, single---testdispatch,--delete/--list/--statushandlers, the check registry (id, version, priority, level, skills). (Create)crates/lab/src/checker/checks_toml.rs—ChecksTomlschema + load/save at<repo>/checks.toml; per-crate-per-test entry{ name, version, priority, status, last_success, source_hash }; re-run policy. (Create)crates/lab/src/checker/discovery.rs— repo-set resolution viaservice::services_toml, plus crate discovery filtered by siblingservice.toml, server-part detection, and skip detection. (Create)crates/lab/src/checker/check_cargo.rs— deterministic Cargo.toml version/policy check (consumespolicy::embedded()), exception support, agent fallback. (Create)crates/lab/src/checker/cargo_check_update.rs—cargo check+cargo updateper crate, agent fallback. (Create)crates/lab/src/checker/cargo_build.rs—lab build <crate>per crate, agent fallback. (Create)crates/lab/src/checker/proc_run.rs—lab build --restart --forcefor crates whose service has a server part, agent fallback prompting/hero_proc_cmd. (Create)crates/lab/src/checker/upload.rs—lab build --platforms <host-os> --upload --updatefor current OS, agent fallback prompting/hero_proc_cmd. (Create)crates/lab/src/agent/agent_check.rs—crate::agentwrapper for check-repair invocations (model1/quality + skill attachment in the prompt), mirroringagent_repair.rs. (Create)crates/lab/src/agent/mod.rs— re-export the new wrapper. (Modify)Implementation Plan
Step 1: Wire the
lab checksubcommand skeletonFiles:
crates/lab/src/main.rs,crates/lab/src/lib.rs,crates/lab/src/checker/mod.rslib.rsaddpub mod checker;.main.rsadd aTopCmd::Checkvariant (model the flag block on the existingBuild/Infocheckvariants). Flags:repos: Option<String>,test: Option<String>,crate_: Option<String>(clap--crate),name: Option<String>,level: Option<u8>(priority 1/2/3, default 1),force: bool,delete: bool,list: bool,status: bool,json: bool. Add#[command(alias = "checks")].cmd_check(...)(follow thecmd_infocheckpattern), building aCheckOptsand callinglab::checker::run(opts).checker::mod.rsexposespub fn run(opts: CheckOpts) -> anyhow::Result<()>andpub struct CheckOpts. Branch:--list/--status/--deleteshort-circuit;--testset => run a single check inline; otherwise orchestrate.Dependencies: none
Step 2:
checks.tomlschema + load/save + re-run policyFiles:
crates/lab/src/checker/checks_toml.rsChecksToml { crates: BTreeMap<String, CrateChecks> },CrateChecks { checks: BTreeMap<String, CheckEntry> },CheckEntry { name, version: u32, priority: u8, status: String, last_success: Option<String>, source_hash: Option<String>, reason: Option<String> }.path()=<repo_root>/checks.toml(per the issue, repo root — NOT~/hero/var).load_or_default/savecopying the toml read/write pattern frombuilder/all.rsAllState.needs_run(entry, current_version, current_hash, force): true if force, missing,entry.version < current_version, hash changed, orstatus != "pass".mark_success/mark_failmutators that persist.builder::hashing::source_hash(crate_root).Dependencies: none (parallel with Step 1)
Step 3: Crate/repo discovery (driven by services.toml)
Files:
crates/lab/src/checker/discovery.rsresolve_repos(repos: Option<&str>) -> Vec<PathBuf>:None=> current repo only viacrate::repo::paths::path_home(None).Some("all")=>service::services_toml::all_services()(enabled entries), each mapped to its repo dir under PATH_CODE.Some(list)=> split on,, each token throughservice::services_toml::resolve_name_or_tag(expands tags), flatten + dedupe, map to repo dirs.repo_local_path/path_code_forpattern frombuilder/all.rs; skip repos not present locally with a clear warning.crates_with_service_toml(repo_root): callbuilder::cargo_discovery::load_all_in_repo+discover_binaries_in_repo(asinfocheck::rundoes), dedupe by crate dir, keep only dirs wherecrate_dir.join("service.toml").exists().crate_has_server(crate_dir): parseservice.tomlintoherolib_core::base::ServiceTomland check for an RPC/server socket or_serverbinary — needed byproc_run.is_skipped(crate_dir, test_id): read a[package.metadata.hero_checks] skip = [...]table from the crate'sCargo.toml.Dependencies: none (parallel with Step 1)
Step 4: Orchestrator — build the hero_proc run with jobs/actions + dependencies
Files:
crates/lab/src/checker/mod.rsbuilder/all.rssubmit_build_run+poll_until_done: usehero_proc_sdk::{RunBuilder, ActionBuilder, hero_proc_factory}; guard withcrate::service::hero_proc_exception::is_hero_proc_alive().opts.nameorformat!("hero_check_find_{}", now.format("%m_%d_%H_%M")).RunBuilder::new(run_name).max_concurrency(8); do not enable cleanup_on_success (keep history).priority <= opts.levelandchecks_toml::needs_run(...)):format!("{run_name}_{repo}_{crate}_{test}")whose script islab check --test <test> --crate <crate> --repos <repo_path> [--force], tagged"hero_check"..depends_on(prev_action_name): check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload..submit(&hp).await?once; print the run URL (http://localhost:9988/hero_proc/admin/#/runs/{run_id}). Returns fast.Dependencies: Steps 1, 2, 3
Step 5: Agent-repair wrapper for checks
Files:
crates/lab/src/agent/agent_check.rs,crates/lab/src/agent/mod.rsagent_check.rsmodeled onagent_repair.rs:pub async fn agent_fix_check(ctx: &CheckFixContext) -> AgentFixResult, usingherolib_ai::agent::Agent::claude(quality)withPermissionMode::DangerousSkipPermissions,.streaming(true),.working_dir(crate_root).CheckFixContext { crate_root, crate_name, test_id, attempt_number, max_attempts, skills: Vec<&str>, failure_detail }./blueprint_base_check+/rust_versions; proc_run/upload =>/hero_proc_cmd). Map--model 1to the fast/haiku tier, escalate by attempt.agent/mod.rs. Per the checker boundary rule, checks callcrate::agent::agent_fix_check, neverherolib_aidirectly.Dependencies: none (parallel with Steps 1-4); used by Step 6
Step 6: The five check implementations (single-test execution path)
Files:
crates/lab/src/checker/{check_cargo,cargo_check_update,cargo_build,proc_run,upload}.rsEach file:
pub const ID: &str,pub const VERSION: u32,pub const PRIORITY: u8,pub const SKILLS: &[&str], andpub fn run(crate_dir: &Path, force: bool) -> Result<bool>. The fast Rust path runs; on failure it loops up to 5 times callingcrate::agent::agent_fix_checkthen re-running the fast path; on success it callschecks_toml::mark_success.check_cargo.rs: loadpolicy::embedded(); walk the crate's Cargo.toml withtoml_edit, flag deps deviating from policy; honor[package.metadata.hero_checks.cargo_exceptions]. On fail => agent with["blueprint_base_check","rust_versions"].cargo_check_update.rs: runcargo checkthencargo updatein the workspace root; on fail => agent.cargo_build.rs: invokelab buildfor the crate; on fail => agent.proc_run.rs: only ifcrate_has_server; runlab build --restart --force; on fail => agent with/hero_proc_cmd.upload.rs: detect host OS; runlab build --platforms <host-label> --upload --updateonly for that OS; on fail => agent with/hero_proc_cmd.mod.rs's check table.Dependencies: Steps 2, 3, 5
Step 7:
--delete,--list,--statushandlersFiles:
crates/lab/src/checker/mod.rs--list: list hero_proc runs whose name starts withhero_check_; print id/name/status.--status: for the named/most-recent check run, print per-job phase (likepoll_until_done).--delete: mirrorservice/fast_teardown.rs::wipe_service—job_stop/SIGKILL thenrun_delete+action_delete; or useaction_clean_by_tagwith the"hero_check"tag for one-call teardown.Dependencies: Steps 1, 4
Acceptance Criteria
lab check --helplists--repos,--test,--crate,--name,--level,--force,--delete,--list,--status;lab checksworks as an alias.lab checkcreates one hero_proc run namedhero_check_find_<mm_dd_hh_mm>(or--name) withmax_concurrency=8, returns quickly, prints the run URL.<runname>_<repo>_<crate>_<test>and per-crate run in the order check_cargo -> cargo_check_update -> cargo_build -> proc_run -> upload via depends_on.services.toml:--repos all= all enabled entries, tags expand, bare names pass through, no--repos= current repo only.service.tomlare checked;proc_runonly for server crates;uploadonly on the host OS; skipped crates/tests honored.lab check --test <t> --crate <c>runs the fast Rust check, and on failure calls an agent (--model 1) with the mandated skills, retrying up to 5 times.<repo>/checks.tomlrecords test name, version, priority, statuspass, andlast_successfor that crate+test; a version bump or source-hash change forces a re-run;--forcere-runs regardless.--listshows check runs,--statusshows per-job status,--deletestops and removes the run's jobs/runs/actions.check_cargovalidates against the embeddedrust_versionspolicy and honors per-crate exceptions.cargo build -p labsucceeds;lab checktwice with no source changes does no work the second time (idempotent).Notes
checker/directory already exists with design docs (instructions.md,ideas.md). Honor its hard rules: reusebuilder/cargo_discovery(load_all_in_repo,discover_binaries_in_repo) andbuilder/hashing::source_hash; do not write a second crate walker or hash function.checks.tomllives at the repo root and is per-crate/per-binary (not~/hero/var).lab/Cargo.tomlunder[package.metadata.hero_builder.rust_versions]and generated intoOUT_DIR/policy.rs; consume vialab::builder::policy::embedded(). Do not re-embed/rust_versions.crate::service::services_tomlloader (all_services,resolve_name_or_tag) —services.tomlis the single source of truth for which reposallcovers (enabled = notdisabled).crates/lab/src/builder/all.rs(lab build all) already does this exact pattern — discover repos, submit one hero_proc run of inline jobs that re-invoke thelabbinary, poll, persist a state TOML. The check orchestrator is structurallylab build allwith a different job command and per-crate dependency chains.is_hero_proc_alive()and emit the same helpful error asall.rs.the skills to use are /blueprint_base_check and /rust_versions), not embedded as files.--model 1maps to a fast/haiku quality tier; escalate by attempt.Test Results
cargo build -p lab: success (clean compile, 1 pre-existing unrelated warning)cargo test -p lab: all targets greenSmoke checks (no hero_proc run triggered)
lab check --helpandlab checks --help(alias) list all flags:--repos,--test,--crate,--name,--level,--force,--delete,--list,--status,--json.lab check --listconnects to hero_proc and reportsNo check runs found.(exit 0).The orchestration, single-test, and agent-repair paths are I/O- and hero_proc-bound; they are verified by the clean build and CLI wiring. A live end-to-end run (submitting jobs, agent repair loops) is left for manual validation against a running hero_proc.
Implementation Summary
Added the
lab check(aliaslab checks) repository self-checker to thelabcrate.What it does
lab checkis a fast orchestrator. It resolves the target repos fromservices.toml, discovers the crates in each that ship aservice.toml, and for every (crate x check) it submits one inline job into a single hero_proc run (max 8 concurrent). Each job re-invokeslab check --test <id> --crate <name> --repos <path>to run one check fast in Rust; per crate the jobs are chained withdepends_onin the canonical order. On failure a check escalates to an agent (crate::agent, mapping the issue's--model 1) and retries up to 5 times. Results are recorded per crate + per check in achecks.tomlat each repo root.Repo selection (from services.toml)
--repos all=> every enabled entry inservices.toml--repos infra,hero_db=> tags expand, bare names pass through--repos=> the current repo only (walk up to.git)Checks (one Rust file each)
check_cargo(prio 1) - Cargo.toml deps validated against the embeddedrust_versionspolicy; honors[package.metadata.hero_checks.cargo_exceptions]cargo_check_update(prio 1) -cargo check+cargo updatecargo_build(prio 2) -lab build <crate>proc_run(prio 2) -lab build --restart --forcefor crates with a server partupload(prio 3) -lab build --platforms <host> --upload --updatefor the host OS onlyPer-crate skips via
[package.metadata.hero_checks] skip = [...]. Re-run policy: a check re-runs on--force, a version bump, a source-hash change, or a prior non-pass.CLI
--repos,--test,--crate,--name,--level(1..=3),--force,--delete(one-call teardown viaaction_clean_by_tagon thehero_checktag),--list,--status,--json.Files
Created under
crates/lab/src/checker/:mod.rs(registry + dispatch),checks_toml.rs,discovery.rs,orchestrator.rs,management.rs, and the five check modules. Addedcrates/lab/src/agent/agent_check.rs(+ re-export). WiredCheckintomain.rsandpub mod checker;intolib.rs.Reuse (no forks)
Crate discovery via
builder::cargo_discovery, hashing viabuilder::hashing::source_hash, the embedded version policy viabuilder::policy::embedded(), and repo resolution viaservice::services_toml.Notes / caveats
proc_run/uploadrun real builds/uploads; gate them with--levelaccordingly (defaults to level 1, which excludes them).