lab: harden fresh-install path on Ubuntu 24 — install.sh, user init, apt SIGHUP, env hydration #286
No reviewers
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_skills!286
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fresh-installer-fixes"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Hardens the documented new-user onboarding path (
curl … install.sh | bash→lab user init→lab install core) so it succeeds end-to-end on a fresh Ubuntu 24 box. Each fix below was reproduced on a clean VM before and verified after.Refs: hero_skills#281 (covers the umbrella issue this PR is based on; the four bugs listed in #281 are addressed plus four more found during verification).
What changes (3 commits, 7 files, +454/-20)
crates/lab/install.shexport PATH="$HOME/hero/bin:$PATH"to~/.bashrcand~/.zshrc(grep-then-append, marker line) so new shells findlabwithout the user pasting cargo's PATH hint.Next step: lab user initat the end. Closes the gap wherelab patherrored with "PATH_ROOT is not set" because no one told the new user aboutlab user init.lab user init(crates/lab/src/main.rs)install_lab_to_path_rootno longer warns "could not move lab into …: ~/.local/bin/lab not found" when the binary is already correctly placed at~/hero/bin/lab(which is whatcrates/lab/install.sh'scargo install --rootdoes).git config --global user.{name,email}from Forge/api/v1/userso the nextlab secrets sync(which commits to the secrets repo) doesn't fail with "Author identity unknown" on a fresh root account. Skipped silently when an identity is already set. Falls back to<login>@users.noreply.<forge-host>if the forge user's email is hidden.lab secrets sync(hero_proc is started bylab install core)".lab path(crates/lab/src/main.rs)lab path: PATH_ROOT is not set — run lab user init first") instead of literalecho "ERROR..." >&2 / exit 1shell code. The eval-friendly form is still emitted when piped, soeval "$(lab path)"still surfaces the error properly.lab installapt robustness (crates/lab/src/installers/util.rs+installers/base.rs)apt_installnow runs apt-get via a newrun_cmd_detachedhelper that callsnix::unistd::setsid()in apre_exechook. This stops the apt batch from being killed mid-run by SIGHUP when its own systemd/udev upgrade triggers an sshd restart. Linux-only; macOS unchanged.dpkg --auditand bails (with the audit output +sudo apt-get -f installremediation) if any half-configured packages remain. Apt's exit code alone is insufficient — the "Operation was interrupted" warning emits with status 0.cmd_install_baseruns a newdpkg_repair_if_brokenpreflight on entry: if dpkg state is broken from a previous killed install, automatically runsdpkg --configure -aandapt-get -f install -ybefore retrying. The user no longer has to know those incantations.Env hydration at
labstartup (crates/lab/src/main.rs+user/cfg.rs)Caught during verification on a fresh VM, separately from the original bug list:
lab user init, runninglab install corein the same shell (withoutsource ~/hero/cfg/init.sh) would bail at the Rust step with "PATH_BUILD is not set — re-install your lab environment" — which is misleading because the env was set up correctly on disk, just not in the current shell's env.main(), before any subcommand dispatches, hydratePATH_ROOT/PATH_BUILD/PATH_CODE/PATH_VAR/CARGO_*/SCCACHE_DIR/FORGE_*from$HOME/hero/cfg/hero_cfg.tomlif they're not already set. Pre-existing env always wins (so explicitexportstill takes precedence); no-op when no config exists yet.user::cfg:apply_env(cfg)(in-process mirror ofsource init.sh) andensure_env_from_disk()(best-effort load from canonical$HOME/herolocation).README
lab user init)" section documenting the 6 thingslab user initdoes, plus a realhero_cfg.tomlschema example with optional[mycelium]/[ssh]sections, and an explicit "you do NOT normally write this file by hand" note.install.shnow wires it.Test plan
Verified on a freshly-provisioned Ubuntu 24.04 root VM:
curl … install.sh | BRANCH=fresh-installer-fixes bashcompletes, prints "wired …/bin into ~/.bashrc" and "Next step: lab user init"exec bash(or new shell) picks up PATH automaticallylab user init(entering FORGE_TOKEN) emits no false "could not move lab" warning; prints✓ configured git identity: <login> <<email>>; ends with the reworded "secrets cloned; they will be imported …" messagelab install coreruns to completion ("Hero Shell core installation complete!"), survives the systemd upgrade that previously SIGHUP'd itlab secrets syncsucceeds (was failing with "Author identity unknown" before)lab pathin a non-init'd shell prints a normal stderr error to a TTY (vs. raw shell-script output before)lab install coreworks even when~/hero/cfg/init.shis not sourced (run viaenv -ito verify)cargo build -p labpasses (same 5 pre-existing warnings — those are tracked separately and stay out of scope for this PR)What's intentionally NOT in this PR
For honesty about remaining work — these are open issues, separate scope:
hero_aibrokerchain inlab service core(hero_skills#282) —lab service corestill fails onhero_aibrokersmoke tests on a fresh box. Needs a design call.lab repo find | head,lab secrets list | head, etc. still panic with "Broken pipe (os error 32)". Trivial fix (onesignal(SIGPIPE, SIG_DFL)in main) but kept separate for clean review.lab service core— emits ~430 identical WARN lines per dep wait. Tracked separately.lab user init/hero_cfg.tomlsections were updated. The "Build mode" / "Service lifecycle" / env-var-name sections still reference retired top-level flags and the oldCODEROOTname.Caught during fresh-install verification on Ubuntu 24. After `lab user init` succeeded, running `lab install core` in the same shell — without first `source`-ing `~/hero/cfg/init.sh` — blew up at the Rust step: === Installing Rust toolchain === lab install: PATH_BUILD is not set — re-install your lab environment The error message is misleading: the env was set up correctly on disk (`init.sh` was written and the shell rc was patched), but the *current* shell hadn't sourced it yet. Telling the user to "re-install" is wrong; they just installed. Fix: at the top of `lab`'s main(), before any subcommand dispatches, hydrate PATH_ROOT / PATH_BUILD / PATH_CODE / PATH_VAR / CARGO_* / SCCACHE_DIR / FORGE_* from `$HOME/hero/cfg/hero_cfg.toml` if they're not already set. No-op when env is already populated (so an explicit `export` always wins) or when no config exists yet (pre-init). Adds two helpers in `user::cfg`: - `apply_env(cfg)` — mirror of what `source init.sh` does, applied to the current process's env. - `ensure_env_from_disk()` — best-effort load of hero_cfg.toml from the canonical `$HOME/hero` location, then apply_env. Returns the resolved root or None when no config is found. Refs: hero_skills#281