lab: harden fresh-install path on Ubuntu 24 — install.sh, user init, apt SIGHUP, env hydration #286

Merged
nabil_salah merged 3 commits from fresh-installer-fixes into development 2026-05-24 10:27:30 +00:00
Member

Summary

Hardens the documented new-user onboarding path (curl … install.sh | bashlab user initlab install core) so it succeeds end-to-end on a fresh Ubuntu 24 box. Each fix below was reproduced on a clean VM before and verified after.

Refs: hero_skills#281 (covers the umbrella issue this PR is based on; the four bugs listed in #281 are addressed plus four more found during verification).

What changes (3 commits, 7 files, +454/-20)

crates/lab/install.sh

  • Appends an idempotent export PATH="$HOME/hero/bin:$PATH" to ~/.bashrc and ~/.zshrc (grep-then-append, marker line) so new shells find lab without the user pasting cargo's PATH hint.
  • Prints a clear Next step: lab user init at the end. Closes the gap where lab path errored with "PATH_ROOT is not set" because no one told the new user about lab user init.

lab user init (crates/lab/src/main.rs)

  • install_lab_to_path_root no longer warns "could not move lab into …: ~/.local/bin/lab not found" when the binary is already correctly placed at ~/hero/bin/lab (which is what crates/lab/install.sh's cargo install --root does).
  • Configures git config --global user.{name,email} from Forge /api/v1/user so the next lab secrets sync (which commits to the secrets repo) doesn't fail with "Author identity unknown" on a fresh root account. Skipped silently when an identity is already set. Falls back to <login>@users.noreply.<forge-host> if the forge user's email is hidden.
  • Reworded the "skipping secrets import" message: previously read like a failure ("hero_proc not running — skipping secrets import"), now says "secrets cloned; they will be imported into hero_proc on the next lab secrets sync (hero_proc is started by lab install core)".

lab path (crates/lab/src/main.rs)

  • TTY-aware error: when stdout is a terminal, prints a normal stderr error ("lab path: PATH_ROOT is not set — run lab user init first") instead of literal echo "ERROR..." >&2 / exit 1 shell code. The eval-friendly form is still emitted when piped, so eval "$(lab path)" still surfaces the error properly.

lab install apt robustness (crates/lab/src/installers/util.rs + installers/base.rs)

  • apt_install now runs apt-get via a new run_cmd_detached helper that calls nix::unistd::setsid() in a pre_exec hook. This stops the apt batch from being killed mid-run by SIGHUP when its own systemd/udev upgrade triggers an sshd restart. Linux-only; macOS unchanged.
  • After every apt invocation, runs dpkg --audit and bails (with the audit output + sudo apt-get -f install remediation) if any half-configured packages remain. Apt's exit code alone is insufficient — the "Operation was interrupted" warning emits with status 0.
  • cmd_install_base runs a new dpkg_repair_if_broken preflight on entry: if dpkg state is broken from a previous killed install, automatically runs dpkg --configure -a and apt-get -f install -y before retrying. The user no longer has to know those incantations.

Env hydration at lab startup (crates/lab/src/main.rs + user/cfg.rs)

Caught during verification on a fresh VM, separately from the original bug list:

  • After a successful lab user init, running lab install core in the same shell (without source ~/hero/cfg/init.sh) would bail at the Rust step with "PATH_BUILD is not set — re-install your lab environment" — which is misleading because the env was set up correctly on disk, just not in the current shell's env.
  • Fix: at the top of main(), before any subcommand dispatches, hydrate PATH_ROOT / PATH_BUILD / PATH_CODE / PATH_VAR / CARGO_* / SCCACHE_DIR / FORGE_* from $HOME/hero/cfg/hero_cfg.toml if they're not already set. Pre-existing env always wins (so explicit export still takes precedence); no-op when no config exists yet.
  • Adds two helpers in user::cfg: apply_env(cfg) (in-process mirror of source init.sh) and ensure_env_from_disk() (best-effort load from canonical $HOME/hero location).

README

  • New "Prerequisites" section with copy-paste blocks for Debian/Ubuntu, macOS, and Fedora (git + build-essential + pkg-config + libssl-dev + rustup).
  • New "First-time setup (lab user init)" section documenting the 6 things lab user init does, plus a real hero_cfg.toml schema example with optional [mycelium] / [ssh] sections, and an explicit "you do NOT normally write this file by hand" note.
  • Tightened the older "make sure ~/hero/bin is on PATH" wording since install.sh now wires it.

Test plan

Verified on a freshly-provisioned Ubuntu 24.04 root VM:

  • curl … install.sh | BRANCH=fresh-installer-fixes bash completes, prints "wired …/bin into ~/.bashrc" and "Next step: lab user init"
  • exec bash (or new shell) picks up PATH automatically
  • lab user init (entering FORGE_TOKEN) emits no false "could not move lab" warning; prints ✓ configured git identity: <login> <<email>>; ends with the reworded "secrets cloned; they will be imported …" message
  • lab install core runs to completion ("Hero Shell core installation complete!"), survives the systemd upgrade that previously SIGHUP'd it
  • lab secrets sync succeeds (was failing with "Author identity unknown" before)
  • lab path in a non-init'd shell prints a normal stderr error to a TTY (vs. raw shell-script output before)
  • Env hydration confirmed: lab install core works even when ~/hero/cfg/init.sh is not sourced (run via env -i to verify)
  • cargo build -p lab passes (same 5 pre-existing warnings — those are tracked separately and stay out of scope for this PR)
  • lab build --fast --restart hero_proc

What's intentionally NOT in this PR

For honesty about remaining work — these are open issues, separate scope:

  • hero_aibroker chain in lab service core (hero_skills#282) — lab service core still fails on hero_aibroker smoke tests on a fresh box. Needs a design call.
  • SIGPIPE panic on piped outputlab repo find | head, lab secrets list | head, etc. still panic with "Broken pipe (os error 32)". Trivial fix (one signal(SIGPIPE, SIG_DFL) in main) but kept separate for clean review.
  • Verbose dependency-wait polling in lab service core — emits ~430 identical WARN lines per dep wait. Tracked separately.
  • 5 (now 6) pre-existing compiler warnings in completions.rs / uninstall.rs / secrets/client.rs / fast_teardown.rs.
  • Full README audit — only the prerequisites + lab user init / hero_cfg.toml sections were updated. The "Build mode" / "Service lifecycle" / env-var-name sections still reference retired top-level flags and the old CODEROOT name.
## Summary Hardens the documented new-user onboarding path (`curl … install.sh | bash` → `lab user init` → `lab install core`) so it succeeds end-to-end on a fresh Ubuntu 24 box. Each fix below was reproduced on a clean VM before and verified after. Refs: hero_skills#281 (covers the umbrella issue this PR is based on; the four bugs listed in #281 are addressed plus four more found during verification). ## What changes (3 commits, 7 files, +454/-20) ### `crates/lab/install.sh` - Appends an idempotent `export PATH="$HOME/hero/bin:$PATH"` to `~/.bashrc` and `~/.zshrc` (grep-then-append, marker line) so new shells find `lab` without the user pasting cargo's PATH hint. - Prints a clear `Next step: lab user init` at the end. Closes the gap where `lab path` errored with "PATH_ROOT is not set" because no one told the new user about `lab user init`. ### `lab user init` (`crates/lab/src/main.rs`) - `install_lab_to_path_root` no longer warns "could not move lab into …: ~/.local/bin/lab not found" when the binary is already correctly placed at `~/hero/bin/lab` (which is what `crates/lab/install.sh`'s `cargo install --root` does). - Configures `git config --global user.{name,email}` from Forge `/api/v1/user` so the next `lab secrets sync` (which commits to the secrets repo) doesn't fail with "Author identity unknown" on a fresh root account. Skipped silently when an identity is already set. Falls back to `<login>@users.noreply.<forge-host>` if the forge user's email is hidden. - Reworded the "skipping secrets import" message: previously read like a failure ("hero_proc not running — skipping secrets import"), now says "secrets cloned; they will be imported into hero_proc on the next `lab secrets sync` (hero_proc is started by `lab install core`)". ### `lab path` (`crates/lab/src/main.rs`) - TTY-aware error: when stdout is a terminal, prints a normal stderr error ("`lab path: PATH_ROOT is not set — run lab user init first`") instead of literal `echo "ERROR..." >&2 / exit 1` shell code. The eval-friendly form is still emitted when piped, so `eval "$(lab path)"` still surfaces the error properly. ### `lab install` apt robustness (`crates/lab/src/installers/util.rs` + `installers/base.rs`) - `apt_install` now runs apt-get via a new `run_cmd_detached` helper that calls `nix::unistd::setsid()` in a `pre_exec` hook. This stops the apt batch from being killed mid-run by SIGHUP when its own systemd/udev upgrade triggers an sshd restart. Linux-only; macOS unchanged. - After every apt invocation, runs `dpkg --audit` and bails (with the audit output + `sudo apt-get -f install` remediation) if any half-configured packages remain. Apt's exit code alone is insufficient — the "Operation was interrupted" warning emits with status 0. - `cmd_install_base` runs a new `dpkg_repair_if_broken` preflight on entry: if dpkg state is broken from a previous killed install, automatically runs `dpkg --configure -a` and `apt-get -f install -y` before retrying. The user no longer has to know those incantations. ### Env hydration at `lab` startup (`crates/lab/src/main.rs` + `user/cfg.rs`) Caught during verification on a fresh VM, separately from the original bug list: - After a successful `lab user init`, running `lab install core` in the same shell (without `source ~/hero/cfg/init.sh`) would bail at the Rust step with "PATH_BUILD is not set — re-install your lab environment" — which is misleading because the env was set up correctly on disk, just not in the current shell's env. - Fix: at the top of `main()`, before any subcommand dispatches, hydrate `PATH_ROOT` / `PATH_BUILD` / `PATH_CODE` / `PATH_VAR` / `CARGO_*` / `SCCACHE_DIR` / `FORGE_*` from `$HOME/hero/cfg/hero_cfg.toml` if they're not already set. Pre-existing env always wins (so explicit `export` still takes precedence); no-op when no config exists yet. - Adds two helpers in `user::cfg`: `apply_env(cfg)` (in-process mirror of `source init.sh`) and `ensure_env_from_disk()` (best-effort load from canonical `$HOME/hero` location). ### README - New "Prerequisites" section with copy-paste blocks for Debian/Ubuntu, macOS, and Fedora (git + build-essential + pkg-config + libssl-dev + rustup). - New "First-time setup (`lab user init`)" section documenting the 6 things `lab user init` does, plus a real `hero_cfg.toml` schema example with optional `[mycelium]` / `[ssh]` sections, and an explicit "you do NOT normally write this file by hand" note. - Tightened the older "make sure ~/hero/bin is on PATH" wording since `install.sh` now wires it. ## Test plan Verified on a freshly-provisioned Ubuntu 24.04 root VM: - [x] `curl … install.sh | BRANCH=fresh-installer-fixes bash` completes, prints "wired …/bin into ~/.bashrc" and "Next step: lab user init" - [x] `exec bash` (or new shell) picks up PATH automatically - [x] `lab user init` (entering FORGE_TOKEN) emits no false "could not move lab" warning; prints `✓ configured git identity: <login> <<email>>`; ends with the reworded "secrets cloned; they will be imported …" message - [x] `lab install core` runs to completion ("Hero Shell core installation complete!"), survives the systemd upgrade that previously SIGHUP'd it - [x] `lab secrets sync` succeeds (was failing with "Author identity unknown" before) - [x] `lab path` in a non-init'd shell prints a normal stderr error to a TTY (vs. raw shell-script output before) - [x] Env hydration confirmed: `lab install core` works even when `~/hero/cfg/init.sh` is not sourced (run via `env -i` to verify) - [x] `cargo build -p lab` passes (same 5 pre-existing warnings — those are tracked separately and stay out of scope for this PR) - [x] lab build --fast --restart hero_proc ## What's intentionally NOT in this PR For honesty about remaining work — these are open issues, separate scope: - **`hero_aibroker` chain in `lab service core`** (hero_skills#282) — `lab service core` still fails on `hero_aibroker` smoke tests on a fresh box. Needs a design call. - **SIGPIPE panic on piped output** — `lab repo find | head`, `lab secrets list | head`, etc. still panic with "Broken pipe (os error 32)". Trivial fix (one `signal(SIGPIPE, SIG_DFL)` in main) but kept separate for clean review. - **Verbose dependency-wait polling** in `lab service core` — emits ~430 identical WARN lines per dep wait. Tracked separately. - **5 (now 6) pre-existing compiler warnings** in completions.rs / uninstall.rs / secrets/client.rs / fast_teardown.rs. - **Full README audit** — only the prerequisites + `lab user init` / `hero_cfg.toml` sections were updated. The "Build mode" / "Service lifecycle" / env-var-name sections still reference retired top-level flags and the old `CODEROOT` name.
Signed-off-by: Nabil-Salah <nabil.salah203@gmail.com>
Caught during fresh-install verification on Ubuntu 24.

After `lab user init` succeeded, running `lab install core` in the same
shell — without first `source`-ing `~/hero/cfg/init.sh` — blew up at the
Rust step:

    === Installing Rust toolchain ===
    lab install: PATH_BUILD is not set — re-install your lab environment

The error message is misleading: the env was set up correctly on disk
(`init.sh` was written and the shell rc was patched), but the *current*
shell hadn't sourced it yet. Telling the user to "re-install" is wrong;
they just installed.

Fix: at the top of `lab`'s main(), before any subcommand dispatches,
hydrate PATH_ROOT / PATH_BUILD / PATH_CODE / PATH_VAR / CARGO_* /
SCCACHE_DIR / FORGE_* from `$HOME/hero/cfg/hero_cfg.toml` if they're
not already set. No-op when env is already populated (so an explicit
`export` always wins) or when no config exists yet (pre-init).

Adds two helpers in `user::cfg`:
- `apply_env(cfg)`     — mirror of what `source init.sh` does, applied
                         to the current process's env.
- `ensure_env_from_disk()` — best-effort load of hero_cfg.toml from the
                             canonical `$HOME/hero` location, then
                             apply_env. Returns the resolved root or
                             None when no config is found.

Refs: hero_skills#281
Signed-off-by: Nabil-Salah <nabil.salah203@gmail.com>
mahmoud approved these changes 2026-05-24 10:26:38 +00:00
nabil_salah merged commit a4e2308579 into development 2026-05-24 10:27:30 +00:00
nabil_salah deleted branch fresh-installer-fixes 2026-05-24 10:27:37 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills!286
No description provided.