lab_install.sh defaults to development/edge channel — reinstalls multi-socket lab, breaks lifecycle against main hero_proc (rpc_jobs.sock) #316

Open
opened 2026-06-08 14:33:43 +00:00 by sameh-farouk · 0 comments
Member

Summary

scripts/lab_install.sh defaults the install channel to development/edge, which ships a lab built against hero_proc_sdk@development (multi-domain sockets: rpc_jobs.sock, rpc_logs.sock, rpc_secrets.sock, rpc_system.sock). On any host running main hero_proc (single rpc.sock), lab's service lifecycle then fails to connect:

cannot connect to hero_proc: failed to connect to hero_proc sockets: socket bind: transport connect:
Connection error: socket not found at '.../hero_proc/rpc_jobs.sock' — service not running or socket directory not mounted

Why this is a footgun

Even after fixing everything else (rebuild hero_proc from main; rebuild lab from main, which eb9f586 already repins to hero_proc_sdk = { branch = "main" }), a later bare lab_install.sh (no BRANCH=main) or a lab self-update silently reinstalls the development/edge multi-socket lab and the rpc_jobs.sock error returns. The failure looks like a hero_proc problem, but the real culprit is which hero_proc_sdk the installed lab was built against.

Root cause

  • scripts/lab_install.sh:112BRANCH="${BRANCH:-development}" (default = development)
  • scripts/lab_install.sh:114-116development → fetches the edge prerelease binary
  • crates/lab/Cargo.toml:50 — on development, hero_proc_sdk = { branch = "development" } (multi-socket). main now pins branch = "main" (single-socket) via eb9f586.

So edge lab is multi-socket; latest/stable lab (built from main) is single-socket.

Suggested fixes (any of)

  1. Change the installer default from development to the stable/main channel, so a bare lab_install.sh matches a main hero_proc. (Opt into edge explicitly with BRANCH=development.)
  2. Repin hero_proc_sdk to main on the development branch too, or finish the socket migration so main == development and the dialect mismatch disappears.
  3. Make lab fail with an actionable message when its socket dialect doesn't match the running hero_proc, e.g. "this lab expects multi-domain sockets but hero_proc serves rpc.sock — reinstall with BRANCH=main."

Workaround (current)

Install with BRANCH=main lab_install.sh, or build from a main checkout via scripts/lab_build.sh.

## Summary `scripts/lab_install.sh` defaults the install channel to **`development`/`edge`**, which ships a `lab` built against `hero_proc_sdk@development` (multi-domain sockets: `rpc_jobs.sock`, `rpc_logs.sock`, `rpc_secrets.sock`, `rpc_system.sock`). On any host running **`main`** hero_proc (single `rpc.sock`), `lab`'s service lifecycle then fails to connect: ``` cannot connect to hero_proc: failed to connect to hero_proc sockets: socket bind: transport connect: Connection error: socket not found at '.../hero_proc/rpc_jobs.sock' — service not running or socket directory not mounted ``` ## Why this is a footgun Even after fixing everything else (rebuild hero_proc from `main`; rebuild `lab` from `main`, which `eb9f586` already repins to `hero_proc_sdk = { branch = "main" }`), a later bare `lab_install.sh` (no `BRANCH=main`) or a `lab self-update` **silently reinstalls the development/edge multi-socket `lab`** and the `rpc_jobs.sock` error returns. The failure looks like a hero_proc problem, but the real culprit is which `hero_proc_sdk` the installed `lab` was built against. ## Root cause - `scripts/lab_install.sh:112` — `BRANCH="${BRANCH:-development}"` (default = development) - `scripts/lab_install.sh:114-116` — `development` → fetches the `edge` prerelease binary - `crates/lab/Cargo.toml:50` — on `development`, `hero_proc_sdk = { branch = "development" }` (multi-socket). `main` now pins `branch = "main"` (single-socket) via `eb9f586`. So `edge` lab is multi-socket; `latest`/stable lab (built from `main`) is single-socket. ## Suggested fixes (any of) 1. **Change the installer default** from `development` to the stable/`main` channel, so a bare `lab_install.sh` matches a `main` hero_proc. (Opt into edge explicitly with `BRANCH=development`.) 2. **Repin `hero_proc_sdk` to `main` on the `development` branch too**, or finish the socket migration so `main` == `development` and the dialect mismatch disappears. 3. **Make `lab` fail with an actionable message** when its socket dialect doesn't match the running hero_proc, e.g. *"this lab expects multi-domain sockets but hero_proc serves rpc.sock — reinstall with `BRANCH=main`."* ## Workaround (current) Install with `BRANCH=main lab_install.sh`, or build from a `main` checkout via `scripts/lab_build.sh`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#316
No description provided.