hero_assistance --start does not spawn _admin or _ui child processes #16

Closed
opened 2026-05-23 03:33:10 +00:00 by mik-tf · 2 comments
Owner

Running 'hero_assistance --start' from the CLI only spawns hero_assistance_server. The expected behavior, matching what lab service hero_assistance --start does via hero_proc, is to also spawn hero_assistance_ui and hero_assistance_admin as supervised children so the full demo surface is up. The dispatch logic lives in crates/hero_assistance/src/main.rs around the Start subcommand. The fix is to add the _admin and _ui actions to the start sequence alongside _server, mirroring how hero_proc's service definition does it.

Running 'hero_assistance --start' from the CLI only spawns hero_assistance_server. The expected behavior, matching what lab service hero_assistance --start does via hero_proc, is to also spawn hero_assistance_ui and hero_assistance_admin as supervised children so the full demo surface is up. The dispatch logic lives in crates/hero_assistance/src/main.rs around the Start subcommand. The fix is to add the _admin and _ui actions to the start sequence alongside _server, mirroring how hero_proc's service definition does it.
Author
Owner

Reproduced today against a fresh local install. With PATH_ROOT set, hero_assistance --start registers a single supervised service named hero_assistance but only spawns hero_assistance_server; the _admin and _ui daemons never start, so the customer UI at /hero_assistance/app/ and the operator admin at /hero_assistance/admin/ are both unreachable until they are started by hand. The call site is crates/hero_assistance/src/main.rs:333 (self_start calls hp.restart_service(SERVICE_NAME, service, 30)). build_service_definition correctly returns a ServiceBuildResult with three actions (server, ui, admin), and the phase24c_* unit tests at lines 1100 to 1180 pin those registrations, so the in-code action registration is intact; the runtime gap is that restart_service is only spawning the primary action. Two viable fix paths inside this repo: (a) iterate the three action names after restart_service and call hp.action_start for hero_assistance_ui and hero_assistance_admin; (b) re-shape the service definition so the auto-start contract on hero_proc's side picks up all three actions, and update the phase24c_* test gates accordingly. The per-component workaround (lab service hero_assistance_server --install --start plus the same for _admin and _ui) registers all three independently and passes smoke, so this issue blocks only the single-command operator flow, not the underlying stack.

Recommended fix-owner: this repo (lhumina_code/hero_assistance).

Reproduced today against a fresh local install. With `PATH_ROOT` set, `hero_assistance --start` registers a single supervised service named `hero_assistance` but only spawns `hero_assistance_server`; the `_admin` and `_ui` daemons never start, so the customer UI at `/hero_assistance/app/` and the operator admin at `/hero_assistance/admin/` are both unreachable until they are started by hand. The call site is `crates/hero_assistance/src/main.rs:333` (`self_start` calls `hp.restart_service(SERVICE_NAME, service, 30)`). `build_service_definition` correctly returns a `ServiceBuildResult` with three `actions` (server, ui, admin), and the `phase24c_*` unit tests at lines 1100 to 1180 pin those registrations, so the in-code action registration is intact; the runtime gap is that `restart_service` is only spawning the primary action. Two viable fix paths inside this repo: (a) iterate the three action names after `restart_service` and call `hp.action_start` for `hero_assistance_ui` and `hero_assistance_admin`; (b) re-shape the service definition so the auto-start contract on hero_proc's side picks up all three actions, and update the `phase24c_*` test gates accordingly. The per-component workaround (`lab service hero_assistance_server --install --start` plus the same for `_admin` and `_ui`) registers all three independently and passes smoke, so this issue blocks only the single-command operator flow, not the underlying stack. Recommended fix-owner: this repo (`lhumina_code/hero_assistance`).
Author
Owner

Fixed in #20 (squash-merged as 49ea76a7 on development).

Root cause was the Lesson #19 trap: build_service_definition didn't thread PATH_ROOT and HERO_SOCKET_DIR into the action specs for the spawned daemons. hero_assistance_server happened to survive without PATH_ROOT because its startup path doesn't call herolib_core::base::prepare_sockets; the other two daemons do, and resolve_socket_dir falls through to path_var which calls path_root().expect(...) when HERO_SOCKET_DIR is unset in the child env. hero_proc spawns children with whatever env hero_proc_server itself has, so the operator's shell PATH_ROOT never reached the daemons.

Fix is the same forward_env_if_set plus FORWARDED_ENV pattern already in hero_os, hero_indexer, hero_matrixchat, hero_wallet, hero_runner_rhai, hero_code_indexer, and hero_onboarding. New phase24c_build_service_definition_forwards_lesson19_env_to_all_actions unit test pins the contract.

Verified live on this host: hero_assistance --start brought up all three daemons (server PID 3230295, ui PID 3232160, admin PID 3231628) with PATH_ROOT=/home/pctwo/hero propagated into each child env (confirmed via /proc/<pid>/environ). curl returned 200 on the customer SPA root via app.sock, on the admin SPA root via admin.sock, and on /health via rpc.sock.

Fixed in https://forge.ourworld.tf/lhumina_code/hero_assistance/pulls/20 (squash-merged as `49ea76a7` on development). Root cause was the Lesson #19 trap: `build_service_definition` didn't thread `PATH_ROOT` and `HERO_SOCKET_DIR` into the action specs for the spawned daemons. `hero_assistance_server` happened to survive without `PATH_ROOT` because its startup path doesn't call `herolib_core::base::prepare_sockets`; the other two daemons do, and `resolve_socket_dir` falls through to `path_var` which calls `path_root().expect(...)` when `HERO_SOCKET_DIR` is unset in the child env. hero_proc spawns children with whatever env hero_proc_server itself has, so the operator's shell `PATH_ROOT` never reached the daemons. Fix is the same `forward_env_if_set` plus `FORWARDED_ENV` pattern already in hero_os, hero_indexer, hero_matrixchat, hero_wallet, hero_runner_rhai, hero_code_indexer, and hero_onboarding. New `phase24c_build_service_definition_forwards_lesson19_env_to_all_actions` unit test pins the contract. Verified live on this host: `hero_assistance --start` brought up all three daemons (server PID 3230295, ui PID 3232160, admin PID 3231628) with `PATH_ROOT=/home/pctwo/hero` propagated into each child env (confirmed via `/proc/<pid>/environ`). curl returned 200 on the customer SPA root via app.sock, on the admin SPA root via admin.sock, and on /health via rpc.sock.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_assistance#16
No description provided.