Admin UI: surface progress for Provision and Install (both are multi-minute waits) #18

Open
opened 2026-05-28 05:08:03 +00:00 by mik-tf · 1 comment
Owner

The deployer admin UI has two operator-driven actions that take a long time but currently look indistinguishable from a hung browser: Provision (~55-60s for substrate write plus deploy_webgateway plus per-tester OAuth app creation) and Install (~9 minutes for setup-binaries.sh plus secret pushes plus hero_proxy restart plus domain.add). Both currently render as a blank loading tab while the synchronous POST is in flight, with no feedback about what step is running or how long is left. Live walk in May 2026 surfaced this from the operator side: clicking Provision for a fresh tester user (alice123) appeared to hang for over a minute before the success page rendered with the VM details. Proposal: switch both actions to a kick-off-plus-poll shape. Provision returns 202 immediately with a status URL the admin page polls (matches the install_hero_stack pattern already in place), Install does the same. The VMs table on the user-detail page becomes the canonical place to watch progress: install_state column already transitions none -> installing -> ready -> error, and provision can grow a sibling provision_state column with similar transitions (queued -> deploying_vm -> deploying_webgateway -> creating_oauth_app -> ready -> error). The admin page renders friendly labels per state plus an elapsed-time indicator so the operator knows the wait is real work in progress, not a hang.

The deployer admin UI has two operator-driven actions that take a long time but currently look indistinguishable from a hung browser: Provision (~55-60s for substrate write plus deploy_webgateway plus per-tester OAuth app creation) and Install (~9 minutes for setup-binaries.sh plus secret pushes plus hero_proxy restart plus domain.add). Both currently render as a blank loading tab while the synchronous POST is in flight, with no feedback about what step is running or how long is left. Live walk in May 2026 surfaced this from the operator side: clicking Provision for a fresh tester user (alice123) appeared to hang for over a minute before the success page rendered with the VM details. Proposal: switch both actions to a kick-off-plus-poll shape. Provision returns 202 immediately with a status URL the admin page polls (matches the install_hero_stack pattern already in place), Install does the same. The VMs table on the user-detail page becomes the canonical place to watch progress: install_state column already transitions none -> installing -> ready -> error, and provision can grow a sibling provision_state column with similar transitions (queued -> deploying_vm -> deploying_webgateway -> creating_oauth_app -> ready -> error). The admin page renders friendly labels per state plus an elapsed-time indicator so the operator knows the wait is real work in progress, not a hang.
Author
Owner

Operator follow-up from the live walk: clicking Install kicks off install_hero_stack correctly, but the install_state badge in the VMs table does not refresh automatically. The operator only sees the badge transition from installing to ready after a manual page reload. So the same kick-off-plus-poll story this issue captures for Provision applies in two ways for Install too: (a) the existing install_state column is correct (the state machine works) but (b) the UI itself needs a periodic background refresh of that column so the operator does not have to remember to reload. Concretely: a small polling JS that hits a JSON endpoint every 5 seconds and re-renders the badge plus the cockpit-URL cell would close the loop. Same primitive (poll-and-rerender) is needed for the new provision_state column proposed above. Worth bundling both into a single UI change instead of two passes.

Operator follow-up from the live walk: clicking Install kicks off install_hero_stack correctly, but the install_state badge in the VMs table does not refresh automatically. The operator only sees the badge transition from `installing` to `ready` after a manual page reload. So the same kick-off-plus-poll story this issue captures for Provision applies in two ways for Install too: (a) the existing install_state column is correct (the state machine works) but (b) the UI itself needs a periodic background refresh of that column so the operator does not have to remember to reload. Concretely: a small polling JS that hits a JSON endpoint every 5 seconds and re-renders the badge plus the cockpit-URL cell would close the loop. Same primitive (poll-and-rerender) is needed for the new provision_state column proposed above. Worth bundling both into a single UI change instead of two passes.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_os_tfgrid_deployer#18
No description provided.