[--from-ci] service_X start commands purge binaries and re-cargo-build, defeating --from-ci installs #64
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
service_X start(andstart --reset) currently always invokes the cargo install path internally — purging existing binaries and rebuilding from source. This wipes out binaries installed viaservice_X install --from-ciand immediately fails on hosts without the source repos /ROOTDIRconfigured (every--from-ci-only deploy host).Reproduction
On heroci.gent01.grid.tf (a fresh TFGrid VM with no Hero source repos, only nu + hero_skills):
The same shape applies to every
service_X.nu'sstartfunction — they all callsvc_drop_registration → svc_wait_processes_gone → svc_purge_binaries → install → register → start. Theinstallstep insidestartis the cargo path; there's no branch for--from-ci.Why this matters
This blocks the natural goal of hero_demo#54: stand up a full Hero stack on a CI-paved VM with no source builds. We can
--from-ciinstall all services (already done for 7 in PR-195/196/197 etc.), but the moment we want to actually run them under hero_proc supervision, we're back to needing source + cargo.Two ways to fix
Option A — extend
--from-citostartcommands (mechanical, mirrors install pattern)Each
service_X.nustartfunction gets--from-ciand--versionflags, identical-shape to install. Internally,startwould:svc_purge_binarieswhen--from-ciis set AND binaries are already in place from a CI install (or--resetis also set, in which case purge and re-fetch).installcall betweensvc_install_from_ciandsvc_installbased on$from_ci.~/hero/bin/<name>is byte-equivalent regardless of how it got there.Per-service patch is ~5 lines. Mirrors PRs #195 / #196 / #197.
Pros: boring, mechanical, fully consistent with the merged
--from-ciinstall pattern. No new abstractions.Cons: ~10 service modules need the same patch (one PR per logical batch). Adds another flag to start signatures.
startbecomes coupled to "how was it installed", which is a smell.Option B — separate
register/upcommand that doesn't reinstallSplit
startinto two pieces:service_X register— ensures the binary is present, registers actions + service with hero_proc, starts. No install / purge. If the binary is missing, errors out with a clear message naminginstall/install --from-cias the prerequisite.service_X start(existing) — keeps the convenience "ensure installed + register + start" behaviour for source-build dev workflows.registeris the one used for CI-deploy paths.Pros: clean separation.
registeris independent of install method. Forces explicit "install before register" on deploy paths (which matches industry-standard practice —apt install foothensystemctl start foo).Cons: new command name to learn. Two paths to maintain. Documentation cost across all
service_*modules.Recommendation
Option A for the immediate
--from-cirollout — fastest path, mirrors merged pattern, keeps the invariant thatstartis the one-shot "make this service running" command regardless of install method.Option B is a bigger architectural cleanup that's worth its own design pass once the install side is fully rolled out across all easy-tier services. The two are not mutually exclusive: A unblocks deploys today, B refines the contract once the smoke is settled.
Workaround for now
On a
--from-cihost:service_X install --from-ci --rootfor every needed service.proc service .../proc action ...RPC calls — bypassingservice_X start's reinstall step.This is workable for one-off validation but not for production deploys.
Out of scope
--from-citoservice_install_all(Phase 3 of hero_demo#54). Same fix shape, but lifted to the rollup.service_voice,service_embedder) where the ONNX bundling decision is still pending.Found while running the install-side smoke pass for 7 services on heroci. Install path for all 7 is verified working; full lifecycle (
startunder hero_proc supervision) is blocked by this.Signed-off-by: mik-tf