lab-publish.yaml: CI exits success but no latest release created on 3 repos #277

Closed
opened 2026-05-20 12:44:00 +00:00 by mik-tf · 2 comments
Owner

Three repos (hero_rpc, hero_codescalers, hero_wallet) ran lab-publish.yaml to CI status success in the last 24h but did not produce a latest release on Forgejo. The .../releases/tags/latest endpoint returns The target couldn't be found for all three; GET /releases lists only ancient unrelated tags.

This is independent of hero_skills#270 (which is now fixed at de34fe0 and the very next hero_rpc dispatch returned publish | success post-patch — but still without a latest release).

Confirmed instances

Repo Last publish task CI conclusion releases/tags/latest
hero_rpc 2026-05-20T12:42Z (workflow_dispatch, post-#270 patch) success absent
hero_codescalers 2026-05-19T22:39Z (push: fix(ci): flip hero_admin_lib path→git) success absent
hero_wallet 2026-05-19T22:37Z (push: fix(ci): drop removed askama with-axum feature) success absent

For comparison, the 9 other s128 in-flight pushes that landed publish | success all DO have musl-x86_64 assets attached to latest:

Repo musl-x86_64 binaries
hero_agent 3
hero_books 3
hero_collab 3
hero_foundry 4
hero_lib_rhai 2
hero_logic 3
hero_matrixchat 3
hero_planner 3
hero_slides 4

Hypotheses to investigate

  1. lab build exits 0 on failed builds (already known per CLAUDE.md "Tooling concerns"). The publish run may be reporting success even when 0 binaries pass MIN_BIN_BYTES / cargo build / something else.
  2. --upload path silently skips when token scope lacks write:repository on those three repos' Actions secrets. (Token is per-repo.)
  3. Race / quota / forgejo upload-API hiccup unique to those repos.
  4. Different per-repo gating — some metadata in hero_rpc / hero_codescalers / hero_wallet that lab's upload logic treats specially.

Repro

gh api -X POST /repos/lhumina_code/hero_rpc/actions/workflows/lab-publish.yaml/dispatches -f ref=development
sleep 720
curl -s https://forge.ourworld.tf/api/v1/repos/lhumina_code/hero_rpc/releases/tags/latest
# → {"message":"The target couldn't be found."}

Fix surface (best guess)

Add explicit asset-count verification at the end of the publish job — non-zero exit if <expected_count> musl-x86_64 binaries are not present on latest after the run. Surfaces the silent-skip class definitively.

Filed during s129 right after closing #270. Three repos are the only members of the s128 9-repo unblock set that didn't go from green-CI to green-release.

Three repos (`hero_rpc`, `hero_codescalers`, `hero_wallet`) ran `lab-publish.yaml` to **CI status `success`** in the last 24h but did not produce a `latest` release on Forgejo. The `.../releases/tags/latest` endpoint returns `The target couldn't be found` for all three; `GET /releases` lists only ancient unrelated tags. This is independent of [hero_skills#270](https://forge.ourworld.tf/lhumina_code/hero_skills/issues/270) (which is now fixed at [`de34fe0`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/de34fe0) and the very next hero_rpc dispatch returned `publish | success` post-patch — but still without a `latest` release). ### Confirmed instances | Repo | Last `publish` task | CI conclusion | `releases/tags/latest` | |---|---|---|---| | `hero_rpc` | 2026-05-20T12:42Z (workflow_dispatch, post-#270 patch) | success | absent | | `hero_codescalers` | 2026-05-19T22:39Z (push: `fix(ci): flip hero_admin_lib path→git`) | success | absent | | `hero_wallet` | 2026-05-19T22:37Z (push: `fix(ci): drop removed askama with-axum feature`) | success | absent | For comparison, the 9 other s128 in-flight pushes that landed `publish | success` all DO have musl-x86_64 assets attached to `latest`: | Repo | musl-x86_64 binaries | |---|---| | hero_agent | 3 | | hero_books | 3 | | hero_collab | 3 | | hero_foundry | 4 | | hero_lib_rhai | 2 | | hero_logic | 3 | | hero_matrixchat | 3 | | hero_planner | 3 | | hero_slides | 4 | ### Hypotheses to investigate 1. **`lab build` exits 0 on failed builds** (already known per CLAUDE.md "Tooling concerns"). The publish run may be reporting success even when 0 binaries pass MIN_BIN_BYTES / cargo build / something else. 2. **`--upload` path silently skips when token scope lacks `write:repository`** on those three repos' Actions secrets. (Token is per-repo.) 3. **Race / quota / forgejo upload-API hiccup** unique to those repos. 4. **Different per-repo gating** — some metadata in hero_rpc / hero_codescalers / hero_wallet that lab's upload logic treats specially. ### Repro ```bash gh api -X POST /repos/lhumina_code/hero_rpc/actions/workflows/lab-publish.yaml/dispatches -f ref=development sleep 720 curl -s https://forge.ourworld.tf/api/v1/repos/lhumina_code/hero_rpc/releases/tags/latest # → {"message":"The target couldn't be found."} ``` ### Fix surface (best guess) Add explicit asset-count verification at the end of the publish job — non-zero exit if `<expected_count>` musl-x86_64 binaries are not present on `latest` after the run. Surfaces the silent-skip class definitively. Filed during s129 right after closing [#270](https://forge.ourworld.tf/lhumina_code/hero_skills/issues/270). Three repos are the only members of the s128 9-repo unblock set that didn't go from green-CI to green-release.
Author
Owner

Update — likely false alarm.

Re-dispatched the same lab-publish.yaml on hero_wallet (no code change, just workflow_dispatch) and it produced 3 musl-x86_64 binaries on latest:

  • hero_wallet-linux-musl-x86_64
  • hero_wallet_admin-linux-musl-x86_64
  • hero_wallet_server-linux-musl-x86_64

So the original publish | success runs likely hit a transient condition (token-scope timing, forgejo upload-API hiccup, runner state) rather than a structural lab/workflow bug.

Re-dispatching hero_codescalers + hero_rpc now to confirm the same pattern. Will close this issue if both come back with assets on latest.

Fix-surface change: if both re-fires produce assets, the only durable improvement worth landing is the explicit asset-count assertion at the end of the publish job so a future transient silently-failed upload surfaces as a CI red instead of a green-with-empty-release.

**Update — likely false alarm.** Re-dispatched the same `lab-publish.yaml` on `hero_wallet` (no code change, just `workflow_dispatch`) and it produced 3 musl-x86_64 binaries on `latest`: - `hero_wallet-linux-musl-x86_64` - `hero_wallet_admin-linux-musl-x86_64` - `hero_wallet_server-linux-musl-x86_64` So the original `publish | success` runs likely hit a transient condition (token-scope timing, forgejo upload-API hiccup, runner state) rather than a structural lab/workflow bug. Re-dispatching `hero_codescalers` + `hero_rpc` now to confirm the same pattern. Will close this issue if both come back with assets on `latest`. Fix-surface change: if both re-fires produce assets, the only durable improvement worth landing is the **explicit asset-count assertion at the end of the publish job** so a future transient silently-failed upload surfaces as a CI red instead of a green-with-empty-release.
Author
Owner

Resolved as transient. Re-dispatching the three repos (hero_wallet, hero_codescalers, hero_rpc) without any code change produced the expected latest releases:

  • hero_wallet → 3 musl-x86_64 binaries (cli + server + admin)
  • hero_codescalers → 3 musl-x86_64 binaries
  • hero_rpc → 6 musl-x86_64 binaries (including the two nested-workspace bins unblocked by #270)

Root cause was per-repo FORGEJO_TOKEN secret not yet populated when the original CI runs fired (user confirmed mid-session). The workflow itself behaves correctly under a configured token.

The proposed asset-count assertion is still a worthwhile hardening (surfaces future transient token/forgejo-API issues as CI red instead of CI green with empty release), but it's a quality-of-life polish — not a structural bug. Filing as separate optional improvement if anyone wants to land it; otherwise closing this issue as the immediate symptom is moot.

Also noted during s129's post-dispatch sweep: separate from the transient symptom, three repos ship fewer binaries than cargo metadata expects despite lab-publish.yaml running — hero_memory (1/5, ONNX-related), hero_lib_rhai (2/5), hero_office (2/3). All three show publish | failure in CI status (NOT the silent-success symptom this issue catalogues). Those are the strict-coverage gap and will be tracked in the s130 head.

**Resolved as transient.** Re-dispatching the three repos (`hero_wallet`, `hero_codescalers`, `hero_rpc`) without any code change produced the expected `latest` releases: - `hero_wallet` → 3 musl-x86_64 binaries (cli + server + admin) - `hero_codescalers` → 3 musl-x86_64 binaries - `hero_rpc` → 6 musl-x86_64 binaries (including the two nested-workspace bins unblocked by [#270](https://forge.ourworld.tf/lhumina_code/hero_skills/issues/270)) Root cause was per-repo `FORGEJO_TOKEN` secret not yet populated when the original CI runs fired (user confirmed mid-session). The workflow itself behaves correctly under a configured token. The proposed asset-count assertion is still a worthwhile hardening (surfaces future transient token/forgejo-API issues as CI red instead of CI green with empty release), but it's a quality-of-life polish — not a structural bug. Filing as separate optional improvement if anyone wants to land it; otherwise closing this issue as the immediate symptom is moot. Also noted during s129's post-dispatch sweep: separate from the transient symptom, three repos ship fewer binaries than `cargo metadata` expects despite `lab-publish.yaml` running — `hero_memory` (1/5, ONNX-related), `hero_lib_rhai` (2/5), `hero_office` (2/3). All three show `publish | failure` in CI status (NOT the silent-success symptom this issue catalogues). Those are the strict-coverage gap and will be tracked in the s130 head.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#277
No description provided.