Auto-publish lab-latest binaries from CI on push-to-development #268

Open
opened 2026-05-19 17:23:06 +00:00 by mik-tf · 0 comments
Owner

Summary

Replace the ad-hoc, currently-broken .forgejo/workflows/build-linux.yaml (and siblings) across all hero repos with one canonical workflow that runs lab build --release --platforms linux-musl-x86_64 --upload --workspace. The workflow fires on push-to-development and on v* tag push, populating each repo's latest release on Forgejo. Once in place, every push to development means fresh binaries on Forgejo within ~5–10 min, and any VM (or new herodemo) can be brought up to date with a simple lab build <repo> --download --install loop — no human in the loop, no per-repo manual lab build --upload.

Why

Today the publish flow is fully manual. The lab build --upload / lab build --download --install pair works (validated by populating all ~20 latest releases for herodemo on 2026-05-19), but no automation fires it on push, so every release means a workstation operator running 20 builds by hand. Forgejo Actions infrastructure already exists across the org — every hero_* repo has .forgejo/workflows/build-linux.yaml or similar — but those workflows:

  1. Trigger only on v* tag push, never on development push
  2. Call raw cargo build + curl -X POST upload, not lab build --upload (so naming + tag are inconsistent — ${BIN}-x86_64-unknown-linux-musl vs lab's ${BIN}-linux-musl-x86_64, and upload tag is the version v0.1.5 rather than latest)
  3. Are currently broken — the last 5 push runs for hero_proc all conclude failure. Auto-fetched verdict via GET /api/v1/repos/lhumina_code/hero_proc/actions/runs?limit=5.

Net: we have the CI surface, we have lab, but they aren't wired together.

Proposed canonical workflow

Single file, ~25 lines, lives at .forgejo/workflows/lab-publish.yaml in every hero_* repo:

name: lab publish
on:
  push:
    branches: [development]
    tags: ["v*"]
  workflow_dispatch:

jobs:
  publish:
    runs-on: docker
    container:
      image: ghcr.io/despiegk/builder:latest
    steps:
      - uses: actions/checkout@v4
      - name: Install lab
        run: |
          curl -sSfL https://forge.ourworld.tf/lhumina_code/hero_skills/raw/branch/development/crates/lab/install.sh | bash
          echo "$HOME/.local/bin" >> $GITHUB_PATH
          echo "$HOME/hero/bin" >> $GITHUB_PATH
      - name: Build + upload
        env:
          FORGE_TOKEN: ${{ secrets.FORGEJO_TOKEN }}
          FORGEJO_TOKEN: ${{ secrets.FORGEJO_TOKEN }}
        run: |
          lab build --release --platforms linux-musl-x86_64 --upload --workspace

That's it. lab handles platform, naming (${BIN}-linux-musl-x86_64), the latest tag, md5 sidecars, skip-by-fingerprint, and the policy enforcement that the old workflows didn't.

Roll-out plan

  1. Land the YAML in hero_proc first (already has CI surface, mature). Push to development. Verify a green run lands binaries at https://forge.ourworld.tf/lhumina_code/hero_proc/releases/tag/latest.
  2. Templating script in hero_skills/scripts/ — iterates over the 37 repos with existing CI:
    • Deletes .forgejo/workflows/{build-linux,build,release,build.yml}.yaml (the broken ones)
    • Drops the canonical lab-publish.yaml in
    • git add -A && git commit -m "ci: switch to lab-publish canonical workflow" && git push origin development
  3. First push triggers the new CI on every repo → all latest releases get a green refresh
  4. Document the canonical flow in hero_skills/docs/ and link from lab skill body

Gotcha 1 — install.sh 404

The lab skill currently advertises https://forge.ourworld.tf/lhumina_code/hero_skills/raw/branch/development/crates/lab/install.sh as the install path. That URL returns 404 today (validated 2026-05-19; was the blocker when trying to install lab on the herodemo VM — had to scp the binary from the workstation instead).

Fix: add install.sh at crates/lab/install.sh in hero_skills on development. The content already exists in the wild (was at this path historically); just needs to be restored or rewritten. Minimal version:

#!/usr/bin/env bash
set -euo pipefail
TMPDIR=$(mktemp -d)
cd "$TMPDIR"
forge_clone() { git clone --depth 1 -b development "https://forge.ourworld.tf/lhumina_code/hero_skills" hero_skills_tmp; }
forge_clone
cd hero_skills_tmp/crates/lab
cargo install --path . --root "${HOME}/.local" --locked --force
# symlink into ~/hero/bin if it exists
[ -d "${HOME}/hero/bin" ] && ln -sf "${HOME}/.local/bin/lab" "${HOME}/hero/bin/lab" || true

Without this fix the proposed CI workflow's "Install lab" step also fails — same root cause.

Gotcha 2 — reusable workflows vs templated copies

Forgejo Actions supports uses: org/repo/.forgejo/workflows/file.yaml@branch in some versions ("reusable workflows" feature, GHA-compatible).

If supported on our Forgejo deployment: the canonical workflow lives in one place (hero_skills/.forgejo/workflows/lab-publish-reusable.yaml) and every per-repo workflow shrinks to a 3-line caller:

name: lab publish
on:
  push: { branches: [development], tags: ["v*"] }
jobs:
  publish:
    uses: lhumina_code/hero_skills/.forgejo/workflows/lab-publish-reusable.yaml@development
    secrets: inherit

37 × 3 lines instead of 37 × 25 lines. One update to hero_skills propagates everywhere. No drift.

If not supported: fall back to copying the 25-line YAML into each repo (still way better than today's 37 broken ad-hoc workflows). Document the templated form so updates can be re-applied across the org with a single script run.

Check first: run forgejo --version on the runner host, or curl -s https://forge.ourworld.tf/api/v1/version and cross-reference against the Forgejo changelog for reusable-workflow support (landed in Forgejo 7.x). If yes, prefer the reusable path; if no, templated copies.

Definition of done

  • install.sh lands at crates/lab/install.sh on development branch of hero_skills, verified by curl -sSfL <URL> | head returning the script
  • Forgejo reusable-workflow support confirmed or denied (check GET /api/v1/version)
  • .forgejo/workflows/lab-publish.yaml (or reusable form) lands in hero_proc on development; one push-triggered green run produces fresh musl binaries at releases/tag/latest
  • [~] Templating script rolls it out to the other 36 hero_* repos that currently have CI; one green run on each
  • CLAUDE.md "Release pipeline" section updated to reference the automatic publish flow; manual lab build --upload becomes the bypass case, not the canonical case
  • lab skill body updated with the install.sh URL working

Scope notes

  • This is publish-only. Deploy-to-herodemo remains separate (the lab build --download --install step). A follow-up issue could automate "VM auto-pulls latest" via a cron or webhook receiver, but that's outside this scope.
  • Does not touch hero_zero container-based deploy (different architecture, separate update.sh flow).
  • service.toml validation already happens inside lab build; no separate CI step needed for that.
  • Workspaces with non-canonical bins (e.g., hero_os/_app WASM target) are handled by lab already — those targets just fail individually with 1 failed, 4 built and the workflow still publishes the 4 successful binaries.

s127 close — post-rollout status (2026-05-19)

Workflow file landed in 31 of 35 D-07 set repos (skipped 4 special-case: hero_voice / hero_os / hero_office / mycelium_network — deferred for separate session-level decisions).

Currently green (binaries auto-publishing on every push to development): 17 of 31 (55%).

✓ green ✗ red (pre-existing per-repo bug)
hero_skills, hero_proc, hero_router, hero_db, hero_proxy, hero_osis, hero_livekit, hero_biz, hero_whiteboard, hero_indexer, hero_code_indexer, hero_researcher, hero_webbuilder, hero_website_framework, hero_aibroker, hero_compute, hero_lib hero_code, hero_codescalers, hero_collab, hero_lib_rhai, hero_matrixchat, hero_planner (path-dep × 6); hero_embedder, hero_memory (C++ musl-g++ missing × 2); hero_rpc (stale service.toml refs); hero_agent (cargo feature); hero_wallet (cargo feature); hero_foundry (links= collision); hero_logic (musl portability); hero_slides (merge marker); hero_books (stale herolib_core API)

In-session bonus fix (s127 final hour): lab build post-process < 512 KB "likely corrupt output" guard is now overridable via HERO_MIN_BIN_BYTES env var (default unchanged for local lab; canonical workflow sets 0 to disable). Unblocked 3 repos (hero_aibroker, hero_compute, hero_lib) that were tripping on legitimately-small Rust binaries. Landed in lab via commit 777a7ec.

Follow-up issue: per-repo cleanup catalogue at hero_skills#269 — 14 repos with their failure category + fix surface. Repo owners can pick up their own cleanup independently; the workflow file is already in place, so each per-repo fix lands → next push auto-goes-green.

Files shipped this session

  • crates/lab/install.sh (63e0ef7)
  • .forgejo/workflows/lab-publish.yaml (ad537ae → iterations through 4767f67777a7ec)
  • scripts/wire-lab-publish.sh (7ca5874 + fix b9f7ae0)
  • crates/lab/src/builder/orchestrator.rs env-overridable size guard (777a7ec)
  • skills/lab/lab.md drift fix (d660a96)
## Summary Replace the ad-hoc, currently-broken `.forgejo/workflows/build-linux.yaml` (and siblings) across all hero repos with one canonical workflow that runs `lab build --release --platforms linux-musl-x86_64 --upload --workspace`. The workflow fires on push-to-`development` and on `v*` tag push, populating each repo's `latest` release on Forgejo. Once in place, **every push to development means fresh binaries on Forgejo within ~5–10 min**, and any VM (or new herodemo) can be brought up to date with a simple `lab build <repo> --download --install` loop — no human in the loop, no per-repo manual `lab build --upload`. ## Why Today the publish flow is fully manual. The `lab build --upload` / `lab build --download --install` pair works (validated by populating all ~20 `latest` releases for herodemo on 2026-05-19), but no automation fires it on push, so every release means a workstation operator running 20 builds by hand. Forgejo Actions infrastructure already exists across the org — every hero_* repo has `.forgejo/workflows/build-linux.yaml` or similar — but those workflows: 1. Trigger only on `v*` tag push, never on `development` push 2. Call raw `cargo build` + `curl -X POST` upload, not `lab build --upload` (so naming + tag are inconsistent — `${BIN}-x86_64-unknown-linux-musl` vs `lab`'s `${BIN}-linux-musl-x86_64`, and upload tag is the version `v0.1.5` rather than `latest`) 3. Are **currently broken** — the last 5 push runs for hero_proc all conclude `failure`. Auto-fetched verdict via `GET /api/v1/repos/lhumina_code/hero_proc/actions/runs?limit=5`. Net: we have the CI surface, we have `lab`, but they aren't wired together. ## Proposed canonical workflow Single file, ~25 lines, lives at `.forgejo/workflows/lab-publish.yaml` in every hero_* repo: ```yaml name: lab publish on: push: branches: [development] tags: ["v*"] workflow_dispatch: jobs: publish: runs-on: docker container: image: ghcr.io/despiegk/builder:latest steps: - uses: actions/checkout@v4 - name: Install lab run: | curl -sSfL https://forge.ourworld.tf/lhumina_code/hero_skills/raw/branch/development/crates/lab/install.sh | bash echo "$HOME/.local/bin" >> $GITHUB_PATH echo "$HOME/hero/bin" >> $GITHUB_PATH - name: Build + upload env: FORGE_TOKEN: ${{ secrets.FORGEJO_TOKEN }} FORGEJO_TOKEN: ${{ secrets.FORGEJO_TOKEN }} run: | lab build --release --platforms linux-musl-x86_64 --upload --workspace ``` That's it. `lab` handles platform, naming (`${BIN}-linux-musl-x86_64`), the `latest` tag, md5 sidecars, skip-by-fingerprint, and the policy enforcement that the old workflows didn't. ## Roll-out plan 1. **Land the YAML in `hero_proc` first** (already has CI surface, mature). Push to `development`. Verify a green run lands binaries at `https://forge.ourworld.tf/lhumina_code/hero_proc/releases/tag/latest`. 2. **Templating script** in `hero_skills/scripts/` — iterates over the 37 repos with existing CI: - Deletes `.forgejo/workflows/{build-linux,build,release,build.yml}.yaml` (the broken ones) - Drops the canonical `lab-publish.yaml` in - `git add -A && git commit -m "ci: switch to lab-publish canonical workflow" && git push origin development` 3. **First push triggers the new CI on every repo** → all `latest` releases get a green refresh 4. **Document the canonical flow** in `hero_skills/docs/` and link from `lab` skill body ## Gotcha 1 — `install.sh` 404 The lab skill currently advertises `https://forge.ourworld.tf/lhumina_code/hero_skills/raw/branch/development/crates/lab/install.sh` as the install path. **That URL returns 404 today** (validated 2026-05-19; was the blocker when trying to install `lab` on the herodemo VM — had to scp the binary from the workstation instead). **Fix:** add `install.sh` at `crates/lab/install.sh` in `hero_skills` on `development`. The content already exists in the wild (was at this path historically); just needs to be restored or rewritten. Minimal version: ```bash #!/usr/bin/env bash set -euo pipefail TMPDIR=$(mktemp -d) cd "$TMPDIR" forge_clone() { git clone --depth 1 -b development "https://forge.ourworld.tf/lhumina_code/hero_skills" hero_skills_tmp; } forge_clone cd hero_skills_tmp/crates/lab cargo install --path . --root "${HOME}/.local" --locked --force # symlink into ~/hero/bin if it exists [ -d "${HOME}/hero/bin" ] && ln -sf "${HOME}/.local/bin/lab" "${HOME}/hero/bin/lab" || true ``` Without this fix the proposed CI workflow's "Install lab" step also fails — same root cause. ## Gotcha 2 — reusable workflows vs templated copies Forgejo Actions supports `uses: org/repo/.forgejo/workflows/file.yaml@branch` in some versions ("reusable workflows" feature, GHA-compatible). **If supported on our Forgejo deployment:** the canonical workflow lives in **one** place (`hero_skills/.forgejo/workflows/lab-publish-reusable.yaml`) and every per-repo workflow shrinks to a 3-line caller: ```yaml name: lab publish on: push: { branches: [development], tags: ["v*"] } jobs: publish: uses: lhumina_code/hero_skills/.forgejo/workflows/lab-publish-reusable.yaml@development secrets: inherit ``` 37 × 3 lines instead of 37 × 25 lines. One update to `hero_skills` propagates everywhere. No drift. **If not supported:** fall back to copying the 25-line YAML into each repo (still way better than today's 37 broken ad-hoc workflows). Document the templated form so updates can be re-applied across the org with a single script run. **Check first:** run `forgejo --version` on the runner host, or `curl -s https://forge.ourworld.tf/api/v1/version` and cross-reference against the Forgejo changelog for reusable-workflow support (landed in Forgejo 7.x). If yes, prefer the reusable path; if no, templated copies. ## Definition of done - [x] `install.sh` lands at `crates/lab/install.sh` on `development` branch of `hero_skills`, verified by `curl -sSfL <URL> | head` returning the script - [x] Forgejo reusable-workflow support confirmed or denied (check `GET /api/v1/version`) - [x] `.forgejo/workflows/lab-publish.yaml` (or reusable form) lands in `hero_proc` on `development`; one push-triggered green run produces fresh musl binaries at `releases/tag/latest` - [~] Templating script rolls it out to the other 36 hero_* repos that currently have CI; one green run on each - [x] CLAUDE.md "Release pipeline" section updated to reference the automatic publish flow; manual `lab build --upload` becomes the bypass case, not the canonical case - [x] `lab` skill body updated with the `install.sh` URL working ## Scope notes - This is **publish-only**. Deploy-to-herodemo remains separate (the `lab build --download --install` step). A follow-up issue could automate "VM auto-pulls latest" via a cron or webhook receiver, but that's outside this scope. - Does **not** touch `hero_zero` container-based deploy (different architecture, separate `update.sh` flow). - `service.toml` validation already happens inside `lab build`; no separate CI step needed for that. - Workspaces with non-canonical bins (e.g., `hero_os/_app` WASM target) are handled by `lab` already — those targets just fail individually with `1 failed, 4 built` and the workflow still publishes the 4 successful binaries. --- ## s127 close — post-rollout status (2026-05-19) **Workflow file landed in 31 of 35 D-07 set repos** (skipped 4 special-case: hero_voice / hero_os / hero_office / mycelium_network — deferred for separate session-level decisions). **Currently green** (binaries auto-publishing on every push to `development`): **17 of 31** (55%). | ✓ green | ✗ red (pre-existing per-repo bug) | |---|---| | hero_skills, hero_proc, hero_router, hero_db, hero_proxy, hero_osis, hero_livekit, hero_biz, hero_whiteboard, hero_indexer, hero_code_indexer, hero_researcher, hero_webbuilder, hero_website_framework, hero_aibroker, hero_compute, hero_lib | hero_code, hero_codescalers, hero_collab, hero_lib_rhai, hero_matrixchat, hero_planner (path-dep × 6); hero_embedder, hero_memory (C++ musl-g++ missing × 2); hero_rpc (stale service.toml refs); hero_agent (cargo feature); hero_wallet (cargo feature); hero_foundry (links= collision); hero_logic (musl portability); hero_slides (merge marker); hero_books (stale herolib_core API) | **In-session bonus fix** (s127 final hour): `lab build` post-process `< 512 KB` "likely corrupt output" guard is now overridable via `HERO_MIN_BIN_BYTES` env var (default unchanged for local lab; canonical workflow sets `0` to disable). Unblocked 3 repos (hero_aibroker, hero_compute, hero_lib) that were tripping on legitimately-small Rust binaries. Landed in lab via [commit `777a7ec`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/777a7ec). **Follow-up issue**: per-repo cleanup catalogue at **[hero_skills#269](https://forge.ourworld.tf/lhumina_code/hero_skills/issues/269)** — 14 repos with their failure category + fix surface. Repo owners can pick up their own cleanup independently; the workflow file is already in place, so each per-repo fix lands → next push auto-goes-green. ### Files shipped this session - `crates/lab/install.sh` ([`63e0ef7`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/63e0ef7)) - `.forgejo/workflows/lab-publish.yaml` ([`ad537ae`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/ad537ae) → iterations through `4767f67` → `777a7ec`) - `scripts/wire-lab-publish.sh` ([`7ca5874`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/7ca5874) + fix [`b9f7ae0`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/b9f7ae0)) - `crates/lab/src/builder/orchestrator.rs` env-overridable size guard ([`777a7ec`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/777a7ec)) - `skills/lab/lab.md` drift fix ([`d660a96`](https://forge.ourworld.tf/lhumina_code/hero_skills/commit/d660a96))
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#268
No description provided.