fix(service_router): use svc_mycelium_address helper, not raw /7 scan #121

Merged
sameh-farouk merged 2 commits from fix/service-router-mycelium-address into development 2026-04-22 19:59:22 +00:00
Member

Summary

service_router start's auto-detector (svx_mycelium_node_addr) scans ip -6 addr and filters for /7 scope global addresses to pick a mycelium IP. On multi-user dev boxes this only ever matches the host's canonical mycelium node ID — each user's per-bridge address is a /64 subnet under that /7 prefix and never matches the filter.

Result on a shared dev box: every user who runs service_router start binds their UI to the host IP instead of their own multi_user_add-provisioned bridge IP.

End-user impact

A browser from the user's laptop connects to the dev box over mycelium using the user's bridge IP (the one documented in ~/hero/cfg/hero_cfg.toml). But the router is actually bound to the host IP. If the user has set ADMIN_SECRETS in hero_router's admin UI to their own bridge IP, the router's serve_tcp_with_admin_list accept loop silently TCP-RSTs requests. Symptom: ERR_CONNECTION_RESET in the browser, or a WebSocket reconnect loop that looks like a hero_router code regression but is actually an IP mismatch at the hero_skills module layer.

This class of failure ate most of a debugging day during PR #115's validation on a shared dev box.

Fix — two commits

1. fix(service_router): use svc_mycelium_address helper, not raw /7 scan

Prefer the ipv6_address field from ~/hero/cfg/hero_cfg.toml (written by multi_user_add at provisioning time) via the shared svc_mycelium_address helper in lib.nu:546. That helper already existed but had no consumersservice_livekit and service_collab effectively use the same source via $env.MYCELIUM_IP (populated by hero_loader from the same hero_cfg.toml), so this aligns hero_router with the established pattern.

The legacy raw /7 scan is preserved as a fallback for hosts without hero_cfg.toml (single-user installs, pre-provisioning bootstrap). Strict expansion — nothing that worked before stops working.

2. fix(lib): svc_mycelium_config silently returned {} due to invalid has

While testing the first commit on the dev box, the router still auto-detected the host IP. Turns out svc_mycelium_config had been silently broken since it was written:

# BEFORE (broken):
if ($content | type) == "record" and ($content | has mycelium) {
    $content.mycelium
} else { {} }

| has and | type are not valid nushell commands — both raise parse errors. Both were inside an outer try/catch that swallowed the error and made the helper return {} for every caller. That's why the helper appeared unused — it was unusable.

The service_router change in commit 1 was the first attempt to actually read from it, so the bug surfaced immediately:

$ /home/sameh/hero/bin/nu -l -c 'print (svc_mycelium_address false)'
  # expected: 543:66c5:6430:8f31:1::1
  # actual:   (empty — fell through to /7 scan, picked host IP)

Fixed by replacing with the idiomatic nu form:

# AFTER:
$content | get -o mycelium | default {}

get -o <key> returns null for missing keys instead of throwing, and | default {} keeps the contract intact. open on a .toml file always returns a record, so no separate type-check is needed.

Testing

  • Parse check on both service_router.nu and lib.nu — clean.
  • Standalone helper test with a fixture hero_cfg.toml: svc_mycelium_address false now returns 543:66c5:6430:8f31:1::1 as expected.
  • Full integration on the affected dev box: after these two commits, service_router start auto-binds to [543:66c5:6430:8f31:1::1]:9988 (the user's bridge IP from hero_cfg.toml) instead of [543:66c5:6430:8f31:5293:1ad9:694a:70f3]:9988 (host IP, previous buggy behavior).

Scope

Two files, two small commits. No API changes. Every existing caller of svc_mycelium_config / svc_mycelium_address (currently only service_router after this PR) now receives the correct value. Zero effect on single-user hosts without hero_cfg.toml — falls back to the legacy scan.

## Summary `service_router start`'s auto-detector (`svx_mycelium_node_addr`) scans `ip -6 addr` and filters for `/7 scope global` addresses to pick a mycelium IP. On multi-user dev boxes this only ever matches the **host's** canonical mycelium node ID — each user's per-bridge address is a **/64 subnet under that /7 prefix** and never matches the filter. Result on a shared dev box: every user who runs `service_router start` binds their UI to the host IP instead of their own `multi_user_add`-provisioned bridge IP. ## End-user impact A browser from the user's laptop connects to the dev box over mycelium using the user's bridge IP (the one documented in `~/hero/cfg/hero_cfg.toml`). But the router is actually bound to the host IP. If the user has set `ADMIN_SECRETS` in hero_router's admin UI to their own bridge IP, the router's `serve_tcp_with_admin_list` accept loop silently TCP-RSTs requests. Symptom: `ERR_CONNECTION_RESET` in the browser, or a WebSocket reconnect loop that looks like a hero_router code regression but is actually an IP mismatch at the hero_skills module layer. This class of failure ate most of a debugging day during PR #115's validation on a shared dev box. ## Fix — two commits ### 1. `fix(service_router): use svc_mycelium_address helper, not raw /7 scan` Prefer the `ipv6_address` field from `~/hero/cfg/hero_cfg.toml` (written by `multi_user_add` at provisioning time) via the shared `svc_mycelium_address` helper in `lib.nu:546`. That helper already existed but had **no consumers** — `service_livekit` and `service_collab` effectively use the same source via `$env.MYCELIUM_IP` (populated by `hero_loader` from the same hero_cfg.toml), so this aligns hero_router with the established pattern. The legacy raw `/7` scan is preserved as a **fallback** for hosts without hero_cfg.toml (single-user installs, pre-provisioning bootstrap). Strict expansion — nothing that worked before stops working. ### 2. `fix(lib): svc_mycelium_config silently returned {} due to invalid has` While testing the first commit on the dev box, the router still auto-detected the host IP. Turns out `svc_mycelium_config` had been silently broken since it was written: ```nu # BEFORE (broken): if ($content | type) == "record" and ($content | has mycelium) { $content.mycelium } else { {} } ``` `| has` and `| type` are not valid nushell commands — both raise parse errors. Both were inside an outer `try/catch` that swallowed the error and made the helper return `{}` for every caller. That's why the helper appeared unused — it was unusable. The `service_router` change in commit 1 was the first attempt to actually read from it, so the bug surfaced immediately: ``` $ /home/sameh/hero/bin/nu -l -c 'print (svc_mycelium_address false)' # expected: 543:66c5:6430:8f31:1::1 # actual: (empty — fell through to /7 scan, picked host IP) ``` Fixed by replacing with the idiomatic nu form: ```nu # AFTER: $content | get -o mycelium | default {} ``` `get -o <key>` returns `null` for missing keys instead of throwing, and `| default {}` keeps the contract intact. `open` on a `.toml` file always returns a record, so no separate type-check is needed. ## Testing - Parse check on both `service_router.nu` and `lib.nu` — clean. - Standalone helper test with a fixture `hero_cfg.toml`: `svc_mycelium_address false` now returns `543:66c5:6430:8f31:1::1` as expected. - Full integration on the affected dev box: after these two commits, `service_router start` auto-binds to `[543:66c5:6430:8f31:1::1]:9988` (the user's bridge IP from hero_cfg.toml) instead of `[543:66c5:6430:8f31:5293:1ad9:694a:70f3]:9988` (host IP, previous buggy behavior). ## Scope Two files, two small commits. No API changes. Every existing caller of `svc_mycelium_config` / `svc_mycelium_address` (currently only `service_router` after this PR) now receives the correct value. Zero effect on single-user hosts without hero_cfg.toml — falls back to the legacy scan.
The raw `ip -6 addr` scan in svx_mycelium_node_addr filters for
`/7 scope global` addresses, which only ever matches the *host's*
canonical mycelium node ID. Per-user bridge addresses provisioned by
`multi_user_add` are /64 subnets under that /7 prefix and never match
the filter — so on a multi-user dev box, every user's `service_router
start` would auto-detect the host address and bind there instead of
their own bridge IP.

End-user impact: a browser connecting to the dev box via mycelium hits
an IP that doesn't match the hero_router ADMIN_SECRETS allowlist entry
the user set for their own bridge → silent TCP RST from
`serve_tcp_with_admin_list` → "WebSocket reconnect loop" that looks
like a router code bug but is actually an IP mismatch.

Fix: prefer the `ipv6_address` field from `~/hero/cfg/hero_cfg.toml`
(populated at provisioning time by multi_user_add) via the shared
`svc_mycelium_address` helper in lib.nu — which already existed but
had no consumers. This aligns hero_router with how service_livekit +
service_collab already resolve the mycelium address (both read
`$env.MYCELIUM_IP`, which hero_loader populates from the same
hero_cfg.toml source).

The legacy /7 scan is kept as a fallback for hosts without
hero_cfg.toml (single-user installs, pre-provisioning bootstrap), so
this is a strict expansion — nothing that worked before stops working.
`($content | has mycelium)` and `($content | type)` are not valid
nushell commands — they raise parse errors. Both were inside a
try/catch block, which swallowed the error and made
svc_mycelium_config return {} for every caller. svc_mycelium_address
(the only consumer) therefore always returned "".

This went unnoticed because svc_mycelium_address had no consumers in
the tree until PR #121, which wires service_router up to use it. The
router PR then appeared to fall back to the legacy /7 scan in all
cases, masking the fact that the helper it was calling was broken.

Fix: use `get -o mycelium | default {}` — the idiomatic nu form that
returns null for missing keys instead of throwing, combined with
`default {}` to produce the same empty-record fallback the original
intended. `open` on a .toml file always returns a record, so no
type-check is needed.

Verified: with a test hero_cfg.toml the helper now returns the
configured ipv6_address correctly.
sameh-farouk merged commit 89655d74d0 into development 2026-04-22 19:59:22 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills!121
No description provided.