feat(11-C2): per-Play span socket listener #18

Merged
timur merged 1 commit from feat/11-phase-c2-span-socket into development 2026-05-05 12:29:25 +00:00
Owner

Summary

Phase C2 of #11. Mirror of Phase B (#16, merged) on the Rust side: this listener consumes the JSONL events the SDK writes and turns them into Play.spans rows on disk.

Module: engine/span_socket.rs

API Use
SpanListener::new(play_sid, domain) Construct against a Play SID + the OsisLogic domain.
.bind() Create the UDS at /tmp/spans-{play_sid}.sock. Must complete before the Python subprocess is spawned (Phase C3 enforces this) so accept() doesn't race the Python connect().
.run(listener) Spawn a tokio task that accepts one connection, processes events to EOF, then unlinks the socket. Returns JoinHandle<()> for the executor to await on subprocess exit.
play_socket_path(sid) Path resolver — exposed so Phase C3's subprocess spawn can pass HERO_FLOW_SPAN_SOCK without depending on SpanListener internals.

Behavioural details

  • Idempotent span_start — a duplicate event for the same span_id (flaky reconnect) updates the existing row rather than appending.
  • Tag mergespan_start with initial tags + later span_tag events accumulate via JSON merge.
  • Unknown SpanStatusFailed — Python/Rust contract drift surfaces visibly; never silently maps to Ok.
  • Bad lines — malformed JSON is logged with a truncated preview and skipped. One bad line doesn't kill the listener for the whole flow.
  • Per-event play_get/play_set — known O(events) storage churn. Acceptable for C2 simplicity; can buffer if profiling shows it as a hotspot.

Tests

13 total: 8 unit + 5 integration.

  • Unit: parser, status mapping, tag stringify/parse round-trip, log truncation, unknown-type rejection, edge cases.
  • Integration (real OsisLogic + tempdir): full lifecycle, nested parent linkage, failed-status persistence, RPC metadata round-trip, malformed-line resilience.
  • Tests use SpanListener::with_socket_path (cfg(test)-only) so parallel test runs don't collide on /tmp/spans-*.sock when OSIS hands every fresh tempdir database the same first SID.

What this PR is NOT

  • No subprocess spawn (C3).
  • No sandbox (C3).
  • No RPC handlers / SSE / play_start routing (C4).

Phase plan (#11)

  • A — schema additive (#15, merged)
  • B — hero_tracing.py SDK (#16, merged)
  • C1 — staging (#17, merged)
  • C2 — this PR — span socket listener
  • C3 — python3 subprocess + Tier 0 sandbox
  • C4 — new RPCs + SSE + play_start routing
  • D — migration tool + delete legacy DAG

Test plan

  • cargo test -p hero_logic --lib span_socket — 13/13 pass
  • cargo test --workspace --lib — 23 total, all green
  • cargo build --workspace clean

🤖 Generated with Claude Code

## Summary Phase C2 of #11. Mirror of Phase B (#16, merged) on the Rust side: this listener consumes the JSONL events the SDK writes and turns them into `Play.spans` rows on disk. ## Module: `engine/span_socket.rs` | API | Use | |---|---| | `SpanListener::new(play_sid, domain)` | Construct against a Play SID + the OsisLogic domain. | | `.bind()` | Create the UDS at `/tmp/spans-{play_sid}.sock`. **Must complete before** the Python subprocess is spawned (Phase C3 enforces this) so `accept()` doesn't race the Python `connect()`. | | `.run(listener)` | Spawn a tokio task that accepts one connection, processes events to EOF, then unlinks the socket. Returns `JoinHandle<()>` for the executor to await on subprocess exit. | | `play_socket_path(sid)` | Path resolver — exposed so Phase C3's subprocess spawn can pass `HERO_FLOW_SPAN_SOCK` without depending on `SpanListener` internals. | ## Behavioural details - **Idempotent `span_start`** — a duplicate event for the same `span_id` (flaky reconnect) updates the existing row rather than appending. - **Tag merge** — `span_start` with initial tags + later `span_tag` events accumulate via JSON merge. - **Unknown `SpanStatus` → `Failed`** — Python/Rust contract drift surfaces visibly; never silently maps to `Ok`. - **Bad lines** — malformed JSON is logged with a truncated preview and skipped. One bad line doesn't kill the listener for the whole flow. - **Per-event `play_get`/`play_set`** — known O(events) storage churn. Acceptable for C2 simplicity; can buffer if profiling shows it as a hotspot. ## Tests 13 total: 8 unit + 5 integration. - Unit: parser, status mapping, tag stringify/parse round-trip, log truncation, unknown-type rejection, edge cases. - Integration (real OsisLogic + tempdir): full lifecycle, nested parent linkage, failed-status persistence, RPC metadata round-trip, malformed-line resilience. - Tests use `SpanListener::with_socket_path` (cfg(test)-only) so parallel test runs don't collide on `/tmp/spans-*.sock` when OSIS hands every fresh tempdir database the same first SID. ## What this PR is NOT - No subprocess spawn (C3). - No sandbox (C3). - No RPC handlers / SSE / `play_start` routing (C4). ## Phase plan (#11) - A — schema additive (#15, merged) - B — `hero_tracing.py` SDK (#16, merged) - C1 — staging (#17, merged) - **C2 — this PR** — span socket listener - C3 — python3 subprocess + Tier 0 sandbox - C4 — new RPCs + SSE + `play_start` routing - D — migration tool + delete legacy DAG ## Test plan - [x] `cargo test -p hero_logic --lib span_socket` — 13/13 pass - [x] `cargo test --workspace --lib` — 23 total, all green - [x] `cargo build --workspace` clean 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Phase C2 of #11. Accepts a Python flow's UDS connection at
`/tmp/spans-{play_sid}.sock`, parses JSONL events, and incrementally
persists them into `Play.spans` via OSIS storage. The hero_tracing.py
SDK from Phase B writes; this listener reads.

Wire protocol mirrors `sdk/python/hero_tracing.py`:
  - span_start  → push or upsert a Span (status: Running)
  - span_tag    → JSON-merge into the span's tags string
  - span_log    → append "[ts_ms] text" to logs
  - span_end    → set end_ms / status / error

Module: `engine/span_socket.rs`. Public surface:
  - SpanListener::new(play_sid, domain) → bind() → run(listener)
  - bind() must complete BEFORE the subprocess is spawned (Phase C3 will
    enforce this) so the Python `connect()` doesn't race the
    listener's `accept()`.
  - run() spawns a tokio task; returns a JoinHandle the executor awaits
    on subprocess exit.

Behavioural details worth pinning:
  - Idempotent span_start: a duplicate event for the same span_id (e.g.
    flaky reconnect) updates the existing record rather than appending
    a duplicate row.
  - Tags are JSON-merged, not replaced — so a `span_start` with initial
    tags + later `span_tag` events accumulate.
  - Unknown SpanStatus from the wire falls back to Failed, not Ok, so a
    Python/Rust contract drift surfaces as a visible problem rather
    than a silent success.
  - Malformed JSON lines are logged with a truncated preview and
    skipped — one bad line doesn't kill the listener for a whole flow.
  - Each event triggers play_get / mutate / play_set. Known O(events)
    storage churn — acceptable for Phase C2 simplicity; revisit if
    profiling shows it as a hotspot.

Tests:
  - 8 unit tests (parser, status mapping, tag round-trip, truncation,
    unknown-type rejection, edge cases for malformed tag JSON)
  - 5 integration tests on real OsisLogic in tempdirs:
    * full lifecycle (start → tag → log → end → assert Play.spans)
    * nested spans link parent to child correctly
    * failed span persists status + error message
    * RPC metadata (rpc_service, rpc_method) round-trips
    * malformed line is dropped without killing the listener
  - Tests use SpanListener::with_socket_path so parallel runs don't
    collide on /tmp/spans-*.sock when OSIS hands every fresh database
    the same first SID. The cfg(test) constructor stays out of the
    public API.

What's NOT here:
  - Subprocess spawn (C3)
  - Tier 0 sandbox (C3)
  - RPC handlers / play_start routing / SSE (C4)
  - Timeout-driven socket teardown (C3 owns the lifecycle)

Refs hero_logic#11 (Story 1: Foundation), hero_logic#10 (epic).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
timur merged commit d4e16c6ffe into development 2026-05-05 12:29:25 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_logic!18
No description provided.