baobab/docs/REDIS_QUEUES_GUIDE.md

# Redis Queues Guide: Who Pushes Where, When, and How to Inspect

This guide documents the canonical queues used in the project, explains which component pushes to which queue at each step, and provides redis-cli commands to inspect state during development.

Canonical keys
- Job hash (immutable key shape):
  - hero:job:{job_id}
  - Builder: [rust.keys::job_hash()](core/job/src/lib.rs:396)
- Work queues (push here to dispatch work):
  - Type queue: hero:q:work:type:{script_type}
  - Builders:
    - [rust.keys::work_type()](core/job/src/lib.rs:405)
    - [rust.keys::work_group()](core/job/src/lib.rs:411)
    - [rust.keys::work_instance()](core/job/src/lib.rs:420)
- Reply queue (optional, for actors that send explicit replies):
  - hero:q:reply:{job_id}
  - Builder: [rust.keys::reply()](core/job/src/lib.rs:401)
- Control queue (optional stop/control per-type):
  - hero:q:ctl:type:{script_type}
  - Builder: [rust.keys::stop_type()](core/job/src/lib.rs:429)


1) Who pushes where

A. Supervisor: creating, starting, and running jobs
- Create job (stores job hash):
  - [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:660)
  - Persists hero:job:{job_id} via [rust.Job::store_in_redis()](core/job/src/lib.rs:147)
- Start job (dispatch to worker queue):
  - [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:675) → [rust.Supervisor::start_job_using_connection()](core/supervisor/src/lib.rs:599)
  - LPUSH hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)
- Run-and-wait (one-shot):
  - [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
  - Stores hero:job:{job_id}, LPUSH hero:q:work:type:{script_type} (same as start)
  - Waits on hero:q:reply:{job_id} (via [rust.keys::reply()](core/job/src/lib.rs:401)) and also polls hero:job:{job_id} for output to support hash-only actors

B. Terminal UI: quick dispatch from the actor TUI
- Stores job using Job::store_in_redis, then pushes to type queue:
  - Dispatch code: [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:460)
  - LPUSH hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)

C. Actors: consuming and completing work
- Consume jobs:
  - Standalone Rhai actor: [rust.spawn_rhai_actor()](core/actor/src/lib.rs:211)
    - BLPOP hero:q:work:type:{script_type} (queue selection computed via [rust.derive_script_type_from_actor_id()](core/actor/src/lib.rs:262), then [rust.keys::work_type()](core/job/src/lib.rs:405))
  - Trait-based actor loop: [rust.Actor::spawn()](core/actor/src/actor_trait.rs:119)
    - BLPOP hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)
- Write results:
  - Hash-only (current default): [rust.Job::set_result()](core/job/src/lib.rs:322) updates hero:job:{job_id} with output and status=finished
  - Optional reply queue model: actor may LPUSH hero:q:reply:{job_id} (if implemented)


2) End-to-end flows and the queues involved

Flow A: Two-step (create + start) with Supervisor
- Code path:
  - [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:660)
  - [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:675)
- Keys touched:
  - hero:job:{job_id} (created)
  - hero:q:work:type:{script_type} (LPUSH job_id)
- Expected actor behavior:
  - BLPOP hero:q:work:type:{script_type}
  - Execute script, then [rust.Job::set_result()](core/job/src/lib.rs:322)
- How to inspect with redis-cli:
  - FLUSHALL (fresh dev) then run create and start
  - Verify job hash:
    - HGETALL hero:job:{job_id}
  - Verify queue length before consumption:
    - LLEN hero:q:work:type:osis
  - See pending items:
    - LRANGE hero:q:work:type:osis 0 -1
  - After actor runs, verify result in job hash:
    - HGET hero:job:{job_id} status
    - HGET hero:job:{job_id} output

Flow B: One-shot (run and await result) with Supervisor
- Code path:
  - [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
  - Uses [rust.keys::reply()](core/job/src/lib.rs:401) and polls the hash for output
- Keys touched:
  - hero:job:{job_id}
  - hero:q:work:type:{script_type}
  - hero:q:reply:{job_id} (only if an actor uses reply queues)
- How to inspect with redis-cli:
  - While waiting:
    - LLEN hero:q:work:type:osis
    - HGET hero:job:{job_id} status
  - If an actor uses reply queues (optional):
    - LLEN hero:q:reply:{job_id}
    - LRANGE hero:q:reply:{job_id} 0 -1
  - After completion:
    - HGET hero:job:{job_id} output

Flow C: Dispatch from the Actor TUI (manual testing)
- Code path:
  - [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:460) stores job and LPUSH to [rust.keys::work_type()](core/job/src/lib.rs:405)
- Keys touched:
  - hero:job:{job_id}
  - hero:q:work:type:{script_type}
- How to inspect with redis-cli:
  - List all work queues:
    - KEYS hero:q:work:type:*
  - Show items in a specific type queue:
    - LRANGE hero:q:work:type:osis 0 -1
  - Read one pending job:
    - HGETALL hero:job:{job_id}
  - After actor runs:
    - HGET hero:job:{job_id} status
    - HGET hero:job:{job_id} output


3) Example redis-cli sequences

A. Basic OSIS job lifecycle (two-step)
- Prepare
  - FLUSHALL
- Create and start (via code or supervisor-cli)
- Inspect queue and job
  - KEYS hero:q:work:type:*
  - LLEN hero:q:work:type:osis
  - LRANGE hero:q:work:type:osis 0 -1
  - HGETALL hero:job:{job_id}
- After actor consumes the job:
  - HGET hero:job:{job_id} status           → finished
  - HGET hero:job:{job_id} output           → script result
  - LLEN hero:q:work:type:osis              → likely 0 if all consumed

B. One-shot run-and-wait (hash-only actor)
- Prepare
  - FLUSHALL
- Submit via run_job_and_await_result()
- While supervisor waits:
  - HGET hero:job:{job_id} status           → started/finished
  - (Optional) LLEN hero:q:reply:{job_id}   → typically 0 if actor doesn’t use reply queues
- When done:
  - HGET hero:job:{job_id} output           → result

C. Listing and cleanup helpers
- List jobs
  - KEYS hero:job:*
- Show a specific job
  - HGETALL hero:job:{job_id}
- Clear all keys (dev only)
  - FLUSHALL


4) Where the queue names are computed in code

- Builders for canonical keys:
  - [rust.keys::job_hash()](core/job/src/lib.rs:396)
  - [rust.keys::reply()](core/job/src/lib.rs:401)
  - [rust.keys::work_type()](core/job/src/lib.rs:405)
  - [rust.keys::work_group()](core/job/src/lib.rs:411)
  - [rust.keys::work_instance()](core/job/src/lib.rs:420)
- Supervisor routing and waiting:
  - Type queue selection: [rust.Supervisor::get_actor_queue_key()](core/supervisor/src/lib.rs:410)
  - LPUSH to type queue: [rust.Supervisor::start_job_using_connection()](core/supervisor/src/lib.rs:599)
  - One-shot run and wait: [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
- Actor consumption:
  - Standalone Rhai actor: [rust.spawn_rhai_actor()](core/actor/src/lib.rs:211)
    - Type queue computed via [rust.derive_script_type_from_actor_id()](core/actor/src/lib.rs:262) + [rust.keys::work_type()](core/job/src/lib.rs:405)
  - Trait-based actor loop: [rust.Actor::spawn()](core/actor/src/actor_trait.rs:119)
    - BLPOP type queue via [rust.keys::work_type()](core/job/src/lib.rs:405)


5) Quick checklist for debugging

- Nothing consumes from the type queue
  - Is at least one actor process running that BLPOPs hero:q:work:type:{script_type}?
  - LLEN hero:q:work:type:{script_type} shows > 0 means unconsumed backlog
- Job “Dispatched” but never “Finished”
  - HGET hero:job:{job_id} status
  - Actor logs: check for script errors and verify it is connected to the same Redis
- “run-and-wait” timeout
  - Hash-only actors don’t push to reply queues; the supervisor will still return once it sees hero:job:{job_id}.output set by [rust.Job::set_result()](core/job/src/lib.rs:322)
- Mixed types:
  - Verify you targeted the correct type queue (e.g., osis vs sal): LLEN hero:q:work:type:osis, hero:q:work:type:sal


6) Canonical patterns to remember

- To dispatch a job:
  - LPUSH hero:q:work:type:{script_type} {job_id}
- To read job data:
  - HGETALL hero:job:{job_id}
- To wait for output (optional reply model):
  - BLPOP hero:q:reply:{job_id} {timeout_secs}
- To verify system state:
  - KEYS hero:q:*
  - KEYS hero:job:*


This guide reflects the canonical scheme implemented in:
- [rust.Supervisor](core/supervisor/src/lib.rs:1)
- [rust.keys](core/job/src/lib.rs:392)
- [core/actor/src/lib.rs](core/actor/src/lib.rs:1)
- [core/actor/src/actor_trait.rs](core/actor/src/actor_trait.rs:1)
- [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:1)