199 lines
8.4 KiB
Markdown
199 lines
8.4 KiB
Markdown
# Redis Queues Guide: Who Pushes Where, When, and How to Inspect
|
||
|
||
This guide documents the canonical queues used in the project, explains which component pushes to which queue at each step, and provides redis-cli commands to inspect state during development.
|
||
|
||
Canonical keys
|
||
- Job hash (immutable key shape):
|
||
- hero:job:{job_id}
|
||
- Builder: [rust.keys::job_hash()](core/job/src/lib.rs:396)
|
||
- Work queues (push here to dispatch work):
|
||
- Type queue: hero:q:work:type:{script_type}
|
||
- Builders:
|
||
- [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- [rust.keys::work_group()](core/job/src/lib.rs:411)
|
||
- [rust.keys::work_instance()](core/job/src/lib.rs:420)
|
||
- Reply queue (optional, for actors that send explicit replies):
|
||
- hero:q:reply:{job_id}
|
||
- Builder: [rust.keys::reply()](core/job/src/lib.rs:401)
|
||
- Control queue (optional stop/control per-type):
|
||
- hero:q:ctl:type:{script_type}
|
||
- Builder: [rust.keys::stop_type()](core/job/src/lib.rs:429)
|
||
|
||
|
||
1) Who pushes where
|
||
|
||
A. Supervisor: creating, starting, and running jobs
|
||
- Create job (stores job hash):
|
||
- [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:660)
|
||
- Persists hero:job:{job_id} via [rust.Job::store_in_redis()](core/job/src/lib.rs:147)
|
||
- Start job (dispatch to worker queue):
|
||
- [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:675) → [rust.Supervisor::start_job_using_connection()](core/supervisor/src/lib.rs:599)
|
||
- LPUSH hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- Run-and-wait (one-shot):
|
||
- [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
|
||
- Stores hero:job:{job_id}, LPUSH hero:q:work:type:{script_type} (same as start)
|
||
- Waits on hero:q:reply:{job_id} (via [rust.keys::reply()](core/job/src/lib.rs:401)) and also polls hero:job:{job_id} for output to support hash-only actors
|
||
|
||
B. Terminal UI: quick dispatch from the actor TUI
|
||
- Stores job using Job::store_in_redis, then pushes to type queue:
|
||
- Dispatch code: [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:460)
|
||
- LPUSH hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
|
||
C. Actors: consuming and completing work
|
||
- Consume jobs:
|
||
- Standalone Rhai actor: [rust.spawn_rhai_actor()](core/actor/src/lib.rs:211)
|
||
- BLPOP hero:q:work:type:{script_type} (queue selection computed via [rust.derive_script_type_from_actor_id()](core/actor/src/lib.rs:262), then [rust.keys::work_type()](core/job/src/lib.rs:405))
|
||
- Trait-based actor loop: [rust.Actor::spawn()](core/actor/src/actor_trait.rs:119)
|
||
- BLPOP hero:q:work:type:{script_type} using [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- Write results:
|
||
- Hash-only (current default): [rust.Job::set_result()](core/job/src/lib.rs:322) updates hero:job:{job_id} with output and status=finished
|
||
- Optional reply queue model: actor may LPUSH hero:q:reply:{job_id} (if implemented)
|
||
|
||
|
||
2) End-to-end flows and the queues involved
|
||
|
||
Flow A: Two-step (create + start) with Supervisor
|
||
- Code path:
|
||
- [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:660)
|
||
- [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:675)
|
||
- Keys touched:
|
||
- hero:job:{job_id} (created)
|
||
- hero:q:work:type:{script_type} (LPUSH job_id)
|
||
- Expected actor behavior:
|
||
- BLPOP hero:q:work:type:{script_type}
|
||
- Execute script, then [rust.Job::set_result()](core/job/src/lib.rs:322)
|
||
- How to inspect with redis-cli:
|
||
- FLUSHALL (fresh dev) then run create and start
|
||
- Verify job hash:
|
||
- HGETALL hero:job:{job_id}
|
||
- Verify queue length before consumption:
|
||
- LLEN hero:q:work:type:osis
|
||
- See pending items:
|
||
- LRANGE hero:q:work:type:osis 0 -1
|
||
- After actor runs, verify result in job hash:
|
||
- HGET hero:job:{job_id} status
|
||
- HGET hero:job:{job_id} output
|
||
|
||
Flow B: One-shot (run and await result) with Supervisor
|
||
- Code path:
|
||
- [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
|
||
- Uses [rust.keys::reply()](core/job/src/lib.rs:401) and polls the hash for output
|
||
- Keys touched:
|
||
- hero:job:{job_id}
|
||
- hero:q:work:type:{script_type}
|
||
- hero:q:reply:{job_id} (only if an actor uses reply queues)
|
||
- How to inspect with redis-cli:
|
||
- While waiting:
|
||
- LLEN hero:q:work:type:osis
|
||
- HGET hero:job:{job_id} status
|
||
- If an actor uses reply queues (optional):
|
||
- LLEN hero:q:reply:{job_id}
|
||
- LRANGE hero:q:reply:{job_id} 0 -1
|
||
- After completion:
|
||
- HGET hero:job:{job_id} output
|
||
|
||
Flow C: Dispatch from the Actor TUI (manual testing)
|
||
- Code path:
|
||
- [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:460) stores job and LPUSH to [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- Keys touched:
|
||
- hero:job:{job_id}
|
||
- hero:q:work:type:{script_type}
|
||
- How to inspect with redis-cli:
|
||
- List all work queues:
|
||
- KEYS hero:q:work:type:*
|
||
- Show items in a specific type queue:
|
||
- LRANGE hero:q:work:type:osis 0 -1
|
||
- Read one pending job:
|
||
- HGETALL hero:job:{job_id}
|
||
- After actor runs:
|
||
- HGET hero:job:{job_id} status
|
||
- HGET hero:job:{job_id} output
|
||
|
||
|
||
3) Example redis-cli sequences
|
||
|
||
A. Basic OSIS job lifecycle (two-step)
|
||
- Prepare
|
||
- FLUSHALL
|
||
- Create and start (via code or supervisor-cli)
|
||
- Inspect queue and job
|
||
- KEYS hero:q:work:type:*
|
||
- LLEN hero:q:work:type:osis
|
||
- LRANGE hero:q:work:type:osis 0 -1
|
||
- HGETALL hero:job:{job_id}
|
||
- After actor consumes the job:
|
||
- HGET hero:job:{job_id} status → finished
|
||
- HGET hero:job:{job_id} output → script result
|
||
- LLEN hero:q:work:type:osis → likely 0 if all consumed
|
||
|
||
B. One-shot run-and-wait (hash-only actor)
|
||
- Prepare
|
||
- FLUSHALL
|
||
- Submit via run_job_and_await_result()
|
||
- While supervisor waits:
|
||
- HGET hero:job:{job_id} status → started/finished
|
||
- (Optional) LLEN hero:q:reply:{job_id} → typically 0 if actor doesn’t use reply queues
|
||
- When done:
|
||
- HGET hero:job:{job_id} output → result
|
||
|
||
C. Listing and cleanup helpers
|
||
- List jobs
|
||
- KEYS hero:job:*
|
||
- Show a specific job
|
||
- HGETALL hero:job:{job_id}
|
||
- Clear all keys (dev only)
|
||
- FLUSHALL
|
||
|
||
|
||
4) Where the queue names are computed in code
|
||
|
||
- Builders for canonical keys:
|
||
- [rust.keys::job_hash()](core/job/src/lib.rs:396)
|
||
- [rust.keys::reply()](core/job/src/lib.rs:401)
|
||
- [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- [rust.keys::work_group()](core/job/src/lib.rs:411)
|
||
- [rust.keys::work_instance()](core/job/src/lib.rs:420)
|
||
- Supervisor routing and waiting:
|
||
- Type queue selection: [rust.Supervisor::get_actor_queue_key()](core/supervisor/src/lib.rs:410)
|
||
- LPUSH to type queue: [rust.Supervisor::start_job_using_connection()](core/supervisor/src/lib.rs:599)
|
||
- One-shot run and wait: [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:689)
|
||
- Actor consumption:
|
||
- Standalone Rhai actor: [rust.spawn_rhai_actor()](core/actor/src/lib.rs:211)
|
||
- Type queue computed via [rust.derive_script_type_from_actor_id()](core/actor/src/lib.rs:262) + [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
- Trait-based actor loop: [rust.Actor::spawn()](core/actor/src/actor_trait.rs:119)
|
||
- BLPOP type queue via [rust.keys::work_type()](core/job/src/lib.rs:405)
|
||
|
||
|
||
5) Quick checklist for debugging
|
||
|
||
- Nothing consumes from the type queue
|
||
- Is at least one actor process running that BLPOPs hero:q:work:type:{script_type}?
|
||
- LLEN hero:q:work:type:{script_type} shows > 0 means unconsumed backlog
|
||
- Job “Dispatched” but never “Finished”
|
||
- HGET hero:job:{job_id} status
|
||
- Actor logs: check for script errors and verify it is connected to the same Redis
|
||
- “run-and-wait” timeout
|
||
- Hash-only actors don’t push to reply queues; the supervisor will still return once it sees hero:job:{job_id}.output set by [rust.Job::set_result()](core/job/src/lib.rs:322)
|
||
- Mixed types:
|
||
- Verify you targeted the correct type queue (e.g., osis vs sal): LLEN hero:q:work:type:osis, hero:q:work:type:sal
|
||
|
||
|
||
6) Canonical patterns to remember
|
||
|
||
- To dispatch a job:
|
||
- LPUSH hero:q:work:type:{script_type} {job_id}
|
||
- To read job data:
|
||
- HGETALL hero:job:{job_id}
|
||
- To wait for output (optional reply model):
|
||
- BLPOP hero:q:reply:{job_id} {timeout_secs}
|
||
- To verify system state:
|
||
- KEYS hero:q:*
|
||
- KEYS hero:job:*
|
||
|
||
|
||
This guide reflects the canonical scheme implemented in:
|
||
- [rust.Supervisor](core/supervisor/src/lib.rs:1)
|
||
- [rust.keys](core/job/src/lib.rs:392)
|
||
- [core/actor/src/lib.rs](core/actor/src/lib.rs:1)
|
||
- [core/actor/src/actor_trait.rs](core/actor/src/actor_trait.rs:1)
|
||
- [core/actor/src/terminal_ui.rs](core/actor/src/terminal_ui.rs:1) |