103 lines
4.2 KiB
Markdown
103 lines
4.2 KiB
Markdown
# Hero Supervisor
|
|
|
|
The **Hero Supervisor** is responsible for supervising the lifecycle of workers and dispatching jobs to them via Redis queues.
|
|
|
|
## Overview
|
|
|
|
The system involves four primary actors:
|
|
|
|
1. **OSIS**: A worker that executes Rhai and HeroScript.
|
|
2. **SAL**: A worker that performs system abstraction layer functionalities using Rhai.
|
|
3. **V**: A worker that executes HeroScript in the V programming language.
|
|
4. **Python**: A worker that executes HeroScript in Python.
|
|
|
|
The Supervisor utilizes **zinit** to start and monitor these workers, ensuring they are running correctly.
|
|
|
|
### Key Features
|
|
|
|
- **Worker Lifecycle Supervision**: Oversee the lifecycle of workers, including starting, stopping, restarting, and load balancing based on job demand.
|
|
- **Job Supervision**: API for efficiently managing jobs dispatched to workers over Redis queues.
|
|
|
|
## Worker Lifecycle Supervision
|
|
|
|
The Supervisor oversees the lifecycle of the workers, ensuring they are operational and efficiently allocated. Load balancing is implemented to dynamically adjust the number of active workers based on job demand.
|
|
|
|
Additionally, the Supervisor implements health monitoring for worker engines: if a worker engine does not receive a job within 10 minutes, the Supervisor sends a ping job. The engine must respond immediately; if it fails to do so, the Supervisor restarts the requested job engine.
|
|
|
|
### Prerequisites
|
|
|
|
**Important**: Before running any lifecycle examples or using worker management features, you must start the Zinit daemon:
|
|
|
|
```bash
|
|
# Start Zinit daemon (required for worker lifecycle management)
|
|
sudo zinit init
|
|
|
|
# Or start Zinit with a custom socket path
|
|
sudo zinit --socket /var/run/zinit.sock init
|
|
```
|
|
|
|
**Note**: The Supervisor uses Zinit as the process manager for worker lifecycle operations. The default socket path is `/var/run/zinit.sock`, but you can configure a custom path using the `SupervisorBuilder::zinit_socket_path()` method.
|
|
|
|
**Troubleshooting**: If you get connection errors when running examples, ensure:
|
|
1. Zinit daemon is running (`zinit list` should work)
|
|
2. The socket path matches between Zinit and your Supervisor configuration
|
|
3. You have appropriate permissions to access the Zinit socket
|
|
|
|
### Supervisor API for Worker Lifecycle
|
|
|
|
The Supervisor provides the following methods for supervising the worker lifecycle:
|
|
|
|
- **`start_worker()`**: Initializes and starts a specified worker.
|
|
- **`stop_worker()`**: Gracefully stops a specified worker.
|
|
- **`restart_worker()`**: Restarts a specified worker to ensure it operates correctly.
|
|
- **`get_worker_status()`**: Checks the status of a specific worker.
|
|
|
|
## Job Supervision
|
|
|
|
Jobs are dispatched to workers through their designated Redis queues, and the Supervisor provides an API for comprehensive job supervision.
|
|
|
|
### Supervisor API for Job Supervision
|
|
|
|
The Supervisor offers the following methods for handling jobs:
|
|
|
|
- **`new_job()`**: Creates a new `JobBuilder` for configuring a job.
|
|
- **`create_job()`**: Stores a job in Redis.
|
|
- **`run_job_and_await_result()`**: Executes a job and waits for its completion.
|
|
- **`get_job_status()`**: Checks the current execution status of a job.
|
|
- **`get_job_output()`**: Retrieves the results of a completed job.
|
|
|
|
## Running Examples
|
|
|
|
The supervisor includes several examples demonstrating lifecycle management:
|
|
|
|
```bash
|
|
# 1. First, start the Zinit daemon
|
|
sudo zinit init
|
|
|
|
# 2. In another terminal, start Redis (if not already running)
|
|
redis-server
|
|
|
|
# 3. Run the lifecycle demo
|
|
cargo run --example simple_lifecycle_demo
|
|
|
|
# Or run the comprehensive lifecycle demo
|
|
cargo run --example lifecycle_demo
|
|
```
|
|
|
|
**Example Configuration**: The examples use these default paths:
|
|
- Redis: `redis://localhost:6379`
|
|
- Zinit socket: `/var/run/zinit.sock`
|
|
|
|
You can modify these in the example source code if your setup differs.
|
|
|
|
### Redis Schema for Job Supervision
|
|
|
|
Jobs are managed within the `hero:` namespace in Redis:
|
|
|
|
- **`hero:job:{job_id}`**: Stores job parameters as a Redis hash.
|
|
- **`hero:work_queue:{worker_id}`**: Contains worker-specific job queues for dispatching jobs.
|
|
- **`hero:reply:{job_id}`**: Dedicated queues for job results.
|
|
|
|
## Prerequisites
|
|
|
|
- A Redis server must be accessible to both the Supervisor and the workers. |