add some documentation for blue book
This commit is contained in:
0
docs/.collection
Normal file
0
docs/.collection
Normal file
67
docs/README.md
Normal file
67
docs/README.md
Normal file
@@ -0,0 +1,67 @@
|
|||||||
|
# Horus Documentation
|
||||||
|
|
||||||
|
**Hierarchical Orchestration Runtime for Universal Scripts**
|
||||||
|
|
||||||
|
Horus is a distributed job execution system with three layers: Coordinator, Supervisor, and Runner.
|
||||||
|
|
||||||
|
## Quick Links
|
||||||
|
|
||||||
|
- **[Getting Started](./getting-started.md)** - Install and run your first job
|
||||||
|
- **[Architecture](./architecture.md)** - System design and components
|
||||||
|
- **[Etymology](./ethymology.md)** - The meaning behind the name
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
### Coordinator
|
||||||
|
Workflow orchestration engine for DAG-based execution.
|
||||||
|
|
||||||
|
- [Overview](./coordinator/overview.md)
|
||||||
|
|
||||||
|
### Supervisor
|
||||||
|
Job dispatcher with authentication and routing.
|
||||||
|
|
||||||
|
- [Overview](./supervisor/overview.md)
|
||||||
|
- [Authentication](./supervisor/auth.md)
|
||||||
|
- [OpenRPC API](./supervisor/openrpc.json)
|
||||||
|
|
||||||
|
### Runners
|
||||||
|
Job executors for different workload types.
|
||||||
|
|
||||||
|
- [Runner Overview](./runner/overview.md)
|
||||||
|
- [Hero Runner](./runner/hero.md) - Heroscript execution
|
||||||
|
- [SAL Runner](./runner/sal.md) - System operations
|
||||||
|
- [Osiris Runner](./runner/osiris.md) - Database operations
|
||||||
|
|
||||||
|
## Core Concepts
|
||||||
|
|
||||||
|
### Jobs
|
||||||
|
Units of work executed by runners. Each job contains:
|
||||||
|
- Target runner ID
|
||||||
|
- Payload (script/command)
|
||||||
|
- Cryptographic signature
|
||||||
|
- Optional timeout and environment variables
|
||||||
|
|
||||||
|
### Workflows
|
||||||
|
Multi-step DAGs executed by the Coordinator. Steps can:
|
||||||
|
- Run in parallel or sequence
|
||||||
|
- Pass data between steps
|
||||||
|
- Target different runners
|
||||||
|
- Handle errors and retries
|
||||||
|
|
||||||
|
### Signatures
|
||||||
|
All jobs must be cryptographically signed:
|
||||||
|
- Ensures job authenticity
|
||||||
|
- Prevents tampering
|
||||||
|
- Enables authorization
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
- **Automation**: Execute system tasks and scripts
|
||||||
|
- **Data Pipelines**: Multi-step ETL workflows
|
||||||
|
- **CI/CD**: Build, test, and deployment pipelines
|
||||||
|
- **Infrastructure**: Manage cloud resources and containers
|
||||||
|
- **Integration**: Connect systems via scripted workflows
|
||||||
|
|
||||||
|
## Repository
|
||||||
|
|
||||||
|
[git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
|
||||||
@@ -1,15 +1,185 @@
|
|||||||
# Architecture
|
# Architecture
|
||||||
|
|
||||||
The Horus architecture consists of three layers:
|
Horus is a hierarchical orchestration runtime with three layers: Coordinator, Supervisor, and Runner.
|
||||||
|
|
||||||
1. Coordinator: A workflow engine that executes DAG-based flows by sending ready job steps to the targeted supervisors.
|
## Overview
|
||||||
2. Supervisor: A job dispatcher that routes jobs to the appropriate runners.
|
|
||||||
3. Runner: A job executor that runs the actual job steps.
|
|
||||||
|
|
||||||
## Networking
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ Coordinator │
|
||||||
|
│ (Workflow Engine - DAG Execution) │
|
||||||
|
│ │
|
||||||
|
│ • Parses workflow definitions │
|
||||||
|
│ • Resolves dependencies │
|
||||||
|
│ • Dispatches ready steps │
|
||||||
|
│ • Tracks workflow state │
|
||||||
|
└────────────────────┬────────────────────────────────────┘
|
||||||
|
│ OpenRPC (HTTP/Mycelium)
|
||||||
|
│
|
||||||
|
┌────────────────────▼────────────────────────────────────┐
|
||||||
|
│ Supervisor │
|
||||||
|
│ (Job Dispatcher & Authenticator) │
|
||||||
|
│ │
|
||||||
|
│ • Verifies job signatures │
|
||||||
|
│ • Routes jobs to runners │
|
||||||
|
│ • Manages runner registry │
|
||||||
|
│ • Tracks job lifecycle │
|
||||||
|
└────────────────────┬────────────────────────────────────┘
|
||||||
|
│ Redis Queue Protocol
|
||||||
|
│
|
||||||
|
┌────────────────────▼────────────────────────────────────┐
|
||||||
|
│ Runners │
|
||||||
|
│ (Job Executors) │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ Hero │ │ SAL │ │ Osiris │ │
|
||||||
|
│ │ Runner │ │ Runner │ │ Runner │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
- The user / client talks to the coordinator over an OpenRPC interface, using either regular HTTP transport or Mycelium.
|
## Layers
|
||||||
- The coordinator talks to the supervisor over an OpenRPC interface, using either regular HTTP transport or Mycelium.
|
|
||||||
- The supervisor talks to runners over a Redis based job execution protocol.
|
### 1. Coordinator (Optional)
|
||||||
|
**Purpose:** Workflow orchestration and DAG execution
|
||||||
|
|
||||||
|
**Responsibilities:**
|
||||||
|
- Parse and validate workflow definitions
|
||||||
|
- Execute DAG-based flows
|
||||||
|
- Manage step dependencies
|
||||||
|
- Route jobs to appropriate supervisors
|
||||||
|
- Handle multi-step workflows
|
||||||
|
|
||||||
|
**Use When:**
|
||||||
|
- You need multi-step workflows
|
||||||
|
- Jobs have dependencies
|
||||||
|
- Parallel execution is required
|
||||||
|
- Complex data pipelines
|
||||||
|
|
||||||
|
[→ Coordinator Documentation](./coordinator/overview.md)
|
||||||
|
|
||||||
|
### 2. Supervisor (Required)
|
||||||
|
**Purpose:** Job admission, authentication, and routing
|
||||||
|
|
||||||
|
**Responsibilities:**
|
||||||
|
- Receive jobs via OpenRPC interface
|
||||||
|
- Verify cryptographic signatures
|
||||||
|
- Route jobs to appropriate runners
|
||||||
|
- Manage runner registry
|
||||||
|
- Track job status and results
|
||||||
|
|
||||||
|
**Features:**
|
||||||
|
- OpenRPC API for job management
|
||||||
|
- HTTP and Mycelium transport
|
||||||
|
- Signature-based authentication
|
||||||
|
- Runner health monitoring
|
||||||
|
|
||||||
|
[→ Supervisor Documentation](./supervisor/overview.md)
|
||||||
|
|
||||||
|
### 3. Runners (Required)
|
||||||
|
**Purpose:** Execute actual job workloads
|
||||||
|
|
||||||
|
**Available Runners:**
|
||||||
|
- **Hero Runner**: Executes heroscripts via Hero CLI
|
||||||
|
- **SAL Runner**: System operations (OS, K8s, cloud, etc.)
|
||||||
|
- **Osiris Runner**: Database operations with Rhai scripts
|
||||||
|
|
||||||
|
**Common Features:**
|
||||||
|
- Redis queue-based job polling
|
||||||
|
- Signature verification
|
||||||
|
- Timeout support
|
||||||
|
- Environment variable handling
|
||||||
|
|
||||||
|
[→ Runner Documentation](./runner/overview.md)
|
||||||
|
|
||||||
|
## Communication Protocols
|
||||||
|
|
||||||
|
### Client ↔ Coordinator
|
||||||
|
- **Protocol:** OpenRPC
|
||||||
|
- **Transport:** HTTP or Mycelium
|
||||||
|
- **Operations:** Submit workflow, check status, retrieve results
|
||||||
|
|
||||||
|
### Coordinator ↔ Supervisor
|
||||||
|
- **Protocol:** OpenRPC
|
||||||
|
- **Transport:** HTTP or Mycelium
|
||||||
|
- **Operations:** Create job, get status, retrieve logs
|
||||||
|
|
||||||
|
### Supervisor ↔ Runner
|
||||||
|
- **Protocol:** Redis Queue
|
||||||
|
- **Transport:** Redis pub/sub and lists
|
||||||
|
- **Operations:** Push job, poll queue, store result
|
||||||
|
|
||||||
|
## Job Flow
|
||||||
|
|
||||||
|
### Simple Job (No Coordinator)
|
||||||
|
```
|
||||||
|
1. Client → Supervisor: create_job()
|
||||||
|
2. Supervisor: Verify signature
|
||||||
|
3. Supervisor → Redis: Push to runner queue
|
||||||
|
4. Runner ← Redis: Pop job
|
||||||
|
5. Runner: Execute job
|
||||||
|
6. Runner → Redis: Store result
|
||||||
|
7. Client ← Supervisor: get_job_result()
|
||||||
|
```
|
||||||
|
|
||||||
|
### Workflow (With Coordinator)
|
||||||
|
```
|
||||||
|
1. Client → Coordinator: submit_workflow()
|
||||||
|
2. Coordinator: Parse DAG
|
||||||
|
3. Coordinator: Identify ready steps
|
||||||
|
4. Coordinator → Supervisor: create_job() for each ready step
|
||||||
|
5. Supervisor → Runner: Route via Redis
|
||||||
|
6. Runner: Execute and return result
|
||||||
|
7. Coordinator: Update workflow state
|
||||||
|
8. Coordinator: Dispatch next ready steps
|
||||||
|
9. Repeat until workflow complete
|
||||||
|
```
|
||||||
|
|
||||||
|
## Security Model
|
||||||
|
|
||||||
|
### Authentication
|
||||||
|
- Jobs must be cryptographically signed
|
||||||
|
- Signatures verified at Supervisor layer
|
||||||
|
- Public key infrastructure for identity
|
||||||
|
|
||||||
|
### Authorization
|
||||||
|
- Runners only execute signed jobs
|
||||||
|
- Signature verification before execution
|
||||||
|
- Untrusted jobs rejected
|
||||||
|
|
||||||
|
### Transport Security
|
||||||
|
- Optional TLS for HTTP transport
|
||||||
|
- End-to-end encryption via Mycelium
|
||||||
|
- No plaintext credentials
|
||||||
|
|
||||||
|
[→ Authentication Details](./supervisor/auth.md)
|
||||||
|
|
||||||
|
## Deployment Patterns
|
||||||
|
|
||||||
|
### Minimal Setup
|
||||||
|
```
|
||||||
|
Redis + Supervisor + Runner(s)
|
||||||
|
```
|
||||||
|
Single machine, simple job execution.
|
||||||
|
|
||||||
|
### Distributed Setup
|
||||||
|
```
|
||||||
|
Redis Cluster + Multiple Supervisors + Runner Pool
|
||||||
|
```
|
||||||
|
High availability, load balancing.
|
||||||
|
|
||||||
|
### Full Orchestration
|
||||||
|
```
|
||||||
|
Coordinator + Multiple Supervisors + Runner Pool
|
||||||
|
```
|
||||||
|
Complex workflows, multi-step pipelines.
|
||||||
|
|
||||||
|
## Design Principles
|
||||||
|
|
||||||
|
1. **Hierarchical**: Clear separation of concerns across layers
|
||||||
|
2. **Secure**: Signature-based authentication throughout
|
||||||
|
3. **Scalable**: Horizontal scaling at each layer
|
||||||
|
4. **Observable**: Comprehensive logging and status tracking
|
||||||
|
5. **Flexible**: Multiple runners for different workload types
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
145
docs/coordinator/overview.md
Normal file
145
docs/coordinator/overview.md
Normal file
@@ -0,0 +1,145 @@
|
|||||||
|
# Coordinator Overview
|
||||||
|
|
||||||
|
The Coordinator is the workflow orchestration layer in Horus. It executes DAG-based flows by managing job dependencies and dispatching ready steps to supervisors.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Client → Coordinator → Supervisor(s) → Runner(s)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Responsibilities
|
||||||
|
|
||||||
|
### 1. **Workflow Management**
|
||||||
|
- Parse and validate DAG workflow definitions
|
||||||
|
- Track workflow execution state
|
||||||
|
- Manage step dependencies
|
||||||
|
|
||||||
|
### 2. **Job Orchestration**
|
||||||
|
- Determine which steps are ready to execute
|
||||||
|
- Dispatch jobs to appropriate supervisors
|
||||||
|
- Handle step failures and retries
|
||||||
|
|
||||||
|
### 3. **Dependency Resolution**
|
||||||
|
- Track step completion
|
||||||
|
- Resolve data dependencies between steps
|
||||||
|
- Pass outputs from completed steps to dependent steps
|
||||||
|
|
||||||
|
### 4. **Multi-Supervisor Coordination**
|
||||||
|
- Route jobs to specific supervisors
|
||||||
|
- Handle supervisor failures
|
||||||
|
- Load balance across supervisors
|
||||||
|
|
||||||
|
## Workflow Definition
|
||||||
|
|
||||||
|
Workflows are defined as Directed Acyclic Graphs (DAGs):
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
workflow:
|
||||||
|
name: "data-pipeline"
|
||||||
|
steps:
|
||||||
|
- id: "fetch"
|
||||||
|
runner: "hero"
|
||||||
|
payload: "!!http.get url:'https://api.example.com/data'"
|
||||||
|
|
||||||
|
- id: "process"
|
||||||
|
runner: "sal"
|
||||||
|
depends_on: ["fetch"]
|
||||||
|
payload: |
|
||||||
|
let data = input.fetch;
|
||||||
|
let processed = process_data(data);
|
||||||
|
processed
|
||||||
|
|
||||||
|
- id: "store"
|
||||||
|
runner: "osiris"
|
||||||
|
depends_on: ["process"]
|
||||||
|
payload: |
|
||||||
|
let model = osiris.model("results");
|
||||||
|
model.create(input.process);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
### DAG Execution
|
||||||
|
- Parallel execution of independent steps
|
||||||
|
- Sequential execution of dependent steps
|
||||||
|
- Automatic dependency resolution
|
||||||
|
|
||||||
|
### Error Handling
|
||||||
|
- Step-level retry policies
|
||||||
|
- Workflow-level error handlers
|
||||||
|
- Partial workflow recovery
|
||||||
|
|
||||||
|
### Data Flow
|
||||||
|
- Pass outputs between steps
|
||||||
|
- Transform data between steps
|
||||||
|
- Aggregate results from parallel steps
|
||||||
|
|
||||||
|
### Monitoring
|
||||||
|
- Real-time workflow status
|
||||||
|
- Step-level progress tracking
|
||||||
|
- Execution metrics and logs
|
||||||
|
|
||||||
|
## Workflow Lifecycle
|
||||||
|
|
||||||
|
1. **Submission**: Client submits workflow definition
|
||||||
|
2. **Validation**: Coordinator validates DAG structure
|
||||||
|
3. **Scheduling**: Determine ready steps (no pending dependencies)
|
||||||
|
4. **Dispatch**: Send jobs to supervisors
|
||||||
|
5. **Tracking**: Monitor step completion
|
||||||
|
6. **Progression**: Execute next ready steps
|
||||||
|
7. **Completion**: Workflow finishes when all steps complete
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
### Data Pipelines
|
||||||
|
```yaml
|
||||||
|
Extract → Transform → Load
|
||||||
|
```
|
||||||
|
|
||||||
|
### CI/CD Workflows
|
||||||
|
```yaml
|
||||||
|
Build → Test → Deploy
|
||||||
|
```
|
||||||
|
|
||||||
|
### Multi-Stage Processing
|
||||||
|
```yaml
|
||||||
|
Fetch Data → Process → Validate → Store → Notify
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parallel Execution
|
||||||
|
```yaml
|
||||||
|
┌─ Task A ─┐
|
||||||
|
Start ──┼─ Task B ─┼── Aggregate → Finish
|
||||||
|
└─ Task C ─┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start coordinator
|
||||||
|
coordinator --port 9090 --redis-url redis://localhost:6379
|
||||||
|
|
||||||
|
# With multiple supervisors
|
||||||
|
coordinator --port 9090 \
|
||||||
|
--supervisor http://supervisor1:8080 \
|
||||||
|
--supervisor http://supervisor2:8080
|
||||||
|
```
|
||||||
|
|
||||||
|
## API
|
||||||
|
|
||||||
|
The Coordinator exposes an OpenRPC API:
|
||||||
|
|
||||||
|
- `submit_workflow`: Submit a new workflow
|
||||||
|
- `get_workflow_status`: Check workflow progress
|
||||||
|
- `list_workflows`: List all workflows
|
||||||
|
- `cancel_workflow`: Stop a running workflow
|
||||||
|
- `get_workflow_logs`: Retrieve execution logs
|
||||||
|
|
||||||
|
## Advantages
|
||||||
|
|
||||||
|
- **Declarative**: Define what to do, not how
|
||||||
|
- **Scalable**: Parallel execution across multiple supervisors
|
||||||
|
- **Resilient**: Automatic retry and error handling
|
||||||
|
- **Observable**: Real-time status and logging
|
||||||
|
- **Composable**: Reuse workflows as steps in larger workflows
|
||||||
186
docs/getting-started.md
Normal file
186
docs/getting-started.md
Normal file
@@ -0,0 +1,186 @@
|
|||||||
|
# Getting Started with Horus
|
||||||
|
|
||||||
|
Quick start guide to running your first Horus job.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Redis server running
|
||||||
|
- Rust toolchain installed
|
||||||
|
- Horus repository cloned
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
### Build from Source
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone repository
|
||||||
|
git clone https://git.ourworld.tf/herocode/horus
|
||||||
|
cd horus
|
||||||
|
|
||||||
|
# Build all components
|
||||||
|
cargo build --release
|
||||||
|
|
||||||
|
# Binaries will be in target/release/
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### 1. Start Redis
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Using Docker
|
||||||
|
docker run -d -p 6379:6379 redis:latest
|
||||||
|
|
||||||
|
# Or install locally
|
||||||
|
redis-server
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Start a Runner
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start Hero runner
|
||||||
|
./target/release/herorunner my-runner
|
||||||
|
|
||||||
|
# Or SAL runner
|
||||||
|
./target/release/runner_sal my-sal-runner
|
||||||
|
|
||||||
|
# Or Osiris runner
|
||||||
|
./target/release/runner_osiris my-osiris-runner
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Start the Supervisor
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./target/release/supervisor --port 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Submit a Job
|
||||||
|
|
||||||
|
Using the Supervisor client:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use hero_supervisor_client::SupervisorClient;
|
||||||
|
use hero_job::Job;
|
||||||
|
|
||||||
|
#[tokio::main]
|
||||||
|
async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||||
|
let client = SupervisorClient::new("http://localhost:8080")?;
|
||||||
|
|
||||||
|
let job = Job::new(
|
||||||
|
"my-runner",
|
||||||
|
"print('Hello from Horus!')".to_string(),
|
||||||
|
);
|
||||||
|
|
||||||
|
let result = client.create_job(job).await?;
|
||||||
|
println!("Job ID: {}", result.id);
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example Workflows
|
||||||
|
|
||||||
|
### Simple Heroscript Execution
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Job payload
|
||||||
|
print("Hello World")
|
||||||
|
!!git.list
|
||||||
|
```
|
||||||
|
|
||||||
|
### SAL System Operation
|
||||||
|
|
||||||
|
```rhai
|
||||||
|
// List files in directory
|
||||||
|
let files = os.list_dir("/tmp");
|
||||||
|
for file in files {
|
||||||
|
print(file);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Osiris Data Storage
|
||||||
|
|
||||||
|
```rhai
|
||||||
|
// Store user data
|
||||||
|
let users = osiris.model("users");
|
||||||
|
let user = users.create(#{
|
||||||
|
name: "Alice",
|
||||||
|
email: "alice@example.com"
|
||||||
|
});
|
||||||
|
print(`Created user: ${user.id}`);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┐
|
||||||
|
│ Coordinator │ (Optional: For workflows)
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
┌──────▼───────┐
|
||||||
|
│ Supervisor │ (Job dispatcher)
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
│ Redis
|
||||||
|
│
|
||||||
|
┌──────▼───────┐
|
||||||
|
│ Runners │ (Job executors)
|
||||||
|
│ - Hero │
|
||||||
|
│ - SAL │
|
||||||
|
│ - Osiris │
|
||||||
|
└──────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Architecture Details](./architecture.md)
|
||||||
|
- [Runner Documentation](./runner/overview.md)
|
||||||
|
- [Supervisor API](./supervisor/overview.md)
|
||||||
|
- [Coordinator Workflows](./coordinator/overview.md)
|
||||||
|
- [Authentication](./supervisor/auth.md)
|
||||||
|
|
||||||
|
## Common Issues
|
||||||
|
|
||||||
|
### Runner Not Receiving Jobs
|
||||||
|
|
||||||
|
1. Check Redis connection
|
||||||
|
2. Verify runner ID matches job target
|
||||||
|
3. Check supervisor logs
|
||||||
|
|
||||||
|
### Job Signature Verification Failed
|
||||||
|
|
||||||
|
1. Ensure job is properly signed
|
||||||
|
2. Verify public key is registered
|
||||||
|
3. Check signature format
|
||||||
|
|
||||||
|
### Timeout Errors
|
||||||
|
|
||||||
|
1. Increase job timeout value
|
||||||
|
2. Check runner resource availability
|
||||||
|
3. Optimize job payload
|
||||||
|
|
||||||
|
## Development
|
||||||
|
|
||||||
|
### Running Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All tests
|
||||||
|
cargo test
|
||||||
|
|
||||||
|
# Specific component
|
||||||
|
cargo test -p hero-supervisor
|
||||||
|
cargo test -p runner-hero
|
||||||
|
```
|
||||||
|
|
||||||
|
### Debug Mode
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Enable debug logging
|
||||||
|
RUST_LOG=debug ./target/release/supervisor --port 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
## Support
|
||||||
|
|
||||||
|
- Documentation: [docs.ourworld.tf/horus](https://docs.ourworld.tf/horus)
|
||||||
|
- Repository: [git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
|
||||||
|
- Issues: Report on the repository
|
||||||
179
docs/job-format.md
Normal file
179
docs/job-format.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
# Job Format
|
||||||
|
|
||||||
|
Jobs are the fundamental unit of work in Horus.
|
||||||
|
|
||||||
|
## Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Job {
|
||||||
|
pub id: String, // Unique job identifier
|
||||||
|
pub runner_id: String, // Target runner ID
|
||||||
|
pub payload: String, // Job payload (script/command)
|
||||||
|
pub timeout: Option<u64>, // Timeout in seconds
|
||||||
|
pub env_vars: HashMap<String, String>, // Environment variables
|
||||||
|
pub signatures: Vec<Signature>, // Cryptographic signatures
|
||||||
|
pub created_at: i64, // Creation timestamp
|
||||||
|
pub status: JobStatus, // Current status
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Job Status
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum JobStatus {
|
||||||
|
Pending, // Queued, not yet started
|
||||||
|
Running, // Currently executing
|
||||||
|
Completed, // Finished successfully
|
||||||
|
Failed, // Execution failed
|
||||||
|
Timeout, // Exceeded timeout
|
||||||
|
Cancelled, // Manually cancelled
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Signature Format
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Signature {
|
||||||
|
pub public_key: String, // Signer's public key
|
||||||
|
pub signature: String, // Cryptographic signature
|
||||||
|
pub algorithm: String, // Signature algorithm (e.g., "ed25519")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Creating a Job
|
||||||
|
|
||||||
|
### Minimal Job
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use hero_job::Job;
|
||||||
|
|
||||||
|
let job = Job::new(
|
||||||
|
"my-runner",
|
||||||
|
"print('Hello World')".to_string(),
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Timeout
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let job = Job::builder()
|
||||||
|
.runner_id("my-runner")
|
||||||
|
.payload("long_running_task()")
|
||||||
|
.timeout(300) // 5 minutes
|
||||||
|
.build();
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Environment Variables
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use std::collections::HashMap;
|
||||||
|
|
||||||
|
let mut env_vars = HashMap::new();
|
||||||
|
env_vars.insert("API_KEY".to_string(), "secret".to_string());
|
||||||
|
env_vars.insert("ENV".to_string(), "production".to_string());
|
||||||
|
|
||||||
|
let job = Job::builder()
|
||||||
|
.runner_id("my-runner")
|
||||||
|
.payload("deploy_app()")
|
||||||
|
.env_vars(env_vars)
|
||||||
|
.build();
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Signature
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use hero_job::{Job, Signature};
|
||||||
|
|
||||||
|
let job = Job::builder()
|
||||||
|
.runner_id("my-runner")
|
||||||
|
.payload("important_task()")
|
||||||
|
.signature(Signature {
|
||||||
|
public_key: "ed25519:abc123...".to_string(),
|
||||||
|
signature: "sig:xyz789...".to_string(),
|
||||||
|
algorithm: "ed25519".to_string(),
|
||||||
|
})
|
||||||
|
.build();
|
||||||
|
```
|
||||||
|
|
||||||
|
## Payload Format
|
||||||
|
|
||||||
|
The payload format depends on the target runner:
|
||||||
|
|
||||||
|
### Hero Runner
|
||||||
|
Heroscript content:
|
||||||
|
```heroscript
|
||||||
|
!!git.list
|
||||||
|
print("Repositories listed")
|
||||||
|
!!docker.ps
|
||||||
|
```
|
||||||
|
|
||||||
|
### SAL Runner
|
||||||
|
Rhai script with SAL modules:
|
||||||
|
```rhai
|
||||||
|
let files = os.list_dir("/tmp");
|
||||||
|
for file in files {
|
||||||
|
print(file);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Osiris Runner
|
||||||
|
Rhai script with Osiris database:
|
||||||
|
```rhai
|
||||||
|
let users = osiris.model("users");
|
||||||
|
let user = users.create(#{
|
||||||
|
name: "Alice",
|
||||||
|
email: "alice@example.com"
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Job Result
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct JobResult {
|
||||||
|
pub job_id: String,
|
||||||
|
pub status: JobStatus,
|
||||||
|
pub output: String, // Stdout
|
||||||
|
pub error: Option<String>, // Stderr or error message
|
||||||
|
pub exit_code: Option<i32>,
|
||||||
|
pub started_at: Option<i64>,
|
||||||
|
pub completed_at: Option<i64>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
### Timeouts
|
||||||
|
- Always set timeouts for jobs
|
||||||
|
- Default: 60 seconds
|
||||||
|
- Long-running jobs: Set appropriate timeout
|
||||||
|
- Infinite jobs: Use separate monitoring
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
- Don't store secrets in env vars in production
|
||||||
|
- Use vault/secret management instead
|
||||||
|
- Keep env vars minimal
|
||||||
|
- Document required variables
|
||||||
|
|
||||||
|
### Signatures
|
||||||
|
- Always sign jobs in production
|
||||||
|
- Use strong algorithms (ed25519)
|
||||||
|
- Rotate keys regularly
|
||||||
|
- Store private keys securely
|
||||||
|
|
||||||
|
### Payloads
|
||||||
|
- Keep payloads concise
|
||||||
|
- Validate input data
|
||||||
|
- Handle errors gracefully
|
||||||
|
- Log important operations
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
Jobs are validated before execution:
|
||||||
|
|
||||||
|
1. **Structure**: All required fields present
|
||||||
|
2. **Signature**: Valid cryptographic signature
|
||||||
|
3. **Runner**: Target runner exists and available
|
||||||
|
4. **Payload**: Non-empty payload
|
||||||
|
5. **Timeout**: Reasonable timeout value
|
||||||
|
|
||||||
|
Invalid jobs are rejected before execution.
|
||||||
71
docs/runner/hero.md
Normal file
71
docs/runner/hero.md
Normal file
@@ -0,0 +1,71 @@
|
|||||||
|
# Hero Runner
|
||||||
|
|
||||||
|
Executes heroscripts using the Hero CLI tool.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Hero runner pipes job payloads directly to `hero run -s` via stdin, making it ideal for executing Hero automation tasks and heroscripts.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Heroscript Execution**: Direct stdin piping to `hero run -s`
|
||||||
|
- **No Temp Files**: Secure execution without filesystem artifacts
|
||||||
|
- **Environment Variables**: Full environment variable support
|
||||||
|
- **Timeout Support**: Respects job timeout settings
|
||||||
|
- **Signature Verification**: Cryptographic job verification
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start the runner
|
||||||
|
herorunner my-hero-runner
|
||||||
|
|
||||||
|
# With custom Redis
|
||||||
|
herorunner my-hero-runner --redis-url redis://custom:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## Job Payload
|
||||||
|
|
||||||
|
The payload should contain the heroscript content:
|
||||||
|
|
||||||
|
```heroscript
|
||||||
|
!!git.list
|
||||||
|
print("Repositories listed")
|
||||||
|
!!docker.ps
|
||||||
|
```
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Simple Print
|
||||||
|
```heroscript
|
||||||
|
print("Hello from heroscript!")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Hero Actions
|
||||||
|
```heroscript
|
||||||
|
!!git.list
|
||||||
|
!!docker.start name:"myapp"
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Environment Variables
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"payload": "print(env.MY_VAR)",
|
||||||
|
"env_vars": {
|
||||||
|
"MY_VAR": "Hello World"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- `hero` CLI must be installed and in PATH
|
||||||
|
- Redis server accessible
|
||||||
|
- Valid job signatures
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
- **Hero CLI Not Found**: Returns error if `hero` command unavailable
|
||||||
|
- **Timeout**: Kills process if timeout exceeded
|
||||||
|
- **Non-zero Exit**: Returns error with hero CLI output
|
||||||
|
- **Invalid Signature**: Rejects job before execution
|
||||||
142
docs/runner/osiris.md
Normal file
142
docs/runner/osiris.md
Normal file
@@ -0,0 +1,142 @@
|
|||||||
|
# Osiris Runner
|
||||||
|
|
||||||
|
Database-backed runner for structured data storage and retrieval.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Osiris runner executes Rhai scripts with access to a model-based database system, enabling structured data operations and persistence.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Rhai Scripting**: Execute Rhai scripts with Osiris database access
|
||||||
|
- **Model-Based Storage**: Define and use data models
|
||||||
|
- **CRUD Operations**: Create, read, update, delete records
|
||||||
|
- **Query Support**: Search and filter data
|
||||||
|
- **Schema Validation**: Type-safe data operations
|
||||||
|
- **Transaction Support**: Atomic database operations
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start the runner
|
||||||
|
runner_osiris my-osiris-runner
|
||||||
|
|
||||||
|
# With custom Redis
|
||||||
|
runner_osiris my-osiris-runner --redis-url redis://custom:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## Job Payload
|
||||||
|
|
||||||
|
The payload should contain a Rhai script using Osiris operations:
|
||||||
|
|
||||||
|
```rhai
|
||||||
|
// Example: Store data
|
||||||
|
let model = osiris.model("users");
|
||||||
|
let user = model.create(#{
|
||||||
|
name: "Alice",
|
||||||
|
email: "alice@example.com",
|
||||||
|
age: 30
|
||||||
|
});
|
||||||
|
print(user.id);
|
||||||
|
|
||||||
|
// Example: Retrieve data
|
||||||
|
let found = model.get(user.id);
|
||||||
|
print(found.name);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Create Model and Store Data
|
||||||
|
```rhai
|
||||||
|
// Define model
|
||||||
|
let posts = osiris.model("posts");
|
||||||
|
|
||||||
|
// Create record
|
||||||
|
let post = posts.create(#{
|
||||||
|
title: "Hello World",
|
||||||
|
content: "First post",
|
||||||
|
author: "Alice",
|
||||||
|
published: true
|
||||||
|
});
|
||||||
|
|
||||||
|
print(`Created post with ID: ${post.id}`);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Query Data
|
||||||
|
```rhai
|
||||||
|
let posts = osiris.model("posts");
|
||||||
|
|
||||||
|
// Find by field
|
||||||
|
let published = posts.find(#{
|
||||||
|
published: true
|
||||||
|
});
|
||||||
|
|
||||||
|
for post in published {
|
||||||
|
print(post.title);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update Records
|
||||||
|
```rhai
|
||||||
|
let posts = osiris.model("posts");
|
||||||
|
|
||||||
|
// Get record
|
||||||
|
let post = posts.get("post-123");
|
||||||
|
|
||||||
|
// Update fields
|
||||||
|
post.content = "Updated content";
|
||||||
|
posts.update(post);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Delete Records
|
||||||
|
```rhai
|
||||||
|
let posts = osiris.model("posts");
|
||||||
|
|
||||||
|
// Delete by ID
|
||||||
|
posts.delete("post-123");
|
||||||
|
```
|
||||||
|
|
||||||
|
### Transactions
|
||||||
|
```rhai
|
||||||
|
osiris.transaction(|| {
|
||||||
|
let users = osiris.model("users");
|
||||||
|
let posts = osiris.model("posts");
|
||||||
|
|
||||||
|
let user = users.create(#{ name: "Bob" });
|
||||||
|
let post = posts.create(#{
|
||||||
|
title: "Bob's Post",
|
||||||
|
author_id: user.id
|
||||||
|
});
|
||||||
|
|
||||||
|
// Both operations commit together
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Models
|
||||||
|
|
||||||
|
Models are defined dynamically through Rhai scripts:
|
||||||
|
|
||||||
|
```rhai
|
||||||
|
let model = osiris.model("products");
|
||||||
|
|
||||||
|
// Model automatically handles:
|
||||||
|
// - ID generation
|
||||||
|
// - Timestamps (created_at, updated_at)
|
||||||
|
// - Schema validation
|
||||||
|
// - Indexing
|
||||||
|
```
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Redis server accessible
|
||||||
|
- Osiris database configured
|
||||||
|
- Valid job signatures
|
||||||
|
- Sufficient storage for data operations
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
- **Configuration Storage**: Store application configs
|
||||||
|
- **User Data**: Manage user profiles and preferences
|
||||||
|
- **Workflow State**: Persist workflow execution state
|
||||||
|
- **Metrics & Logs**: Store structured logs and metrics
|
||||||
|
- **Cache Management**: Persistent caching layer
|
||||||
96
docs/runner/overview.md
Normal file
96
docs/runner/overview.md
Normal file
@@ -0,0 +1,96 @@
|
|||||||
|
# Runners Overview
|
||||||
|
|
||||||
|
Runners are the execution layer in the Horus architecture. They receive jobs from the Supervisor via Redis queues and execute the actual workload.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Supervisor → Redis Queue → Runner → Execute Job → Return Result
|
||||||
|
```
|
||||||
|
|
||||||
|
## Available Runners
|
||||||
|
|
||||||
|
Horus provides three specialized runners:
|
||||||
|
|
||||||
|
### 1. **Hero Runner**
|
||||||
|
Executes heroscripts using the Hero CLI ecosystem.
|
||||||
|
|
||||||
|
**Use Cases:**
|
||||||
|
- Running Hero automation tasks
|
||||||
|
- Executing heroscripts from job payloads
|
||||||
|
- Integration with Hero CLI tools
|
||||||
|
|
||||||
|
**Binary:** `herorunner`
|
||||||
|
|
||||||
|
[→ Hero Runner Documentation](./hero.md)
|
||||||
|
|
||||||
|
### 2. **SAL Runner**
|
||||||
|
System Abstraction Layer runner for system-level operations.
|
||||||
|
|
||||||
|
**Use Cases:**
|
||||||
|
- OS operations (file, process, network)
|
||||||
|
- Infrastructure management (Kubernetes, VMs)
|
||||||
|
- Cloud provider operations (Hetzner)
|
||||||
|
- Database operations (Redis, Postgres)
|
||||||
|
|
||||||
|
**Binary:** `runner_sal`
|
||||||
|
|
||||||
|
[→ SAL Runner Documentation](./sal.md)
|
||||||
|
|
||||||
|
### 3. **Osiris Runner**
|
||||||
|
Database-backed runner for data storage and retrieval using Rhai scripts.
|
||||||
|
|
||||||
|
**Use Cases:**
|
||||||
|
- Structured data storage
|
||||||
|
- Model-based data operations
|
||||||
|
- Rhai script execution with database access
|
||||||
|
|
||||||
|
**Binary:** `runner_osiris`
|
||||||
|
|
||||||
|
[→ Osiris Runner Documentation](./osiris.md)
|
||||||
|
|
||||||
|
## Common Features
|
||||||
|
|
||||||
|
All runners implement the `Runner` trait and provide:
|
||||||
|
|
||||||
|
- **Job Execution**: Process jobs from Redis queues
|
||||||
|
- **Signature Verification**: Verify job signatures before execution
|
||||||
|
- **Timeout Support**: Respect job timeout settings
|
||||||
|
- **Environment Variables**: Pass environment variables to jobs
|
||||||
|
- **Error Handling**: Comprehensive error reporting
|
||||||
|
- **Logging**: Structured logging for debugging
|
||||||
|
|
||||||
|
## Runner Protocol
|
||||||
|
|
||||||
|
Runners communicate with the Supervisor using a Redis-based protocol:
|
||||||
|
|
||||||
|
1. **Job Queue**: Supervisor pushes jobs to `runner:{runner_id}:jobs`
|
||||||
|
2. **Job Processing**: Runner pops job, validates signature, executes
|
||||||
|
3. **Result Storage**: Runner stores result in `job:{job_id}:result`
|
||||||
|
4. **Status Updates**: Runner updates job status throughout execution
|
||||||
|
|
||||||
|
## Starting a Runner
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Hero Runner
|
||||||
|
herorunner <runner_id> [--redis-url <url>]
|
||||||
|
|
||||||
|
# SAL Runner
|
||||||
|
runner_sal <runner_id> [--redis-url <url>]
|
||||||
|
|
||||||
|
# Osiris Runner
|
||||||
|
runner_osiris <runner_id> [--redis-url <url>]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
All runners accept:
|
||||||
|
- `runner_id`: Unique identifier for the runner (required)
|
||||||
|
- `--redis-url`: Redis connection URL (default: `redis://localhost:6379`)
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
- Jobs must be cryptographically signed
|
||||||
|
- Runners verify signatures before execution
|
||||||
|
- Untrusted jobs are rejected
|
||||||
|
- Environment variables should not contain sensitive data in production
|
||||||
123
docs/runner/sal.md
Normal file
123
docs/runner/sal.md
Normal file
@@ -0,0 +1,123 @@
|
|||||||
|
# SAL Runner
|
||||||
|
|
||||||
|
System Abstraction Layer runner for system-level operations.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The SAL runner executes Rhai scripts with access to system abstraction modules for OS operations, infrastructure management, and cloud provider interactions.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Rhai Scripting**: Execute Rhai scripts with SAL modules
|
||||||
|
- **System Operations**: File, process, and network management
|
||||||
|
- **Infrastructure**: Kubernetes, VM, and container operations
|
||||||
|
- **Cloud Providers**: Hetzner and other cloud integrations
|
||||||
|
- **Database Access**: Redis and Postgres client operations
|
||||||
|
- **Networking**: Mycelium and network configuration
|
||||||
|
|
||||||
|
## Available SAL Modules
|
||||||
|
|
||||||
|
### Core Modules
|
||||||
|
- **sal-os**: Operating system operations
|
||||||
|
- **sal-process**: Process management
|
||||||
|
- **sal-text**: Text processing utilities
|
||||||
|
- **sal-net**: Network operations
|
||||||
|
|
||||||
|
### Infrastructure
|
||||||
|
- **sal-virt**: Virtualization management
|
||||||
|
- **sal-kubernetes**: Kubernetes cluster operations
|
||||||
|
- **sal-zinit-client**: Zinit process manager
|
||||||
|
|
||||||
|
### Storage & Data
|
||||||
|
- **sal-redisclient**: Redis operations
|
||||||
|
- **sal-postgresclient**: PostgreSQL operations
|
||||||
|
- **sal-vault**: Secret management
|
||||||
|
|
||||||
|
### Networking
|
||||||
|
- **sal-mycelium**: Mycelium network integration
|
||||||
|
|
||||||
|
### Cloud Providers
|
||||||
|
- **sal-hetzner**: Hetzner cloud operations
|
||||||
|
|
||||||
|
### Version Control
|
||||||
|
- **sal-git**: Git repository operations
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start the runner
|
||||||
|
runner_sal my-sal-runner
|
||||||
|
|
||||||
|
# With custom Redis
|
||||||
|
runner_sal my-sal-runner --redis-url redis://custom:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## Job Payload
|
||||||
|
|
||||||
|
The payload should contain a Rhai script using SAL modules:
|
||||||
|
|
||||||
|
```rhai
|
||||||
|
// Example: List files
|
||||||
|
let files = os.list_dir("/tmp");
|
||||||
|
print(files);
|
||||||
|
|
||||||
|
// Example: Process management
|
||||||
|
let pid = process.spawn("ls", ["-la"]);
|
||||||
|
let output = process.wait(pid);
|
||||||
|
print(output);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### File Operations
|
||||||
|
```rhai
|
||||||
|
// Read file
|
||||||
|
let content = os.read_file("/path/to/file");
|
||||||
|
print(content);
|
||||||
|
|
||||||
|
// Write file
|
||||||
|
os.write_file("/path/to/output", "Hello World");
|
||||||
|
```
|
||||||
|
|
||||||
|
### Kubernetes Operations
|
||||||
|
```rhai
|
||||||
|
// List pods
|
||||||
|
let pods = k8s.list_pods("default");
|
||||||
|
for pod in pods {
|
||||||
|
print(pod.name);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Redis Operations
|
||||||
|
```rhai
|
||||||
|
// Set value
|
||||||
|
redis.set("key", "value");
|
||||||
|
|
||||||
|
// Get value
|
||||||
|
let val = redis.get("key");
|
||||||
|
print(val);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Git Operations
|
||||||
|
```rhai
|
||||||
|
// Clone repository
|
||||||
|
git.clone("https://github.com/user/repo", "/tmp/repo");
|
||||||
|
|
||||||
|
// Get status
|
||||||
|
let status = git.status("/tmp/repo");
|
||||||
|
print(status);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Redis server accessible
|
||||||
|
- System permissions for requested operations
|
||||||
|
- Valid job signatures
|
||||||
|
- SAL modules available in runtime
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
- SAL operations have system-level access
|
||||||
|
- Jobs must be from trusted sources
|
||||||
|
- Signature verification is mandatory
|
||||||
|
- Limit runner permissions in production
|
||||||
88
docs/supervisor/overview.md
Normal file
88
docs/supervisor/overview.md
Normal file
@@ -0,0 +1,88 @@
|
|||||||
|
# Supervisor Overview
|
||||||
|
|
||||||
|
The Supervisor is the job dispatcher layer in Horus. It receives jobs, verifies signatures, and routes them to appropriate runners.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Client → Supervisor → Redis Queue → Runner
|
||||||
|
```
|
||||||
|
|
||||||
|
## Responsibilities
|
||||||
|
|
||||||
|
### 1. **Job Admission**
|
||||||
|
- Receive jobs via OpenRPC interface
|
||||||
|
- Validate job structure and required fields
|
||||||
|
- Verify cryptographic signatures
|
||||||
|
|
||||||
|
### 2. **Authentication & Authorization**
|
||||||
|
- Verify job signatures using public keys
|
||||||
|
- Ensure jobs are from authorized sources
|
||||||
|
- Reject unsigned or invalid jobs
|
||||||
|
|
||||||
|
### 3. **Job Routing**
|
||||||
|
- Route jobs to appropriate runner queues
|
||||||
|
- Maintain runner registry
|
||||||
|
- Load balance across available runners
|
||||||
|
|
||||||
|
### 4. **Job Management**
|
||||||
|
- Track job status and lifecycle
|
||||||
|
- Provide job query and listing APIs
|
||||||
|
- Store job results and logs
|
||||||
|
|
||||||
|
### 5. **Runner Management**
|
||||||
|
- Register and track available runners
|
||||||
|
- Monitor runner health and availability
|
||||||
|
- Handle runner disconnections
|
||||||
|
|
||||||
|
## OpenRPC Interface
|
||||||
|
|
||||||
|
The Supervisor exposes an OpenRPC API for job management:
|
||||||
|
|
||||||
|
### Job Operations
|
||||||
|
- `create_job`: Submit a new job
|
||||||
|
- `get_job`: Retrieve job details
|
||||||
|
- `list_jobs`: List all jobs
|
||||||
|
- `delete_job`: Remove a job
|
||||||
|
- `get_job_logs`: Retrieve job execution logs
|
||||||
|
|
||||||
|
### Runner Operations
|
||||||
|
- `register_runner`: Register a new runner
|
||||||
|
- `list_runners`: List available runners
|
||||||
|
- `get_runner_status`: Check runner health
|
||||||
|
|
||||||
|
## Job Lifecycle
|
||||||
|
|
||||||
|
1. **Submission**: Client submits job via OpenRPC
|
||||||
|
2. **Validation**: Supervisor validates structure and signature
|
||||||
|
3. **Queueing**: Job pushed to runner's Redis queue
|
||||||
|
4. **Execution**: Runner processes job
|
||||||
|
5. **Completion**: Result stored in Redis
|
||||||
|
6. **Retrieval**: Client retrieves result via OpenRPC
|
||||||
|
|
||||||
|
## Transport Options
|
||||||
|
|
||||||
|
The Supervisor supports multiple transport layers:
|
||||||
|
|
||||||
|
- **HTTP**: Standard HTTP/HTTPS transport
|
||||||
|
- **Mycelium**: Peer-to-peer encrypted transport
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start supervisor
|
||||||
|
supervisor --port 8080 --redis-url redis://localhost:6379
|
||||||
|
|
||||||
|
# With Mycelium
|
||||||
|
supervisor --port 8080 --mycelium --redis-url redis://localhost:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
- All jobs must be cryptographically signed
|
||||||
|
- Signatures verified before job admission
|
||||||
|
- Public key infrastructure for identity
|
||||||
|
- Optional TLS for HTTP transport
|
||||||
|
- End-to-end encryption via Mycelium
|
||||||
|
|
||||||
|
[→ Authentication Documentation](./auth.md)
|
||||||
Reference in New Issue
Block a user