9.5 KiB
Actor Lifecycle Management
The Hero Supervisor includes comprehensive actor lifecycle management functionality using Zinit as the process manager. This enables the supervisor to manage actor processes, perform health monitoring, and implement load balancing.
Overview
The lifecycle management system provides:
- Actor Process Management: Start, stop, restart, and monitor actor binaries
- Health Monitoring: Automatic ping jobs every 10 minutes for idle actors
- Graceful Shutdown: Clean termination of actor processes
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Supervisor │ │ ActorLifecycle │ │ Zinit │
│ │◄──►│ Manager │◄──►│ (Process │
│ (Job Dispatch) │ │ │ │ Manager) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Redis │ │ Health Monitor │ │ Actor Binaries │
│ (Job Queue) │ │ (Ping Jobs) │ │ (OSIS/SAL/V) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
Components
ActorConfig
Defines configuration for a actor binary:
use hero_supervisor::{ActorConfig, ScriptType};
use std::path::PathBuf;
use std::collections::HashMap;
let config = ActorConfig::new(
"osis_actor_0".to_string(),
PathBuf::from("/usr/local/bin/osis_actor"),
ScriptType::OSIS,
)
.with_args(vec![
"--redis-url".to_string(),
"redis://localhost:6379".to_string(),
"--actor-id".to_string(),
"osis_actor_0".to_string(),
])
.with_env({
let mut env = HashMap::new();
env.insert("RUST_LOG".to_string(), "info".to_string());
env.insert("ACTOR_TYPE".to_string(), "osis".to_string());
env
})
.with_health_check("/usr/local/bin/osis_actor --health-check".to_string())
.with_dependencies(vec!["redis".to_string()]);
ActorLifecycleManager
Main component for managing actor lifecycles:
use hero_supervisor::{ActorLifecycleManagerBuilder, Supervisor};
let supervisor = SupervisorBuilder::new()
.redis_url("redis://localhost:6379")
.caller_id("my_supervisor")
.context_id("production")
.build()?;
let mut lifecycle_manager = ActorLifecycleManagerBuilder::new("/var/run/zinit.sock".to_string())
.with_supervisor(supervisor.clone())
.add_actor(osis_actor_config)
.add_actor(sal_actor_config)
.add_actor(v_actor_config)
.build();
Supported Script Types
The lifecycle manager supports all Hero script types:
- OSIS: Rhai/HeroScript execution actors
- SAL: System Abstraction Layer actors
- V: HeroScript execution in V language
- Python: HeroScript execution in Python
Key Features
1. Actor Management
// Start all configured actors
lifecycle_manager.start_all_actors().await?;
// Stop all actors
lifecycle_manager.stop_all_actors().await?;
// Restart specific actor
lifecycle_manager.restart_actor("osis_actor_0").await?;
// Get actor status
let status = lifecycle_manager.get_actor_status("osis_actor_0").await?;
println!("Actor state: {:?}, PID: {}", status.state, status.pid);
2. Health Monitoring
The system automatically monitors actor health:
- Tracks last job execution time for each actor
- Sends ping jobs to actors idle for 10+ minutes
- Restarts actors that fail ping checks 3 times
- Updates job times when actors receive tasks
// Manual health check
lifecycle_manager.monitor_actor_health().await?;
// Update job time (called automatically by supervisor)
lifecycle_manager.update_actor_job_time("osis_actor_0");
// Start continuous health monitoring
lifecycle_manager.start_health_monitoring().await; // Runs forever
3. Dynamic Scaling
Scale actors up or down based on demand:
// Scale OSIS actors to 5 instances
lifecycle_manager.scale_actors(&ScriptType::OSIS, 5).await?;
// Scale down SAL actors to 1 instance
lifecycle_manager.scale_actors(&ScriptType::SAL, 1).await?;
// Check current running count
let count = lifecycle_manager.get_running_actor_count(&ScriptType::V).await;
println!("Running V actors: {}", count);
4. Service Dependencies
Actors can depend on other services:
let config = ActorConfig::new(name, binary, script_type)
.with_dependencies(vec![
"redis".to_string(),
"database".to_string(),
"auth_service".to_string(),
]);
Zinit ensures dependencies start before the actor.
Integration with Supervisor
The lifecycle manager integrates seamlessly with the supervisor:
use hero_supervisor::{Supervisor, ActorLifecycleManager};
// Create supervisor and lifecycle manager
let supervisor = SupervisorBuilder::new().build()?;
let mut lifecycle_manager = ActorLifecycleManagerBuilder::new(zinit_socket)
.with_supervisor(supervisor.clone())
.build();
// Start actors
lifecycle_manager.start_all_actors().await?;
// Create and execute jobs (supervisor automatically routes to actors)
let job = supervisor
.new_job()
.script_type(ScriptType::OSIS)
.script_content("println!(\"Hello World!\");".to_string())
.build()?;
let result = supervisor.run_job_and_await_result(&job).await?;
println!("Job result: {}", result);
Zinit Service Configuration
The lifecycle manager automatically creates Zinit service configurations:
# Generated service config for osis_actor_0
exec: "/usr/local/bin/osis_actor --redis-url redis://localhost:6379 --actor-id osis_actor_0"
test: "/usr/local/bin/osis_actor --health-check"
oneshot: false # Restart on exit
after:
- redis
env:
RUST_LOG: "info"
ACTOR_TYPE: "osis"
Error Handling
The system provides comprehensive error handling:
use hero_supervisor::SupervisorError;
match lifecycle_manager.start_actor(&config).await {
Ok(_) => println!("Actor started successfully"),
Err(SupervisorError::ActorStartFailed(actor, reason)) => {
eprintln!("Failed to start {}: {}", actor, reason);
}
Err(e) => eprintln!("Other error: {}", e),
}
Example Usage
See examples/lifecycle_demo.rs
for a comprehensive demonstration:
# Run the lifecycle demo
cargo run --example lifecycle_demo
# Run with custom Redis URL
REDIS_URL=redis://localhost:6379 cargo run --example lifecycle_demo
Prerequisites
-
Zinit: Install and run Zinit process manager
curl https://raw.githubusercontent.com/threefoldtech/zinit/refs/heads/master/install.sh | bash zinit init --config /etc/zinit/ --socket /var/run/zinit.sock
-
Redis: Running Redis instance for job queues
redis-server
-
Actor Binaries: Compiled actor binaries for each script type
/usr/local/bin/osis_actor
/usr/local/bin/sal_actor
/usr/local/bin/v_actor
/usr/local/bin/python_actor
Configuration Best Practices
- Resource Limits: Configure appropriate resource limits in Zinit
- Health Checks: Implement meaningful health check commands
- Dependencies: Define proper service dependencies
- Environment: Set appropriate environment variables
- Logging: Configure structured logging for debugging
- Monitoring: Use health monitoring for production deployments
Troubleshooting
Common Issues
-
Zinit Connection Failed
- Ensure Zinit is running:
ps aux | grep zinit
- Check socket permissions:
ls -la /var/run/zinit.sock
- Verify socket path in configuration
- Ensure Zinit is running:
-
Actor Start Failed
- Check binary exists and is executable
- Verify dependencies are running
- Review Zinit logs:
zinit logs <service-name>
-
Health Check Failures
- Implement proper health check endpoint in actors
- Verify health check command syntax
- Check actor responsiveness
-
Redis Connection Issues
- Ensure Redis is running and accessible
- Verify Redis URL configuration
- Check network connectivity
Debug Commands
# Check Zinit status
zinit list
# View service logs
zinit logs osis_actor_0
# Check service status
zinit status osis_actor_0
# Monitor Redis queues
redis-cli keys "hero:job:*"
Performance Considerations
- Scaling: Start with minimal actors and scale based on queue depth
- Health Monitoring: Adjust ping intervals based on workload patterns
- Resource Usage: Monitor CPU/memory usage of actor processes
- Queue Depth: Monitor Redis queue lengths for scaling decisions
Security
- Process Isolation: Zinit provides process isolation
- User Permissions: Run actors with appropriate user permissions
- Network Security: Secure Redis and Zinit socket access
- Binary Validation: Verify actor binary integrity before deployment
Future
- Load Balancing: Dynamic scaling of actors based on demand
- Service Dependencies: Proper startup ordering with dependency management