Lightweight process supervisor with dependency management, simpler than systemd.
  • Rust 83.6%
  • JavaScript 12%
  • HTML 2%
  • CSS 1.9%
  • Shell 0.5%
Find a file
mahmoud 5051b9a4c1
All checks were successful
Build and Test / build (push) Successful in 21m7s
lab publish / publish (push) Successful in 28m32s
Delete Cargo.toml.hero_builder_backup
2026-06-08 11:19:30 +00:00
.claude/skills/run_ui_tests fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
.forgejo/workflows fix(ci): remove duplicate FORGE_TOKEN env keys breaking YAML decode 2026-06-03 13:49:03 +02:00
.hero fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
_archive fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
crates chore: Update dependencies and branches 2026-06-03 15:54:30 +03:00
docker Add hero_proc project structure and implementation 2026-03-19 07:03:35 +01:00
docs fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
errors fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
examples fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
memory feat(logs,supervisor,ui): improve job logs lookup and API documentation UI 2026-05-02 15:47:13 +02:00
prompts fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
schema feat(schema): add OSchema definitions + fix service.toml field names 2026-05-31 20:32:41 +02:00
specs fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
.gitignore feat(openrpc): expand API surface — runs, job archives, probes, log aliases, archived filtering 2026-05-16 15:43:30 +02:00
Cargo.lock fix(deps): regenerate lock to all-main (resolves the enforcement break) 2026-06-03 17:06:19 +03:00
Cargo.toml chore: Update dependencies and branches 2026-06-03 15:54:30 +03:00
changes init fixing 2026-04-13 09:44:36 +02:00
process_repos.sh feat(admin): draggable detail-panel splitter, close button, open-in-new-tab + fix socket dir API 2026-05-24 15:32:22 +02:00
PURPOSE.md fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
README.md fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00
rust-toolchain.toml build: pin rust-toolchain to 1.96 for hero_proc 2026-06-04 15:01:55 +03:00
validation_report.json chore: update rust-version to 1.95.0 and canonical dep versions 2026-05-08 09:35:33 +02:00
xxx.md fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module 2026-05-26 12:19:49 +02:00

hero_proc

A lightweight process supervisor with dependency management, similar to systemd but simpler.

Quick Start

Install and Run

service proc start --update --reset

Use the CLI

hero_proc service list
hero_proc service status my-service
hero_proc service start my-service
hero_proc service stop my-service

The web admin dashboard is available via hero_proc_admin on the admin socket.


Documentation


Features

  • Dependency Graph: Services declare dependencies (requires, after, wants, conflicts)
  • State Machine: Explicit states (Inactive, Blocked, Starting, Running, Stopping, Success, Exited, Failed)
  • Process Groups: Signals sent to process groups, handling sh -c child processes correctly
  • Health Checks: TCP, HTTP, and exec-based health checks with retries
  • Ordered Shutdown: Dependents stop before their dependencies
  • Hot Reload: Reload configuration without full restart
  • Secrets Management: Encrypted secret storage with Forgejo sync (init, pull, push)
  • Scheduled Actions: Cron-based scheduling for recurring tasks
  • PTY Attach: Live terminal attach to running processes via WebSocket
  • Web Admin Dashboard: Real-time service management UI with charts, logs, events, and bulk operations
  • TUI Dashboard: Interactive terminal UI for service management (ratatui-based)
  • Fully Embedded UI: All assets (Bootstrap, Chart.js, icons) compiled into the binary — no CDN or network required
  • OpenRPC API: 92 JSON-RPC 2.0 methods over Unix socket

Architecture

hero_proc_server (daemon)
 | unix socket (IPC + JSON-RPC 2.0)
 v
hero_proc (CLI/TUI) hero_proc_admin (web admin dashboard)
 | unix socket (admin.sock)

Crate Structure

crates/
 hero_proc_sdk/ # OpenRPC client SDK — generated client + builders + factory
 hero_proc_server/ # Process supervisor daemon (JSON-RPC 2.0 via Unix socket)
 hero_proc/ # Command-line interface + TUI
 hero_proc_admin/ # Web admin dashboard (Axum + Askama + Bootstrap)
 hero_proc_lib/ # SQLite persistence layer (jobs, runs, secrets, logging, services)
 hero_proc_examples/ # Runnable SDK usage examples
 hero_proc_test/ # Integration test suite + stress tests

Dependency Graph

 hero_proc_sdk (no internal deps)
 ^ ^ ^ ^
 | | | |
 server CLI UI lib

All crates depend on hero_proc_sdk. No cross-dependencies between server, CLI, UI, or lib.

Ports and Sockets

Component Binding Default
hero_proc_server Unix socket (IPC) $PATH_SOCKET/hero_proc/rpc.sock
hero_proc_admin Unix socket (admin) $PATH_SOCKET/hero_proc/admin.sock

Core Concepts

Concept Role Lifetime
Action Executable template (script + interpreter + config) Stored, reusable
Service Supervision unit — desired state + auto-restart Ongoing, supervisor-managed
Job Single execution of an action Transient
Run Universal grouping unit — groups jobs under a single lifecycle Transient

Service

A service is a supervision unit (like a systemd unit). It declares a desired state and references one or more actions. The supervisor continuously reconciles reality with the desired state:

  • start — supervisor ensures the service is running; restarts on crash
  • stop — supervisor ensures the service is stopped
  • ignore — supervisor does not manage this service

Action

An action is a reusable executable template: a script, its interpreter, environment, timeout, retry policy, and dependency edges. Actions can declare depends_on other actions for intra-service ordering.

Job

A job is a single execution of an action. Jobs can be one-shot (run and exit) or long-running processes (is_process = true), where exiting is treated as failure. Each job tracks phase (pending -> running -> succeeded/failed), PID, exit code, and logs.

Run

A run is the universal execution grouping unit. It serves two roles:

  1. Service run — created automatically when a service is started. Named service_{name}, with service_id pointing back to the owning service. If a service has 3 actions, starting it creates 1 run with 3 jobs.
  2. Ad-hoc run — standalone execution of a set of actions (e.g., build pipelines, one-off tasks). Name is required. service_id is None.

A run can depend on other runs by ID — the supervisor will not start it until all dependency runs have reached "ok". Status progression: created → waiting_deps → starting → running → ok | error | halted.

Principles

  • Run is the universal grouping unit: both ad-hoc executions and service starts create a Run. The service_id field distinguishes them.
  • Cascade delete: deleting a Run or Service deletes all associated Jobs. A Job belongs to exactly one Run.
  • Clean restart: when a Service is started, previous Jobs for that Service are removed from the database by default (can be overridden).
  • Provenance tracking: each Job records its service_id and action_id so the origin is always traceable.

For the full data model specification, see docs/README.md.

CLI Commands

All CLI commands are organized into subcommand groups:

Service Management

hero_proc service list # List all services
hero_proc service status <name> # Show service status
hero_proc service start <name> # Start a service
hero_proc service stop <name> # Stop (cascades to dependents)
hero_proc service restart <name> # Restart a service
hero_proc service kill <name> # Send signal to service
hero_proc service add <name> # Add a service at runtime
hero_proc service add-job <svc> ... # Add a job to a service
hero_proc service remove <name> # Remove a service
hero_proc service logs <name> # View service logs
hero_proc service why <name> # Show why service is blocked
hero_proc service tree # Show dependency tree

Job Management

hero_proc job list # List jobs
hero_proc job get <id> # Get job details
hero_proc job create ... # Create a job
hero_proc job delete <id> # Delete a job
hero_proc job status <id> # Job status
hero_proc job logs <id> # Job logs
hero_proc job retry <id> # Retry a failed job
hero_proc job cancel <id> # Cancel a running job

Run Tracking

hero_proc run list # List runs
hero_proc run get <id> # Get run details
hero_proc run logs <id> # Run logs
hero_proc run stats # Run statistics

Log Management

hero_proc log query # Query logs
hero_proc log filter # Filter logs
hero_proc log prune # Prune old logs
hero_proc log export # Export logs

Secrets Management

hero_proc secret set <key> <val> # Set a secret
hero_proc secret get <key> # Get a secret
hero_proc secret list # List secrets
hero_proc secret delete <key> # Delete a secret
hero_proc secret init # Initialize secrets store
hero_proc secret pull # Pull secrets from Forgejo
hero_proc secret push # Push secrets to Forgejo

Actions

hero_proc action list # List actions
hero_proc action get <name> # Get action details
hero_proc action set ... # Register an action
hero_proc action delete <name> # Delete an action

Scripts

hero_proc script scan # Scan for scripts
hero_proc script list # List registered scripts
hero_proc script get <name> # Get script details
hero_proc script set ... # Register a script
hero_proc script delete <name> # Delete a script
hero_proc script run <name> # Run a script

System

hero_proc system ping # Check daemon connectivity
hero_proc system health # Server health check
hero_proc system stats # System statistics
hero_proc system shutdown [--force] # Shutdown daemon
hero_proc system reset [--force] # Stop all, delete all configs
hero_proc system wipe # Wipe all data
hero_proc system demo # Create demo services
hero_proc system schedules # List scheduled actions

Debug

hero_proc debug state # Full graph state dump
hero_proc debug procs # Process tree dump

Other

hero_proc attach <name> # Attach to PTY of running process
hero_proc tui # Launch interactive TUI dashboard

Web Admin Dashboard

The hero_proc_admin crate provides a real-time web admin dashboard with tabs for:

  • Actions: Registered actions with interpreter, timeout, and tags
  • Jobs: Job instances with phase, status, and logs; includes statistics
  • Runs: Execution runs with status and job counts
  • Services: Service management, dependencies, and action mappings
  • Secrets: Encrypted configuration values
  • Logs: Query and filter system logs by source, level, and timestamp

All UI assets (Bootstrap 5.3.3, Bootstrap Icons) are embedded in the binary via rust-embed.

# Start server + admin dashboard
service proc start --update --reset

SDK Usage

hero_proc_sdk is builder-first. The four fluent builders (RetryPolicyBuilder, ActionBuilder, ServiceBuilder, RunBuilder) cover everything you do against hero_proc, and a single HeroProcFactory handle (hp) exposes both convenience helpers and the full RPC surface via Deref.

For the full reference see crates/hero_proc_sdk/README.md and BUILDERS.md.

Connect

use hero_proc_sdk::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
 let hp = hero_proc_factory().await?; // local Unix socket
 let pong = hp.system_ping(SystemPingInput {}).await?;
 println!("server: {}", pong.version);
 Ok(())
}

Remote:

let hp = HeroProcFactory::builder().http("http://10.0.0.1:8080").connect().await?;

Run a long-running daemon — ServiceBuilder

let svc = ServiceBuilder::new("api")
 .action(ActionBuilder::new("api", "node server.js")
 .env("PORT", "8080")
 .retry_builder(|b| b.max_attempts(10).delay_ms(2_000).backoff(true))
 .build())
 .requires(&["postgres"])
 .build();

hp.start_service("api", svc, 60).await?; // register + start + wait

Presets when the full builder is overkill: simple_service, oneshot_service, system_service, sleep_service.

Submit a one-shot or batch — RunBuilder

// Trivial one-liner
let handle = hp.submit_oneshot("backup", "rsync -av /data /backup").await?;
hp.wait_run(handle.run_id, 300).await?;

// Multi-step batch with concurrency cap, mixed interpreters, auto-cleanup
let handle = RunBuilder::new("daily-checks")
 .max_concurrency(3)
 .add_inline_script_with(
 "row_count",
 "import sqlite3; print(sqlite3.connect('/data/app.db').execute('SELECT COUNT(*) FROM users').fetchone()[0])",
 Interpreter::Python3,
 )
 .add_inline_script_with("big_files", "ls /var/log | where size > 100mb", Interpreter::Nushell)
 .add_inline_script("notify", "curl -X POST https://hooks.example.com/done")
 .submit(&hp).await?;

The supervisor honours max_concurrency (1..=100) per run, walks the actions array in submission order, skips past dependency-blocked jobs, and auto-cleans inline actions when the run reaches ok. Defaults applied automatically when any inline action is present: cap=5, cleanup_on_success=true.

Interpreters: Bash (default), Sh, Python3, Node, Bun, Nushell, Exec, Ai, Mcp.

Convenience helpers on hp

hp.wait_run(run_id, secs).await?; // poll to terminal state
hp.wait_job(job_id, secs).await?;
hp.wait_service_running("api", secs).await?;

hp.tail("api", 50).await?; // structured logs by service
hp.job_tail(job_id, 100).await?; // one job's stdout/stderr
hp.search("api.*", 200).await?; // wildcard search
hp.recent_errors(Some("api.*"), 50).await?; // loglevel >= 3

Every generated RPC method is also available directly on hp (~109 methods, type-safe Input/Output structs).

Environment Variables

Required

Variable Description
WEBROOT Base URL of the hero_proc admin dashboard (e.g. http://127.0.0.1:9998/).

Optional

Variable Default Description
HERO_PROC_LOG_LEVEL info Log level: trace, debug, info, warn, error
HERO_PROC_CONFIG_DIR ~/hero/cfg/hero_proc Service config directory
HERO_PROC_SOCKET $PATH_SOCKET/hero_proc/rpc.sock Unix socket path

Shutdown Ordering

Services are stopped in reverse dependency order:

Example: database <- app <- worker

Startup order: database -> app -> worker
Shutdown order: worker -> app -> database

When stopping a single service, dependents are stopped first:

  • hero_proc service stop database stops worker, then app, then database
  • Dependencies are NOT auto-stopped (other services may need them)

Development

Start / Stop

service proc start --update --reset # Install, build, and start
service proc stop # Graceful shutdown
service proc start --clear # Wipe state and restart fresh

Integration Tests

cargo test --test shutdown -- --nocapture --test-threads=1 # Shutdown tests