Lightweight process supervisor with dependency management, simpler than systemd.

Rust 83.6%
JavaScript 12%
HTML 2%
CSS 1.9%
Shell 0.5%

Find a file

mahmoud 5051b9a4c1 All checks were successful Build and Test / build (push) Successful in 21m7s Details lab publish / publish (push) Successful in 28m32s Details Delete Cargo.toml.hero_builder_backup		2026-06-08 11:19:30 +00:00
.claude/skills/run_ui_tests	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
.forgejo/workflows	fix(ci): remove duplicate FORGE_TOKEN env keys breaking YAML decode	2026-06-03 13:49:03 +02:00
.hero	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
_archive	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
crates	chore: Update dependencies and branches	2026-06-03 15:54:30 +03:00
docker	Add hero_proc project structure and implementation	2026-03-19 07:03:35 +01:00
docs	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
errors	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
examples	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
memory	feat(logs,supervisor,ui): improve job logs lookup and API documentation UI	2026-05-02 15:47:13 +02:00
prompts	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
schema	feat(schema): add OSchema definitions + fix service.toml field names	2026-05-31 20:32:41 +02:00
specs	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
.gitignore	feat(openrpc): expand API surface — runs, job archives, probes, log aliases, archived filtering	2026-05-16 15:43:30 +02:00
Cargo.lock	fix(deps): regenerate lock to all-main (resolves the enforcement break)	2026-06-03 17:06:19 +03:00
Cargo.toml	chore: Update dependencies and branches	2026-06-03 15:54:30 +03:00
changes	init fixing	2026-04-13 09:44:36 +02:00
process_repos.sh	feat(admin): draggable detail-panel splitter, close button, open-in-new-tab + fix socket dir API	2026-05-24 15:32:22 +02:00
PURPOSE.md	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
README.md	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00
rust-toolchain.toml	build: pin rust-toolchain to 1.96 for hero_proc	2026-06-04 15:01:55 +03:00
validation_report.json	chore: update rust-version to 1.95.0 and canonical dep versions	2026-05-08 09:35:33 +02:00
xxx.md	fix: rename FORGEJO_TOKEN→FORGE_TOKEN and HERO_SOCKET_DIR→PATH_SOCKET; add schedule RPC module	2026-05-26 12:19:49 +02:00

README.md

hero_proc

A lightweight process supervisor with dependency management, similar to systemd but simpler.

Quick Start

Install and Run

service proc start --update --reset

Use the CLI

hero_proc service list
hero_proc service status my-service
hero_proc service start my-service
hero_proc service stop my-service

The web admin dashboard is available via hero_proc_admin on the admin socket.

Documentation

Data Model Reference: Core concepts, schemas, and data model specification
OpenRPC API: JSON-RPC 2.0 method reference
Example Configurations: Ready-to-use examples
SDK README: Rust SDK — connect, build services, submit runs, tail logs
SDK Builders Reference: ServiceBuilder / ActionBuilder / RunBuilder / RetryPolicyBuilder — every method, every default

Features

Dependency Graph: Services declare dependencies (requires, after, wants, conflicts)
State Machine: Explicit states (Inactive, Blocked, Starting, Running, Stopping, Success, Exited, Failed)
Process Groups: Signals sent to process groups, handling sh -c child processes correctly
Health Checks: TCP, HTTP, and exec-based health checks with retries
Ordered Shutdown: Dependents stop before their dependencies
Hot Reload: Reload configuration without full restart
Secrets Management: Encrypted secret storage with Forgejo sync (init, pull, push)
Scheduled Actions: Cron-based scheduling for recurring tasks
PTY Attach: Live terminal attach to running processes via WebSocket
Web Admin Dashboard: Real-time service management UI with charts, logs, events, and bulk operations
TUI Dashboard: Interactive terminal UI for service management (ratatui-based)
Fully Embedded UI: All assets (Bootstrap, Chart.js, icons) compiled into the binary — no CDN or network required
OpenRPC API: 92 JSON-RPC 2.0 methods over Unix socket

Architecture

hero_proc_server (daemon)
 | unix socket (IPC + JSON-RPC 2.0)
 v
hero_proc (CLI/TUI) hero_proc_admin (web admin dashboard)
 | unix socket (admin.sock)

Crate Structure

crates/
 hero_proc_sdk/ # OpenRPC client SDK — generated client + builders + factory
 hero_proc_server/ # Process supervisor daemon (JSON-RPC 2.0 via Unix socket)
 hero_proc/ # Command-line interface + TUI
 hero_proc_admin/ # Web admin dashboard (Axum + Askama + Bootstrap)
 hero_proc_lib/ # SQLite persistence layer (jobs, runs, secrets, logging, services)
 hero_proc_examples/ # Runnable SDK usage examples
 hero_proc_test/ # Integration test suite + stress tests

Dependency Graph

 hero_proc_sdk (no internal deps)
 ^ ^ ^ ^
 | | | |
 server CLI UI lib

All crates depend on hero_proc_sdk. No cross-dependencies between server, CLI, UI, or lib.

Ports and Sockets

Component	Binding	Default
hero_proc_server	Unix socket (IPC)	`$PATH_SOCKET/hero_proc/rpc.sock`
hero_proc_admin	Unix socket (admin)	`$PATH_SOCKET/hero_proc/admin.sock`

Core Concepts

Concept	Role	Lifetime
Action	Executable template (script + interpreter + config)	Stored, reusable
Service	Supervision unit — desired state + auto-restart	Ongoing, supervisor-managed
Job	Single execution of an action	Transient
Run	Universal grouping unit — groups jobs under a single lifecycle	Transient

Service

A service is a supervision unit (like a systemd unit). It declares a desired state and references one or more actions. The supervisor continuously reconciles reality with the desired state:

start — supervisor ensures the service is running; restarts on crash
stop — supervisor ensures the service is stopped
ignore — supervisor does not manage this service

Action

An action is a reusable executable template: a script, its interpreter, environment, timeout, retry policy, and dependency edges. Actions can declare depends_on other actions for intra-service ordering.

Job

A job is a single execution of an action. Jobs can be one-shot (run and exit) or long-running processes (is_process = true), where exiting is treated as failure. Each job tracks phase (pending -> running -> succeeded/failed), PID, exit code, and logs.

Run

A run is the universal execution grouping unit. It serves two roles:

Service run — created automatically when a service is started. Named service_{name}, with service_id pointing back to the owning service. If a service has 3 actions, starting it creates 1 run with 3 jobs.
Ad-hoc run — standalone execution of a set of actions (e.g., build pipelines, one-off tasks). Name is required. service_id is None.

A run can depend on other runs by ID — the supervisor will not start it until all dependency runs have reached "ok". Status progression: created → waiting_deps → starting → running → ok | error | halted.

Principles

Run is the universal grouping unit: both ad-hoc executions and service starts create a Run. The service_id field distinguishes them.
Cascade delete: deleting a Run or Service deletes all associated Jobs. A Job belongs to exactly one Run.
Clean restart: when a Service is started, previous Jobs for that Service are removed from the database by default (can be overridden).
Provenance tracking: each Job records its service_id and action_id so the origin is always traceable.

For the full data model specification, see docs/README.md.

CLI Commands

All CLI commands are organized into subcommand groups:

Service Management

hero_proc service list # List all services
hero_proc service status <name> # Show service status
hero_proc service start <name> # Start a service
hero_proc service stop <name> # Stop (cascades to dependents)
hero_proc service restart <name> # Restart a service
hero_proc service kill <name> # Send signal to service
hero_proc service add <name> # Add a service at runtime
hero_proc service add-job <svc> ... # Add a job to a service
hero_proc service remove <name> # Remove a service
hero_proc service logs <name> # View service logs
hero_proc service why <name> # Show why service is blocked
hero_proc service tree # Show dependency tree

Job Management

hero_proc job list # List jobs
hero_proc job get <id> # Get job details
hero_proc job create ... # Create a job
hero_proc job delete <id> # Delete a job
hero_proc job status <id> # Job status
hero_proc job logs <id> # Job logs
hero_proc job retry <id> # Retry a failed job
hero_proc job cancel <id> # Cancel a running job

Run Tracking

hero_proc run list # List runs
hero_proc run get <id> # Get run details
hero_proc run logs <id> # Run logs
hero_proc run stats # Run statistics

Log Management

hero_proc log query # Query logs
hero_proc log filter # Filter logs
hero_proc log prune # Prune old logs
hero_proc log export # Export logs

Secrets Management

hero_proc secret set <key> <val> # Set a secret
hero_proc secret get <key> # Get a secret
hero_proc secret list # List secrets
hero_proc secret delete <key> # Delete a secret
hero_proc secret init # Initialize secrets store
hero_proc secret pull # Pull secrets from Forgejo
hero_proc secret push # Push secrets to Forgejo

Actions

hero_proc action list # List actions
hero_proc action get <name> # Get action details
hero_proc action set ... # Register an action
hero_proc action delete <name> # Delete an action

Scripts

hero_proc script scan # Scan for scripts
hero_proc script list # List registered scripts
hero_proc script get <name> # Get script details
hero_proc script set ... # Register a script
hero_proc script delete <name> # Delete a script
hero_proc script run <name> # Run a script

System

hero_proc system ping # Check daemon connectivity
hero_proc system health # Server health check
hero_proc system stats # System statistics
hero_proc system shutdown [--force] # Shutdown daemon
hero_proc system reset [--force] # Stop all, delete all configs
hero_proc system wipe # Wipe all data
hero_proc system demo # Create demo services
hero_proc system schedules # List scheduled actions

Debug

hero_proc debug state # Full graph state dump
hero_proc debug procs # Process tree dump

Other

hero_proc attach <name> # Attach to PTY of running process
hero_proc tui # Launch interactive TUI dashboard

Web Admin Dashboard

The hero_proc_admin crate provides a real-time web admin dashboard with tabs for:

Actions: Registered actions with interpreter, timeout, and tags
Jobs: Job instances with phase, status, and logs; includes statistics
Runs: Execution runs with status and job counts
Services: Service management, dependencies, and action mappings
Secrets: Encrypted configuration values
Logs: Query and filter system logs by source, level, and timestamp

All UI assets (Bootstrap 5.3.3, Bootstrap Icons) are embedded in the binary via rust-embed.

# Start server + admin dashboard
service proc start --update --reset

SDK Usage

hero_proc_sdk is builder-first. The four fluent builders (RetryPolicyBuilder, ActionBuilder, ServiceBuilder, RunBuilder) cover everything you do against hero_proc, and a single HeroProcFactory handle (hp) exposes both convenience helpers and the full RPC surface via Deref.

For the full reference see crates/hero_proc_sdk/README.md and BUILDERS.md.

Connect

use hero_proc_sdk::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
 let hp = hero_proc_factory().await?; // local Unix socket
 let pong = hp.system_ping(SystemPingInput {}).await?;
 println!("server: {}", pong.version);
 Ok(())
}

Remote:

let hp = HeroProcFactory::builder().http("http://10.0.0.1:8080").connect().await?;

Run a long-running daemon — `ServiceBuilder`

let svc = ServiceBuilder::new("api")
 .action(ActionBuilder::new("api", "node server.js")
 .env("PORT", "8080")
 .retry_builder(|b| b.max_attempts(10).delay_ms(2_000).backoff(true))
 .build())
 .requires(&["postgres"])
 .build();

hp.start_service("api", svc, 60).await?; // register + start + wait

Presets when the full builder is overkill: simple_service, oneshot_service, system_service, sleep_service.

Submit a one-shot or batch — `RunBuilder`

// Trivial one-liner
let handle = hp.submit_oneshot("backup", "rsync -av /data /backup").await?;
hp.wait_run(handle.run_id, 300).await?;

// Multi-step batch with concurrency cap, mixed interpreters, auto-cleanup
let handle = RunBuilder::new("daily-checks")
 .max_concurrency(3)
 .add_inline_script_with(
 "row_count",
 "import sqlite3; print(sqlite3.connect('/data/app.db').execute('SELECT COUNT(*) FROM users').fetchone()[0])",
 Interpreter::Python3,
 )
 .add_inline_script_with("big_files", "ls /var/log | where size > 100mb", Interpreter::Nushell)
 .add_inline_script("notify", "curl -X POST https://hooks.example.com/done")
 .submit(&hp).await?;

The supervisor honours max_concurrency (1..=100) per run, walks the actions array in submission order, skips past dependency-blocked jobs, and auto-cleans inline actions when the run reaches ok. Defaults applied automatically when any inline action is present: cap=5, cleanup_on_success=true.

Interpreters: Bash (default), Sh, Python3, Node, Bun, Nushell, Exec, Ai, Mcp.

Convenience helpers on `hp`

hp.wait_run(run_id, secs).await?; // poll to terminal state
hp.wait_job(job_id, secs).await?;
hp.wait_service_running("api", secs).await?;

hp.tail("api", 50).await?; // structured logs by service
hp.job_tail(job_id, 100).await?; // one job's stdout/stderr
hp.search("api.*", 200).await?; // wildcard search
hp.recent_errors(Some("api.*"), 50).await?; // loglevel >= 3

Every generated RPC method is also available directly on hp (~109 methods, type-safe Input/Output structs).

Environment Variables

Required

Variable	Description
`WEBROOT`	Base URL of the hero_proc admin dashboard (e.g. `http://127.0.0.1:9998/`).

Optional

Variable	Default	Description
`HERO_PROC_LOG_LEVEL`	`info`	Log level: `trace`, `debug`, `info`, `warn`, `error`
`HERO_PROC_CONFIG_DIR`	`~/hero/cfg/hero_proc`	Service config directory
`HERO_PROC_SOCKET`	`$PATH_SOCKET/hero_proc/rpc.sock`	Unix socket path

Shutdown Ordering

Services are stopped in reverse dependency order:

Example: database <- app <- worker

Startup order: database -> app -> worker
Shutdown order: worker -> app -> database

When stopping a single service, dependents are stopped first:

hero_proc service stop database stops worker, then app, then database
Dependencies are NOT auto-stopped (other services may need them)

Development

Start / Stop

service proc start --update --reset # Install, build, and start
service proc stop # Graceful shutdown
service proc start --clear # Wipe state and restart fresh

Integration Tests

cargo test --test shutdown -- --nocapture --test-threads=1 # Shutdown tests