zinit UI: admin section on zinit + graphs for memory usage... #37

Open
opened 2026-03-09 05:42:30 +00:00 by despiegk · 4 comments
Owner

we used to have a nice section where we could see basic stats at left side of screen

  • mem usage
  • nr of processes

use a graph to show this

restore this, if needed check in development branch how it was done

make sure skill: hero_ui_dashboard is followed

then make an admin tab

shutdown

  • clean shutdown of zinit
  • check previous implementation in zinit (development branch) how this was done
  • make sure there is a function for this in server and then sdk
  • make sure all services cleanly stopped
  • then before leaving the UI, check zinit processes empty and show this

stopall

  • stop gracefully all jobs & services
  • make sure there is a function for this in server and then sdk
  • make sure they are stopped
  • the actions & services still defined

clean

  • clean all old jobs & logs for jobs no longer running
  • make sure there is a function for this in server and then sdk

resetall

  • stop gracefully all jobs & services
  • make sure there is a function for this in server and then sdk
  • make sure they are stopped
  • remove all the actions and services
  • basically this gives us clean state

service & job management

  • we can click on service we can stop, restart, ... (see old implementation)
  • when we click on job we can kill, ... (see old)

we need full management of the jobs

make a detailed page per job/service

  • where we see the job details
  • cpu/memory usage and keep on tracking real time (make sure is in server to check memory and CPU and sdk) as long as tab is open
  • we see children
  • we can do actions on it
  • be creative what else is important for managing a job
  • see a graph of jobs we depend on and also who depend on us

now same for service

use router feature

  • make sure to show router paths on top of ui
  • $server:8888/logs_run/$run. (show logs from all jobs related to run)
  • $server:8888/logs_job/$jobid
  • $server:8888/job/$jobid
  • $server:8888/service/...
  • $server:8888/action/...
  • $server:8888/run/...
  • ...

for all relevent elements

do proper tests with mcp browser

we used to have a nice section where we could see basic stats at left side of screen - mem usage - nr of processes use a graph to show this restore this, if needed check in development branch how it was done make sure skill: hero_ui_dashboard is followed # then make an admin tab ## shutdown - clean shutdown of zinit - check previous implementation in zinit (development branch) how this was done - make sure there is a function for this in server and then sdk - make sure all services cleanly stopped - then before leaving the UI, check zinit processes empty and show this ## stopall - stop gracefully all jobs & services - make sure there is a function for this in server and then sdk - make sure they are stopped - the actions & services still defined ## clean - clean all old jobs & logs for jobs no longer running - make sure there is a function for this in server and then sdk ## resetall - stop gracefully all jobs & services - make sure there is a function for this in server and then sdk - make sure they are stopped - remove all the actions and services - basically this gives us clean state # service & job management - we can click on service we can stop, restart, ... (see old implementation) - when we click on job we can kill, ... (see old) we need full management of the jobs # make a detailed page per job/service - where we see the job details - cpu/memory usage and keep on tracking real time (make sure is in server to check memory and CPU and sdk) as long as tab is open - we see children - we can do actions on it - be creative what else is important for managing a job - see a graph of jobs we depend on and also who depend on us now same for service ## use router feature - make sure to show router paths on top of ui - $server:8888/logs_run/$run. (show logs from all jobs related to run) - $server:8888/logs_job/$jobid - $server:8888/job/$jobid - $server:8888/service/... - $server:8888/action/... - $server:8888/run/... - ... for all relevent elements do proper tests with mcp browser
Author
Owner

Implementation Spec for Issue #37

Objective

Extend the Zinit web admin dashboard with: (1) a dashboard stats panel showing memory/CPU usage graphs, (2) an Admin tab with system-level operations (shutdown, stop-all, clean, reset-all), (3) enhanced service and job detail panels with lifecycle management controls, (4) dedicated detail pages per job/service with real-time resource tracking, (5) URL-based routing for deep-linking, and (6) MCP browser tests.

Requirements

Dashboard Stats Panel

  • Left-side or top-bar panel showing aggregate system stats: total memory, total CPU, process count, service count, job counts by phase
  • Chart.js graphs for memory usage over time (already bundled but not loaded)
  • Auto-refreshing on the existing 5-second interval

Admin Tab

  • New "Admin" tab with four operations:
    • Shutdown: Calls system.shutdown RPC; polls to verify processes empty
    • Stop All: Calls service.stop_all RPC; shows progress and stopped count
    • Clean: Calls job.purge and logs.delete_older_than RPC
    • Reset All: Calls existing /api/services/reset-all REST endpoint

Service & Job Lifecycle Controls

  • Service: Start/Stop/Restart/Kill buttons, live status, real-time stats
  • Job: Cancel/Retry/Kill, dependency graph, why-waiting info, elapsed time

Detailed Pages per Job/Service

  • Real-time CPU/memory from service.stats, children processes, dependency info
  • Job logs, elapsed time, attempt history, dependency graph

URL Router

  • Path-based routing: /, /job/{id}, /service/{name}, /admin, /run/{id}, etc.
  • history.pushState for clean URLs, popstate handling

Testing

  • MCP browser-based tests for UI verification

Implementation Plan (7 Steps)

Step 1: Implement real service.stats and add system.stats RPC

  • Fix service.stats handler to use sysinfo crate for real CPU/memory (currently returns zeros)
  • Add system.stats handler returning system-wide memory, CPU, process count
  • Update openrpc.json with system.stats method definition
  • Files: zinit_server/src/rpc/service.rs, zinit_server/src/rpc/system.rs, zinit_server/src/rpc/mod.rs, zinit_server/openrpc.json

Step 2: Add Chart.js loading and Dashboard Stats UI

  • Load bundled Chart.js in HTML template
  • Add dashboard stats bar with stat cards and memory chart
  • Poll system.stats every 5 seconds, maintain rolling 60-point history
  • Files: zinit_ui/templates/base.html, zinit_ui/static/js/dashboard.js, zinit_ui/static/css/dashboard.css

Step 3: Add Admin Tab with Bulk Operations

  • Add Admin tab button and pane with four operation cards
  • Create admin.js with shutdown, stop-all, clean, reset-all functions
  • Confirmation dialogs before destructive operations
  • Files: zinit_ui/templates/base.html, zinit_ui/templates/index.html, zinit_ui/static/js/admin.js, zinit_ui/static/css/dashboard.css

Step 4: Enhance Service Detail Panel

  • Add Start/Stop/Restart/Kill buttons to service detail
  • Show real-time PID, memory, CPU, child processes
  • Files: zinit_ui/static/js/dashboard.js

Step 5: Enhance Job Detail Panel

  • Add dependency graph, why-waiting info, elapsed time, attempt count
  • Kill button for stuck jobs
  • Files: zinit_ui/static/js/dashboard.js

Step 6: Implement Path-Based URL Router

  • Add catch-all routes on server side
  • Replace hash-based routing with history.pushState
  • Handle popstate and initial URL parsing
  • Files: zinit_ui/src/routes.rs, zinit_ui/static/js/dashboard.js

Step 7: MCP Browser Tests

  • Automated tests covering: page load, tab navigation, admin tab, service detail, URL routing
  • Files: tests/ui/ (new directory)

Key Notes

  • Chart.js is already bundled at static/js/chart.umd.min.js — just needs a script tag
  • The reset-all endpoint already exists with full implementation
  • sysinfo crate is already a dependency — no new deps needed
  • SDK is auto-generated from openrpc.json via openrpc_client! macro
  • All RPC methods are automatically proxied via openrpc_proxy! macro

Acceptance Criteria

  • system.stats returns real memory, CPU, and process count
  • service.stats returns real memory/CPU (not zeros)
  • Dashboard stats bar visible on all tabs with live data
  • Chart.js memory graph renders and updates every 5s
  • Admin tab with Shutdown/Stop All/Clean/Reset All operations
  • Service detail panel with lifecycle buttons and real-time stats
  • Job detail with elapsed time, attempts, dependency graph, why-waiting
  • URL routing works for all entity types with browser back/forward
  • MCP browser tests pass
  • No JS console errors, light/dark themes work
## Implementation Spec for Issue #37 ### Objective Extend the Zinit web admin dashboard with: (1) a dashboard stats panel showing memory/CPU usage graphs, (2) an Admin tab with system-level operations (shutdown, stop-all, clean, reset-all), (3) enhanced service and job detail panels with lifecycle management controls, (4) dedicated detail pages per job/service with real-time resource tracking, (5) URL-based routing for deep-linking, and (6) MCP browser tests. ### Requirements **Dashboard Stats Panel** - Left-side or top-bar panel showing aggregate system stats: total memory, total CPU, process count, service count, job counts by phase - Chart.js graphs for memory usage over time (already bundled but not loaded) - Auto-refreshing on the existing 5-second interval **Admin Tab** - New "Admin" tab with four operations: - **Shutdown**: Calls `system.shutdown` RPC; polls to verify processes empty - **Stop All**: Calls `service.stop_all` RPC; shows progress and stopped count - **Clean**: Calls `job.purge` and `logs.delete_older_than` RPC - **Reset All**: Calls existing `/api/services/reset-all` REST endpoint **Service & Job Lifecycle Controls** - Service: Start/Stop/Restart/Kill buttons, live status, real-time stats - Job: Cancel/Retry/Kill, dependency graph, why-waiting info, elapsed time **Detailed Pages per Job/Service** - Real-time CPU/memory from `service.stats`, children processes, dependency info - Job logs, elapsed time, attempt history, dependency graph **URL Router** - Path-based routing: `/`, `/job/{id}`, `/service/{name}`, `/admin`, `/run/{id}`, etc. - history.pushState for clean URLs, popstate handling **Testing** - MCP browser-based tests for UI verification ### Implementation Plan (7 Steps) #### Step 1: Implement real `service.stats` and add `system.stats` RPC - Fix `service.stats` handler to use `sysinfo` crate for real CPU/memory (currently returns zeros) - Add `system.stats` handler returning system-wide memory, CPU, process count - Update `openrpc.json` with `system.stats` method definition - **Files:** `zinit_server/src/rpc/service.rs`, `zinit_server/src/rpc/system.rs`, `zinit_server/src/rpc/mod.rs`, `zinit_server/openrpc.json` #### Step 2: Add Chart.js loading and Dashboard Stats UI - Load bundled Chart.js in HTML template - Add dashboard stats bar with stat cards and memory chart - Poll `system.stats` every 5 seconds, maintain rolling 60-point history - **Files:** `zinit_ui/templates/base.html`, `zinit_ui/static/js/dashboard.js`, `zinit_ui/static/css/dashboard.css` #### Step 3: Add Admin Tab with Bulk Operations - Add Admin tab button and pane with four operation cards - Create `admin.js` with shutdown, stop-all, clean, reset-all functions - Confirmation dialogs before destructive operations - **Files:** `zinit_ui/templates/base.html`, `zinit_ui/templates/index.html`, `zinit_ui/static/js/admin.js`, `zinit_ui/static/css/dashboard.css` #### Step 4: Enhance Service Detail Panel - Add Start/Stop/Restart/Kill buttons to service detail - Show real-time PID, memory, CPU, child processes - **Files:** `zinit_ui/static/js/dashboard.js` #### Step 5: Enhance Job Detail Panel - Add dependency graph, why-waiting info, elapsed time, attempt count - Kill button for stuck jobs - **Files:** `zinit_ui/static/js/dashboard.js` #### Step 6: Implement Path-Based URL Router - Add catch-all routes on server side - Replace hash-based routing with history.pushState - Handle popstate and initial URL parsing - **Files:** `zinit_ui/src/routes.rs`, `zinit_ui/static/js/dashboard.js` #### Step 7: MCP Browser Tests - Automated tests covering: page load, tab navigation, admin tab, service detail, URL routing - **Files:** `tests/ui/` (new directory) ### Key Notes - Chart.js is already bundled at `static/js/chart.umd.min.js` — just needs a script tag - The `reset-all` endpoint already exists with full implementation - `sysinfo` crate is already a dependency — no new deps needed - SDK is auto-generated from `openrpc.json` via `openrpc_client!` macro - All RPC methods are automatically proxied via `openrpc_proxy!` macro ### Acceptance Criteria - [ ] `system.stats` returns real memory, CPU, and process count - [ ] `service.stats` returns real memory/CPU (not zeros) - [ ] Dashboard stats bar visible on all tabs with live data - [ ] Chart.js memory graph renders and updates every 5s - [ ] Admin tab with Shutdown/Stop All/Clean/Reset All operations - [ ] Service detail panel with lifecycle buttons and real-time stats - [ ] Job detail with elapsed time, attempts, dependency graph, why-waiting - [ ] URL routing works for all entity types with browser back/forward - [ ] MCP browser tests pass - [ ] No JS console errors, light/dark themes work
Author
Owner

Updated Implementation Spec for Issue #37 — Server-First Approach

RPC Audit Results

67 RPC methods already exist with 100% handler ↔ OpenRPC ↔ SDK coverage. Almost all required server-side functionality for the UI is already implemented:

Already working:

  • system.shutdown, system.reboot, system.ping, system.prepare_restart
  • service.start, service.stop, service.restart, service.kill
  • service.start_all, service.stop_all
  • service.status, service.status_full, service.is_running, service.why, service.tree
  • job.cancel, job.retry, job.cancel_bulk, job.purge
  • job.graph, job.graph_for, job.why_waiting
  • job.logs, job.elapsed_ms, job.attempts, job.stats
  • logs.delete_older_than, logs.delete_by_src
  • debug.state, debug.process_tree
  • Full CRUD for actions, services, jobs, runs, secrets, logs

Broken / missing (needs implementation):

# Issue Current State Fix Required
1 system.stats doesn't exist No handler, no OpenRPC entry Add handler + OpenRPC schema returning system memory, CPU, process count, service/job counts
2 service.stats returns fake data Returns memory_bytes: 0, cpu_percent: 0.0 always Use sysinfo crate (already a dep) to read real process metrics from PID
3 service.children returns fake memory Returns memory_bytes: 0 per child Same fix — use sysinfo to read child process memory
4 Restart count hardcoded service.status returns restarts: 0 always Track actual restart count in service state
5 Job log archive doesn't separate by attempt Wraps all logs into single attempt: 1 entry Use attempt markers/tags to properly separate log entries per attempt

Implementation Plan

Step 1: Implement real service.stats with sysinfo

Files: crates/zinit_server/src/rpc/service.rs

  • Fix handle_stats to use sysinfo::System to look up process by PID
  • Read real memory() and cpu_usage() from the process
  • Return actual values instead of zeros
  • Dependencies: none

Step 2: Fix service.children memory reporting

Files: crates/zinit_server/src/rpc/service.rs

  • Fix handle_children to use sysinfo for child process memory
  • Same pattern as Step 1
  • Dependencies: Step 1 (shares sysinfo pattern)

Step 3: Add system.stats RPC method

Files: crates/zinit_server/src/rpc/system.rs, crates/zinit_server/src/rpc/mod.rs, crates/zinit_server/openrpc.json

  • Add handle_stats function using sysinfo::System for system-wide metrics
  • Return: memory_total_bytes, memory_used_bytes, cpu_percent, process_count, service_count, job_stats (from db.jobs.stats())
  • Add dispatch entry in mod.rs
  • Add method + schema in openrpc.json (SDK auto-generates from this)
  • Dependencies: none

Step 4: Track actual restart count

Files: crates/zinit_server/src/rpc/service.rs (and possibly service state/DB)

  • Investigate how restarts are triggered (via service.restart handler)
  • Add restart counter to service state or derive from job history
  • Return real count in service.status and service.status_full
  • Dependencies: none

Step 5: Fix job log archive attempt separation

Files: crates/zinit_server/src/rpc/job.rs

  • Fix handle_log_archive to properly separate logs by attempt number
  • Use job attempt count and log timestamps/tags to partition
  • Dependencies: none

Step 6: Verify OpenRPC + SDK integration

  • Ensure all changes are reflected in openrpc.json
  • Run cargo build to verify SDK auto-generation works
  • Run cargo test to verify nothing broke
  • Dependencies: Steps 1-5

Acceptance Criteria

  • service.stats returns real memory_bytes and cpu_percent from sysinfo
  • service.children returns real memory_bytes per child process
  • system.stats RPC exists, returns real system memory/CPU/process/service/job counts
  • system.stats appears in openrpc.json and is auto-generated in SDK
  • service.status returns actual restart count (not always 0)
  • job.log_archive properly separates logs by attempt number
  • cargo test passes
  • cargo build succeeds (SDK generation works)

Notes

  • sysinfo crate is already a dependency (used in process.rs) — no new deps needed
  • SDK is auto-generated from openrpc.json via openrpc_client! macro — adding to openrpc.json is sufficient
  • All RPC methods are auto-proxied to the UI via openrpc_proxy! — no UI route changes needed for new methods
  • Steps 1, 3, 4, 5 are independent and could run in parallel
  • This spec covers only server-side work; UI implementation will follow in a separate phase
## Updated Implementation Spec for Issue #37 — Server-First Approach ### RPC Audit Results **67 RPC methods already exist** with 100% handler ↔ OpenRPC ↔ SDK coverage. Almost all required server-side functionality for the UI is already implemented: ✅ **Already working:** - `system.shutdown`, `system.reboot`, `system.ping`, `system.prepare_restart` - `service.start`, `service.stop`, `service.restart`, `service.kill` - `service.start_all`, `service.stop_all` - `service.status`, `service.status_full`, `service.is_running`, `service.why`, `service.tree` - `job.cancel`, `job.retry`, `job.cancel_bulk`, `job.purge` - `job.graph`, `job.graph_for`, `job.why_waiting` - `job.logs`, `job.elapsed_ms`, `job.attempts`, `job.stats` - `logs.delete_older_than`, `logs.delete_by_src` - `debug.state`, `debug.process_tree` - Full CRUD for actions, services, jobs, runs, secrets, logs ❌ **Broken / missing (needs implementation):** | # | Issue | Current State | Fix Required | |---|-------|--------------|--------------| | 1 | `system.stats` doesn't exist | No handler, no OpenRPC entry | Add handler + OpenRPC schema returning system memory, CPU, process count, service/job counts | | 2 | `service.stats` returns fake data | Returns `memory_bytes: 0`, `cpu_percent: 0.0` always | Use `sysinfo` crate (already a dep) to read real process metrics from PID | | 3 | `service.children` returns fake memory | Returns `memory_bytes: 0` per child | Same fix — use `sysinfo` to read child process memory | | 4 | Restart count hardcoded | `service.status` returns `restarts: 0` always | Track actual restart count in service state | | 5 | Job log archive doesn't separate by attempt | Wraps all logs into single `attempt: 1` entry | Use attempt markers/tags to properly separate log entries per attempt | ### Implementation Plan #### Step 1: Implement real `service.stats` with sysinfo **Files:** `crates/zinit_server/src/rpc/service.rs` - Fix `handle_stats` to use `sysinfo::System` to look up process by PID - Read real `memory()` and `cpu_usage()` from the process - Return actual values instead of zeros - Dependencies: none #### Step 2: Fix `service.children` memory reporting **Files:** `crates/zinit_server/src/rpc/service.rs` - Fix `handle_children` to use `sysinfo` for child process memory - Same pattern as Step 1 - Dependencies: Step 1 (shares sysinfo pattern) #### Step 3: Add `system.stats` RPC method **Files:** `crates/zinit_server/src/rpc/system.rs`, `crates/zinit_server/src/rpc/mod.rs`, `crates/zinit_server/openrpc.json` - Add `handle_stats` function using `sysinfo::System` for system-wide metrics - Return: `memory_total_bytes`, `memory_used_bytes`, `cpu_percent`, `process_count`, `service_count`, `job_stats` (from `db.jobs.stats()`) - Add dispatch entry in `mod.rs` - Add method + schema in `openrpc.json` (SDK auto-generates from this) - Dependencies: none #### Step 4: Track actual restart count **Files:** `crates/zinit_server/src/rpc/service.rs` (and possibly service state/DB) - Investigate how restarts are triggered (via `service.restart` handler) - Add restart counter to service state or derive from job history - Return real count in `service.status` and `service.status_full` - Dependencies: none #### Step 5: Fix job log archive attempt separation **Files:** `crates/zinit_server/src/rpc/job.rs` - Fix `handle_log_archive` to properly separate logs by attempt number - Use job attempt count and log timestamps/tags to partition - Dependencies: none #### Step 6: Verify OpenRPC + SDK integration - Ensure all changes are reflected in `openrpc.json` - Run `cargo build` to verify SDK auto-generation works - Run `cargo test` to verify nothing broke - Dependencies: Steps 1-5 ### Acceptance Criteria - [ ] `service.stats` returns real memory_bytes and cpu_percent from sysinfo - [ ] `service.children` returns real memory_bytes per child process - [ ] `system.stats` RPC exists, returns real system memory/CPU/process/service/job counts - [ ] `system.stats` appears in openrpc.json and is auto-generated in SDK - [ ] `service.status` returns actual restart count (not always 0) - [ ] `job.log_archive` properly separates logs by attempt number - [ ] `cargo test` passes - [ ] `cargo build` succeeds (SDK generation works) ### Notes - `sysinfo` crate is already a dependency (used in `process.rs`) — no new deps needed - SDK is auto-generated from `openrpc.json` via `openrpc_client!` macro — adding to openrpc.json is sufficient - All RPC methods are auto-proxied to the UI via `openrpc_proxy!` — no UI route changes needed for new methods - Steps 1, 3, 4, 5 are independent and could run in parallel - This spec covers only server-side work; UI implementation will follow in a separate phase
Author
Owner

Implementation Complete: Server-Side Stats & Fixes

Commit: 1708f5e on branch development_kristof

What was implemented

New: system.stats RPC method

Returns real system-wide resource usage:

  • memory_total_bytes, memory_used_bytes — real system memory via sysinfo
  • cpu_percent — accurate CPU with delta tracking across calls
  • process_count — total OS processes
  • service_count — registered zinit services
  • job_stats — aggregate job counts by phase
  • Added to openrpc.json → SDK auto-generates system_stats() client method

Fixed: service.stats — real data (was returning zeros)

  • Uses shared sysinfo System instance for accurate CPU deltas
  • Returns real memory_bytes and cpu_percent for the service's process
  • Verified: sleep process shows 1,196,032 bytes memory, 0.0% CPU (correct)

Fixed: service.children — real memory per child

  • Batch-fetches process stats for all child PIDs in one sysinfo refresh
  • Each child now shows real memory_bytes

Fixed: service.status / service.status_full — real restart count

  • Derived from job history (counts jobs with attempt > 1)
  • Was hardcoded to 0

Fixed: job.log_archive — attempt separation

  • Partitions logs by attempt:N tags when present

New: sysmon module — shared system monitor

  • Persistent System instance behind LazyLock<Mutex<System>>
  • CPU deltas work correctly because the System persists across calls
  • Functions: system_stats(), process_stats(pid), processes_stats(pids)

Bugs found and fixed during implementation

Critical: JobFilter::default() had limit: 0

  • The #[derive(Default)] gave limit: 0 (u32 default), not 100
  • serde(default = "default_list_limit") only applies during JSON deserialization
  • This caused service_running_jobs() to always return empty via LIMIT 0 SQL
  • All service status/stats queries were broken — services always appeared "inactive"
  • Fixed with manual Default impl setting limit: 100

Bug: service_running_jobs didn't match by action name

  • Services reference actions by name (e.g., service "test_svc" has action "test_sleep")
  • But service_running_jobs only matched job.action == service_name or prefix service_name:
  • Jobs with action="test_sleep" were never matched to service "test_svc"
  • Fixed to also check the service's registered action names list

Verified with live testing

All endpoints tested against a running zinit_server with real processes:

system.stats  → memory: 48GB total/33GB used, CPU: 21.4%, 1088 processes
service.stats → memory_bytes: 1196032, cpu_percent: 0.0 (sleep process)
service.children → 2 children with real memory_bytes
service.status → state: "running", pid: 65020

Test results

All 189 tests pass (136 zinit_lib + 37 zinit_server + 16 zinit_sdk)


--- ## Implementation Complete: Server-Side Stats & Fixes **Commit:** `1708f5e` on branch `development_kristof` ### What was implemented #### New: `system.stats` RPC method Returns real system-wide resource usage: - `memory_total_bytes`, `memory_used_bytes` — real system memory via sysinfo - `cpu_percent` — accurate CPU with delta tracking across calls - `process_count` — total OS processes - `service_count` — registered zinit services - `job_stats` — aggregate job counts by phase - Added to `openrpc.json` → SDK auto-generates `system_stats()` client method #### Fixed: `service.stats` — real data (was returning zeros) - Uses shared sysinfo `System` instance for accurate CPU deltas - Returns real `memory_bytes` and `cpu_percent` for the service's process - Verified: `sleep` process shows 1,196,032 bytes memory, 0.0% CPU (correct) #### Fixed: `service.children` — real memory per child - Batch-fetches process stats for all child PIDs in one sysinfo refresh - Each child now shows real `memory_bytes` #### Fixed: `service.status` / `service.status_full` — real restart count - Derived from job history (counts jobs with `attempt > 1`) - Was hardcoded to `0` #### Fixed: `job.log_archive` — attempt separation - Partitions logs by `attempt:N` tags when present #### New: `sysmon` module — shared system monitor - Persistent `System` instance behind `LazyLock<Mutex<System>>` - CPU deltas work correctly because the System persists across calls - Functions: `system_stats()`, `process_stats(pid)`, `processes_stats(pids)` ### Bugs found and fixed during implementation #### Critical: `JobFilter::default()` had `limit: 0` - The `#[derive(Default)]` gave `limit: 0` (u32 default), not 100 - `serde(default = "default_list_limit")` only applies during JSON deserialization - This caused `service_running_jobs()` to always return empty via `LIMIT 0` SQL - **All service status/stats queries were broken** — services always appeared "inactive" - Fixed with manual `Default` impl setting `limit: 100` #### Bug: `service_running_jobs` didn't match by action name - Services reference actions by name (e.g., service "test_svc" has action "test_sleep") - But `service_running_jobs` only matched `job.action == service_name` or prefix `service_name:` - Jobs with action="test_sleep" were never matched to service "test_svc" - Fixed to also check the service's registered action names list ### Verified with live testing All endpoints tested against a running zinit_server with real processes: ``` system.stats → memory: 48GB total/33GB used, CPU: 21.4%, 1088 processes service.stats → memory_bytes: 1196032, cpu_percent: 0.0 (sleep process) service.children → 2 children with real memory_bytes service.status → state: "running", pid: 65020 ``` ### Test results All 189 tests pass (136 zinit_lib + 37 zinit_server + 16 zinit_sdk) ---
Author
Owner

UI Integration Check Results

Current State

I checked the zinit UI at http://localhost:9999 and found the following:

Tabs Currently Implemented:

  • Actions (16 items)
  • Jobs (48 items) - includes stats bar showing total/running/pending/waiting/succeeded/failed/cancelled
  • Runs (0 items)
  • Services (8 items) - shows service name, status, class, actions (edit/delete), dependencies
  • Secrets (0 items)
  • Logs

Missing / NOT Integrated

Admin Tab - NOT VISIBLE

  • No shutdown button
  • No stop-all functionality
  • No clean functionality
  • No reset-all functionality
  • This is the core requirement from this issue

Stats/Graphs Section - NOT VISIBLE

  • No memory usage graph/display
  • No process count graph/display
  • Jobs tab has basic stats bar (counts by phase) but not the system memory/process stats mentioned in the issue

Detailed Job/Service Pages - PARTIALLY IMPLEMENTED

  • UI has detail panels in the code (services-detail, jobs-detail) but detail panel content is not visible/populated
  • No CPU/memory tracking graphs for individual jobs/services
  • No dependency graph visualization mentioned in the issue

Router Paths - NOT VISIBLE

  • No navigation via router paths like:
    • /logs_run/$run
    • /logs_job/$jobid
    • /job/$jobid
    • /service/...
    • /action/...
    • /run/...

Summary

The admin section is NOT integrated into the UI. The current implementation focuses on Actions, Jobs, Runs, Services, Secrets, and Logs management, but lacks the critical admin controls (shutdown, stop-all, clean, reset-all) and system monitoring features (memory/CPU graphs) that are specified in this issue.

The code for /api/services/reset-all exists in the backend routes, but there is no UI button or admin tab to access it.

## UI Integration Check Results ### Current State I checked the zinit UI at http://localhost:9999 and found the following: **Tabs Currently Implemented:** - ✅ Actions (16 items) - ✅ Jobs (48 items) - includes stats bar showing total/running/pending/waiting/succeeded/failed/cancelled - ✅ Runs (0 items) - ✅ Services (8 items) - shows service name, status, class, actions (edit/delete), dependencies - ✅ Secrets (0 items) - ✅ Logs ### Missing / NOT Integrated ❌ **Admin Tab** - **NOT VISIBLE** - No shutdown button - No stop-all functionality - No clean functionality - No reset-all functionality - This is the core requirement from this issue ❌ **Stats/Graphs Section** - **NOT VISIBLE** - No memory usage graph/display - No process count graph/display - Jobs tab has basic stats bar (counts by phase) but not the system memory/process stats mentioned in the issue ❌ **Detailed Job/Service Pages** - **PARTIALLY IMPLEMENTED** - UI has detail panels in the code (services-detail, jobs-detail) but detail panel content is not visible/populated - No CPU/memory tracking graphs for individual jobs/services - No dependency graph visualization mentioned in the issue ❌ **Router Paths** - **NOT VISIBLE** - No navigation via router paths like: - `/logs_run/$run` - `/logs_job/$jobid` - `/job/$jobid` - `/service/...` - `/action/...` - `/run/...` ### Summary **The admin section is NOT integrated into the UI.** The current implementation focuses on Actions, Jobs, Runs, Services, Secrets, and Logs management, but lacks the critical admin controls (shutdown, stop-all, clean, reset-all) and system monitoring features (memory/CPU graphs) that are specified in this issue. The code for `/api/services/reset-all` exists in the backend routes, but there is **no UI button or admin tab** to access it.
despiegk added this to the now milestone 2026-03-09 10:22:55 +00:00
Commenting is not possible because the repository is archived.
No labels
No milestone
No project
No assignees
1 participant
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/zinit_archive2#37
No description provided.