Files
zosbuilder/docs/NOTES.md
Jan De Landtsheer 721e26a855 build: remove testing.sh in favor of runit.sh; add claude.md reference
Replace inline boot testing with standalone runit.sh runner for clarity:
- Remove scripts/lib/testing.sh source and boot_tests stage from build.sh
- Remove --skip-tests option from build.sh and rebuild-after-zinit.sh
- Update all docs to reference runit.sh for QEMU/cloud-hypervisor testing
- Add comprehensive claude.md as AI assistant entry point with guidelines

Testing is now fully decoupled from build pipeline; use ./runit.sh for
QEMU/cloud-hypervisor validation after builds complete.
2025-11-04 13:47:24 +01:00

17 KiB
Raw Blame History

Zero-OS Builder Working Notes and Repository Map

Purpose

  • This document captures operational knowledge of this repository: build flow, key files, flags, and recent behavior decisions (e.g., passwordless root).
  • Links below point to exact functions and files for fast triage, using code navigation-friendly anchors.

Repository Overview

  • Build entrypoint: scripts/build.sh
    • Orchestrates incremental stages using stage markers.
    • Runs inside a container defined by Dockerfile for reproducibility.
  • Common utilities and config loading: bash.common.sh
    • Loads config/build.conf, normalizes directory paths, provides logging and safe execution wrappers.
  • Initramfs assembly and finalization: bash.initramfs_* functions
    • Copies components, sets up zinit configs, finalizes branding, creates CPIO archive, validates contents.
  • Kernel integration (optional embedded initramfs): bash.kernel_* functions
    • Downloads/configures/builds kernel and modules, embeds initramfs, runs depmod.
  • zinit configuration: config/zinit/
    • YAML service definitions and init scripts used by zinit inside the initramfs rootfs.
  • RFS tooling (modules/firmware flists): scripts/rfs/
    • Packs module/firmware flists and embeds them into initramfs at /etc/rfs.

Container Tooling (dev-container)

  • Base image: Alpine 3.22 in Dockerfile
  • Tools:
    • shadow (passwd/chpasswd): required for root password management in initramfs.
    • openssl, openssl-dev: kept for other build steps and potential hashing utilities.
    • build-base, rustup, kmod, upx, etc.: required by various build stages.
  • Removed: perl, not required for password handling after switching to passwd/chpasswd workflow.

Configuration build.conf

  • File: config/build.conf
  • Key variables:
    • Versions: ALPINE_VERSION, KERNEL_VERSION
    • Directories (relative in config, normalized to absolute during runtime):
      • INSTALL_DIR="initramfs"
      • COMPONENTS_DIR="components"
      • KERNEL_DIR="kernel"
      • DIST_DIR="dist"
    • Flags:
      • ZEROOS_BRANDING="true"
      • ZEROOS_REBRANDING="true"
    • Branding behavior:
      • ZEROOS_PASSWORDLESS_ROOT="true" (default for branded builds in current policy)
      • ZEROOS_ROOT_PASSWORD_HASH / ROOT_PASSWORD_HASH (not used in current policy)
      • ZEROOS_ROOT_PASSWORD / ROOT_PASSWORD (not used in current policy)
    • FIRMWARE_TAG optional for reproducible firmware flist naming.

Absolute Path Normalization

  • Location: bash.common.sh
  • After sourcing build.conf, the following variables are normalized to absolute paths anchored at PROJECT_ROOT:
    • INSTALL_DIR, COMPONENTS_DIR, KERNEL_DIR, DIST_DIR
  • Rationale: Prevents path resolution errors when CWD changes (e.g., when kernel build operates in /workspace/kernel/current, validation now resolves to /workspace/initramfs instead of /workspace/kernel/current/initramfs).

Build Pipeline High Level

  • Orchestrator: bash.main_build_process()
    • Stage list:
      • alpine_extract
      • alpine_configure
      • alpine_packages
      • alpine_firmware
      • components_build
      • components_verify
      • kernel_modules
      • init_script
      • components_copy
      • zinit_setup
      • modules_setup
      • modules_copy
      • cleanup
      • rfs_flists
      • validation
      • initramfs_create
      • initramfs_test
      • kernel_build
      • boot_tests
    • Each stage wrapped with bash.stage_run() and tracked under .build-stages/
  • Container use:
    • Always run in container for stable toolchain (podman/docker auto-detected).
    • Inside container, CWD normalized to PROJECT_ROOT.

Initramfs Assembly Key Functions

Kernel Integration

RFS Flists (modules/firmware)

  • Packing scripts:
  • Firmware policy:
    • For initramfs: config/firmware.conf is the single source of truth for preinstalled firmware; modules.conf hints are ignored.
    • For RFS: install all Alpine linux-firmware* packages into the build container and pack from /lib/firmware (full set for runtime).
  • Integrated in stage_rfs_flists:
  • Runtime mount/readiness:
    • Firmware flist mounts over /lib/firmware (overmount hides any initramfs firmware).
    • Modules flist mounts at /lib/modules/$(uname -r).
    • Init scripts probe BASE_URL reachability (accepts FLISTS_BASE_URL or FLIST_BASE_URL) and wait for HTTP(S) before fetching:

Branding Behavior (Passwordless Root, motd/issue)

  • Finalization hook: bash.initramfs_finalize_customization()
  • Behavior (current policy):
    • Passwordless root enforced using passwd for shadow-aware deletion:
    • Branding toggles: ZEROOS_BRANDING and ZEROOS_REBRANDING (branding guard printed in logs).
    • Branding also updates /etc/motd and /etc/issue to Zero-OS.

Console and getty

  • Early keyboard and debug:
    • config/init preloads input/HID and USB HCD modules (i8042, atkbd, usbhid, hid, hid_generic, evdev, xhci/ehci/ohci/uhci) so console input works before zinit/rfs.
    • Kernel cmdline initdebug=true opens an early interactive shell; if /init-debug exists and is executable, it runs preferentially.
  • Serial and console getty configs (zinit service YAML):
  • Optional ash login loop (not enabled unless referenced):

Validation Diagnostics and Triage

  • Common error previously observed:
    • “Initramfs directory not found: initramfs (resolved: /workspace/kernel/current/initramfs)”
  • Root cause:
    • INSTALL_DIR re-sourced in a different CWD and interpreted as relative.
  • Fix:
    • Absolute path normalization of INSTALL_DIR/COMPONENTS_DIR/KERNEL_DIR/DIST_DIR after sourcing build.conf in bash.common.sh.
    • Additional “Validation debug” prints added in bash.initramfs_validate().
  • Expected logs now:
    • “Validation debug: input='initramfs' PWD=/workspace PROJECT_ROOT=/workspace INSTALL_DIR=/workspace/initramfs”
    • Resolves correctly even if called from a different stage CWD.

How to Verify Passwordless Root

  • After build, check archive:
    • mkdir -p dist/_inspect && cd dist/_inspect
    • xz -dc ../initramfs.cpio.xz | cpio -idmv
    • grep '^root:' ./etc/shadow
    • Expect root:: (empty field) indicating passwordless root.
  • At runtime on console:
    • When prompted for roots password, press Enter.

Stage System and Incremental Rebuilds

  • Stage markers stored in .build-stages/ (one file per stage).
  • Minimal rebuild helper (host or container):
    • scripts/rebuild-after-zinit.sh clears only: modules_setup, modules_copy, init_script, zinit_setup, validation, initramfs_create, initramfs_test (kernel_build only with --with-kernel; kernel_modules only with --refresh-container-mods).
    • Flags:
      • --with-kernel (also rebuild kernel; ensures cpio is recreated right before embedding)
      • --refresh-container-mods (rebuild container /lib/modules for fresh containers)
      • --verify-only (report changed files and stage status; no rebuild)
    • Shows stage status before/after marker removal; no --rebuild-from is passed by default (relies on markers only).
  • Manual minimal rebuild:
    • Remove relevant .done files, e.g.: initramfs_create.done initramfs_test.done validation.done
    • Rerun: DEBUG=1 ./scripts/build.sh
  • Show status:
    • ./scripts/build.sh --show-stages
  • Test built kernel:
    • ./runit.sh --hypervisor qemu
    • ./runit.sh --hypervisor ch --disks 5 --reset

Key Decisions (current)

  • Firmware selection for initramfs comes exclusively from config/firmware.conf; firmware hints in modules.conf are ignored to avoid duplication/mismatch.
  • Runtime firmware flist overmounts /lib/firmware after network readiness; init scripts wait for FLISTS_BASE_URL/FLIST_BASE_URL HTTP reachability before fetching.
  • Early keyboard and debug shell added to config/init as described above.
  • Branding enforces passwordless root via passwd -d -R inside initramfs finalization, avoiding direct edits of passwd/shadow files.
  • Directory paths normalized to absolute after loading config to avoid CWD-sensitive behavior.
  • Container image contains shadow suite to ensure passwd/chpasswd availability; perl removed.

File Pointers (quick jump)

Change Log

Roadmap / TODO (tracked in tool todo list)

  • Zosception (zinit service graph and ordering)

  • Add zosstorage to initramfs

  • RFS blob store backends (design + docs; http and s3 exist)

    • Current S3 store URI construction: bash.rfs_common_build_s3_store_uri()
    • Flist manifest store patching: bash.rfs_common_patch_flist_stores()
    • Route URL patching: bash.rfs_common_patch_flist_route_url()
    • Packers entrypoints:
    • Proposed additional backend: RESP/DB-style store
      • Goal: Allow rfs to push/fetch content-addressed blobs via a RESP-compatible endpoint (e.g., Redis/KeyDB/Dragonfly-like), or a thin HTTP/RESP adapter.
      • Draft URI scheme examples:
        • resp://host:port/db?tls=0&prefix=blobs
        • resp+tls://host:port/db?prefix=blobs&ca=/etc/ssl/certs/ca.pem
        • resp+sentinel://sentinelHost:26379/mymaster?prefix=blobs
      • Minimum operations:
        • PUT blob: SETEX prefix/ab/cd/hash ttl file-bytes or HSET prefix/hash data file-bytes
        • GET blob: GET or HGET
        • HEAD/exists: EXISTS
        • Optional: pipelined/mget for batch prefetch
      • Client integration layers:
        • Pack-time: extend rfs CLI store resolver (design doc first; scripts/rfs/common.sh can map scheme→uploader if CLI not ready).
        • Manifest post-process: still supported; stores table may include multiple URIs (s3 + resp) for redundancy.
      • Caching and retries:
        • Local on-disk cache under dist/.rfs-cache keyed by hash with LRU GC.
        • Exponential backoff on GET failures; fall back across stores in order.
      • Auth:
        • RESP: optional username/password in URI; TLS with cert pinning parameters.
        • Keep secrets in config/rfs.conf or env; do not embed write creds in manifests (read-credential routes only).
      • Deliverables:
        • Design section in docs/rfs-flists.md (to be added)
        • Config keys in config/rfs.conf.example for RESP endpoints
        • Optional shim uploader script if CLI support lags.
  • Documentation refresh tasks

Diagnostics-first reminder