upgrade: cell grace period + poll error budget + resilient reconcile #30
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "development_fix_28_polling_grace_reconcile"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Rollout cells can no longer pin a rollout open indefinitely on a wedged child or transient hero_proc unreachability. Daemon startup also survives a single corrupt upgrade record instead of bailing on the whole reconcile pass.
Related Issue
Partially addresses #28 (polling drift + startup reconciliation robustness + missing grace-period auto-fail).
Changes
Test Results
Full rollout upg_1779696877140_81382f54 reached status=succeeded with 91/91 children in succeeded phase in ~47s on the new binary.