StateStore and ImageCache Lack Synchronization #12

Closed
opened 2026-02-22 08:41:29 +00:00 by rawan · 1 comment
Member

Location:

  • crates/chvm-lib/src/vm/manager.rs:25-128 - StateStore
  • crates/chvm-lib/src/oci/cache.rs:35-182 - ImageCache

Issue:
Both StateStore and ImageCache use an in-memory HashMap that is read-modify-written and then persisted to disk, with NO internal locking. Concurrent operations (e.g., vm start + vm create in different terminals) can cause:

  1. Lost updates to the state/index
  2. Corrupted JSON files
  3. Duplicate or missing VM entries

Example Scenario:

Thread 1: Load state.json (contains VM1, VM2)
Thread 2: Load state.json (contains VM1, VM2)
Thread 1: Add VM3, save state.json (now has VM1, VM2, VM3)
Thread 2: Add VM4, save state.json (now has VM1, VM2, VM4 - VM3 is lost!)
**Location:** - `crates/chvm-lib/src/vm/manager.rs:25-128` - `StateStore` - `crates/chvm-lib/src/oci/cache.rs:35-182` - `ImageCache` **Issue:** Both `StateStore` and `ImageCache` use an in-memory `HashMap` that is read-modify-written and then persisted to disk, with NO internal locking. Concurrent operations (e.g., `vm start` + `vm create` in different terminals) can cause: 1. Lost updates to the state/index 2. Corrupted JSON files 3. Duplicate or missing VM entries **Example Scenario:** ``` Thread 1: Load state.json (contains VM1, VM2) Thread 2: Load state.json (contains VM1, VM2) Thread 1: Add VM3, save state.json (now has VM1, VM2, VM3) Thread 2: Add VM4, save state.json (now has VM1, VM2, VM4 - VM3 is lost!) ```
Member

after investigation:

StateStore issue

  • the example scenario is not valid since each vm has it's own state.json, The in-memory HashMap is per-process, so two separate chvm create CLI invocations each have their own VmManager instance with their own HashMap. There's no shared mutable state between processes.
  • the only issue here is two processes operating on the same VM concurrently (e.g. chvm stop vm1 and chvm start vm1 at the same time)

ImageCache issue

  • The ImageCache uses a single shared index.json file. Two concurrent chvm pull or chvm create commands that both trigger image pulls could genuinely corrupt or lose entries in index.json.

work completed in pr:

  • ImageCache: added exclusive lock + reload-before-modify on all mutations (add, remove, ensure_rootfs), shared lock on reads (get), and atomic writes for index.json
  • StateStore: added the same protections as a defensive measure. Also added reload_state() so resolve_clone() and update_status() re-read from disk before modifying, which guards against the edge case of concurrent operations on the same VM
### after investigation: #### StateStore issue - the example scenario is not valid since each vm has it's own state.json, The in-memory HashMap is per-process, so two separate chvm create CLI invocations each have their own VmManager instance with their own HashMap. There's no shared mutable state between processes. - the only issue here is two processes operating on the same VM concurrently (e.g. chvm stop vm1 and chvm start vm1 at the same time) #### ImageCache issue - The ImageCache uses a single shared index.json file. Two concurrent chvm pull or chvm create commands that both trigger image pulls could genuinely corrupt or lose entries in index.json. ### work completed in pr: - ImageCache: added exclusive lock + reload-before-modify on all mutations (add, remove, ensure_rootfs), shared lock on reads (get), and atomic writes for index.json - StateStore: added the same protections as a defensive measure. Also added reload_state() so resolve_clone() and update_status() re-read from disk before modifying, which guards against the edge case of concurrent operations on the same VM
thabeta added this to the ACTIVE project 2026-03-12 10:53:31 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/my_hypervisor#12
No description provided.