OCI Cache: Garbage collection / Layer sharing #7
Labels
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
geomind_code/my_hypervisor#7
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The current OCI implementation in
chvm-libstores layers in VM-specific directories. When multiple VMs use the same image, layers are duplicated on disk, increasing storage overhead.Proposed Changes:
prunesubcommand to remove unreferenced layers and images from the CAS.After investigation:
Current behavior:
The current pull logic (
pull.rs:57,72) stores layer blobs under per-manifest directories.This means each image downloads all its layers independently, even if the same layer blob already exists from another image.
For images sharing a common base (for example
node:20-bookwormandnode:22-bookworm, both built ondebian:bookworm), the shared base layers are downloaded and stored multiple times.Additionally,
ensure_rootfs()(cache.rs:189-194) flattens all layers into a singlerootfs/directory per image, so even the extracted filesystem is duplicated across images that share layers.####Proposed approach:
Content-addressable blob storage indexed by layer digest
Skip already-downloaded layers during pull: Before downloading a layer, check if ~/.chvm/blobs// already exists. If it does, skip the download entirely.
Update index.json: Replace layer_paths with layer_digests — an ordered list of layer digest references into the CAS:
{
"layer_digests": ["sha256:xxx", "sha256:yyy"]
}
Overlay stacking instead of flattened rootfs: At VM boot, construct the overlay mount from the layer directories directly:
lowerdir=blobs/sha256-yyy:blobs/sha256-xxx,upperdir=vm/upper,workdir=vm/workPrune: Since index.json tracks which digests each image uses, pruning is just: find blobs not referenced by any image's
layer_digests and delete them.
work completed:
work in progress