Replace hardcoded init.krun with generic virtual file overlay#673
Draft
mtjhrc wants to merge 13 commits into
Draft
Replace hardcoded init.krun with generic virtual file overlay#673mtjhrc wants to merge 13 commits into
mtjhrc wants to merge 13 commits into
Conversation
Move the init binary build script and include_bytes!() from the devices crate into a new init-blob crate. The passthrough modules reference the binary as init_blob::INIT_BINARY instead of using include_bytes! directly. build.rs based on code from containers#593. Co-authored-by: Geoffrey Goodman <geoff@goodman.dev> Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Replace the private next_inode AtomicU64 inside PassthroughFs with a shared InodeAllocator that is passed in at construction. This lets multiple layers (e.g. a future virtual-inode overlay) allocate from the same counter without implicit coordination via reserved ranges. PassthroughFs::new() and PassthroughFsRo::new() now take an Arc<InodeAllocator> parameter. FsWorker::new() creates the allocator and passes it through. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Introduce AugmentFs<T>, a generic overlay that wraps any FileSystem implementation and intercepts FUSE operations for virtual inodes — synthetic read-only files backed by static data. One-shot files can only be looked up once. The overlay uses the shared InodeAllocator to assign inode numbers, so virtual and passthrough inodes never collide. Remove all init.krun special-case code (init_inode, init_handle, INIT_CSTR, init_payload) from both the Linux and macOS passthrough implementations. The init.krun virtual file is now configured via VirtualEntry in the krun API layer and handled generically by the overlay. FsDeviceConfig carries a Vec<VirtualEntry> and FsWorker wraps AugmentFs<PassthroughFs> / AugmentFs<PassthroughFsRo>. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Add API to prevent the default init binary (/init.krun) from being
injected into the root filesystem. Follows the existing
krun_disable_implicit_{console,vsock} pattern.
Must be called before krun_set_root().
Assisted-by: OpenCode:claude-opus-4.6
Signed-off-by: Matej Hrica <mhrica@redhat.com>
Add C API to inject arbitrary virtual files into a virtiofs device. The file appears in the root directory of the specified mount and is backed entirely by host memory. Supports one-shot semantics (the file can only be looked up once). The data pointer follows the same lifetime contract as other krun APIs: the caller must keep the memory valid until krun_start_enter() returns. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Add API to retrieve the built-in default init binary. Callers that use krun_disable_implicit_init() can use this to obtain the init binary and inject it themselves via krun_fs_add_overlay_file(). Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
NullFs implements the FileSystem trait with just an empty root directory. It can be wrapped with AugmentFs to serve virtual files without any host directory involvement. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
krun_set_root_disk_remount no longer creates a temporary empty host directory. Instead it configures a NullFs-backed virtiofs device (shared_dir: None) with init.krun overlaid via AugmentFs. Fs::new() now accepts Option<String> for shared_dir — None selects NullFs. FsDeviceConfig and FsServer gain the corresponding variants. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
The temporary root directory hack is gone (replaced by NullFs), so the ioctl that cleaned it up and the config flag that gated it are no longer needed. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
The exit-code ioctl is a krun mechanism, not a filesystem operation. Move it to the AugmentFs where it is handled before any delegation to the inner filesystem. The Linux passthrough retains only EXPORT_FD (which needs access to passthrough-internal handle and export tables). The macOS passthrough no longer implements ioctl at all (the trait default returns ENOSYS for any cmd that reaches it). Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Boot a VM with a pure NullFs root — no host directory at all. Every
file in the root (init.krun, guest-agent, .krun_config.json, test
data) is injected as a virtual overlay, and /dev, /proc, /sys are
virtual empty directories used as mount points.
The guest verifies:
- One-shot files (init.krun, guest-agent, .krun_config.json) are
gone after being consumed
- Persistent files (marker.txt, testdata.bin) survive and are
re-readable
- Write access to virtual files is denied (EACCES)
- stat reports correct sizes
- Range reads at various offsets return correct data
- Read past EOF returns zero bytes
Assisted-by: OpenCode:claude-opus-4.6
Signed-off-by: Matej Hrica <mhrica@redhat.com>
Boot from an ext4 block device via krun_set_root_disk_remount. The virtiofs root uses NullFs with init.krun and virtual mount-point directories overlaid. The guest verifies it pivoted to the block device root successfully. Assisted-by: OpenCode:claude-opus-4.6 Signed-off-by: Matej Hrica <mhrica@redhat.com>
Contributor
|
No comments on the code but I really love the direction! |
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR replaces the hardcoded init.krun handling in the virtiofs passthrough backends with a generic virtual-files overlay (
AugmentFs).This introduces 2 new filesystem trait implementations:
AugmentFs<T>, a wrapper that intercepts FUSE operations for virtual inodes - synthetic read-only files/directories backed by static data. It also handles our custom ioctlsNullFs, a minimal FileSystem impl with just an empty root directory — used when no host directory is neededThe
init.krunis registered as just a virtual file from the API layer. As a bonus you can even inject the.krun_config.jsonas a virtual file.Reimplemented
krun_set_root_disk_remount()viaNullFs+AugmentFs#551 (comment)The public API is still mostly compatible. There are minor differences like
init.krundissapears after it has been looked up once.API breaking changes - applying
krun_disable_implicit_init()and otherdisable_implicit_*will be applied by default in a follow up PR.The init binary is now in its own init-blob crate. The direction for #634 (2.0 API) is to invert the dependency: init-blob would depend on libkrun's overlay APIs to inject itself, rather than libkrun depending on a specific init.
This supersedes #593 by @ggoodman, which tackled the same problem of decoupling init from the fs backends. This PR takes that idea further by removing awareness of init from the filesystem layer entirely - it's just another virtual file. #593 also introduced InitPolicy startup validation - how that fits into the 2.0 API (#634) with different payload types is still an open question.
Known limitations / future work: