Skip to content

whp: implement safe wrappers for VM, vCPU, and instruction emulation#665

Open
lstocchi wants to merge 1 commit into
containers:mainfrom
lstocchi:whp_crate
Open

whp: implement safe wrappers for VM, vCPU, and instruction emulation#665
lstocchi wants to merge 1 commit into
containers:mainfrom
lstocchi:whp_crate

Conversation

@lstocchi
Copy link
Copy Markdown
Contributor

@lstocchi lstocchi commented May 5, 2026

This commit introduces the foundational Rust abstractions for the Windows Hypervisor Platform (WHP) backend. I took inspiration from the hvf crate (you'll find WhpVm, WhpVcpu, and WhpEmulator).

Key features include:

  • Partition & vCPU Management: Safe wrappers for partition configuration, CPUID masking, memory mapping, and WHvRunVirtualProcessor exit routing.
  • Register I/O: Introduces const-generic helpers (get_registers<N> and set_registers<N>) to guarantee stack-allocated, zero-overhead register reads/writes on the VM-exit hot path.
  • Robust TSC Calibration: Implements accurate host TSC frequency detection via CPUID 0x15/0x16, with a reliable QueryPerformanceCounter (QPC) calibration fallback for AMD/older hardware.
  • Emulation: Integrates WHP's built-in x86 emulator for MMIO and PIO handling.

In a follow-up PR i'll add the hyper-v enlightements as this is already a big one.

@mtjhrc
Copy link
Copy Markdown
Collaborator

mtjhrc commented May 6, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Windows Hypervisor Platform (WHP) backend for libkrun, implementing core functionality for VM partition management, vCPU execution, memory mapping, and instruction emulation. It also includes a mechanism for detecting host TSC frequency using CPUID or the Windows High-Resolution Performance Counter. Feedback was provided to improve the robustness of the CPUID-based frequency detection by verifying the maximum supported leaf before accessing specific leaves, ensuring better compatibility across various CPU architectures.

Comment thread src/whp/src/lib.rs Outdated
@lstocchi lstocchi force-pushed the whp_crate branch 2 times, most recently from dd5b990 to 08ec97b Compare May 6, 2026 12:25
@mtjhrc
Copy link
Copy Markdown
Collaborator

mtjhrc commented May 6, 2026

GitHub is broken so I can't tag the specific line in the code ("internal error") anyways here:

// CPUID 0x15 — TSC / Core Crystal Clock (Intel SDM)
// TSC frequency in Hz = ECX * (EBX / EAX).
// We use a 1 Hz crystal with EBX = tsc_freq_hz to avoid rounding.
cpuid_results.push(WHV_X64_CPUID_RESULT {
    Function: 0x15,
    Reserved: [0; 3],
    Eax: 1,
    Ebx: tsc_freq_hz as u32,
    Ecx: 1,
    Edx: 0,
});

I ran this PR through Claude Opus 4.6 and GPT-5.4 — both independently flagged tsc_freq_hz as u32 as a silent truncation on CPUs with base clocks above ~4.295 GHz. I am not really familiar with this but the problem checks out. Is this a real issue?


detect_tsc_frequency() returns a u64 in Hz (via CPUID 0x15, 0x16, or QPC measurement), but the synthetic CPUID 0x15 response stuffs the entire value into EBX which is a u32. Any host TSC frequency above u32::MAX (~4.295 GHz) silently wraps.
This is most likely to cause an issue on AMD Zen, where the TSC ticks at the P0 (base clock) frequency 1
For example, the Ryzen 9 9900X has a 4.4 GHz base clock, which given the above means a TSC frequency of ~4,400,000,000 Hz. That overflows u32, so the guest would see ~105 MHz instead - breaking timekeeping.

Footnotes

  1. confirmed by the Linux kernel checking TscFreqSel (bit 24 of MSR_K7_HWCR)
    https://github.com/torvalds/linux/blob/adc1e5c6203cf13fe05a1ead08edcb3d3a3baae8/arch/x86/kernel/cpu/amd.c#L425-L433

@lstocchi lstocchi force-pushed the whp_crate branch 2 times, most recently from bce6905 to 4d31a48 Compare May 7, 2026 16:07
@lstocchi
Copy link
Copy Markdown
Contributor Author

lstocchi commented May 7, 2026

@mtjhrc you're right and yes it could be potentially a problem. Fixed it.

In my mind we would never use this code because I am going to enable Hyper-V enlightenments in next PRs and I expect modern Linux to detect and rely on them. Hyper-V enlightenments are features that, once detected, allow the guest to say "hey, I'm a virtual VM and I'm talking with a Windows hypervisor" .... and one of these features is the Reference TSC Page.

Instead of the guest trying to "guess" the hardware frequency or worrying about whether the physical TSC is stable across CPU cores, the hypervisor provides a shared memory page. This page contains a scale and offset that the guest uses to calculate time mathematically: Time = (TSC \Scale) + Offset. Because this paravirtualized clock is more reliable and faster (it avoids MSR reads), modern kernels will prioritize it and effectively ignore the manual frequency detection logic we just fixed.

However, if a user uses a kernel without the hyperv modules or some older guest we still need to have the fallback code works properly. So nice catch. 👍

@mediouni-m
Copy link
Copy Markdown

mediouni-m commented May 10, 2026

Hi, QEMU whpx maintainer here adding some comments

In my mind we would never use this code because I am going to enable Hyper-V enlightenments in next PRs

Good. Supporting Windows 11 onwards (and leaving Windows 10 behind which doesn't support a bunch of this) cuts down on the pain a lot.

Robust TSC Calibration: Implements accurate host TSC frequency detection via CPUID 0x15/0x16, with a reliable QueryPerformanceCounter (QPC) calibration fallback for AMD/older hardware.

The CPUID invariant TSC stuff isn't parsed by Linux for AMD processors unfortunately and I don't think it's worth having given that Hyper-V enlightenments will cover this anyway.

Undocumented WHP Workarounds: Implements the clear_halt_suspend workaround using WHvRegisterInternalActivityState. This ports a known QEMU/crosvm fix that manually clears the HaltSuspend bit, preventing vCPUs from freezing when interrupts are injected via WHvRequestInterrupt while in a HLT state.

You don't need this workaround at least for Linux use cases. This workaround is for manually injected interrupts (ie not using WHvRequestInterrupt) which in practice means non-APIC interrupts. For libkrun, you'll (hopefully, or something is going wrong) have only an APIC without a legacy PIC. And then have Hyper-V enlightenments do the trick instead of the kernel trying to find a timer calibration source.

Emulation: Integrates WHP's built-in x86 emulator for MMIO and PIO handling.

Depending on Windows release this one is a bit variable, we migrated away from it in the QEMU 11.0 cycle. Newer Windows releases ship a winhvemulation that is backed by this codebase: https://github.com/microsoft/openvmm/tree/e42e9614f35496e28171ee5785416e1904606552/vm/x86/x86emu/src which might be worthwhile to directly incorporate.

@lstocchi
Copy link
Copy Markdown
Contributor Author

Hi, QEMU whpx maintainer here adding some comments

Hey @mediouni-m , thanks for your comments. Really appreciated

The CPUID invariant TSC stuff isn't parsed by Linux for AMD processors unfortunately and I don't think it's worth having given that Hyper-V enlightenments will cover this anyway.

Yeah, you're right. I added a fallback way to calculate it but i just stuff it into CPUID 0x15 which is just intel-centric. I did not have the time to look at how to stuff it for AMD tbh. I was hoping to use another CPUID, but as i'm trying to create a first version for Windows asap i was focusing on the hardware i have and then iterate over it.
I am always unsure if we need to add funcs to use as fallback if user uses a generic linux vm without hyperv modules. Learning along the way, but if you say it's not worth much, i'll care less

You don't need this workaround at least for Linux use cases. This workaround is for manually injected interrupts (ie not using WHvRequestInterrupt) which in practice means non-APIC interrupts. For libkrun, you'll (hopefully, or something is going wrong) have only an APIC without a legacy PIC. And then have Hyper-V enlightenments do the trick instead of the kernel trying to find a timer calibration source.

Thanks for the explanation 👍 I'll remove it then

Depending on Windows release this one is a bit variable, we migrated away from it in the QEMU 11.0 cycle. Newer Windows releases ship a winhvemulation that is backed by this codebase: https://github.com/microsoft/openvmm/tree/e42e9614f35496e28171ee5785416e1904606552/vm/x86/x86emu/src which might be worthwhile to directly incorporate.

i'll give it a look and update it a follow-up PR, thanks!!

This commit introduces the foundational Rust abstractions for the Windows
Hypervisor Platform (WHP) backend. It encapsulates the raw C FFI bindings
into safe, idiomatic Rust structs (`WhpVm`, `WhpVcpu`, and `WhpEmulator`).

Key features include:
- Partition & vCPU Management: Safe wrappers for partition configuration,
  CPUID masking, memory mapping, and `WHvRunVirtualProcessor` exit routing.
- High-Performance Register I/O: Introduces const-generic helpers
  (`get_registers<N>` and `set_registers<N>`) to guarantee stack-allocated,
  zero-overhead register reads/writes on the VM-exit hot path.
- Robust TSC Calibration: Implements accurate host TSC frequency detection
  via CPUID 0x15/0x16, with a reliable `QueryPerformanceCounter` (QPC)
  calibration fallback for AMD/older hardware.
- Emulation: Integrates WHP's built-in x86 emulator for MMIO and PIO handling.

Signed-off-by: lstocchi <lstocchi@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants