From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Mathias Krause <minipli@grsecurity.net>, qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
kvm@vger.kernel.org, Oliver Upton <oliver.upton@linux.dev>,
Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH] i386/kvm: Disable hypercall patching quirk by default
Date: Tue, 22 Jul 2025 11:45:13 +0800 [thread overview]
Message-ID: <41a5767e-42d7-4877-9bc8-aa8eca6dd3e3@intel.com> (raw)
In-Reply-To: <20250619194204.1089048-1-minipli@grsecurity.net>
On 6/20/2025 3:42 AM, Mathias Krause wrote:
> KVM has a weird behaviour when a guest executes VMCALL on an AMD system
> or VMMCALL on an Intel CPU. Both naturally generate an invalid opcode
> exception (#UD) as they are just the wrong instruction for the CPU
> given. But instead of forwarding the exception to the guest, KVM tries
> to patch the guest instruction to match the host's actual hypercall
> instruction. That is doomed to fail as read-only code is rather the
> standard these days. But, instead of letting go the patching attempt and
> falling back to #UD injection, KVM injects the page fault instead.
>
> That's wrong on multiple levels. Not only isn't that a valid exception
> to be generated by these instructions, confusing attempts to handle
> them. It also destroys guest state by doing so, namely the value of CR2.
>
> Sean attempted to fix that in KVM[1] but the patch was never applied.
>
> Later, Oliver added a quirk bit in [2] so the behaviour can, at least,
> conceptually be disabled. Paolo even called out to add this very
> functionality to disable the quirk in QEMU[3]. So lets just do it.
>
> A new property 'hypercall-patching=on|off' is added, for the very
> unlikely case that there are setups that really need the patching.
> However, these would be vulnerable to memory corruption attacks freely
> overwriting code as they please. So, my guess is, there are exactly 0
> systems out there requiring this quirk.
The default behavior is patching the hypercall for many years.
If you desire to change the default behavior, please at least keep it
unchanged for old machine version. i.e., introduce compat_property,
which sets KVMState->hypercall_patching_enabled to true.
> [1] https://lore.kernel.org/kvm/20211210222903.3417968-1-seanjc@google.com/
> [2] https://lore.kernel.org/kvm/20220316005538.2282772-2-oupton@google.com/
> [3] https://lore.kernel.org/kvm/80e1f1d2-2d79-22b7-6665-c00e4fe9cb9c@redhat.com/
>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Mathias Krause <minipli@grsecurity.net>
> ---
> include/system/kvm_int.h | 1 +
> qemu-options.hx | 10 ++++++++++
> target/i386/kvm/kvm.c | 38 ++++++++++++++++++++++++++++++++++++++
> 3 files changed, 49 insertions(+)
>
> diff --git a/include/system/kvm_int.h b/include/system/kvm_int.h
> index 756a3c0a250e..fd7129824429 100644
> --- a/include/system/kvm_int.h
> +++ b/include/system/kvm_int.h
> @@ -159,6 +159,7 @@ struct KVMState
> uint64_t kvm_eager_split_size; /* Eager Page Splitting chunk size */
> struct KVMDirtyRingReaper reaper;
> struct KVMMsrEnergy msr_energy;
> + bool hypercall_patching_enabled;
IMHO, we can just name it "hypercall_patching".
Since it's a boolean type, true means enabled and false means disabled.
> NotifyVmexitOption notify_vmexit;
> uint32_t notify_window;
> uint32_t xen_version;
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 1f862b19a676..c2e232649c19 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -231,6 +231,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
> " dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
> " eager-split-size=n (KVM Eager Page Split chunk size, default 0, disabled. ARM only)\n"
> " notify-vmexit=run|internal-error|disable,notify-window=n (enable notify VM exit and set notify window, x86 only)\n"
> + " hypercall-patching=on|off (enable KVM's VMCALL/VMMCALL hypercall patching quirk, x86 only)\n"
> " thread=single|multi (enable multi-threaded TCG)\n"
> " device=path (KVM device path, default /dev/kvm)\n", QEMU_ARCH_ALL)
> SRST
> @@ -313,6 +314,15 @@ SRST
> open up for a specified of time (i.e. notify-window).
> Default: notify-vmexit=run,notify-window=0.
>
> + ``hypercall-patching=on|off``
> + KVM tries to recover from the wrong hypercall instruction being used by
> + a guest by attempting to rewrite it to the one supported natively by
> + the host CPU (VMCALL on Intel, VMMCALL for AMD systems). However, this
> + patching may fail if the guest memory is write protected, leading to a
> + page fault getting propagated to the guest instead of an illegal
> + instruction exception. As this may confuse guests, it gets disabled by
> + default (x86 only).
> +
> ``device=path``
> Sets the path to the KVM device node. Defaults to ``/dev/kvm``. This
> option can be used to pass the KVM device to use via a file descriptor
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 56a6b9b6381a..6f5f3b95e553 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -3224,6 +3224,19 @@ static int kvm_vm_enable_energy_msrs(KVMState *s)
> return 0;
> }
>
> +static int kvm_vm_disable_hypercall_patching(KVMState *s)
> +{
> + int valid_quirks = kvm_vm_check_extension(s, KVM_CAP_DISABLE_QUIRKS2);
> +
> + if (valid_quirks & KVM_X86_QUIRK_FIX_HYPERCALL_INSN) {
> + return kvm_vm_enable_cap(s, KVM_CAP_DISABLE_QUIRKS2, 0,
> + KVM_X86_QUIRK_FIX_HYPERCALL_INSN);
> + }
> +
> + warn_report("kvm: disabling hypercall patching not supported");
It's not clear it's 1) KVM doesn't support/has FIX_HYPERCALL_INSN quirk
or 2) KVM has FIX_HYPERCALL_INSN quirk but doesn't allow it to be
disabled, when KVM_X86_QUIRK_FIX_HYPERCALL_INSN is not returned in
KVM_CAP_DISABLE_QUIRKS2.
If it's case 1), it can be treated as hypercall patching is disabled
thus no warning is expected.
So, I think it requires a new cap in KVM to return the enabled quirks.
> + return 0;
I think return 0 here is to avoid the warn_report() in the caller. But
for the correct semantics, we need to return -1 to indicate that it
fails to disable the hypercall patching?
> +}
> +
> int kvm_arch_init(MachineState *ms, KVMState *s)
> {
> int ret;
> @@ -3363,6 +3376,12 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> }
> }
>
> + if (s->hypercall_patching_enabled == false) {
> + if (kvm_vm_disable_hypercall_patching(s)) {
> + warn_report("kvm: failed to disable hypercall patching quirk");
> + }
> + }
> +
> return 0;
> }
>
> @@ -6456,6 +6475,19 @@ void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask)
> }
> }
>
> +static bool kvm_arch_get_hypercall_patching(Object *obj, Error **errp)
> +{
> + KVMState *s = KVM_STATE(obj);
> + return s->hypercall_patching_enabled;
> +}
> +
> +static void kvm_arch_set_hypercall_patching(Object *obj, bool value,
> + Error **errp)
> +{
> + KVMState *s = KVM_STATE(obj);
> + s->hypercall_patching_enabled = value;
> +}
> +
> static int kvm_arch_get_notify_vmexit(Object *obj, Error **errp)
> {
> KVMState *s = KVM_STATE(obj);
> @@ -6589,6 +6621,12 @@ static void kvm_arch_set_xen_evtchn_max_pirq(Object *obj, Visitor *v,
>
> void kvm_arch_accel_class_init(ObjectClass *oc)
> {
> + object_class_property_add_bool(oc, "hypercall-patching",
> + kvm_arch_get_hypercall_patching,
> + kvm_arch_set_hypercall_patching);
> + object_class_property_set_description(oc, "hypercall-patching",
> + "Enable hypercall patching quirk");
> +
> object_class_property_add_enum(oc, "notify-vmexit", "NotifyVMexitOption",
> &NotifyVmexitOption_lookup,
> kvm_arch_get_notify_vmexit,
next prev parent reply other threads:[~2025-07-22 3:47 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-19 19:42 [PATCH] i386/kvm: Disable hypercall patching quirk by default Mathias Krause
2025-07-21 10:22 ` Mathias Krause
2025-07-22 3:45 ` Xiaoyao Li [this message]
2025-07-22 9:21 ` Mathias Krause
2025-07-22 10:27 ` Xiaoyao Li
2025-07-22 10:35 ` Mohamed Mediouni
2025-07-22 11:06 ` Xiaoyao Li
2025-07-22 11:16 ` Mohamed Mediouni
2025-07-22 11:15 ` Daniel P. Berrangé
2025-07-22 12:21 ` Xiaoyao Li
2025-07-22 20:13 ` Mathias Krause
2025-07-22 20:08 ` Mathias Krause
2025-07-23 9:26 ` David Woodhouse
2025-08-04 9:35 ` Mathias Krause
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41a5767e-42d7-4877-9bc8-aa8eca6dd3e3@intel.com \
--to=xiaoyao.li@intel.com \
--cc=kvm@vger.kernel.org \
--cc=minipli@grsecurity.net \
--cc=mtosatti@redhat.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).