From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Mathias Krause <minipli@grsecurity.net>,
qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
kvm@vger.kernel.org, Oliver Upton <oliver.upton@linux.dev>,
Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH] i386/kvm: Disable hypercall patching quirk by default
Date: Tue, 22 Jul 2025 12:15:05 +0100 [thread overview]
Message-ID: <aH9yuVcUJQc4_-vP@redhat.com> (raw)
In-Reply-To: <3f58125c-183f-49e0-813e-d4cb1be724e8@intel.com>
On Tue, Jul 22, 2025 at 06:27:45PM +0800, Xiaoyao Li wrote:
> On 7/22/2025 5:21 PM, Mathias Krause wrote:
> > On 22.07.25 05:45, Xiaoyao Li wrote:
> > > On 6/20/2025 3:42 AM, Mathias Krause wrote:
> > > > KVM has a weird behaviour when a guest executes VMCALL on an AMD system
> > > > or VMMCALL on an Intel CPU. Both naturally generate an invalid opcode
> > > > exception (#UD) as they are just the wrong instruction for the CPU
> > > > given. But instead of forwarding the exception to the guest, KVM tries
> > > > to patch the guest instruction to match the host's actual hypercall
> > > > instruction. That is doomed to fail as read-only code is rather the
> > > > standard these days. But, instead of letting go the patching attempt and
> > > > falling back to #UD injection, KVM injects the page fault instead.
> > > >
> > > > That's wrong on multiple levels. Not only isn't that a valid exception
> > > > to be generated by these instructions, confusing attempts to handle
> > > > them. It also destroys guest state by doing so, namely the value of CR2.
> > > >
> > > > Sean attempted to fix that in KVM[1] but the patch was never applied.
> > > >
> > > > Later, Oliver added a quirk bit in [2] so the behaviour can, at least,
> > > > conceptually be disabled. Paolo even called out to add this very
> > > > functionality to disable the quirk in QEMU[3]. So lets just do it.
> > > >
> > > > A new property 'hypercall-patching=on|off' is added, for the very
> > > > unlikely case that there are setups that really need the patching.
> > > > However, these would be vulnerable to memory corruption attacks freely
> > > > overwriting code as they please. So, my guess is, there are exactly 0
> > > > systems out there requiring this quirk.
> > >
> > > The default behavior is patching the hypercall for many years.
> > >
> > > If you desire to change the default behavior, please at least keep it
> > > unchanged for old machine version. i.e., introduce compat_property,
> > > which sets KVMState->hypercall_patching_enabled to true.
> >
> > Well, the thing is, KVM's patching is done with the effective
> > permissions of the guest which means, if the code in question isn't
> > writable from the guest's point of view, KVM's attempt to modify it will
> > fail. This failure isn't transparent for the guest as it sees a #PF
> > instead of a #UD, and that's what I'm trying to fix by disabling the quirk.
> >
> > The hypercall patching was introduced in Linux commit 7aa81cc04781
> > ("KVM: Refactor hypercall infrastructure (v3)") in v2.6.25. Until then
> > it was based on a dedicated hypercall page that was handled by KVM to
> > use the proper instruction of the KVM module in use (VMX or SVM).
> >
> > Patching code was fine back then, but the introduction of DEBUG_RO_DATA
> > made the patching attempts fail and, ultimately, lead to Paolo handle
> > this with commit c1118b3602c2 ("x86: kvm: use alternatives for VMCALL
> > vs. VMMCALL if kernel text is read-only").
> >
> > However, his change still doesn't account for the cross-vendor live
> > migration case (Intel<->AMD), which will still be broken, causing the
> > before mentioned bogus #PF, which will just lead to misleading Oops
> > reports, confusing the poor souls, trying to make sense of it.
> >
> > IMHO, there is no valid reason for still having the patching in place as
> > the .text of non-ancient kernel's will be write-protected, making
> > patching attempts fail. And, as they fail with a #PF instead of #UD, the
> > guest cannot even handle them appropriately, as there was no memory
> > write attempt from its point of view. Therefore the default should be to
> > disable it, IMO. This won't prevent guests making use of the wrong
> > instruction from trapping, but, at least, now they'll get the correct
> > exception vector and can handle it appropriately.
>
> But you don't accout for the case that guest kernel is built without
> CONFIG_STRICT_KERNEL_RWX enabled, or without CONFIG_DEBUG_RO_DATA, or for
> whatever reason the guest's text is not readonly, and the VM needs to be
> migrated among different vendors (Intel <-> AMD).
>
> Before this patch, the above usecase works well. But with this patch, the
> guest will gets #UD after migrated to different vendors.
>
> I heard from some small CSPs that they do want to the ability to live
> migrate VMs among Intel and AMD host.
Usually CSPs don't have full control over what their customers
are running as a guest. If their customers are running mainstream
modern guest OS, CONFIG_STRICT_KERNEL_RWX is pretty likely to be
set, so presumably migration between Intel & AMD will not work
and this isn't making it worse ?
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2025-07-22 11:16 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-19 19:42 [PATCH] i386/kvm: Disable hypercall patching quirk by default Mathias Krause
2025-07-21 10:22 ` Mathias Krause
2025-07-22 3:45 ` Xiaoyao Li
2025-07-22 9:21 ` Mathias Krause
2025-07-22 10:27 ` Xiaoyao Li
2025-07-22 10:35 ` Mohamed Mediouni
2025-07-22 11:06 ` Xiaoyao Li
2025-07-22 11:16 ` Mohamed Mediouni
2025-07-22 11:15 ` Daniel P. Berrangé [this message]
2025-07-22 12:21 ` Xiaoyao Li
2025-07-22 20:13 ` Mathias Krause
2025-07-22 20:08 ` Mathias Krause
2025-07-23 9:26 ` David Woodhouse
2025-08-04 9:35 ` Mathias Krause
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aH9yuVcUJQc4_-vP@redhat.com \
--to=berrange@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=minipli@grsecurity.net \
--cc=mtosatti@redhat.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=seanjc@google.com \
--cc=xiaoyao.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).