Re: [PATCH] i386/kvm: Disable hypercall patching quirk by default

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Mathias Krause <minipli@grsecurity.net>,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org, Oliver Upton <oliver.upton@linux.dev>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH] i386/kvm: Disable hypercall patching quirk by default
Date: Tue, 22 Jul 2025 12:15:05 +0100	[thread overview]
Message-ID: <aH9yuVcUJQc4_-vP@redhat.com> (raw)
In-Reply-To: <3f58125c-183f-49e0-813e-d4cb1be724e8@intel.com>

On Tue, Jul 22, 2025 at 06:27:45PM +0800, Xiaoyao Li wrote:
> On 7/22/2025 5:21 PM, Mathias Krause wrote:
> > On 22.07.25 05:45, Xiaoyao Li wrote:
> > > On 6/20/2025 3:42 AM, Mathias Krause wrote:
> > > > KVM has a weird behaviour when a guest executes VMCALL on an AMD system
> > > > or VMMCALL on an Intel CPU. Both naturally generate an invalid opcode
> > > > exception (#UD) as they are just the wrong instruction for the CPU
> > > > given. But instead of forwarding the exception to the guest, KVM tries
> > > > to patch the guest instruction to match the host's actual hypercall
> > > > instruction. That is doomed to fail as read-only code is rather the
> > > > standard these days. But, instead of letting go the patching attempt and
> > > > falling back to #UD injection, KVM injects the page fault instead.
> > > > 
> > > > That's wrong on multiple levels. Not only isn't that a valid exception
> > > > to be generated by these instructions, confusing attempts to handle
> > > > them. It also destroys guest state by doing so, namely the value of CR2.
> > > > 
> > > > Sean attempted to fix that in KVM[1] but the patch was never applied.
> > > > 
> > > > Later, Oliver added a quirk bit in [2] so the behaviour can, at least,
> > > > conceptually be disabled. Paolo even called out to add this very
> > > > functionality to disable the quirk in QEMU[3]. So lets just do it.
> > > > 
> > > > A new property 'hypercall-patching=on|off' is added, for the very
> > > > unlikely case that there are setups that really need the patching.
> > > > However, these would be vulnerable to memory corruption attacks freely
> > > > overwriting code as they please. So, my guess is, there are exactly 0
> > > > systems out there requiring this quirk.
> > > 
> > > The default behavior is patching the hypercall for many years.
> > > 
> > > If you desire to change the default behavior, please at least keep it
> > > unchanged for old machine version. i.e., introduce compat_property,
> > > which sets KVMState->hypercall_patching_enabled to true.
> > 
> > Well, the thing is, KVM's patching is done with the effective
> > permissions of the guest which means, if the code in question isn't
> > writable from the guest's point of view, KVM's attempt to modify it will
> > fail. This failure isn't transparent for the guest as it sees a #PF
> > instead of a #UD, and that's what I'm trying to fix by disabling the quirk.
> > 
> > The hypercall patching was introduced in Linux commit 7aa81cc04781
> > ("KVM: Refactor hypercall infrastructure (v3)") in v2.6.25. Until then
> > it was based on a dedicated hypercall page that was handled by KVM to
> > use the proper instruction of the KVM module in use (VMX or SVM).
> > 
> > Patching code was fine back then, but the introduction of DEBUG_RO_DATA
> > made the patching attempts fail and, ultimately, lead to Paolo handle
> > this with commit c1118b3602c2 ("x86: kvm: use alternatives for VMCALL
> > vs. VMMCALL if kernel text is read-only").
> > 
> > However, his change still doesn't account for the cross-vendor live
> > migration case (Intel<->AMD), which will still be broken, causing the
> > before mentioned bogus #PF, which will just lead to misleading Oops
> > reports, confusing the poor souls, trying to make sense of it.
> > 
> > IMHO, there is no valid reason for still having the patching in place as
> > the .text of non-ancient kernel's  will be write-protected, making
> > patching attempts fail. And, as they fail with a #PF instead of #UD, the
> > guest cannot even handle them appropriately, as there was no memory
> > write attempt from its point of view. Therefore the default should be to
> > disable it, IMO. This won't prevent guests making use of the wrong
> > instruction from trapping, but, at least, now they'll get the correct
> > exception vector and can handle it appropriately.
> 
> But you don't accout for the case that guest kernel is built without
> CONFIG_STRICT_KERNEL_RWX enabled, or without CONFIG_DEBUG_RO_DATA, or for
> whatever reason the guest's text is not readonly, and the VM needs to be
> migrated among different vendors (Intel <-> AMD).
> 
> Before this patch, the above usecase works well. But with this patch, the
> guest will gets #UD after migrated to different vendors.
> 
> I heard from some small CSPs that they do want to the ability to live
> migrate VMs among Intel and AMD host.

Usually CSPs don't have full control over what their customers
are running as a guest. If their customers are running mainstream
modern guest OS, CONFIG_STRICT_KERNEL_RWX is pretty likely to be
set, so presumably migration between Intel & AMD will not work
and this isn't making it worse ?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2025-07-22 11:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-19 19:42 [PATCH] i386/kvm: Disable hypercall patching quirk by default Mathias Krause
2025-07-21 10:22 ` Mathias Krause
2025-07-22  3:45 ` Xiaoyao Li
2025-07-22  9:21   ` Mathias Krause
2025-07-22 10:27     ` Xiaoyao Li
2025-07-22 10:35       ` Mohamed Mediouni
2025-07-22 11:06         ` Xiaoyao Li
2025-07-22 11:16           ` Mohamed Mediouni
2025-07-22 11:15       ` Daniel P. Berrangé [this message]
2025-07-22 12:21         ` Xiaoyao Li
2025-07-22 20:13           ` Mathias Krause
2025-07-22 20:08       ` Mathias Krause
2025-07-23  9:26 ` David Woodhouse
2025-08-04  9:35   ` Mathias Krause

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aH9yuVcUJQc4_-vP@redhat.com \
    --to=berrange@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=minipli@grsecurity.net \
    --cc=mtosatti@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=seanjc@google.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.