From: Sean Christopherson <seanjc@google.com>
To: Hou Wenlong <houwenlong.hwl@antgroup.com>
Cc: Chao Gao <chao.gao@intel.com>,
kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: x86: Remove outdated comments and code in kvm_on_user_return()
Date: Mon, 15 Sep 2025 17:13:50 -0700 [thread overview]
Message-ID: <aMirvo9Xly5fVmbY@google.com> (raw)
In-Reply-To: <20250913050643.GA50691@k08j02272.eu95sqa>
On Sat, Sep 13, 2025, Hou Wenlong wrote:
> On Fri, Sep 12, 2025 at 07:40:43AM -0700, Sean Christopherson wrote:
> > On Fri, Sep 12, 2025, Hou Wenlong wrote:
> > > On Fri, Sep 12, 2025 at 05:38:22PM +0800, Hou Wenlong wrote:
> > > > On Fri, Sep 12, 2025 at 04:35:00PM +0800, Chao Gao wrote:
> > > > > On Fri, Sep 12, 2025 at 03:35:29PM +0800, Hou Wenlong wrote:
> > > > > >The commit a377ac1cd9d7b ("x86/entry: Move user return notifier out of
> > > > > >loop") moved fire_user_return_notifiers() into the section with
> > > > > >interrupts disabled, so the callback kvm_on_user_return() cannot be
> > > > > >interrupted by kvm_arch_disable_virtualization_cpu() now. Therefore,
> > > > > >remove the outdated comments and local_irq_save()/local_irq_restore()
> > > > > >code in kvm_on_user_return().
> > > > > >
> > > > > >Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > > > > >---
> > > > > > arch/x86/kvm/x86.c | 16 +++++-----------
> > > > > > 1 file changed, 5 insertions(+), 11 deletions(-)
> > > > > >
> > > > > >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > >index 33fba801b205..10afbacb1851 100644
> > > > > >--- a/arch/x86/kvm/x86.c
> > > > > >+++ b/arch/x86/kvm/x86.c
> > > > > >@@ -568,18 +568,12 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
> > > > > > struct kvm_user_return_msrs *msrs
> > > > > > = container_of(urn, struct kvm_user_return_msrs, urn);
> > > > > > struct kvm_user_return_msr_values *values;
> > > > > >- unsigned long flags;
> > > > > >
> > > > > >- /*
> > > > > >- * Disabling irqs at this point since the following code could be
> > > > > >- * interrupted and executed through kvm_arch_disable_virtualization_cpu()
> > > > > >- */
> > > > > >- local_irq_save(flags);
> > > > > >- if (msrs->registered) {
> > > > > >- msrs->registered = false;
> > > > > >- user_return_notifier_unregister(urn);
> > > > > >- }
> > > > > >- local_irq_restore(flags);
> > > > > >+ lockdep_assert_irqs_disabled();
> > > > >
> > > > > kvm_offline_cpu() may call into this function. But I am not sure if interrupts
> > > > > are disabled in that path.
> > > > >
> > > > Thanks for pointing that out. I see that interrupts are enabled in the
> > > > callback during the CPU offline test. I'll remove the
> > > > lockdep_assert_irqs_disabled() here.
> > > >
> > >
> > > Upon a second look, can we just disable interrupts in kvm_cpu_offline()?
> > > The other paths that call kvm_disable_virtualization_cpu() are all in an
> > > interrupt-disabled state, although it seems that
> > > kvm_disable_virtualization_cpu() cannot be reentered.
> >
> > Why do we care? I.e. what is the motivation for changing this code? I'm hesitant
> > to touch this code without good reason given its fragility and subtlety.
> Hi, Sean.
>
> I'm just reworking the shared MSRs part in our inner multi-KVM. First, I
> noticed that the comment mentions that kvm_on_user_return() can be
> interrupted or reentered, which is a little confusing to me. Then, I
> found that the comment is outdated, so I decided to remove it and also
> make changes to the code. I agree that this code is fragile, maybe
> just change the comment?
I'm not opposed to making changes, I just don't want to do so without reason.
"This is ridiculously confusing" is a good enough reason :-) Hmm, and disabling
IRQs in that path might technically be a bug fix.
I think the main reason this got especially confusing is that commit a377ac1cd9d7
("x86/entry: Move user return notifier out of loop") somewhat inadvertantly fixed
the underlying issue that was papered over by 1650b4ebc99d ("KVM: Disable irq while
unregistering user notifier").
And then commit 9a798b1337af ("KVM: Register cpuhp and syscore callbacks when
enabling hardware") removed KVM's "normal" path use of IPIs to enable/disable
virtualization.
As a result, KVM is left with a rather uncommon corner case of reboot being the
only way for kvm_on_user_return() to be interrupted. For the life of me, I can't
tell whether or not CPU (un)hotplug paths run with IRQs disabled. I know at
least some run in task context, but I've no idea if that applies to CPUHP_AP_KVM_ONLINE.
So, looking at this (yet) again, I'm in favor of doing as you suggest and saving
IRQs in both kvm_online_cpu() and kvm_offline_cpu(). If IRQs aren't guarnateed
to be disabled, I _think_ that's technically a bug fix, because virtualization_enabled
could be stale (with respect to the actual state of hardware) when read from IRQ
context.
Something like this?
---
arch/x86/kvm/x86.c | 12 +++++++-----
virt/kvm/kvm_main.c | 10 ++++++++--
2 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c78acab2ff3f..067cb66e9c18 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -582,18 +582,20 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
struct kvm_user_return_msrs *msrs
= container_of(urn, struct kvm_user_return_msrs, urn);
struct kvm_user_return_msr_values *values;
- unsigned long flags;
/*
- * Disabling irqs at this point since the following code could be
- * interrupted and executed through kvm_arch_disable_virtualization_cpu()
+ * Assert that IRQs are disabled. KVM disables virtualization via IPI
+ * callback on reboot, and this code isn't safe for re-entrancy, e.g.
+ * receiving the IRQ after checking "registered" would lead to double
+ * deletion of KVM's notifier.
*/
- local_irq_save(flags);
+ lockdep_assert_irqs_disabled();
+
if (msrs->registered) {
msrs->registered = false;
user_return_notifier_unregister(urn);
}
- local_irq_restore(flags);
+
for (slot = 0; slot < kvm_nr_uret_msrs; ++slot) {
values = &msrs->values[slot];
if (values->host != values->curr) {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fee108988028..1b7d59adc390 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5581,6 +5581,8 @@ __weak void kvm_arch_disable_virtualization(void)
static int kvm_enable_virtualization_cpu(void)
{
+ lockdep_assert_irqs_disabled();
+
if (__this_cpu_read(virtualization_enabled))
return 0;
@@ -5596,6 +5598,8 @@ static int kvm_enable_virtualization_cpu(void)
static int kvm_online_cpu(unsigned int cpu)
{
+ guard(irqsave)();
+
/*
* Abort the CPU online process if hardware virtualization cannot
* be enabled. Otherwise running VMs would encounter unrecoverable
@@ -5606,6 +5610,8 @@ static int kvm_online_cpu(unsigned int cpu)
static void kvm_disable_virtualization_cpu(void *ign)
{
+ lockdep_assert_irqs_disabled();
+
if (!__this_cpu_read(virtualization_enabled))
return;
@@ -5616,6 +5622,8 @@ static void kvm_disable_virtualization_cpu(void *ign)
static int kvm_offline_cpu(unsigned int cpu)
{
+ guard(irqsave)();
+
kvm_disable_virtualization_cpu(NULL);
return 0;
}
@@ -5649,7 +5657,6 @@ static int kvm_suspend(void)
* dropped all locks (userspace tasks are frozen via a fake signal).
*/
lockdep_assert_not_held(&kvm_usage_lock);
- lockdep_assert_irqs_disabled();
kvm_disable_virtualization_cpu(NULL);
return 0;
@@ -5658,7 +5665,6 @@ static int kvm_suspend(void)
static void kvm_resume(void)
{
lockdep_assert_not_held(&kvm_usage_lock);
- lockdep_assert_irqs_disabled();
WARN_ON_ONCE(kvm_enable_virtualization_cpu());
}
base-commit: 14298d819d5a6b7180a4089e7d2121ca3551dc6c
--
prev parent reply other threads:[~2025-09-16 0:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-12 7:35 [PATCH] KVM: x86: Remove outdated comments and code in kvm_on_user_return() Hou Wenlong
2025-09-12 8:35 ` Chao Gao
2025-09-12 9:38 ` Hou Wenlong
2025-09-12 14:11 ` Hou Wenlong
2025-09-12 14:40 ` Sean Christopherson
2025-09-13 5:06 ` Hou Wenlong
2025-09-16 0:13 ` Sean Christopherson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMirvo9Xly5fVmbY@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=chao.gao@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=houwenlong.hwl@antgroup.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox