linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Chao Gao <chao.gao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	"Kirill A. Shutemov" <kas@kernel.org>,
	kvm@vger.kernel.org,  x86@kernel.org, linux-coco@lists.linux.dev,
	linux-kernel@vger.kernel.org,  Yan Zhao <yan.y.zhao@intel.com>,
	Xiaoyao Li <xiaoyao.li@intel.com>,
	 Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Hou Wenlong <houwenlong.hwl@antgroup.com>
Subject: Re: [PATCH v4 2/4] KVM: x86: Leave user-return notifier registered on reboot/shutdown
Date: Fri, 17 Oct 2025 08:27:22 -0700	[thread overview]
Message-ID: <aPJgWhywMXZdiyU5@google.com> (raw)
In-Reply-To: <aPHU+RZKwCK0BK7t@intel.com>

On Fri, Oct 17, 2025, Chao Gao wrote:
> > bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
> >@@ -14363,6 +14377,11 @@ module_init(kvm_x86_init);
> > 
> > static void __exit kvm_x86_exit(void)
> > {
> >+	int cpu;
> >+
> >+	for_each_possible_cpu(cpu)
> >+		WARN_ON_ONCE(per_cpu_ptr(user_return_msrs, cpu)->registered);
> 
> Is it OK to reference user_return_msrs during kvm.ko unloading? IIUC,
> user_return_msrs has already been freed during kvm-{intel,amd}.ko unloading.
> See:
> 
> vmx_exit/svm_exit()
>   -> kvm_x86_vendor_exit()
>        -> free_percpu(user_return_msrs);

Ouch.  Guess who didn't run with KASAN...

And rather than squeezing the WARN into this patch, I'm strongly leaning toward
adding it in a prep patch, as the WARN is valuable irrespective of how KVM handles
reboot.

Not yet tested...

--
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 17 Oct 2025 06:10:30 -0700
Subject: [PATCH 2/5] KVM: x86: WARN if user-return MSR notifier is registered
 on exit

When freeing the per-CPU user-return MSRs structures, WARN if any CPU has
a registered notifier to help detect and/or debug potential use-after-free
issues.  The lifecycle of the notifiers is rather convoluted, and has
several non-obvious paths where notifiers are unregistered, i.e. isn't
exactly the most robust code possible.

The notifiers they are registered on-demand in KVM, on the first WRMSR to
a tracked register.  _Usually_ the notifier is unregistered whenever the
CPU returns to userspace.  But because any given CPU isn't guaranteed to
return to userspace, e.g. the CPU could be offlined before doing so, KVM
also "drops", a.k.a. unregisters, the notifiers when virtualization is
disabled on the CPU.

Further complicating the unregister path is the fact that the calls to
disable virtualization come from common KVM, and the per-CPU calls are
guarded by a per-CPU flag (to harden _that_ code against bugs, e.g. due to
mishandling reboot).  Reboot/shutdown in particular is problematic, as KVM
disables virtualization via IPI function call, i.e. from IRQ context,
instead of using the cpuhp framework, which runs in task context.  I.e. on
reboot/shutdown, drop_user_return_notifiers() is called asynchronously.

Forced reboot/shutdown is the most problematic scenario, as userspace tasks
are not frozen before kvm_shutdown() is invoked, i.e. KVM could be actively
manipulating the user-return MSR lists and/or notifiers when the IPI
arrives.  To a certain extent, all bets are off when userspace forces a
reboot/shutdown, but KVM should at least avoid a use-after-free, e.g. to
avoid crashing the kernel when trying to reboot.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4b5d2d09634..334a911b36c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -575,6 +575,27 @@ static inline void kvm_async_pf_hash_reset(struct kvm_vcpu *vcpu)
 		vcpu->arch.apf.gfns[i] = ~0;
 }
 
+static int kvm_init_user_return_msrs(void)
+{
+	user_return_msrs = alloc_percpu(struct kvm_user_return_msrs);
+	if (!user_return_msrs) {
+		pr_err("failed to allocate percpu user_return_msrs\n");
+		return -ENOMEM;
+	}
+	kvm_nr_uret_msrs = 0;
+	return 0;
+}
+
+static void kvm_free_user_return_msrs(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		WARN_ON_ONCE(per_cpu_ptr(user_return_msrs, cpu)->registered);
+
+	free_percpu(user_return_msrs);
+}
+
 static void kvm_on_user_return(struct user_return_notifier *urn)
 {
 	unsigned slot;
@@ -10032,13 +10053,9 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 		return -ENOMEM;
 	}
 
-	user_return_msrs = alloc_percpu(struct kvm_user_return_msrs);
-	if (!user_return_msrs) {
-		pr_err("failed to allocate percpu kvm_user_return_msrs\n");
-		r = -ENOMEM;
+	r = kvm_init_user_return_msrs();
+	if (r)
 		goto out_free_x86_emulator_cache;
-	}
-	kvm_nr_uret_msrs = 0;
 
 	r = kvm_mmu_vendor_module_init();
 	if (r)
@@ -10141,7 +10158,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 out_mmu_exit:
 	kvm_mmu_vendor_module_exit();
 out_free_percpu:
-	free_percpu(user_return_msrs);
+	kvm_free_user_return_msrs();
 out_free_x86_emulator_cache:
 	kmem_cache_destroy(x86_emulator_cache);
 	return r;
@@ -10170,7 +10187,7 @@ void kvm_x86_vendor_exit(void)
 #endif
 	kvm_x86_call(hardware_unsetup)();
 	kvm_mmu_vendor_module_exit();
-	free_percpu(user_return_msrs);
+	kvm_free_user_return_msrs();
 	kmem_cache_destroy(x86_emulator_cache);
 #ifdef CONFIG_KVM_XEN
 	static_key_deferred_flush(&kvm_xen_enabled);
-- 

  reply	other threads:[~2025-10-17 15:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16 22:28 [PATCH v4 0/4] KVM: x86: User-return MSR cleanups Sean Christopherson
2025-10-16 22:28 ` [PATCH v4 1/4] KVM: TDX: Synchronize user-return MSRs immediately after VP.ENTER Sean Christopherson
2025-10-20 22:55   ` Edgecombe, Rick P
2025-10-21 13:37     ` Adrian Hunter
2025-10-21 15:06       ` Sean Christopherson
2025-10-21 16:36         ` Adrian Hunter
2025-10-21 16:46           ` Sean Christopherson
2025-10-21 18:54         ` Edgecombe, Rick P
2025-10-21 19:33           ` Sean Christopherson
2025-10-21 20:49             ` Edgecombe, Rick P
2025-10-23  5:59             ` Xiaoyao Li
2025-10-16 22:28 ` [PATCH v4 2/4] KVM: x86: Leave user-return notifier registered on reboot/shutdown Sean Christopherson
2025-10-17  5:32   ` Chao Gao
2025-10-17 15:27     ` Sean Christopherson [this message]
2025-10-16 22:28 ` [PATCH v4 3/4] KVM: x86: Don't disable IRQs when unregistering user-return notifier Sean Christopherson
2025-10-16 22:28 ` [PATCH v4 4/4] KVM: x86: Drop "cache" from user return MSR setter that skips WRMSR Sean Christopherson
2025-10-17  2:52   ` Chao Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPJgWhywMXZdiyU5@google.com \
    --to=seanjc@google.com \
    --cc=chao.gao@intel.com \
    --cc=houwenlong.hwl@antgroup.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).