Re: [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,  Will Deacon <will@kernel.org>,
	Boqun Feng <boqun@kernel.org>, Waiman Long <longman@redhat.com>,
	 linux-kernel@vger.kernel.org, Marco Elver <elver@google.com>,
	 Christoph Hellwig <hch@lst.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Nick Desaulniers <ndesaulniers@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Kees Cook <kees@kernel.org>,  Jann Horn <jannh@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze
Date: Thu, 26 Feb 2026 09:47:55 -0800	[thread overview]
Message-ID: <aaCHS5ZRuW-QJkK7@google.com> (raw)
In-Reply-To: <7a22294b-1150-4c55-a95a-ea918cfb9b76@acm.org>

On Tue, Feb 24, 2026, Bart Van Assche wrote:
> On 2/24/26 10:20 AM, Sean Christopherson wrote:
> > For the scope, please use:
> > 
> >     KVM: VMX:
> > 
> > On Mon, Feb 23, 2026, Bart Van Assche wrote:
> > > The Clang thread-safety analyzer does not support comparing expressions
> > > that use per_cpu(). Hence introduce a new local variable to capture the
> > > address of a per-cpu spinlock. This patch prepares for enabling the
> > > Clang thread-safety analyzer.
> > > 
> > > Cc: Sean Christopherson <seanjc@google.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: kvm@vger.kernel.org
> > > Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> > > ---
> > >   arch/x86/kvm/vmx/posted_intr.c | 7 ++++---
> > >   1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> > > index 4a6d9a17da23..f8711b7b85a8 100644
> > > --- a/arch/x86/kvm/vmx/posted_intr.c
> > > +++ b/arch/x86/kvm/vmx/posted_intr.c
> > > @@ -164,6 +164,7 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
> > >   	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> > >   	struct vcpu_vt *vt = to_vt(vcpu);
> > >   	struct pi_desc old, new;
> > > +	raw_spinlock_t *wakeup_lock;
> > >   	lockdep_assert_irqs_disabled();
> > > @@ -179,11 +180,11 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
> > >   	 * entirety of the sched_out critical section, i.e. the wakeup handler
> > >   	 * can't run while the scheduler locks are held.
> > >   	 */
> > > -	raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
> > > -			     PI_LOCK_SCHED_OUT);
> > > +	wakeup_lock = &per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu);
> > 
> > Addressing this piecemeal doesn't seem maintainable in the long term.  The odds
> > of unintentionally regressing the coverage with a cleanup are rather high.  Or
> > we'll end up with confused and/or grumpy developers because they're required to
> > write code in a very specific way because of what are effectively shortcomings
> > in the compiler.
> 
> I think it's worth mentioning that the number of patches similar to the
> above is small. If I remember correctly, I only encountered two similar
> cases in the entire kernel tree.

Yeah, it's definitely not a deal-breaker to work around this in KVM, especially
if this is one of the few things blocking -Wthread-safety.

> Regarding why the above patch is necessary, I don't think that it is
> fair to blame the compiler in this case. The macros that implement
> per_cpu() make it impossible for the compiler to conclude that the
> pointers passed to the raw_spin_lock_nested() and raw_spin_unlock()
> calls are identical:

Well rats, that pretty much makes it infeasible to solve the underlying problem.

> /*
>  * Add an offset to a pointer.  Use RELOC_HIDE() to prevent the compiler
>  * from making incorrect assumptions about the pointer value.
>  */
> #define SHIFT_PERCPU_PTR(__p, __offset)				\
> 	RELOC_HIDE(PERCPU_PTR(__p), (__offset))
> 
> #define RELOC_HIDE(ptr, off)					\
> ({								\
> 	unsigned long __ptr;					\
> 	__asm__ ("" : "=r"(__ptr) : "0"(ptr));			\
> 	(typeof(ptr)) (__ptr + (off));				\
> })
> 
> By the way, the above patch is not the only possible solution for
> addressing the thread-safety warning Clang reports for this function.
> Another possibility is adding __no_context_analysis to the function
> definition. Is the latter perhaps what you prefer?

Hmm, I'd prefer to keep the analysis, even though it's a bit of a pain.  We already
went through quite some effort to preserve lockdep for this lock; compared to that,
forcing use of local variables is hardly anything.

My only concern is lack of enforcement and documentation.  I fiddled with a bunch
of ideas, but mostly of them flamed out because of the aformentioned lockdep
shenanigans.  E.g. forcing use of guard() or scoped_guard() doesn't Just Work.

The best idea I came up with is to rename the global variable to something scary,
and then define a CLASS() so that it's syntactically all but impossible to feed
the the result of per_cpu() directly into lock() or unlock().

What's your timeline for enabling -Wthread-safety?  E.g. are you trying to land
it in 7.1?  7.2+?  I'd be happy to formally post the below and get it landed in
the N-1 kernel (assuming Paolo is also comfortable landing the patch in 7.0 if
you're targeting 7.1).

---
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 26 Feb 2026 07:21:52 -0800
Subject: [PATCH] KVM: VMX: Force wakeup_vcpus_on_cpu_lock to be captured in
 local variable

Wrap wakeup_vcpus_on_cpu_lock in a CLASS() and append "do_not_use" to the
per-CPU symbol to effectively force lock()+unlock() paths to capture the
per-CPU lock in a local variable.  Clang's thread-safety analyzer doesn't
support comparing lock() vs. unlock() expressions that use separate
per_cpu() invocations (-Wthread-safety generates false-positves), as the
kernel's per_cpu() implementation deliberately hides the resolved address
from the compiler, specifically to prevent the compiler from reasoning
about the symbol.  I.e. per_cpu() is a victim of its own success.

Link: https://lore.kernel.org/all/a2ebde260608230500o3407b108hc03debb9da6e62c@mail.gmail.com
Link: https://news.ycombinator.com/item?id=18050983
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/posted_intr.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 4a6d9a17da23..e08faaeab12f 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -31,7 +31,21 @@ static DEFINE_PER_CPU(struct list_head, wakeup_vcpus_on_cpu);
  * CPU.  IRQs must be disabled when taking this lock, otherwise deadlock will
  * occur if a wakeup IRQ arrives and attempts to acquire the lock.
  */
-static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock);
+static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock__do_not_touch);
+
+/*
+ * Route accesses to the lock through a CLASS() to effectively force users to
+ * capture the lock in a local variable.  The kernel's per_cpu() implementation
+ * deliberately obfuscates the address of the data to prevent the compiler from
+ * making incorrect assumptions about the symbol.  However, hiding the address
+ * triggers false-positive thread-safety warnings if lock() vs. unlock() are
+ * called with different per_cpu() invocations, because the compiler can't tell
+ * its the same lock under the hood.
+ */
+DEFINE_CLASS(pi_wakeup_vcpus_lock, raw_spinlock_t *,
+	     lockdep_assert_not_held(_T),
+	     &per_cpu(wakeup_vcpus_on_cpu_lock__do_not_touch, cpu),
+	     int cpu);
 
 #define PI_LOCK_SCHED_OUT SINGLE_DEPTH_NESTING
 
@@ -90,7 +104,7 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
 	 * current pCPU if the task was migrated.
 	 */
 	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) {
-		raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu);
+		CLASS(pi_wakeup_vcpus_lock, spinlock)(cpu);
 
 		/*
 		 * In addition to taking the wakeup lock for the regular/IRQ
@@ -165,6 +179,8 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
 	struct vcpu_vt *vt = to_vt(vcpu);
 	struct pi_desc old, new;
 
+	CLASS(pi_wakeup_vcpus_lock, spinlock)(vcpu->cpu);
+
 	lockdep_assert_irqs_disabled();
 
 	/*
@@ -179,11 +195,10 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
 	 * entirety of the sched_out critical section, i.e. the wakeup handler
 	 * can't run while the scheduler locks are held.
 	 */
-	raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
-			     PI_LOCK_SCHED_OUT);
+	raw_spin_lock_nested(spinlock, PI_LOCK_SCHED_OUT);
 	list_add_tail(&vt->pi_wakeup_list,
 		      &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu));
-	raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
+	raw_spin_unlock(spinlock);
 
 	WARN(pi_test_sn(pi_desc), "PI descriptor SN field set before blocking");
 
@@ -254,9 +269,10 @@ void pi_wakeup_handler(void)
 {
 	int cpu = smp_processor_id();
 	struct list_head *wakeup_list = &per_cpu(wakeup_vcpus_on_cpu, cpu);
-	raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, cpu);
 	struct vcpu_vt *vt;
 
+	CLASS(pi_wakeup_vcpus_lock, spinlock)(cpu);
+
 	raw_spin_lock(spinlock);
 	list_for_each_entry(vt, wakeup_list, pi_wakeup_list) {
 
@@ -269,7 +285,7 @@ void pi_wakeup_handler(void)
 void __init pi_init_cpu(int cpu)
 {
 	INIT_LIST_HEAD(&per_cpu(wakeup_vcpus_on_cpu, cpu));
-	raw_spin_lock_init(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
+	raw_spin_lock_init(&per_cpu(wakeup_vcpus_on_cpu_lock__do_not_touch, cpu));
 }
 
 void pi_apicv_pre_state_restore(struct kvm_vcpu *vcpu)

base-commit: 183bb0ce8c77b0fd1fb25874112bc8751a461e49
--

next prev parent reply	other threads:[~2026-02-26 17:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 21:50 [PATCH 00/62] Bug fixes and refactoring patches related to locking Bart Van Assche
2026-02-23 21:50 ` [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze Bart Van Assche
2026-02-24 18:20   ` Sean Christopherson
2026-02-24 19:25     ` Bart Van Assche
2026-02-26 17:47       ` Sean Christopherson [this message]
2026-02-26 20:13         ` Marco Elver
2026-02-27  0:19           ` Bart Van Assche
2026-03-18 23:31             ` Marco Elver
2026-03-19 14:43               ` Marco Elver
2026-02-26 22:36         ` Bart Van Assche
2026-02-26 22:41           ` Sean Christopherson
2026-02-23 21:50 ` [PATCH 02/62] blk-ioc: Prepare for enabling thread-safety analysis Bart Van Assche
2026-02-23 21:50 ` [PATCH 03/62] drbd: Balance RCU calls in drbd_adm_dump_devices() Bart Van Assche
2026-02-23 21:50 ` [PATCH 04/62] dax/bus.c: Fix a locking bug Bart Van Assche
2026-02-23 21:50 ` [PATCH 05/62] dma-buf: Convert dma_buf_import_sync_file() to the early-return style Bart Van Assche
2026-02-23 21:50 ` [PATCH 06/62] dma-buf: Handle all dma_resv_lock() errors Bart Van Assche
2026-02-23 21:50 ` [PATCH 07/62] drm/amdgpu: Unlock a mutex before destroying it Bart Van Assche
2026-02-24  8:26   ` Christian König
2026-02-23 21:50 ` [PATCH 08/62] drm/amdgpu: Fix locking bugs in error paths Bart Van Assche
2026-02-24  8:28   ` Christian König
2026-02-24 14:32     ` Alex Deucher
2026-02-23 21:50 ` [PATCH 09/62] drm: bridge: cdns-mhdp8546: Fix a locking bug in an error path Bart Van Assche
2026-02-23 21:50 ` [PATCH 10/62] drm: Make drm_read() easier to analyze Bart Van Assche
2026-02-23 21:50 ` [PATCH 11/62] drm/pagemap: Unlock cache->lock before freeing it Bart Van Assche
2026-02-23 21:50 ` [PATCH 12/62] drm/gpusvm.c: Fix a locking bug in an error path Bart Van Assche
2026-02-23 21:50 ` [PATCH 13/62] drm/qxl: Fix a buffer leak " Bart Van Assche
2026-02-23 21:50 ` [PATCH 14/62] hwmon: (it87) Check the it87_lock() return value Bart Van Assche
2026-02-23 21:50 ` [PATCH 15/62] Input: synaptics-rmi4 - fix a locking bug in an error path Bart Van Assche
2026-02-23 21:58   ` Dmitry Torokhov
2026-02-23 22:05     ` Bart Van Assche
2026-02-23 21:50 ` [PATCH 16/62] md: Make mddev_suspend() easier to analyze Bart Van Assche
2026-02-23 21:50 ` [PATCH 17/62] bnxt_en: Make bnxt_resume() " Bart Van Assche
2026-02-23 21:50 ` [PATCH 18/62] bnxt_en: Fix bnxt_dl_reload_up() Bart Van Assche
2026-02-23 21:50 ` [PATCH 19/62] ice: Fix a locking bug in an error path Bart Van Assche
2026-02-23 22:01 ` [PATCH 00/62] Bug fixes and refactoring patches related to locking Peter Zijlstra
2026-02-23 22:13   ` Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2026-02-23 22:00 Bart Van Assche
2026-02-23 22:00 ` [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze Bart Van Assche

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:4a6d9a17da2 dfblob:e08faaeab12 )
 OR (
bs:"KVM: VMX: Force wakeup_vcpus_on_cpu_lock to be captured in" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaCHS5ZRuW-QJkK7@google.com \
    --to=seanjc@google.com \
    --cc=boqun@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=elver@google.com \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=kees@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox