All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,  Will Deacon <will@kernel.org>,
	Boqun Feng <boqun@kernel.org>, Waiman Long <longman@redhat.com>,
	 linux-kernel@vger.kernel.org, Marco Elver <elver@google.com>,
	 Christoph Hellwig <hch@lst.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Nick Desaulniers <ndesaulniers@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Kees Cook <kees@kernel.org>,  Jann Horn <jannh@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze
Date: Thu, 26 Feb 2026 09:47:55 -0800	[thread overview]
Message-ID: <aaCHS5ZRuW-QJkK7@google.com> (raw)
In-Reply-To: <7a22294b-1150-4c55-a95a-ea918cfb9b76@acm.org>

On Tue, Feb 24, 2026, Bart Van Assche wrote:
> On 2/24/26 10:20 AM, Sean Christopherson wrote:
> > For the scope, please use:
> > 
> >     KVM: VMX:
> > 
> > On Mon, Feb 23, 2026, Bart Van Assche wrote:
> > > The Clang thread-safety analyzer does not support comparing expressions
> > > that use per_cpu(). Hence introduce a new local variable to capture the
> > > address of a per-cpu spinlock. This patch prepares for enabling the
> > > Clang thread-safety analyzer.
> > > 
> > > Cc: Sean Christopherson <seanjc@google.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: kvm@vger.kernel.org
> > > Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> > > ---
> > >   arch/x86/kvm/vmx/posted_intr.c | 7 ++++---
> > >   1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> > > index 4a6d9a17da23..f8711b7b85a8 100644
> > > --- a/arch/x86/kvm/vmx/posted_intr.c
> > > +++ b/arch/x86/kvm/vmx/posted_intr.c
> > > @@ -164,6 +164,7 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
> > >   	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> > >   	struct vcpu_vt *vt = to_vt(vcpu);
> > >   	struct pi_desc old, new;
> > > +	raw_spinlock_t *wakeup_lock;
> > >   	lockdep_assert_irqs_disabled();
> > > @@ -179,11 +180,11 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
> > >   	 * entirety of the sched_out critical section, i.e. the wakeup handler
> > >   	 * can't run while the scheduler locks are held.
> > >   	 */
> > > -	raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
> > > -			     PI_LOCK_SCHED_OUT);
> > > +	wakeup_lock = &per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu);
> > 
> > Addressing this piecemeal doesn't seem maintainable in the long term.  The odds
> > of unintentionally regressing the coverage with a cleanup are rather high.  Or
> > we'll end up with confused and/or grumpy developers because they're required to
> > write code in a very specific way because of what are effectively shortcomings
> > in the compiler.
> 
> I think it's worth mentioning that the number of patches similar to the
> above is small. If I remember correctly, I only encountered two similar
> cases in the entire kernel tree.

Yeah, it's definitely not a deal-breaker to work around this in KVM, especially
if this is one of the few things blocking -Wthread-safety.

> Regarding why the above patch is necessary, I don't think that it is
> fair to blame the compiler in this case. The macros that implement
> per_cpu() make it impossible for the compiler to conclude that the
> pointers passed to the raw_spin_lock_nested() and raw_spin_unlock()
> calls are identical:

Well rats, that pretty much makes it infeasible to solve the underlying problem.

> /*
>  * Add an offset to a pointer.  Use RELOC_HIDE() to prevent the compiler
>  * from making incorrect assumptions about the pointer value.
>  */
> #define SHIFT_PERCPU_PTR(__p, __offset)				\
> 	RELOC_HIDE(PERCPU_PTR(__p), (__offset))
> 
> #define RELOC_HIDE(ptr, off)					\
> ({								\
> 	unsigned long __ptr;					\
> 	__asm__ ("" : "=r"(__ptr) : "0"(ptr));			\
> 	(typeof(ptr)) (__ptr + (off));				\
> })
> 
> By the way, the above patch is not the only possible solution for
> addressing the thread-safety warning Clang reports for this function.
> Another possibility is adding __no_context_analysis to the function
> definition. Is the latter perhaps what you prefer?

Hmm, I'd prefer to keep the analysis, even though it's a bit of a pain.  We already
went through quite some effort to preserve lockdep for this lock; compared to that,
forcing use of local variables is hardly anything.

My only concern is lack of enforcement and documentation.  I fiddled with a bunch
of ideas, but mostly of them flamed out because of the aformentioned lockdep
shenanigans.  E.g. forcing use of guard() or scoped_guard() doesn't Just Work.

The best idea I came up with is to rename the global variable to something scary,
and then define a CLASS() so that it's syntactically all but impossible to feed
the the result of per_cpu() directly into lock() or unlock().

What's your timeline for enabling -Wthread-safety?  E.g. are you trying to land
it in 7.1?  7.2+?  I'd be happy to formally post the below and get it landed in
the N-1 kernel (assuming Paolo is also comfortable landing the patch in 7.0 if
you're targeting 7.1).

---
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 26 Feb 2026 07:21:52 -0800
Subject: [PATCH] KVM: VMX: Force wakeup_vcpus_on_cpu_lock to be captured in
 local variable

Wrap wakeup_vcpus_on_cpu_lock in a CLASS() and append "do_not_use" to the
per-CPU symbol to effectively force lock()+unlock() paths to capture the
per-CPU lock in a local variable.  Clang's thread-safety analyzer doesn't
support comparing lock() vs. unlock() expressions that use separate
per_cpu() invocations (-Wthread-safety generates false-positves), as the
kernel's per_cpu() implementation deliberately hides the resolved address
from the compiler, specifically to prevent the compiler from reasoning
about the symbol.  I.e. per_cpu() is a victim of its own success.

Link: https://lore.kernel.org/all/a2ebde260608230500o3407b108hc03debb9da6e62c@mail.gmail.com
Link: https://news.ycombinator.com/item?id=18050983
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/posted_intr.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 4a6d9a17da23..e08faaeab12f 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -31,7 +31,21 @@ static DEFINE_PER_CPU(struct list_head, wakeup_vcpus_on_cpu);
  * CPU.  IRQs must be disabled when taking this lock, otherwise deadlock will
  * occur if a wakeup IRQ arrives and attempts to acquire the lock.
  */
-static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock);
+static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock__do_not_touch);
+
+/*
+ * Route accesses to the lock through a CLASS() to effectively force users to
+ * capture the lock in a local variable.  The kernel's per_cpu() implementation
+ * deliberately obfuscates the address of the data to prevent the compiler from
+ * making incorrect assumptions about the symbol.  However, hiding the address
+ * triggers false-positive thread-safety warnings if lock() vs. unlock() are
+ * called with different per_cpu() invocations, because the compiler can't tell
+ * its the same lock under the hood.
+ */
+DEFINE_CLASS(pi_wakeup_vcpus_lock, raw_spinlock_t *,
+	     lockdep_assert_not_held(_T),
+	     &per_cpu(wakeup_vcpus_on_cpu_lock__do_not_touch, cpu),
+	     int cpu);
 
 #define PI_LOCK_SCHED_OUT SINGLE_DEPTH_NESTING
 
@@ -90,7 +104,7 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
 	 * current pCPU if the task was migrated.
 	 */
 	if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) {
-		raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu);
+		CLASS(pi_wakeup_vcpus_lock, spinlock)(cpu);
 
 		/*
 		 * In addition to taking the wakeup lock for the regular/IRQ
@@ -165,6 +179,8 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
 	struct vcpu_vt *vt = to_vt(vcpu);
 	struct pi_desc old, new;
 
+	CLASS(pi_wakeup_vcpus_lock, spinlock)(vcpu->cpu);
+
 	lockdep_assert_irqs_disabled();
 
 	/*
@@ -179,11 +195,10 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu)
 	 * entirety of the sched_out critical section, i.e. the wakeup handler
 	 * can't run while the scheduler locks are held.
 	 */
-	raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
-			     PI_LOCK_SCHED_OUT);
+	raw_spin_lock_nested(spinlock, PI_LOCK_SCHED_OUT);
 	list_add_tail(&vt->pi_wakeup_list,
 		      &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu));
-	raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
+	raw_spin_unlock(spinlock);
 
 	WARN(pi_test_sn(pi_desc), "PI descriptor SN field set before blocking");
 
@@ -254,9 +269,10 @@ void pi_wakeup_handler(void)
 {
 	int cpu = smp_processor_id();
 	struct list_head *wakeup_list = &per_cpu(wakeup_vcpus_on_cpu, cpu);
-	raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, cpu);
 	struct vcpu_vt *vt;
 
+	CLASS(pi_wakeup_vcpus_lock, spinlock)(cpu);
+
 	raw_spin_lock(spinlock);
 	list_for_each_entry(vt, wakeup_list, pi_wakeup_list) {
 
@@ -269,7 +285,7 @@ void pi_wakeup_handler(void)
 void __init pi_init_cpu(int cpu)
 {
 	INIT_LIST_HEAD(&per_cpu(wakeup_vcpus_on_cpu, cpu));
-	raw_spin_lock_init(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
+	raw_spin_lock_init(&per_cpu(wakeup_vcpus_on_cpu_lock__do_not_touch, cpu));
 }
 
 void pi_apicv_pre_state_restore(struct kvm_vcpu *vcpu)

base-commit: 183bb0ce8c77b0fd1fb25874112bc8751a461e49
--

  reply	other threads:[~2026-02-26 17:48 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 21:50 [PATCH 00/62] Bug fixes and refactoring patches related to locking Bart Van Assche
2026-02-23 21:50 ` [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze Bart Van Assche
2026-02-24 18:20   ` Sean Christopherson
2026-02-24 19:25     ` Bart Van Assche
2026-02-26 17:47       ` Sean Christopherson [this message]
2026-02-26 20:13         ` Marco Elver
2026-02-27  0:19           ` Bart Van Assche
2026-03-18 23:31             ` Marco Elver
2026-03-19 14:43               ` Marco Elver
2026-02-26 22:36         ` Bart Van Assche
2026-02-26 22:41           ` Sean Christopherson
2026-02-23 21:50 ` [PATCH 02/62] blk-ioc: Prepare for enabling thread-safety analysis Bart Van Assche
2026-02-23 21:50 ` [PATCH 03/62] drbd: Balance RCU calls in drbd_adm_dump_devices() Bart Van Assche
2026-02-23 21:50 ` [PATCH 04/62] dax/bus.c: Fix a locking bug Bart Van Assche
2026-02-23 21:50 ` [PATCH 05/62] dma-buf: Convert dma_buf_import_sync_file() to the early-return style Bart Van Assche
2026-02-23 21:50 ` [PATCH 06/62] dma-buf: Handle all dma_resv_lock() errors Bart Van Assche
2026-02-23 21:50 ` [PATCH 07/62] drm/amdgpu: Unlock a mutex before destroying it Bart Van Assche
2026-02-24  8:26   ` Christian König
2026-02-23 21:50 ` [PATCH 08/62] drm/amdgpu: Fix locking bugs in error paths Bart Van Assche
2026-02-24  8:28   ` Christian König
2026-02-24 14:32     ` Alex Deucher
2026-02-23 21:50 ` [PATCH 09/62] drm: bridge: cdns-mhdp8546: Fix a locking bug in an error path Bart Van Assche
2026-02-23 21:50 ` [PATCH 10/62] drm: Make drm_read() easier to analyze Bart Van Assche
2026-02-23 21:50 ` [PATCH 11/62] drm/pagemap: Unlock cache->lock before freeing it Bart Van Assche
2026-02-23 21:50 ` [PATCH 12/62] drm/gpusvm.c: Fix a locking bug in an error path Bart Van Assche
2026-02-23 21:50 ` [PATCH 13/62] drm/qxl: Fix a buffer leak " Bart Van Assche
2026-02-23 21:50 ` [PATCH 14/62] hwmon: (it87) Check the it87_lock() return value Bart Van Assche
2026-02-23 21:50 ` [PATCH 15/62] Input: synaptics-rmi4 - fix a locking bug in an error path Bart Van Assche
2026-02-23 21:58   ` Dmitry Torokhov
2026-02-23 22:05     ` Bart Van Assche
2026-02-23 21:50 ` [PATCH 16/62] md: Make mddev_suspend() easier to analyze Bart Van Assche
2026-02-23 21:50 ` [PATCH 17/62] bnxt_en: Make bnxt_resume() " Bart Van Assche
2026-02-23 21:50 ` [PATCH 18/62] bnxt_en: Fix bnxt_dl_reload_up() Bart Van Assche
2026-02-23 21:50 ` [Intel-wired-lan] [PATCH 19/62] ice: Fix a locking bug in an error path Bart Van Assche via Intel-wired-lan
2026-02-23 21:50   ` Bart Van Assche
2026-02-23 22:01 ` [PATCH 00/62] Bug fixes and refactoring patches related to locking Peter Zijlstra
2026-02-23 22:13   ` Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2026-02-23 22:00 Bart Van Assche
2026-02-23 22:00 ` [PATCH 01/62] kvm: Make pi_enable_wakeup_handler() easier to analyze Bart Van Assche
     [not found] <20260223214950.2153735-1-bvanassche@acm.org>
2026-02-23 21:48 ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaCHS5ZRuW-QJkK7@google.com \
    --to=seanjc@google.com \
    --cc=boqun@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=elver@google.com \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=kees@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.