Re: [RFC] Potential problem in qspinlock due to mixed-size accesses

public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrea Parri <parri.andrea@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Haas <t.haas@tu-bs.de>,
	Alan Stern <stern@rowland.harvard.edu>,
	Will Deacon <will@kernel.org>, Boqun Feng <boqun.feng@gmail.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	David Howells <dhowells@redhat.com>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	Luc Maranget <luc.maranget@inria.fr>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Akira Yokosawa <akiyks@gmail.com>,
	Daniel Lustig <dlustig@nvidia.com>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	lkmm@lists.linux.dev, hernan.poncedeleon@huaweicloud.com,
	jonas.oberhauser@huaweicloud.com,
	"r.maseli@tu-bs.de" <r.maseli@tu-bs.de>
Subject: Re: [RFC] Potential problem in qspinlock due to mixed-size accesses
Date: Fri, 13 Jun 2025 13:17:37 +0200	[thread overview]
Message-ID: <aEwHufdehlQnBX7g@andrea> (raw)
In-Reply-To: <20250613075501.GI2273038@noisy.programming.kicks-ass.net>

> (snip the excellent details)

Indeed, joining in praising this report -  Great work, Thomas!


> > ### Solutions
> > 
> > The problematic executions rely on the fact that T2 can move half of its
> > load operation (1) to before the xchg_tail (3).
> > Preventing this reordering solves all issues. Possible solutions are:
> >     - make the xchg_tail full-sized (i.e, also touch lock/pending bits).
> >       Note that if the kernel is configured with >= 16k cpus, then the tail
> > becomes larger than 16 bits and needs to be encoded in parts of the pending
> > byte as well.
> >       In this case, the kernel makes a full-sized (32-bit) access for the
> > xchg. So the above bugs are only present in the < 16k cpus setting.
> 
> Right, but that is the more expensive option for some.
> 
> >     - make the xchg_tail an acquire operation.
> >     - make the xchg_tail a release operation (this is an odd solution by
> > itself but works for aarch64 because it preserves REL->ACQ ordering). In
> > this case, maybe the preceding "smp_wmb()" can be removed.
> 
> I think I prefer this one, it move a barrier, not really adding
> additional overhead. Will?
> 
> >     - put some other read-read barrier between the xchg_tail and the load.
> > 
> > 
> > ### Implications for qspinlock executed on non-ARM architectures.
> > 
> > Unfortunately, there are no MSA extensions for other hardware memory models,
> > so we have to speculate based on whether the problematic reordering is
> > permitted if the problematic load was treated as two individual
> > instructions.
> > It seems Power and RISCV would have no problem reordering the instructions,
> > so qspinlock might also break on those architectures.
> 
> Power (and RiscV without ZABHA) 'emulate' the short XCHG using a full
> word LL/SC and should be good.
> 
> But yes, ZABHA might be equally broken.

RISC-V forbids store-forwarding from AMOs or SCs, certain (non-normative)
commentary in the spec clarifies that the same ordering rule applies when
the memory accesses in question only overlap partially.

I am not aware of any "RISC-V implementation" manifesting the load-load
re-ordering in question.  IAC, notice that making xchg_tail() a release
operation might not suffice to fix such an implementation given that the
arch has no plain load-acquire instruction yet and relies on the generic
(fence-based) code for atomic_cond_read_acquire().

  Andrea

next prev parent reply	other threads:[~2025-06-13 11:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-12 14:55 [RFC] Potential problem in qspinlock due to mixed-size accesses Thomas Haas
2025-06-13  7:55 ` Peter Zijlstra
2025-06-13 11:17   ` Andrea Parri [this message]
     [not found]     ` <9264df13-36db-4b25-b2c4-7a9701df2f4d@tu-bs.de>
2025-06-16  6:21       ` Andrea Parri
2025-06-16 14:11         ` Alan Stern
2025-06-17 14:09           ` Andrea Parri
2025-06-19 14:27             ` Thomas Haas
2025-06-19 14:32               ` Alan Stern
2025-06-19 14:59                 ` Thomas Haas
2025-06-19 17:56                   ` Alan Stern
2025-06-19 18:21                     ` Thomas Haas
2025-06-16 14:23   ` Will Deacon
2025-06-17  6:19     ` Thomas Haas
2025-06-17  8:42       ` Hernan Ponce de Leon
2025-06-17 14:17         ` Will Deacon
2025-06-17 14:23           ` Thomas Haas
2025-06-17 19:00             ` Hernan Ponce de Leon
2025-06-19 12:30               ` Will Deacon
2025-06-19 14:11                 ` Thomas Haas
2025-06-18  6:51   ` Paul E. McKenney
2025-06-18 12:11     ` Paul E. McKenney
2025-12-17 19:05 ` Thomas Haas
2025-12-18 22:02   ` Andrea Parri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEwHufdehlQnBX7g@andrea \
    --to=parri.andrea@gmail.com \
    --cc=akiyks@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=hernan.poncedeleon@huaweicloud.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=joelagnelf@nvidia.com \
    --cc=jonas.oberhauser@huaweicloud.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkmm@lists.linux.dev \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=r.maseli@tu-bs.de \
    --cc=stern@rowland.harvard.edu \
    --cc=t.haas@tu-bs.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox