Re: [PATCH v3 next 5/5] Avoid writing to node->next in the osq_lock() fast path

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Boqun Feng <boqun@kernel.org>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Yafang Shao <laoar.shao@gmail.com>
Subject: Re: [PATCH v3 next 5/5] Avoid writing to node->next in the osq_lock() fast path
Date: Wed, 11 Mar 2026 21:50:45 +0000	[thread overview]
Message-ID: <20260311215045.5b6a2c65@pumpkin> (raw)
In-Reply-To: <3afd7e58-8e31-4dba-860c-2055ba372423@redhat.com>

On Wed, 11 Mar 2026 15:27:08 -0400
Waiman Long <longman@redhat.com> wrote:

> On 3/6/26 5:51 PM, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> >
> > When osq_lock() returns false or osq_unlock() returns static
> > analysis shows that node->next should always be NULL.
> > This means that it isn't necessary to explicitly set it to NULL
> > prior to atomic_xchg(&lock->tail, curr) on entry to osq_lock().
> >
> > Defer determining the address of the CPU's 'node' until after the
> > atomic_exchange() so that it isn't done in the uncontented path.
> >
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> >   kernel/locking/osq_lock.c | 6 ++----
> >   1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index 0619691e2756..3f0cfdf1cd0f 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -92,13 +92,10 @@ osq_wait_next(struct optimistic_spin_queue *lock,
> >   
> >   bool osq_lock(struct optimistic_spin_queue *lock)
> >   {
> > -	struct optimistic_spin_node *node = this_cpu_ptr(&osq_node);
> > -	struct optimistic_spin_node *prev, *next;
> > +	struct optimistic_spin_node *node, *prev, *next;
> >   	unsigned int curr = encode_cpu(smp_processor_id());
> >   	unsigned int prev_cpu;
> >   
> > -	node->next = NULL;  
> 
> Although it does look like node->next should always be NULL when 
> entering osq_lock(), any future change may invalidate this assumption. I 
> know you want to not touch the osq_node cacheline on fast path, but we 
> will need a big comment here to explicitly spell out this assumption to 
> make sure that we won't break it in the future.

I've been looking at the code some more.
The first new patch adds a load of comments.
Unfortunately they are rather wrong (or in the wrong place) by the time
I finished. So I need to go around again at least once.
The main extra changes:
- Set node->prev to zero instead of node->locked = 1.
  This makes the loop after the smp_cond_load_relaxed() better.
  And the write to next->prev can move into osq_wait_next().
- Just call osq_wait_next() from osq_unlock() - it is the same code.
  Removes the unconditional cmpxchg on lock->tail (which needs to be
  _release in both places).
- Change node->next to be the cpu number - it is only used as a pointer
  once (I think Linus suggested that once).
- Stop 'node' being cache-line aligned - it only contains two cpu numbers.
  Just 8 byte align so it is all in one cache line.
  The data for the 'wrong' cpu is hardly ever accessed.

> 
> BTW, how much performance gain have you measured with this change? Can 
> we just leave it there to be safe.

You think I've been brave enough to run my changes :-)
Actually I've written a little test harness that lets me compile and run
the actual kernel source in usespace and 'prod' it with requests.
That let me add delays at particular points to check the unusual paths
(but not the effects of acquire/release).

I will run the next changes, hopefully there won't be anything nasty in
the rc kernel to catch me out.
But I've no real idea of how to exercise the code.

	David

> 
> > -
> >   	/*
> >   	 * We need both ACQUIRE (pairs with corresponding RELEASE in
> >   	 * unlock() uncontended, or fastpath) and RELEASE (to publish
> > @@ -109,6 +106,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> >   	if (prev_cpu == OSQ_UNLOCKED_VAL)
> >   		return true;
> >   
> > +	node = this_cpu_ptr(&osq_node);
> >   	WRITE_ONCE(node->prev_cpu, prev_cpu);
> >   	prev = decode_cpu(prev_cpu);
> >   	node->locked = 0;  
> 
> I am fine with moving the initialization here. The other patches also 
> look good to me.
> 
> Cheers,
> Longman
>

next prev parent reply	other threads:[~2026-03-11 21:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 22:51 [PATCH v3 next 0/5] locking/osq_lock: Optimisations to osq_lock code david.laight.linux
2026-03-06 22:51 ` [PATCH v3 next 1/5] Only clear node->locked in the slow osq_lock() path david.laight.linux
2026-03-06 23:01   ` David Laight
2026-03-06 22:51 ` [PATCH v3 next 2/5] Optimise vcpu_is_preempted() check david.laight.linux
2026-03-06 23:01   ` David Laight
2026-03-06 23:03   ` David Laight
2026-03-06 22:51 ` [PATCH v3 next 3/5] Use node->prev_cpu instead of saving node->prev david.laight.linux
2026-03-06 23:01   ` David Laight
2026-03-06 23:03   ` David Laight
2026-03-06 22:51 ` [PATCH v3 next 4/5] Optimise decode_cpu() and per_cpu_ptr() david.laight.linux
2026-03-06 23:01   ` David Laight
2026-03-06 23:03   ` David Laight
2026-03-06 22:51 ` [PATCH v3 next 5/5] Avoid writing to node->next in the osq_lock() fast path david.laight.linux
2026-03-06 23:04   ` David Laight
2026-03-07  0:06   ` Linus Torvalds
2026-03-07 11:32     ` David Laight
2026-03-11 19:27   ` Waiman Long
2026-03-11 19:40     ` Waiman Long
2026-03-11 21:50     ` David Laight [this message]
2026-03-06 22:59 ` [PATCH v3 next 0/5] locking/osq_lock: Optimisations to osq_lock code David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260311215045.5b6a2c65@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=boqun@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox