From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: "Ingo Molnar" <mingo@redhat.com>, "Will Deacon" <will@kernel.org>,
"Boqun Feng" <boqun.feng@gmail.com>,
linux-kernel@vger.kernel.org, john.p.donnelly@oracle.com,
"Hillf Danton" <hdanton@sina.com>,
"Mukesh Ojha" <quic_mojha@quicinc.com>,
"Ting11 Wang 王婷" <wangting11@xiaomi.com>
Subject: Re: [PATCH v6 5/6] locking/rwsem: Enable direct rwsem lock handoff
Date: Mon, 23 Jan 2023 15:59:26 +0100 [thread overview]
Message-ID: <Y86gzkVHlsOTY8QL@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20221118022016.462070-6-longman@redhat.com>
On Thu, Nov 17, 2022 at 09:20:15PM -0500, Waiman Long wrote:
> The lock handoff provided in rwsem isn't a true handoff like that in
> the mutex. Instead, it is more like a quiescent state where optimistic
> spinning and lock stealing are disabled to make it easier for the first
> waiter to acquire the lock.
>
> Reworking the code to enable a true lock handoff is more complex due to
> the following facts:
> 1) The RWSEM_FLAG_HANDOFF bit is protected by the wait_lock and it
> is too expensive to always take the wait_lock in the unlock path
> to prevent racing.
Specifically, the issue is that we'd need to turn the
atomic_long_add_return_release() into an atomic_try_cmpxchg_release()
loop, like:
tmp = atomic_long_read(&sem->count);
do {
if (tmp & (WAITERS|HANDOFF))
return slow_unock();
} while (atomic_long_try_cmpxchg_release(&sem->count, &tmp,
tmp - RWSEM_{READER_BIAS,WRITE_LOCKED});
in order to not race with a concurrent setting of the HANDOFF bit,
right? And we don't like to turn unlock into a cmpxchg loop.
(OTOH we sorta do this for mutex, unconteded mutex has cmpxchg lock and
unlock, any fail and we go to the slow path -- I suppose the distinct
difference is that we sorta expect some contention on the read side)
> 2) The reader lock fast path may add a RWSEM_READER_BIAS at the wrong
> time to prevent a proper lock handoff from a reader owned rwsem.
This would be much the same, right? We'd have to turn
rwsem_read_trylock() into a cmpxchg-loop and we don't like that.
Therefore we do that speculative add and fix up later.
Now, I'm not enturely sure what issue you allude to here; is the problem
that you can't quite tell when the last reader is gone?
> A lock handoff can only be initiated when the following conditions are
> true:
> 1) The RWSEM_FLAG_HANDOFF bit is set.
d'uh ;-)
> 2) The task to do the handoff don't see any other active lock
> excluding the lock that it might have held.
2) here is the 2) above, right?
> The new handoff mechanism performs handoff in rwsem_wakeup() to minimize
> overhead. The rwsem count will be known at that point to determine if
> handoff should be done. However, there is a small time gap between the
> rwsem becomes free and the wait_lock is taken
Right, that's between atomic_long_fetch_add_release() and calling the
slow path because WAITERS bit is set.
> where a reader can come in and add a RWSEM_READER_BIAS to the count or
Both 2s above.
> the current first waiter can take the rwsem and clear
> RWSEM_FLAG_HANDOFF in the interim.
The actual intended action.
> That will fail the handoff operation.
I would not list that latter as a failure, it's exactly what we want to
happen, no?
> To handle the former case, a secondary handoff will also be done in
> the rwsem_down_read_slowpath() to catch it.
Right. In short:
Having HANDOVER set:
- implies WAITERS set
- disables all fastpaths (spinning or otherwise)
- dis-allows anybody except first waiter to obtain lock
Therefore, having the window between clearing owner and prodding first
waiter is 'harmless'.
> With true lock handoff, there is no need to do a NULL owner spinning
> anymore as wakeup will be performed if handoff is possible. So it
> is likely that the first waiter won't actually go to sleep even when
> schedule() is called in this case.
Right, removing that NULL spinning was the whole purpose -- except I
seem to have forgotten why it was a problem :-)
OK, lemme go read the actual patch.
Hmmm... you made it a wee bit more complicated, instead of my 3rd clause
above, you added a whole intermediate GRANTED state. Why?
Since we fundamentally must deal with the release->wait_lock hole, why
do we need to do the whole rwsem_wake()->GRANTED->*_slowpath() dance?
Why can't we skip the whole rwsem_wake()->GRANTED part and only do
handoff in the slowpath?
next prev parent reply other threads:[~2023-01-23 15:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-18 2:20 [PATCH v6 0/6] lockinig/rwsem: Fix rwsem bugs & enable true lock handoff Waiman Long
2022-11-18 2:20 ` [PATCH v6 1/6] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath Waiman Long
2022-12-16 15:02 ` Jiri Wiesner
2023-01-20 22:58 ` Waiman Long
2022-11-18 2:20 ` [PATCH v6 2/6] locking/rwsem: Disable preemption at all down_read*() and up_read() code paths Waiman Long
2022-12-16 15:03 ` Jiri Wiesner
2022-11-18 2:20 ` [PATCH v6 3/6] locking/rwsem: Disable preemption at all down_write*() and up_write() " Waiman Long
2022-12-16 15:03 ` Jiri Wiesner
2022-11-18 2:20 ` [PATCH v6 4/6] locking/rwsem: Change waiter->hanodff_set to a handoff_state enum Waiman Long
2022-11-18 2:20 ` [PATCH v6 5/6] locking/rwsem: Enable direct rwsem lock handoff Waiman Long
2023-01-23 14:59 ` Peter Zijlstra [this message]
2023-01-23 17:30 ` Waiman Long
2023-01-23 22:07 ` Waiman Long
2023-01-24 12:58 ` Peter Zijlstra
2023-01-24 12:29 ` Peter Zijlstra
2023-01-25 1:53 ` Waiman Long
2022-11-18 2:20 ` [PATCH v6 6/6] locking/rwsem: Update handoff lock events tracking Waiman Long
2023-01-17 20:53 ` [PATCH v6 0/6] lockinig/rwsem: Fix rwsem bugs & enable true lock handoff Waiman Long
2023-01-22 13:46 ` Peter Zijlstra
2023-01-23 3:40 ` Waiman Long
2023-01-23 21:10 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y86gzkVHlsOTY8QL@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=boqun.feng@gmail.com \
--cc=hdanton@sina.com \
--cc=john.p.donnelly@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=quic_mojha@quicinc.com \
--cc=wangting11@xiaomi.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox