All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>
Cc: linux-kernel@vger.kernel.org, john.p.donnelly@oracle.com,
	"Hillf Danton" <hdanton@sina.com>,
	"Mukesh Ojha" <quic_mojha@quicinc.com>,
	"Ting11 Wang 王婷" <wangting11@xiaomi.com>,
	"Waiman Long" <longman@redhat.com>
Subject: [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for handoff writer
Date: Mon, 17 Oct 2022 17:13:53 -0400	[thread overview]
Message-ID: <20221017211356.333862-3-longman@redhat.com> (raw)
In-Reply-To: <20221017211356.333862-1-longman@redhat.com>

Commit 91d2a812dfb9 ("locking/rwsem: Make handoff writer optimistically
spin on owner") assumes that when the owner field is changed to NULL,
the lock will become free soon.  That assumption may not be correct
especially if the handoff writer doing the spinning is a RT task which
may preempt another task from completing its action of either freeing
the rwsem or properly setting up owner.

To prevent this live lock scenario, we have to limit the number of
trylock attempts without sleeping. The current limit is now set to 8
to allow enough time for the other task to hopefully complete its action.

By adding new lock events to track the number of NULL owner retries with
handoff flag set before a successful trylock when running a 96 threads
locking microbenchmark with equal number of readers and writers running
on a 2-core 96-thread system for 15 seconds, the following stats are
obtained. Note that none of locking threads are RT tasks.

  Retries of successful trylock    Count
  -----------------------------    -----
             1                     1738
             2                       19
             3                       11
             4                        2
             5                        1
             6                        1
             7                        1
             8                        0
             X                        1

The last row is the one failed attempt that needs more than 8 retries.
So a retry count maximum of 8 should capture most of them if no RT task
is in the mix.

Fixes: 91d2a812dfb9 ("locking/rwsem: Make handoff writer optimistically spin on owner")
Reported-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-and-tested-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
 kernel/locking/rwsem.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index be2df9ea7c30..c68d76fc8c68 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -1115,6 +1115,7 @@ static struct rw_semaphore __sched *
 rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 {
 	struct rwsem_waiter waiter;
+	int null_owner_retries;
 	DEFINE_WAKE_Q(wake_q);
 
 	/* do optimistic spinning and steal lock if possible */
@@ -1156,7 +1157,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 	set_current_state(state);
 	trace_contention_begin(sem, LCB_F_WRITE);
 
-	for (;;) {
+	for (null_owner_retries = 0;;) {
 		if (rwsem_try_write_lock(sem, &waiter)) {
 			/* rwsem_try_write_lock() implies ACQUIRE on success */
 			break;
@@ -1182,8 +1183,21 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 			owner_state = rwsem_spin_on_owner(sem);
 			preempt_enable();
 
-			if (owner_state == OWNER_NULL)
+			/*
+			 * owner is NULL doesn't guarantee the lock is free.
+			 * An incoming reader will temporarily increment the
+			 * reader count without changing owner and the
+			 * rwsem_try_write_lock() will fails if the reader
+			 * is not able to decrement it in time. Allow 8
+			 * trylock attempts when hitting a NULL owner before
+			 * going to sleep.
+			 */
+			if ((owner_state == OWNER_NULL) &&
+			    (null_owner_retries < 8)) {
+				null_owner_retries++;
 				goto trylock_again;
+			}
+			null_owner_retries = 0;
 		}
 
 		schedule();
-- 
2.31.1


  parent reply	other threads:[~2022-10-17 21:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17 21:13 [PATCH v3 0/5] lockinig/rwsem: Fix rwsem bugs & enable true lock handoff Waiman Long
2022-10-17 21:13 ` [PATCH v3 1/5] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath Waiman Long
2022-10-24 13:17   ` Peter Zijlstra
2022-10-24 13:50     ` Waiman Long
2022-10-17 21:13 ` Waiman Long [this message]
2022-10-24 13:31   ` [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for handoff writer Peter Zijlstra
2022-10-24 15:55     ` Waiman Long
2022-10-25 11:22       ` Peter Zijlstra
2022-10-25 11:48         ` Peter Zijlstra
2022-10-25 19:55           ` Waiman Long
2022-10-25 20:14             ` Peter Zijlstra
2022-10-26  1:44               ` Waiman Long
     [not found]         ` <20221025145843.2953-1-hdanton@sina.com>
2022-10-25 19:00           ` Waiman Long
2022-10-17 21:13 ` [PATCH v3 3/5] locking/rwsem: Change waiter->hanodff_set to a handoff_state enum Waiman Long
2022-10-17 21:13 ` [PATCH v3 4/5] locking/rwsem: Enable direct rwsem lock handoff Waiman Long
2022-10-17 21:13 ` [PATCH v3 5/5] locking/rwsem: Update handoff lock events tracking Waiman Long
     [not found] ` <20221018111424.1007-1-hdanton@sina.com>
2022-10-18 14:13   ` [PATCH v3 4/5] locking/rwsem: Enable direct rwsem lock handoff Mukesh Ojha
2022-10-18 17:37     ` Waiman Long
     [not found]     ` <20221018235138.1088-1-hdanton@sina.com>
2022-10-19  0:39       ` Waiman Long
     [not found]       ` <20221019022934.1166-1-hdanton@sina.com>
2022-10-19  2:49         ` Waiman Long
     [not found]         ` <20221019070559.1220-1-hdanton@sina.com>
2022-10-19 15:02           ` Waiman Long
2022-10-24 16:18       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221017211356.333862-3-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=boqun.feng@gmail.com \
    --cc=hdanton@sina.com \
    --cc=john.p.donnelly@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quic_mojha@quicinc.com \
    --cc=wangting11@xiaomi.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.