public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	Davidlohr Bueso <dave@stgolabs.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	huang ying <huang.ying.caritas@gmail.com>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH v5 10/18] locking/rwsem: Wake up almost all readers in wait queue
Date: Thu, 18 Apr 2019 19:46:20 -0400	[thread overview]
Message-ID: <20190418234628.3675-11-longman@redhat.com> (raw)
In-Reply-To: <20190418234628.3675-1-longman@redhat.com>

When the front of the wait queue is a reader, other readers
immediately following the first reader will also be woken up at the
same time. However, if there is a writer in between. Those readers
behind the writer will not be woken up.

Because of optimistic spinning, the lock acquisition order is not FIFO
anyway. The lock handoff mechanism will ensure that lock starvation
will not happen.

Assuming that the lock hold times of the other readers still in the
queue will be about the same as the readers that are being woken up,
there is really not much additional cost other than the additional
latency due to the wakeup of additional tasks by the waker. Therefore
all the readers up to a maximum of 256 in the queue are woken up when
the first waiter is a reader to improve reader throughput. This is
somewhat similar in concept to a phase-fair R/W lock.

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
equal numbers of readers and writers before and after this patch were
as follows:

   # of Threads  Pre-Patch   Post-patch
   ------------  ---------   ----------
        4          1,641        1,674
        8            731        1,062
       16            564          924
       32             78          300
       64             38          195
      240             50          149

There is no performance gain at low contention level. At high contention
level, however, this patch gives a pretty decent performance boost.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/rwsem.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 97c4e92482be..76c380b63b0c 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -254,6 +254,14 @@ enum writer_wait_state {
  */
 #define RWSEM_WAIT_TIMEOUT	DIV_ROUND_UP(HZ, 250)
 
+/*
+ * Magic number to batch-wakeup waiting readers, even when writers are
+ * also present in the queue. This both limits the amount of work the
+ * waking thread must do and also prevents any potential counter overflow,
+ * however unlikely.
+ */
+#define MAX_READERS_WAKEUP	0x100
+
 /*
  * handle the lock release when processes blocked on it that can now run
  * - if we come here from up_xxxx(), then the RWSEM_FLAG_WAITERS bit must
@@ -328,16 +336,22 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
 	}
 
 	/*
-	 * Grant an infinite number of read locks to the readers at the front
-	 * of the queue. We know that woken will be at least 1 as we accounted
+	 * Grant up to MAX_READERS_WAKEUP read locks to all the readers in the
+	 * queue. We know that the woken will be at least 1 as we accounted
 	 * for above. Note we increment the 'active part' of the count by the
 	 * number of readers before waking any processes up.
+	 *
+	 * This is an adaptation of the phase-fair R/W locks where at the
+	 * reader phase (first waiter is a reader), all readers are eligible
+	 * to acquire the lock at the same time irrespective of their order
+	 * in the queue. The writers acquire the lock according to their
+	 * order in the queue.
 	 */
 	list_for_each_entry_safe(waiter, tmp, &sem->wait_list, list) {
 		struct task_struct *tsk;
 
 		if (waiter->type == RWSEM_WAITING_FOR_WRITE)
-			break;
+			continue;
 
 		woken++;
 		tsk = waiter->task;
@@ -356,6 +370,12 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
 		 * after setting the reader waiter to nil.
 		 */
 		wake_q_add_safe(wake_q, tsk);
+
+		/*
+		 * Limit # of readers that can be woken up per wakeup call.
+		 */
+		if (woken >= MAX_READERS_WAKEUP)
+			break;
 	}
 
 	adjustment = woken * RWSEM_READER_BIAS - adjustment;
-- 
2.18.1


  parent reply	other threads:[~2019-04-18 23:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18 23:46 [PATCH v5 00/18] locking/rwsem: Rwsem rearchitecture part 2 Waiman Long
2019-04-18 23:46 ` [PATCH v5 01/18] locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER Waiman Long
2019-04-18 23:46 ` [PATCH v5 02/18] locking/rwsem: Remove rwsem_wake() wakeup optimization Waiman Long
2019-04-18 23:46 ` [PATCH v5 03/18] locking/rwsem: Implement a new locking scheme Waiman Long
2019-04-18 23:46 ` [PATCH v5 04/18] locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c Waiman Long
2019-04-18 23:46 ` [PATCH v5 05/18] locking/rwsem: Code cleanup after files merging Waiman Long
2019-04-18 23:46 ` [PATCH v5 06/18] locking/rwsem: Make rwsem_spin_on_owner() return owner state Waiman Long
2019-04-18 23:46 ` [PATCH v5 07/18] locking/rwsem: Implement lock handoff to prevent lock starvation Waiman Long
2019-04-18 23:46 ` [PATCH v5 08/18] locking/rwsem: Always release wait_lock before waking up tasks Waiman Long
2019-04-18 23:46 ` [PATCH v5 09/18] locking/rwsem: More optimal RT task handling of null owner Waiman Long
2019-04-18 23:46 ` Waiman Long [this message]
2019-04-18 23:46 ` [PATCH v5 11/18] locking/rwsem: Clarify usage of owner's nonspinaable bit Waiman Long
2019-04-18 23:46 ` [PATCH v5 12/18] locking/rwsem: Enable readers spinning on writer Waiman Long
2019-04-18 23:46 ` [PATCH v5 13/18] locking/rwsem: Enable time-based spinning on reader-owned rwsem Waiman Long
2019-04-18 23:46 ` [PATCH v5 14/18] locking/rwsem: Adaptive disabling of reader optimistic spinning Waiman Long
2019-04-18 23:46 ` [PATCH v5 15/18] locking/rwsem: Add more rwsem owner access helpers Waiman Long
2019-04-18 23:46 ` [PATCH v5 16/18] locking/rwsem: Guard against making count negative Waiman Long
2019-04-18 23:46 ` [PATCH v5 17/18] locking/rwsem: Merge owner into count on x86-64 Waiman Long
2019-04-18 23:46 ` [PATCH v5 18/18] locking/rwsem: Remove redundant computation of writer lock word Waiman Long
2019-04-18 23:56 ` [PATCH v5 00/18] locking/rwsem: Rwsem rearchitecture part 2 Waiman Long
2019-04-19  7:50   ` Ingo Molnar
2019-04-19 12:49     ` Ingo Molnar
2019-04-19 15:00       ` Waiman Long
2019-04-19 16:56         ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190418234628.3675-11-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave@stgolabs.net \
    --cc=hpa@zytor.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox