From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
Davidlohr Bueso <dave@stgolabs.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>,
huang ying <huang.ying.caritas@gmail.com>,
Waiman Long <longman@redhat.com>
Subject: [PATCH v8 08/19] locking/rwsem: Always release wait_lock before waking up tasks
Date: Mon, 20 May 2019 16:59:07 -0400 [thread overview]
Message-ID: <20190520205918.22251-9-longman@redhat.com> (raw)
In-Reply-To: <20190520205918.22251-1-longman@redhat.com>
With the use of wake_q, we can do task wakeups without holding the
wait_lock. There is one exception in the rwsem code, though. It is
when the writer in the slowpath detects that there are waiters ahead
but the rwsem is not held by a writer. This can lead to a long wait_lock
hold time especially when a large number of readers are to be woken up.
Remediate this situation by releasing the wait_lock before waking
up tasks and re-acquiring it afterward. The rwsem_try_write_lock()
function is also modified to read the rwsem count directly to avoid
stale count value.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
include/linux/sched/wake_q.h | 5 +++++
kernel/locking/rwsem.c | 31 +++++++++++++++----------------
2 files changed, 20 insertions(+), 16 deletions(-)
diff --git a/include/linux/sched/wake_q.h b/include/linux/sched/wake_q.h
index ad826d2a4557..26a2013ac39c 100644
--- a/include/linux/sched/wake_q.h
+++ b/include/linux/sched/wake_q.h
@@ -51,6 +51,11 @@ static inline void wake_q_init(struct wake_q_head *head)
head->lastp = &head->first;
}
+static inline bool wake_q_empty(struct wake_q_head *head)
+{
+ return head->first == WAKE_Q_TAIL;
+}
+
extern void wake_q_add(struct wake_q_head *head, struct task_struct *task);
extern void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task);
extern void wake_up_q(struct wake_q_head *head);
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 0c8aef065acb..36aed5236bd2 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -400,13 +400,14 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
* If wstate is WRITER_HANDOFF, it will make sure that either the handoff
* bit is set or the lock is acquired with handoff bit cleared.
*/
-static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem,
+static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
enum writer_wait_state wstate)
{
- long new;
+ long count, new;
lockdep_assert_held(&sem->wait_lock);
+ count = atomic_long_read(&sem->count);
do {
bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
@@ -751,26 +752,25 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
? RWSEM_WAKE_READERS
: RWSEM_WAKE_ANY, &wake_q);
- /*
- * The wakeup is normally called _after_ the wait_lock
- * is released, but given that we are proactively waking
- * readers we can deal with the wake_q overhead as it is
- * similar to releasing and taking the wait_lock again
- * for attempting rwsem_try_write_lock().
- */
- wake_up_q(&wake_q);
-
- /* We need wake_q again below, reinitialize */
- wake_q_init(&wake_q);
+ if (!wake_q_empty(&wake_q)) {
+ /*
+ * We want to minimize wait_lock hold time especially
+ * when a large number of readers are to be woken up.
+ */
+ raw_spin_unlock_irq(&sem->wait_lock);
+ wake_up_q(&wake_q);
+ wake_q_init(&wake_q); /* Used again, reinit */
+ raw_spin_lock_irq(&sem->wait_lock);
+ }
} else {
- count = atomic_long_add_return(RWSEM_FLAG_WAITERS, &sem->count);
+ atomic_long_or(RWSEM_FLAG_WAITERS, &sem->count);
}
wait:
/* wait until we successfully acquire the lock */
set_current_state(state);
while (true) {
- if (rwsem_try_write_lock(count, sem, wstate))
+ if (rwsem_try_write_lock(sem, wstate))
break;
raw_spin_unlock_irq(&sem->wait_lock);
@@ -811,7 +811,6 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
}
raw_spin_lock_irq(&sem->wait_lock);
- count = atomic_long_read(&sem->count);
}
__set_current_state(TASK_RUNNING);
list_del(&waiter.list);
--
2.18.1
next prev parent reply other threads:[~2019-05-20 21:00 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-20 20:58 [PATCH v8 00/19] locking/rwsem: Rwsem rearchitecture part 2 Waiman Long
2019-05-20 20:59 ` [PATCH v8 01/19] locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER Waiman Long
2019-06-17 14:23 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 02/19] locking/rwsem: Remove rwsem_wake() wakeup optimization Waiman Long
2019-06-17 14:24 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 03/19] locking/rwsem: Implement a new locking scheme Waiman Long
2019-06-17 14:24 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 04/19] locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c Waiman Long
2019-06-17 14:25 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 05/19] locking/rwsem: Code cleanup after files merging Waiman Long
2019-06-17 14:26 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 06/19] locking/rwsem: Make rwsem_spin_on_owner() return owner state Waiman Long
2019-06-17 14:27 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 07/19] locking/rwsem: Implement lock handoff to prevent lock starvation Waiman Long
2019-06-04 3:03 ` Yuyang Du
2019-06-04 3:26 ` Yuyang Du
2019-06-04 9:12 ` Boqun Feng
2019-06-04 16:00 ` Waiman Long
2019-06-05 7:48 ` Yuyang Du
2019-06-04 13:21 ` Waiman Long
2019-06-17 14:27 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` Waiman Long [this message]
2019-06-17 14:28 ` [tip:locking/core] locking/rwsem: Always release wait_lock before waking up tasks tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 09/19] locking/rwsem: More optimal RT task handling of null owner Waiman Long
2019-06-17 14:29 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 10/19] locking/rwsem: Wake up almost all readers in wait queue Waiman Long
2019-06-17 14:29 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 11/19] locking/rwsem: Clarify usage of owner's nonspinaable bit Waiman Long
2019-06-17 14:30 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 12/19] locking/rwsem: Enable readers spinning on writer Waiman Long
2019-06-17 14:31 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 13/19] locking/rwsem: Make rwsem->owner an atomic_long_t Waiman Long
2019-06-04 8:52 ` Peter Zijlstra
2019-06-04 15:44 ` Waiman Long
2019-06-17 14:32 ` [tip:locking/core] " tip-bot for Waiman Long
2019-07-19 18:45 ` [PATCH v8 13/19] " Luis Henriques
2019-07-19 19:32 ` Waiman Long
2019-07-19 19:45 ` Luis Henriques
2019-07-19 20:14 ` Waiman Long
2019-07-19 19:51 ` Linus Torvalds
2019-07-20 8:41 ` Luis Henriques
2019-07-20 9:32 ` Luis Henriques
2019-07-20 9:45 ` Luis Henriques
2019-07-20 11:10 ` Peter Zijlstra
2019-07-20 15:04 ` Waiman Long
2019-07-21 20:49 ` Luis Henriques
2019-07-23 2:57 ` Waiman Long
2019-07-25 15:59 ` [tip:locking/core] locking/rwsem: Don't call owner_on_cpu() on read-owner tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 14/19] locking/rwsem: Enable time-based spinning on reader-owned rwsem Waiman Long
2019-06-04 9:03 ` Peter Zijlstra
2019-06-04 16:54 ` Waiman Long
2019-06-17 14:32 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 15/19] locking/rwsem: Adaptive disabling of reader optimistic spinning Waiman Long
2019-06-04 9:10 ` Peter Zijlstra
2019-06-04 17:28 ` Waiman Long
2019-06-04 9:14 ` Peter Zijlstra
2019-06-04 17:29 ` Waiman Long
2019-06-04 9:20 ` Peter Zijlstra
2019-06-04 17:30 ` Waiman Long
2019-06-04 17:38 ` Peter Zijlstra
2019-06-04 18:04 ` Waiman Long
2019-06-04 18:14 ` Peter Zijlstra
2019-06-04 18:21 ` Waiman Long
2019-06-05 18:13 ` Waiman Long
2019-06-05 20:19 ` Peter Zijlstra
2019-06-05 20:52 ` Linus Torvalds
2019-06-06 8:03 ` Peter Zijlstra
2019-06-06 8:11 ` Peter Zijlstra
2019-06-04 10:58 ` Peter Zijlstra
2019-06-17 14:33 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 16/19] locking/rwsem: Guard against making count negative Waiman Long
2019-06-11 13:11 ` Peter Zijlstra
2019-06-11 13:27 ` Peter Zijlstra
2019-06-11 13:13 ` Peter Zijlstra
2019-06-17 14:34 ` [tip:locking/core] " tip-bot for Waiman Long
2019-05-20 20:59 ` [PATCH v8 17/19] locking/rwsem: Merge owner into count on x86-64 Waiman Long
2019-06-04 9:45 ` Peter Zijlstra
2019-06-04 15:47 ` Waiman Long
2019-06-04 17:02 ` Peter Zijlstra
2019-06-04 17:06 ` Waiman Long
2019-06-04 17:18 ` Peter Zijlstra
2019-05-20 20:59 ` [PATCH v8 18/19] locking/rwsem: Remove redundant computation of writer lock word Waiman Long
2019-05-20 20:59 ` [PATCH v8 19/19] locking/rwsem: Disable preemption in down_read*() if owner in count Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190520205918.22251-9-longman@redhat.com \
--to=longman@redhat.com \
--cc=bp@alien8.de \
--cc=dave@stgolabs.net \
--cc=hpa@zytor.com \
--cc=huang.ying.caritas@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.