From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Darren Hart <darren@dvhart.com>,
Steven Rostedt <rostedt@goodmis.org>,
fredrik.markstrom@windriver.com,
Davidlohr Bueso <dave@stgolabs.net>,
Manfred Spraul <manfred@colorfullife.com>
Subject: [PATCH 2/3 v2] futex: avoid double wake up in futex_wake() on -RT
Date: Fri, 10 Apr 2015 18:11:35 +0200 [thread overview]
Message-ID: <20150410161135.GF3057@linutronix.de> (raw)
In-Reply-To: <alpine.DEB.2.11.1504072131400.3845@nanos>
futex_wake() wakes the waiter while holding the hb->lock. The waiter
does not take the hb->lock and can leave the kernel. However the next
operation the same futex operation will point to the same hb->lock and
we will see a small dance around the lock including prio-boosting and
context switch:
low prio task FUTEX_WAKE on high prio
| ft-1489 [000] ....1.. 81.167501: sys_enter: sys_futex (8049f60, 1, 1, 0, 0, 0)
| ft-1489 [000] dN..311 81.167504: sched_wakeup: pid=1490 prio=94
| ft-1489 [000] d...311 81.167520: sched_switch: prev_pid=1489 prev_prio=120 prev_state=R+ ==> next_pid=1490 next_prio=94
| ft-1490 [000] ....1.. 81.167522: sys_exit: sys_futex = 0
prio task FUTEX_WAKE on low prio
| ft-1490 [000] ....1.. 81.167528: sys_enter: sys_futex (8049f5c, 1, 1, 0, 0, 0)
| ft-1490 [000] ....1.. 81.167530: sys_exit: sys_futex = 0
prio task waits FUTEX_WAIT, hb->lock still owned by low prio task
| ft-1490 [000] ....1.. 81.167534: sys_enter: sys_futex (8049f60, 0, 1, 0, 0, 0)
| ft-1490 [000] d...411 81.167895: sched_pi_setprio: pid=1489 oldprio=120 newprio=94
| ft-1490 [000] d...311 81.167901: sched_switch: prev_pid=1490 prev_prio=94 prev_state=D ==> next_pid=1489 next_prio=94
| ft-1489 [000] d...411 81.167910: sched_wakeup: pid=1490 prio=94
| ft-1489 [000] d...311 81.167912: sched_pi_setprio: pid=1489 oldprio=94 newprio=120
| ft-1489 [000] d...311 81.167915: sched_switch: prev_pid=1489 prev_prio=120 prev_state=R+ ==> next_pid=1490 next_prio=94
| ft-1490 [000] d...3.. 81.167922: sched_switch: prev_pid=1490 prev_prio=94 prev_state=S ==> next_pid=1489 next_prio=120
| ft-1489 [000] ....1.. 81.167924: sys_exit: sys_futex = 1
This patch delays the wakeup of the process untill the hb->lock is
dropped to avoid boosting + context switch to obtain the lock.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2:
- update patch description
- move the comment to __wake_futex()
- move the wakeup before the out_put_key label in futex_wake()
kernel/futex.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index b38abe3573a8..658f4d05cd6f 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1092,12 +1092,12 @@ static void __unqueue_futex(struct futex_q *q)
* The hash bucket lock must be held when this is called.
* Afterwards, the futex_q must not be accessed.
*/
-static void wake_futex(struct futex_q *q)
+static struct task_struct *__wake_futex(struct futex_q *q)
{
struct task_struct *p = q->task;
if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
- return;
+ return NULL;
/*
* We set q->lock_ptr = NULL _before_ we wake up the task. If
@@ -1117,6 +1117,15 @@ static void wake_futex(struct futex_q *q)
*/
smp_wmb();
q->lock_ptr = NULL;
+ return p;
+}
+
+static void wake_futex(struct futex_q *q)
+{
+ struct task_struct *p = __wake_futex(q);
+
+ if (!p)
+ return;
wake_up_state(p, TASK_NORMAL);
put_task_struct(p);
@@ -1228,6 +1237,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
struct futex_hash_bucket *hb;
struct futex_q *this, *next;
union futex_key key = FUTEX_KEY_INIT;
+ struct task_struct *waiter = NULL;
int ret;
if (!bitset)
@@ -1256,13 +1266,21 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
if (!(this->bitset & bitset))
continue;
- wake_futex(this);
+ if (nr_wake == 1)
+ waiter = __wake_futex(this);
+ else
+ wake_futex(this);
if (++ret >= nr_wake)
break;
}
}
spin_unlock(&hb->lock);
+ if (waiter) {
+ wake_up_state(waiter, TASK_NORMAL);
+ put_task_struct(waiter);
+ }
+
out_put_key:
put_futex_key(&key);
out:
--
2.1.4
next prev parent reply other threads:[~2015-04-10 16:11 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-07 15:03 improve futex on -RT by avoiding the double wake-up Sebastian Andrzej Siewior
2015-04-07 15:03 ` [PATCH 1/3] futex: avoid double wake up in PI futex wait / wake on -RT Sebastian Andrzej Siewior
2015-04-07 18:41 ` Thomas Gleixner
2015-04-10 14:42 ` [PATCH 1/3 v2] " Sebastian Andrzej Siewior
2015-04-07 15:03 ` [PATCH 2/3] futex: avoid double wake up in futex_wake() " Sebastian Andrzej Siewior
2015-04-07 19:47 ` Thomas Gleixner
2015-04-10 16:11 ` Sebastian Andrzej Siewior [this message]
2015-04-13 3:02 ` [PATCH 2/3 v2] " Davidlohr Bueso
2015-04-16 5:09 ` Davidlohr Bueso
2015-04-16 9:19 ` Thomas Gleixner
2015-04-16 10:16 ` Peter Zijlstra
2015-04-16 10:49 ` Thomas Gleixner
2015-04-16 14:42 ` Davidlohr Bueso
2015-04-16 15:54 ` Peter Zijlstra
2015-04-16 16:22 ` Davidlohr Bueso
2015-04-07 15:03 ` [PATCH 3/3] ipc/mqueue: remove STATE_PENDING Sebastian Andrzej Siewior
2015-04-07 17:48 ` Manfred Spraul
2015-04-07 18:28 ` Thomas Gleixner
2015-04-10 14:37 ` [PATCH v2] " Sebastian Andrzej Siewior
2015-04-23 22:18 ` Thomas Gleixner
2015-04-28 3:24 ` Davidlohr Bueso
2015-04-28 12:37 ` Peter Zijlstra
2015-04-28 16:36 ` Davidlohr Bueso
2015-04-28 16:43 ` Peter Zijlstra
2015-04-28 16:59 ` Davidlohr Bueso
2015-04-29 19:44 ` Manfred Spraul
2015-04-30 18:46 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150410161135.GF3057@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=darren@dvhart.com \
--cc=dave@stgolabs.net \
--cc=fredrik.markstrom@windriver.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).