From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Darren Hart <darren@dvhart.com>,
Steven Rostedt <rostedt@goodmis.org>,
fredrik.markstrom@windriver.com,
Davidlohr Bueso <dave@stgolabs.net>,
Manfred Spraul <manfred@colorfullife.com>
Subject: [PATCH 2/3 v2] futex: avoid double wake up in futex_wake() on -RT
Date: Fri, 10 Apr 2015 18:11:35 +0200 [thread overview]
Message-ID: <20150410161135.GF3057@linutronix.de> (raw)
In-Reply-To: <alpine.DEB.2.11.1504072131400.3845@nanos>
futex_wake() wakes the waiter while holding the hb->lock. The waiter
does not take the hb->lock and can leave the kernel. However the next
operation the same futex operation will point to the same hb->lock and
we will see a small dance around the lock including prio-boosting and
context switch:
low prio task FUTEX_WAKE on high prio
| ft-1489 [000] ....1.. 81.167501: sys_enter: sys_futex (8049f60, 1, 1, 0, 0, 0)
| ft-1489 [000] dN..311 81.167504: sched_wakeup: pid=1490 prio=94
| ft-1489 [000] d...311 81.167520: sched_switch: prev_pid=1489 prev_prio=120 prev_state=R+ ==> next_pid=1490 next_prio=94
| ft-1490 [000] ....1.. 81.167522: sys_exit: sys_futex = 0
prio task FUTEX_WAKE on low prio
| ft-1490 [000] ....1.. 81.167528: sys_enter: sys_futex (8049f5c, 1, 1, 0, 0, 0)
| ft-1490 [000] ....1.. 81.167530: sys_exit: sys_futex = 0
prio task waits FUTEX_WAIT, hb->lock still owned by low prio task
| ft-1490 [000] ....1.. 81.167534: sys_enter: sys_futex (8049f60, 0, 1, 0, 0, 0)
| ft-1490 [000] d...411 81.167895: sched_pi_setprio: pid=1489 oldprio=120 newprio=94
| ft-1490 [000] d...311 81.167901: sched_switch: prev_pid=1490 prev_prio=94 prev_state=D ==> next_pid=1489 next_prio=94
| ft-1489 [000] d...411 81.167910: sched_wakeup: pid=1490 prio=94
| ft-1489 [000] d...311 81.167912: sched_pi_setprio: pid=1489 oldprio=94 newprio=120
| ft-1489 [000] d...311 81.167915: sched_switch: prev_pid=1489 prev_prio=120 prev_state=R+ ==> next_pid=1490 next_prio=94
| ft-1490 [000] d...3.. 81.167922: sched_switch: prev_pid=1490 prev_prio=94 prev_state=S ==> next_pid=1489 next_prio=120
| ft-1489 [000] ....1.. 81.167924: sys_exit: sys_futex = 1
This patch delays the wakeup of the process untill the hb->lock is
dropped to avoid boosting + context switch to obtain the lock.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2:
- update patch description
- move the comment to __wake_futex()
- move the wakeup before the out_put_key label in futex_wake()
kernel/futex.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index b38abe3573a8..658f4d05cd6f 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1092,12 +1092,12 @@ static void __unqueue_futex(struct futex_q *q)
* The hash bucket lock must be held when this is called.
* Afterwards, the futex_q must not be accessed.
*/
-static void wake_futex(struct futex_q *q)
+static struct task_struct *__wake_futex(struct futex_q *q)
{
struct task_struct *p = q->task;
if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
- return;
+ return NULL;
/*
* We set q->lock_ptr = NULL _before_ we wake up the task. If
@@ -1117,6 +1117,15 @@ static void wake_futex(struct futex_q *q)
*/
smp_wmb();
q->lock_ptr = NULL;
+ return p;
+}
+
+static void wake_futex(struct futex_q *q)
+{
+ struct task_struct *p = __wake_futex(q);
+
+ if (!p)
+ return;
wake_up_state(p, TASK_NORMAL);
put_task_struct(p);
@@ -1228,6 +1237,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
struct futex_hash_bucket *hb;
struct futex_q *this, *next;
union futex_key key = FUTEX_KEY_INIT;
+ struct task_struct *waiter = NULL;
int ret;
if (!bitset)
@@ -1256,13 +1266,21 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
if (!(this->bitset & bitset))
continue;
- wake_futex(this);
+ if (nr_wake == 1)
+ waiter = __wake_futex(this);
+ else
+ wake_futex(this);
if (++ret >= nr_wake)
break;
}
}
spin_unlock(&hb->lock);
+ if (waiter) {
+ wake_up_state(waiter, TASK_NORMAL);
+ put_task_struct(waiter);
+ }
+
out_put_key:
put_futex_key(&key);
out:
--
2.1.4
next prev parent reply other threads:[~2015-04-10 16:11 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-07 15:03 improve futex on -RT by avoiding the double wake-up Sebastian Andrzej Siewior
2015-04-07 15:03 ` [PATCH 1/3] futex: avoid double wake up in PI futex wait / wake on -RT Sebastian Andrzej Siewior
2015-04-07 18:41 ` Thomas Gleixner
2015-04-10 14:42 ` [PATCH 1/3 v2] " Sebastian Andrzej Siewior
2015-04-07 15:03 ` [PATCH 2/3] futex: avoid double wake up in futex_wake() " Sebastian Andrzej Siewior
2015-04-07 19:47 ` Thomas Gleixner
2015-04-10 16:11 ` Sebastian Andrzej Siewior [this message]
2015-04-13 3:02 ` [PATCH 2/3 v2] " Davidlohr Bueso
2015-04-16 5:09 ` Davidlohr Bueso
2015-04-16 9:19 ` Thomas Gleixner
2015-04-16 10:16 ` Peter Zijlstra
2015-04-16 10:49 ` Thomas Gleixner
2015-04-16 14:42 ` Davidlohr Bueso
2015-04-16 15:54 ` Peter Zijlstra
2015-04-16 16:22 ` Davidlohr Bueso
2015-04-07 15:03 ` [PATCH 3/3] ipc/mqueue: remove STATE_PENDING Sebastian Andrzej Siewior
2015-04-07 17:48 ` Manfred Spraul
2015-04-07 18:28 ` Thomas Gleixner
2015-04-10 14:37 ` [PATCH v2] " Sebastian Andrzej Siewior
2015-04-23 22:18 ` Thomas Gleixner
2015-04-28 3:24 ` Davidlohr Bueso
2015-04-28 12:37 ` Peter Zijlstra
2015-04-28 16:36 ` Davidlohr Bueso
2015-04-28 16:43 ` Peter Zijlstra
2015-04-28 16:59 ` Davidlohr Bueso
2015-04-29 19:44 ` Manfred Spraul
2015-04-30 18:46 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150410161135.GF3057@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=darren@dvhart.com \
--cc=dave@stgolabs.net \
--cc=fredrik.markstrom@windriver.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.