From: Peter Zijlstra <peterz@infradead.org>
To: tglx@linutronix.de
Cc: mingo@kernel.org, juri.lelli@arm.com, rostedt@goodmis.org,
xlpang@redhat.com, bigeasy@linutronix.de,
linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com,
jdesfossez@efficios.com, bristot@redhat.com,
dvhart@infradead.org
Subject: Re: [PATCH -v4 00/10] FUTEX_UNLOCK_PI wobbles
Date: Tue, 13 Dec 2016 17:07:14 +0100 [thread overview]
Message-ID: <20161213160714.GF3061@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <20161213083638.938898295@infradead.org>
On Tue, Dec 13, 2016 at 09:36:38AM +0100, Peter Zijlstra wrote:
> The basic idea is to, like requeue PI, break the rt_mutex_lock() function into
> pieces, such that we can enqueue the waiter while holding hb->lock, wait for
> acquisition without hb->lock and can remove the waiter, on failure, while
> holding hb->lock again.
>
> That way, when we drop hb->lock to wait, futex and rt_mutex wait state is
> consistent.
And of course, there's a hole in...
There is a point in futex_unlock_pi() where we hold neither hb->lock nor
wait_lock, at that point a futex_lock_pi() that had failed its
rt_mutex_wait_proxy_lock() can sneak in and remove itself, even though
we saw its waiter, recreating a vraiant of the initial problem.
The below plugs the hole, but its rather fragile in that it relies on
overlapping critical sections and the specific detail that we call
rt_mutex_cleanup_proxy_lock() immediately after (re)acquiring hb->lock.
There is another solution, but that's more involved and uglier still.
I'll give it a bit more thought.
---
kernel/futex.c | 36 +++++++++++++++++++++++++-----------
kernel/locking/rtmutex.c | 21 ++++++++++++++++++---
kernel/locking/rtmutex_common.h | 2 +-
3 files changed, 44 insertions(+), 15 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1384,6 +1384,7 @@ static void mark_wake_futex(struct wake_
}
static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
+ __releases(&pi_state->pi_mutex.wait_lock)
{
u32 uninitialized_var(curval), newval;
struct task_struct *new_owner;
@@ -1391,7 +1392,8 @@ static int wake_futex_pi(u32 __user *uad
DEFINE_WAKE_Q(wake_q);
int ret = 0;
- raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+ lockdep_assert_held(&pi_state->pi_mutex.wait_lock);
+
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
BUG_ON(!new_owner);
@@ -2655,8 +2657,8 @@ static int futex_lock_pi(u32 __user *uad
* rt_mutex waitqueue, such that we can keep the hb and rt_mutex
* wait lists consistent.
*/
- if (ret)
- rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter);
+ if (ret && !rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter))
+ ret = 0;
did_trylock:
/*
@@ -2763,15 +2765,26 @@ static int futex_unlock_pi(u32 __user *u
if (pi_state->owner != current)
goto out_unlock;
+ get_pi_state(pi_state);
+
/*
- * Grab a reference on the pi_state and drop hb->lock.
+ * We must grab wait_lock _before_ dropping hb->lock, such that
+ * the critical sections overlap. Without this there is a hole
+ * in which futex_lock_pi()'s rt_mutex_wait_proxy_lock() can
+ * fail, re-acquire the hb->lock and wait_lock and have our
+ * top_waiter dissapear.
+ */
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+ /*
+ * Now that we have a reference on pi_state and hole wait_lock
+ * we can drop hb->lock without risk of a waiter dissapearing
+ * on us.
*
- * The reference ensures pi_state lives, dropping the hb->lock
- * is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to
- * close the races against futex_lock_pi(), but in case of
- * _any_ fail we'll abort and retry the whole deal.
+ * Even if rt_mutex_wait_proxy_lock() fails, us holding
+ * wait_lock ensures it cannot be removed and the
+ * rt_mutex_cleanup_proxy_lock() call will find it owns the
+ * lock anyway.
*/
- get_pi_state(pi_state);
spin_unlock(&hb->lock);
ret = wake_futex_pi(uaddr, uval, pi_state);
@@ -3041,8 +3054,9 @@ static int futex_wait_requeue_pi(u32 __u
debug_rt_mutex_free_waiter(&rt_waiter);
spin_lock(q.lock_ptr);
- if (ret)
- rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter);
+ if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter))
+ ret = 0;
+
/*
* Fixup the pi_state owner and possibly acquire the lock if we
* haven't already.
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1779,16 +1779,31 @@ int rt_mutex_wait_proxy_lock(struct rt_m
*
* Clean up the failed lock acquisition as per rt_mutex_wait_proxy_lock().
*
+ * Returns:
+ * true - did cleanup, we done.
+ * false - we acquired the lock anyway, after rt_mutex_wait_proxy_lock(),
+ * caller should disregard its return value.
+ *
* Special API call for PI-futex support
*/
-void rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
+bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter)
{
+ bool cleanup = false;
+
raw_spin_lock_irq(&lock->wait_lock);
- remove_waiter(lock, waiter);
- fixup_rt_mutex_waiters(lock);
+ /*
+ * Check if we got the lock anyway...
+ */
+ if (rt_mutex_owner(lock) != current) {
+ remove_waiter(lock, waiter);
+ fixup_rt_mutex_waiters(lock);
+ cleanup = true;
+ }
raw_spin_unlock_irq(&lock->wait_lock);
+
+ return cleanup;
}
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -109,7 +109,7 @@ extern int rt_mutex_start_proxy_lock(str
extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
struct hrtimer_sleeper *to,
struct rt_mutex_waiter *waiter);
-extern void rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
+extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter);
extern int rt_mutex_futex_trylock(struct rt_mutex *l);
next prev parent reply other threads:[~2016-12-13 16:07 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-13 8:36 [PATCH -v4 00/10] FUTEX_UNLOCK_PI wobbles Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 01/10] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI Peter Zijlstra
2016-12-16 23:58 ` Darren Hart
2016-12-13 8:36 ` [PATCH -v4 02/10] futex: Add missing error handling to FUTEX_REQUEUE_PI Peter Zijlstra
2016-12-17 0:06 ` Darren Hart
2016-12-17 13:54 ` Peter Zijlstra
2016-12-18 23:31 ` Darren Hart
2016-12-13 8:36 ` [PATCH -v4 03/10] futex: Cleanup variable names for futex_top_waiter() Peter Zijlstra
2016-12-17 0:13 ` Darren Hart
2017-02-22 14:07 ` Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 04/10] futex: Use smp_store_release() in mark_wake_futex() Peter Zijlstra
2016-12-17 0:50 ` Darren Hart
2017-02-22 14:03 ` Peter Zijlstra
2017-03-17 0:35 ` Darren Hart
2016-12-13 8:36 ` [PATCH -v4 05/10] futex: Remove rt_mutex_deadlock_account_*() Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 06/10] futex,rt_mutex: Provide futex specific rt_mutex API Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 07/10] futex: Change locking Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 08/10] futex: Rework futex_lock_pi() vs rt_mutex_timed_futex_lock() Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 09/10] futex: Remove inconsistent hb/rt_mutex state magic Peter Zijlstra
2016-12-13 8:36 ` [PATCH -v4 10/10] futex: Pull rt_mutex_futex_unlock() out from under hb->lock Peter Zijlstra
2016-12-13 16:07 ` Peter Zijlstra [this message]
2017-02-22 11:02 ` [PATCH -v4 00/10] FUTEX_UNLOCK_PI wobbles Peter Zijlstra
2017-02-22 15:36 ` Peter Zijlstra
2017-03-01 9:05 ` Thomas Gleixner
2017-03-03 9:35 ` Peter Zijlstra
2016-12-16 23:31 ` Darren Hart
2016-12-17 13:52 ` Peter Zijlstra
2016-12-18 22:39 ` Darren Hart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161213160714.GF3061@worktop.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=dvhart@infradead.org \
--cc=jdesfossez@efficios.com \
--cc=juri.lelli@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=xlpang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.