From: Darren Hart <dvhltc@us.ibm.com>
To: Mike Galbraith <mgalbraith@suse.de>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Eric Dumazet <eric.dumazet@gmail.com>,
John Kacur <jkacur@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
linux-rt-users@vger.kernel.org
Subject: Re: [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t
Date: Sun, 11 Jul 2010 08:10:14 -0700 [thread overview]
Message-ID: <4C39DED6.10502@us.ibm.com> (raw)
In-Reply-To: <1278855208.15197.6.camel@marge.simson.net>
On 07/11/2010 06:33 AM, Mike Galbraith wrote:
> On Sat, 2010-07-10 at 21:41 +0200, Mike Galbraith wrote:
>> On Fri, 2010-07-09 at 15:33 -0700, Darren Hart wrote:
>
>>> If we can't move the unlock above before set_owner, then we may need a:
>>>
>>> retry:
>>> cur->lock()
>>> top_waiter = get_top_waiter()
>>> cur->unlock()
>>>
>>> double_lock(cur, topwaiter)
>>> if top_waiter != get_top_waiter()
>>> double_unlock(cur, topwaiter)
>>> goto retry
>>>
>>> Not ideal, but I think I prefer that to making all the hb locks raw.
>
> Another option: only scratch the itchy spot.
>
> futex: non-blocking synchronization point for futex_wait_requeue_pi() and futex_requeue().
>
> Problem analysis by Darren Hart;
> The requeue_pi mechanism introduced proxy locking of the rtmutex. This creates
> a scenario where a task can wake-up, not knowing it has been enqueued on an
> rtmutex. In order to detect this, the task would have to be able to take either
> task->pi_blocked_on->lock->wait_lock and/or the hb->lock. Unfortunately,
> without already holding one of these, the pi_blocked_on variable can change
> from NULL to valid or from valid to NULL. Therefor, the task cannot be allowed
> to take a sleeping lock after wakeup or it could end up trying to block on two
> locks, the second overwriting a valid pi_blocked_on value. This obviously
> breaks the pi mechanism.
>
> Rather than convert the bh-lock to a raw spinlock, do so only in the spot where
> blocking cannot be allowed, ie before we know that lock handoff has completed.
I like it. I especially like the change is only evident if you are using
the code path that introduced the problem in the first place. If you're
doing a lot of requeue_pi operations, then the waking waiters have an
advantage over new pending waiters or other tasks with futex keyed on
the same hash-bucket... but that seems acceptable to me.
I'd like to confirm that holding the pendowner->pi-lock across the
wakeup in wakeup_next_waiter() isn't feasible first. If it can work, I
think the impact would be lower. I'll have a look tomorrow.
Nice work Mike.
--
Darrem
> Signed-off-by: Mike Galbraith<efault@gmx.de>
> Cc: Darren Hart<dvhltc@us.ibm.com>
> Cc: Thomas Gleixner<tglx@linutronix.de>
> Cc: Peter Zijlstra<peterz@infradead.org>
> Cc: Ingo Molnar<mingo@elte.hu>
> Cc: Eric Dumazet<eric.dumazet@gmail.com>
> Cc: John Kacur<jkacur@redhat.com>
> Cc: Steven Rostedt<rostedt@goodmis.org>
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index a6cec32..ef489f3 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2255,7 +2255,14 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, int fshared,
> /* Queue the futex_q, drop the hb lock, wait for wakeup. */
> futex_wait_queue_me(hb,&q, to);
>
> - spin_lock(&hb->lock);
> + /*
> + * Non-blocking synchronization point with futex_requeue().
> + *
> + * We dare not block here because this will alter PI state, possibly
> + * before our waker finishes modifying same in wakeup_next_waiter().
> + */
> + while(!spin_trylock(&hb->lock))
> + cpu_relax();
> ret = handle_early_requeue_pi_wakeup(hb,&q,&key2, to);
> spin_unlock(&hb->lock);
> if (ret)
>
>
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
next prev parent reply other threads:[~2010-07-11 15:10 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-09 22:32 [PATCH 0/4][RT] futex: fix tasks blocking on two rt_mutex locks Darren Hart
2010-07-09 22:32 ` [PATCH 1/4] rtmutex: avoid null derefence in WARN_ON Darren Hart
2010-07-10 0:29 ` Steven Rostedt
2010-07-10 14:42 ` Darren Hart
2010-07-09 22:32 ` [PATCH 2/4] rtmutex: add BUG_ON if a task attempts to block on two locks Darren Hart
2010-07-10 0:30 ` Steven Rostedt
2010-07-10 17:30 ` [PATCH 2/4 V2] " Darren Hart
2010-07-09 22:32 ` [PATCH 3/4] futex: free_pi_state outside of hb->lock sections Darren Hart
2010-07-09 22:55 ` [PATCH 3/4 V2] " Darren Hart
2010-07-09 22:55 ` Darren Hart
2010-07-10 0:32 ` Steven Rostedt
2010-07-10 14:41 ` Darren Hart
2010-07-12 10:35 ` [PATCH 3/4] " Thomas Gleixner
2010-07-12 10:46 ` Steven Rostedt
2010-07-09 22:33 ` [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t Darren Hart
2010-07-09 22:57 ` [PATCH 4/4 V2] " Darren Hart
2010-07-10 0:34 ` Steven Rostedt
2010-07-10 19:41 ` [PATCH 4/4] " Mike Galbraith
2010-07-11 13:33 ` Mike Galbraith
2010-07-11 15:10 ` Darren Hart [this message]
2010-07-12 11:45 ` Steven Rostedt
2010-07-12 12:12 ` Mike Galbraith
2010-07-12 19:10 ` Darren Hart
2010-07-12 20:40 ` Thomas Gleixner
2010-07-12 20:43 ` Thomas Gleixner
2010-07-13 3:09 ` Mike Galbraith
2010-07-13 7:12 ` Darren Hart
2010-07-12 13:05 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C39DED6.10502@us.ibm.com \
--to=dvhltc@us.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=jkacur@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mgalbraith@suse.de \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.