linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Mike Galbraith <mgalbraith@suse.de>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	John Kacur <jkacur@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-rt-users@vger.kernel.org
Subject: Re: [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t
Date: Sun, 11 Jul 2010 08:10:14 -0700	[thread overview]
Message-ID: <4C39DED6.10502@us.ibm.com> (raw)
In-Reply-To: <1278855208.15197.6.camel@marge.simson.net>

On 07/11/2010 06:33 AM, Mike Galbraith wrote:
> On Sat, 2010-07-10 at 21:41 +0200, Mike Galbraith wrote:
>> On Fri, 2010-07-09 at 15:33 -0700, Darren Hart wrote:
>
>>> If we can't move the unlock above before set_owner, then we may need a:
>>>
>>> retry:
>>> cur->lock()
>>> top_waiter = get_top_waiter()
>>> cur->unlock()
>>>
>>> double_lock(cur, topwaiter)
>>> if top_waiter != get_top_waiter()
>>> 	double_unlock(cur, topwaiter)
>>> 	goto retry
>>>
>>> Not ideal, but I think I prefer that to making all the hb locks raw.
>
> Another option: only scratch the itchy spot.
>
> futex: non-blocking synchronization point for futex_wait_requeue_pi() and futex_requeue().
>
> Problem analysis by Darren Hart;
> The requeue_pi mechanism introduced proxy locking of the rtmutex.  This creates
> a scenario where a task can wake-up, not knowing it has been enqueued on an
> rtmutex. In order to detect this, the task would have to be able to take either
> task->pi_blocked_on->lock->wait_lock and/or the hb->lock.  Unfortunately,
> without already holding one of these, the pi_blocked_on variable can change
> from NULL to valid or from valid to NULL. Therefor, the task cannot be allowed
> to take a sleeping lock after wakeup or it could end up trying to block on two
> locks, the second overwriting a valid pi_blocked_on value. This obviously
> breaks the pi mechanism.
>
> Rather than convert the bh-lock to a raw spinlock, do so only in the spot where
> blocking cannot be allowed, ie before we know that lock handoff has completed.

I like it. I especially like the change is only evident if you are using 
the code path that introduced the problem in the first place. If you're 
doing a lot of requeue_pi operations, then the waking waiters have an 
advantage over new pending waiters or other tasks with futex keyed on 
the same hash-bucket... but that seems acceptable to me.

I'd like to confirm that holding the pendowner->pi-lock across the 
wakeup in wakeup_next_waiter() isn't feasible first. If it can work, I 
think the impact would be lower. I'll have a look tomorrow.

Nice work Mike.

--
Darrem

> Signed-off-by: Mike Galbraith<efault@gmx.de>
> Cc: Darren Hart<dvhltc@us.ibm.com>
> Cc: Thomas Gleixner<tglx@linutronix.de>
> Cc: Peter Zijlstra<peterz@infradead.org>
> Cc: Ingo Molnar<mingo@elte.hu>
> Cc: Eric Dumazet<eric.dumazet@gmail.com>
> Cc: John Kacur<jkacur@redhat.com>
> Cc: Steven Rostedt<rostedt@goodmis.org>
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index a6cec32..ef489f3 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2255,7 +2255,14 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, int fshared,
>   	/* Queue the futex_q, drop the hb lock, wait for wakeup. */
>   	futex_wait_queue_me(hb,&q, to);
>
> -	spin_lock(&hb->lock);
> +	/*
> +	 * Non-blocking synchronization point with futex_requeue().
> +	 *
> +	 * We dare not block here because this will alter PI state, possibly
> +	 * before our waker finishes modifying same in wakeup_next_waiter().
> +	 */
> +	while(!spin_trylock(&hb->lock))
> +		cpu_relax();
>   	ret = handle_early_requeue_pi_wakeup(hb,&q,&key2, to);
>   	spin_unlock(&hb->lock);
>   	if (ret)
>
>


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

  reply	other threads:[~2010-07-11 15:10 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-09 22:32 [PATCH 0/4][RT] futex: fix tasks blocking on two rt_mutex locks Darren Hart
2010-07-09 22:32 ` [PATCH 1/4] rtmutex: avoid null derefence in WARN_ON Darren Hart
2010-07-10  0:29   ` Steven Rostedt
2010-07-10 14:42     ` Darren Hart
2010-07-09 22:32 ` [PATCH 2/4] rtmutex: add BUG_ON if a task attempts to block on two locks Darren Hart
2010-07-10  0:30   ` Steven Rostedt
2010-07-10 17:30     ` [PATCH 2/4 V2] " Darren Hart
2010-07-09 22:32 ` [PATCH 3/4] futex: free_pi_state outside of hb->lock sections Darren Hart
2010-07-09 22:55   ` [PATCH 3/4 V2] " Darren Hart
2010-07-10  0:32     ` Steven Rostedt
2010-07-10 14:41       ` Darren Hart
2010-07-12 10:35   ` [PATCH 3/4] " Thomas Gleixner
2010-07-12 10:46     ` Steven Rostedt
2010-07-09 22:33 ` [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t Darren Hart
2010-07-09 22:57   ` [PATCH 4/4 V2] " Darren Hart
2010-07-10  0:34     ` Steven Rostedt
2010-07-10 19:41   ` [PATCH 4/4] " Mike Galbraith
2010-07-11 13:33     ` Mike Galbraith
2010-07-11 15:10       ` Darren Hart [this message]
2010-07-12 11:45       ` Steven Rostedt
2010-07-12 12:12         ` Mike Galbraith
2010-07-12 19:10     ` Darren Hart
2010-07-12 20:40       ` Thomas Gleixner
2010-07-12 20:43         ` Thomas Gleixner
2010-07-13  3:09         ` Mike Galbraith
2010-07-13  7:12           ` Darren Hart
2010-07-12 13:05   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C39DED6.10502@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=jkacur@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mgalbraith@suse.de \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).