From: Thomas Gleixner <tglx@linutronix.de>
To: Darren Hart <dvhltc@us.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Eric Dumazet <eric.dumazet@gmail.com>,
John Kacur <jkacur@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
linux-rt-users@vger.kernel.org
Subject: Re: [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t
Date: Mon, 12 Jul 2010 22:43:12 +0200 (CEST) [thread overview]
Message-ID: <alpine.LFD.2.00.1007122241540.3321@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.00.1007122226230.3321@localhost.localdomain>
On Mon, 12 Jul 2010, Thomas Gleixner wrote:
> On Mon, 12 Jul 2010, Darren Hart wrote:
> > On 07/10/2010 12:41 PM, Mike Galbraith wrote:
> > > On Fri, 2010-07-09 at 15:33 -0700, Darren Hart wrote:
> > > > > Out of curiosity, what's wrong with holding his pi_lock across the
> > > > > wakeup? He can _try_ to block, but can't until pi state is stable.
> > > > >
> > > > > I presume there's a big fat gotcha that's just not obvious to futex
> > > > > locking newbie :)
> >
> > Nor to some of us that have been engrossed in futexes for the last couple
> > years! I discussed the pi_lock across the wakeup issue with Thomas. While this
> > fixes the problem for this particular failure case, it doesn't protect
> > against:
> >
> > <tglx> assume the following:
> > <tglx> t1 is on the condvar
> > <tglx> t2 does the requeue dance and t1 is now blocked on the outer futex
> > <tglx> t3 takes hb->lock for a futex in the same bucket
> > <tglx> t2 wakes due to signal/timeout
> > <tglx> t2 blocks on hb->lock
> >
> > You are likely to have not hit the above scenario because you only had one
> > condvar, so the hash_buckets were not heavily shared and you weren't likely to
> > hit:
> >
> > <tglx> t3 takes hb->lock for a futex in the same bucket
> >
> >
> > I'm going to roll up a patchset with your (Mike) spin_trylock patch and run it
> > through some tests. I'd still prefer a way to detect early wakeup without
> > having to grab the hb->lock(), but I haven't found it yet.
> >
> > + while(!spin_trylock(&hb->lock))
> > + cpu_relax();
> > ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
> > spin_unlock(&hb->lock);
>
> And this is nasty as it will create unbound priority inversion :(
>
> We discussed another solution on IRC in meantime:
>
> in futex_wait_requeue_pi()
>
> futex_wait_queue_me(hb, &q, to);
>
> raw_spin_lock(current->pi_lock);
> if (current->pi_blocked_on) {
> /*
> * We know that we can only be blocked on the outer futex
> * so we can skip the early wakeup check
> */
> raw_spin_unlock(current->pi_lock);
> ret = 0;
> } else {
> current->pi_blocked_on = PI_WAKEUP_INPROGRESS;
> raw_spin_unlock(current->pi_lock);
>
> spin_lock(&hb->lock);
> ret = handle_early_requeue_pi_wakeup();
> ....
> spin_lock(&hb->lock);
> }
>
> Now in the rtmutex magic we need in task_blocks_on_rt_mutex():
>
> raw_spin_lock(task->pi_lock);
>
> /*
> * Add big fat comment why this is only relevant to futex
> * requeue_pi
> */
>
> if (task != current && task->pi_blocked_on == PI_WAKEUP_INPROGRESS) {
> raw_spin_lock(task->pi_lock);
>
> /*
> * Returning 0 here is fine. the requeue code is just going to
> * move the futex_q to the other bucket, but that'll be fixed
> * up in handle_early_requeue_pi_wakeup()
> */
>
> return 0;
We might also return a sensible error code here and just remove the
waiter from all queues, which needs to be handled in
handle_early_requeue_pi_wakeup() after acquiring hb->lock then.
Thanks,
tglx
next prev parent reply other threads:[~2010-07-12 20:43 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-09 22:32 [PATCH 0/4][RT] futex: fix tasks blocking on two rt_mutex locks Darren Hart
2010-07-09 22:32 ` [PATCH 1/4] rtmutex: avoid null derefence in WARN_ON Darren Hart
2010-07-10 0:29 ` Steven Rostedt
2010-07-10 14:42 ` Darren Hart
2010-07-09 22:32 ` [PATCH 2/4] rtmutex: add BUG_ON if a task attempts to block on two locks Darren Hart
2010-07-10 0:30 ` Steven Rostedt
2010-07-10 17:30 ` [PATCH 2/4 V2] " Darren Hart
2010-07-09 22:32 ` [PATCH 3/4] futex: free_pi_state outside of hb->lock sections Darren Hart
2010-07-09 22:55 ` [PATCH 3/4 V2] " Darren Hart
2010-07-10 0:32 ` Steven Rostedt
2010-07-10 14:41 ` Darren Hart
2010-07-12 10:35 ` [PATCH 3/4] " Thomas Gleixner
2010-07-12 10:46 ` Steven Rostedt
2010-07-09 22:33 ` [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t Darren Hart
2010-07-09 22:57 ` [PATCH 4/4 V2] " Darren Hart
2010-07-10 0:34 ` Steven Rostedt
2010-07-10 19:41 ` [PATCH 4/4] " Mike Galbraith
2010-07-11 13:33 ` Mike Galbraith
2010-07-11 15:10 ` Darren Hart
2010-07-12 11:45 ` Steven Rostedt
2010-07-12 12:12 ` Mike Galbraith
2010-07-12 19:10 ` Darren Hart
2010-07-12 20:40 ` Thomas Gleixner
2010-07-12 20:43 ` Thomas Gleixner [this message]
2010-07-13 3:09 ` Mike Galbraith
2010-07-13 7:12 ` Darren Hart
2010-07-12 13:05 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1007122241540.3321@localhost.localdomain \
--to=tglx@linutronix.de \
--cc=dvhltc@us.ibm.com \
--cc=efault@gmx.de \
--cc=eric.dumazet@gmail.com \
--cc=jkacur@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox