From: Mike Galbraith <efault@gmx.de>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Darren Hart <dvhltc@us.ibm.com>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Eric Dumazet <eric.dumazet@gmail.com>,
John Kacur <jkacur@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
linux-rt-users@vger.kernel.org
Subject: Re: [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t
Date: Tue, 13 Jul 2010 05:09:20 +0200 [thread overview]
Message-ID: <1278990560.7516.50.camel@marge.simson.net> (raw)
In-Reply-To: <alpine.LFD.2.00.1007122226230.3321@localhost.localdomain>
On Mon, 2010-07-12 at 22:40 +0200, Thomas Gleixner wrote:
> On Mon, 12 Jul 2010, Darren Hart wrote:
> > On 07/10/2010 12:41 PM, Mike Galbraith wrote:
> > > On Fri, 2010-07-09 at 15:33 -0700, Darren Hart wrote:
> > > > > Out of curiosity, what's wrong with holding his pi_lock across the
> > > > > wakeup? He can _try_ to block, but can't until pi state is stable.
> > > > >
> > > > > I presume there's a big fat gotcha that's just not obvious to futex
> > > > > locking newbie :)
> >
> > Nor to some of us that have been engrossed in futexes for the last couple
> > years! I discussed the pi_lock across the wakeup issue with Thomas. While this
> > fixes the problem for this particular failure case, it doesn't protect
> > against:
> >
> > <tglx> assume the following:
> > <tglx> t1 is on the condvar
> > <tglx> t2 does the requeue dance and t1 is now blocked on the outer futex
> > <tglx> t3 takes hb->lock for a futex in the same bucket
> > <tglx> t2 wakes due to signal/timeout
> > <tglx> t2 blocks on hb->lock
> >
> > You are likely to have not hit the above scenario because you only had one
> > condvar, so the hash_buckets were not heavily shared and you weren't likely to
> > hit:
> >
> > <tglx> t3 takes hb->lock for a futex in the same bucket
> >
> >
> > I'm going to roll up a patchset with your (Mike) spin_trylock patch and run it
> > through some tests. I'd still prefer a way to detect early wakeup without
> > having to grab the hb->lock(), but I haven't found it yet.
> >
> > + while(!spin_trylock(&hb->lock))
> > + cpu_relax();
> > ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
> > spin_unlock(&hb->lock);
>
> And this is nasty as it will create unbound priority inversion :(
Oh ma gawd, _it's a train_ :>
> We discussed another solution on IRC in meantime:
>
> in futex_wait_requeue_pi()
>
> futex_wait_queue_me(hb, &q, to);
>
> raw_spin_lock(current->pi_lock);
> if (current->pi_blocked_on) {
> /*
> * We know that we can only be blocked on the outer futex
> * so we can skip the early wakeup check
> */
> raw_spin_unlock(current->pi_lock);
> ret = 0;
> } else {
> current->pi_blocked_on = PI_WAKEUP_INPROGRESS;
> raw_spin_unlock(current->pi_lock);
>
> spin_lock(&hb->lock);
> ret = handle_early_requeue_pi_wakeup();
> ....
> spin_lock(&hb->lock);
> }
>
> Now in the rtmutex magic we need in task_blocks_on_rt_mutex():
>
> raw_spin_lock(task->pi_lock);
>
> /*
> * Add big fat comment why this is only relevant to futex
> * requeue_pi
> */
>
> if (task != current && task->pi_blocked_on == PI_WAKEUP_INPROGRESS) {
> raw_spin_lock(task->pi_lock);
>
> /*
> * Returning 0 here is fine. the requeue code is just going to
> * move the futex_q to the other bucket, but that'll be fixed
> * up in handle_early_requeue_pi_wakeup()
> */
>
> return 0;
> }
>
> Thanks,
>
> tglx
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-07-13 3:09 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-09 22:32 [PATCH 0/4][RT] futex: fix tasks blocking on two rt_mutex locks Darren Hart
2010-07-09 22:32 ` [PATCH 1/4] rtmutex: avoid null derefence in WARN_ON Darren Hart
2010-07-10 0:29 ` Steven Rostedt
2010-07-10 14:42 ` Darren Hart
2010-07-09 22:32 ` [PATCH 2/4] rtmutex: add BUG_ON if a task attempts to block on two locks Darren Hart
2010-07-10 0:30 ` Steven Rostedt
2010-07-10 17:30 ` [PATCH 2/4 V2] " Darren Hart
2010-07-09 22:32 ` [PATCH 3/4] futex: free_pi_state outside of hb->lock sections Darren Hart
2010-07-09 22:55 ` [PATCH 3/4 V2] " Darren Hart
2010-07-09 22:55 ` Darren Hart
2010-07-10 0:32 ` Steven Rostedt
2010-07-10 14:41 ` Darren Hart
2010-07-12 10:35 ` [PATCH 3/4] " Thomas Gleixner
2010-07-12 10:46 ` Steven Rostedt
2010-07-09 22:33 ` [PATCH 4/4] futex: convert hash_bucket locks to raw_spinlock_t Darren Hart
2010-07-09 22:57 ` [PATCH 4/4 V2] " Darren Hart
2010-07-10 0:34 ` Steven Rostedt
2010-07-10 19:41 ` [PATCH 4/4] " Mike Galbraith
2010-07-11 13:33 ` Mike Galbraith
2010-07-11 15:10 ` Darren Hart
2010-07-12 11:45 ` Steven Rostedt
2010-07-12 12:12 ` Mike Galbraith
2010-07-12 19:10 ` Darren Hart
2010-07-12 20:40 ` Thomas Gleixner
2010-07-12 20:43 ` Thomas Gleixner
2010-07-13 3:09 ` Mike Galbraith [this message]
2010-07-13 7:12 ` Darren Hart
2010-07-12 13:05 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1278990560.7516.50.camel@marge.simson.net \
--to=efault@gmx.de \
--cc=dvhltc@us.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=jkacur@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.