public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Blaise Gassend <blaise@willowgarage.com>
Cc: Jeremy Leibs <leibs@willowgarage.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	"lkml," <linux-kernel@vger.kernel.org>
Subject: Re: ERESTARTSYS escaping from sem_wait with RTLinux patch
Date: Tue, 13 Oct 2009 08:13:44 -0700	[thread overview]
Message-ID: <4AD49928.40600@us.ibm.com> (raw)
In-Reply-To: <1255424204.12737.233.camel@doodleydee>

Blaise Gassend wrote:
>> When 3495 finally get's to run and complete it's futex_wake() call, the task
>> still needs to be woken, so we wake it - but now it's enqueued with a different
>> futex_q, which now has a valid lock_ptr, so upon wake-up we expect a signal!
>>
>> OK, I believe this establishes root cause.  Now to come up with a fix...
> 
> Wow, good work Darren! You definitely have more experience than I do at
> tracking down these in-kernel races, and I'm glad we have you looking at
> this. I'm snarfing up useful techniques from your progress emails.

Great, I learn a lot from reading other people's status-type email as 
well.  Glad I can be on the contributing end once and a while :)

> 
> So if I understand correctly, there is a race between wake_futex and a
> timeout (or a signal, but AFAIK when my python example is running
> steady-state there are no signals). The problem occurs when wake_futex
> gets preempted half-way through, after it has NULLed lock_ptr, but
> before it has woken up the waiter. If a timeout/signal wakes the waiter,
> and the waiter runs and starts waiting again before the waker resumes,
> then the waker will end up waking the waiter a second time, without the
> lock_ptr for the second wait being NULLified. This causes the waiter to
> mis-interpret what woke it and leads to the fault we observed.
> 
> Now I am wondering if the workaround described here
> http://markmail.org/message/at2s337yz3wclaif
> for what seems like this same problem isn't actually a legitimate fix.
> It ends up looking something like this: (lines 3 and 4 are new)
> 
>   ret = -ERESTARTSYS;
>   if (!abs_time) {
>     if (!signal_pending(current))
>       set_tsk_thread_flag(current,TIF_SIGPENDING);
>     goto out_put_key;
>   }

The trouble with this is it is a bandaid to a fundamentally broken 
wake-up path. I tried flagging the waiters on the wake-list as already 
woken and then skipping them in the wake_futex_list(), but this got ugly 
really fast.

Talking with Thomas a bit more we're not sure the patch that introduced 
this lockless waking actually does any good, as the normal wakeup path 
doesn't take the hb->lock anyway, it's more likely the contention was 
due to an app like this that wakes a task and almost immediately puts it 
back to sleep on a futex before the waker has a chance to drop the hb->lock.

The futex wake-up path is complicated enough as it is, in my personal 
opinion, we are better off dropping the "lockless wake-up" patch and 
removing the race and simplifying the wake-up path at the same time.

-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

  reply	other threads:[~2009-10-13 15:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-10  9:09 ERESTARTSYS escaping from sem_wait with RTLinux patch Blaise Gassend
2009-10-10 16:40 ` ERESTARTSYS escaping from sem_wait with Preempt-RT Blaise Gassend
2009-10-10 17:59 ` ERESTARTSYS escaping from sem_wait with RTLinux patch Thomas Gleixner
2009-10-10 19:08   ` Jeremy Leibs
2009-10-11  2:07     ` Jeremy Leibs
2009-10-11  5:48   ` Jeremy Leibs
2009-10-12 14:16     ` Darren Hart
     [not found]       ` <1255384010.10236.123.camel@lts.willowgarage.com>
     [not found]         ` <4AD3BD57.6080703@us.ibm.com>
     [not found]           ` <4AD3D6AE.2050609@us.ibm.com>
     [not found]             ` <4AD3FFB0.5030405@us.ibm.com>
2009-10-13  4:54               ` Darren Hart
2009-10-13  8:56                 ` Blaise Gassend
2009-10-13 15:13                   ` Darren Hart [this message]
2009-10-13 18:45                     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AD49928.40600@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=blaise@willowgarage.com \
    --cc=leibs@willowgarage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox