Re: check *uaddr==val after queueing - without faulting

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Darren Hart <dvhltc@us.ibm.com>
To: "lkml, " <linux-kernel@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	John Stultz <johnstul@linux.vnet.ibm.com>,
	Jakub Jelinek <jakub@redhat.com>,
	Ulrich Drepper <drepper@redhat.com>,
	Eric Dumazet <dada1@cosmosbay.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: check *uaddr==val after queueing - without faulting
Date: Thu, 19 Mar 2009 15:01:58 -0700	[thread overview]
Message-ID: <49C2C0D6.5080700@us.ibm.com> (raw)
In-Reply-To: <49C2BCF4.50908@us.ibm.com>

Adding a few key folks to the Cc, apologies for the short initial Cc list.

Darren Hart wrote:
> The current futex_wait() code (I'm looking at tip/core/futexes)
> conflicts with a warning in the comments about checking *uaddr==val
> before the futex_q is queued on the hb list.  While userspace is able to
> alter *uaddr at will and should expect to hang in the kernel forever
> should it do so haphazardly, there are legitimate scenarios where the
> futex value might change between the call to futex_wait() and when the
> futex_q gets on the hb list.
> 
> For example, glibc protects access to the value of cond.__data.__futex
> via the cond.__data.__lock.  However, before it can issue the syscall it
> has to drop the cond.__data.__lock, leaving a small race window where
> userspace might issue a signal or broadcast, which will modify the value
> of cond.__data.__futex.  As I understand it, this will result in the
> waiter having changed the value of the futex prior to entering the
> kernel, but not enqueuing itself on the hb list until after the waiter
> issues the broadcast that was intended to wake it up.
> 
> I was working up a patch to move the test to after the call to
> queue_me(), but in order to do the test we also have to perform the
> get_user() after the queue_me(), which might sleep if we still hold the
> hb->lock.  If we let queue_me() drop the hb->lock before we call
> get_user() then we may see a legitimate change in *uaddr that occured
> after the queue_me() and before the get_user().
> 
> I'm at a loss for how to resolve the race without causing the false
> positive inside the kernel.  It might be resolvable in glibc by looking
> at the return code from futex_requeue and checking if the number 
> woken_or_requeued agrees with the number it expected to be sleeping; 
> this likely leaves other gaps for other waking calls, like FUTEX_WAKE.
> 
> Any thoughts?  Am I missing something that guards against this race?
> 


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

next prev parent reply	other threads:[~2009-03-19 22:02 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-19 21:45 check *uaddr==val after queueing - without faulting Darren Hart
2009-03-19 22:01 ` Darren Hart [this message]
2009-03-20  8:15   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49C2C0D6.5080700@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=dada1@cosmosbay.com \
    --cc=drepper@redhat.com \
    --cc=jakub@redhat.com \
    --cc=johnstul@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.