linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: mingo@elte.hu, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, rusty@rustcorp.com.au, ahu@ds9a.nl
Subject: Re: Futex queue_me/get_user ordering
Date: Mon, 15 Nov 2004 12:06:53 +0900	[thread overview]
Message-ID: <41981D4D.9030505@jp.fujitsu.com> (raw)
In-Reply-To: <20041115020148.GA17979@mail.shareable.org>

Jamie Lokier wrote:
> Third possibility: your test is buggy.  Do you actually use a mutex in
> your test when you call pthread_cond_wait, and does the waker hold it
> when it calls pthread_cond_signal?
> 
> If you don't use a mutex as you are supposed to with condvars, then it
> might not be a kernel or NPTL bug.  I'm not sure if POSIX-specified
> behaviour is defined when you use condvars without a mutex.
> 
> If you do use a mutex (and you just didn't mention it), then the code
> above is not enough to decide if there's an NPTL bug.  We need to look
> at pthread_cond_wait as well, to see how it does the "atomic" wait and
> mutex release.
> 
> -- Jamie

Now I'm asking our test team about that.

Again, from glibc-2.3.3(RHEL4b2):

[nptl/sysdeps/pthread/pthread_cond_wait.c]
   85 int
   86 __pthread_cond_wait (cond, mutex)
   87      pthread_cond_t *cond;
   88      pthread_mutex_t *mutex;
   89 {
   90   struct _pthread_cleanup_buffer buffer;
   91   struct _condvar_cleanup_buffer cbuffer;
   92   int err;
   93
   94   /* Make sure we are along.  */
   95   lll_mutex_lock (cond->__data.__lock);
   96
   97   /* Now we can release the mutex.  */
   98   err = __pthread_mutex_unlock_usercnt (mutex, 0);
   99   if (__builtin_expect (err, 0))
  100     {
  101       lll_mutex_unlock (cond->__data.__lock);
  102       return err;
  103     }
  104
  105   /* We have one new user of the condvar.  */
  106   ++cond->__data.__total_seq;
  107   ++cond->__data.__futex;
  108   cond->__data.__nwaiters += 1 << COND_CLOCK_BITS;
  109
  110   /* Remember the mutex we are using here.  If there is already a
  111      different address store this is a bad user bug.  Do not store
  112      anything for pshared condvars.  */
  113   if (cond->__data.__mutex != (void *) ~0l)
  114     cond->__data.__mutex = mutex;
  115
  116   /* Prepare structure passed to cancellation handler.  */
  117   cbuffer.cond = cond;
  118   cbuffer.mutex = mutex;
  119
  120   /* Before we block we enable cancellation.  Therefore we have to
  121      install a cancellation handler.  */
  122   __pthread_cleanup_push (&buffer, __condvar_cleanup, &cbuffer);
  123
  124   /* The current values of the wakeup counter.  The "woken" counter
  125      must exceed this value.  */
  126   unsigned long long int val;
  127   unsigned long long int seq;
  128   val = seq = cond->__data.__wakeup_seq;
  129   /* Remember the broadcast counter.  */
  130   cbuffer.bc_seq = cond->__data.__broadcast_seq;
  131
  132   do
  133     {
  134       unsigned int futex_val = cond->__data.__futex;
  135
  136       /* Prepare to wait.  Release the condvar futex.  */
  137       lll_mutex_unlock (cond->__data.__lock);
  138
  139       /* Enable asynchronous cancellation.  Required by the standard.  */
  140       cbuffer.oldtype = __pthread_enable_asynccancel ();
  141
  142       /* Wait until woken by signal or broadcast.  */
  143       lll_futex_wait (&cond->__data.__futex, futex_val);
  144
  145       /* Disable asynchronous cancellation.  */
  146       __pthread_disable_asynccancel (cbuffer.oldtype);
  147
  148       /* We are going to look at shared data again, so get the lock.  */
  149       lll_mutex_lock (cond->__data.__lock);
  150
  151       /* If a broadcast happened, we are done.  */
  152       if (cbuffer.bc_seq != cond->__data.__broadcast_seq)
  153         goto bc_out;
  154
  155       /* Check whether we are eligible for wakeup.  */
  156       val = cond->__data.__wakeup_seq;
  157     }
  158   while (val == seq || cond->__data.__woken_seq == val);
  159
  160   /* Another thread woken up.  */
  161   ++cond->__data.__woken_seq;
  162
  163  bc_out:
  164
  165   cond->__data.__nwaiters -= 1 << COND_CLOCK_BITS;
  166
  167   /* If pthread_cond_destroy was called on this varaible already,
  168      notify the pthread_cond_destroy caller all waiters have left
  169      and it can be successfully destroyed.  */
  170   if (cond->__data.__total_seq == -1ULL
  171       && cond->__data.__nwaiters < (1 << COND_CLOCK_BITS))
  172     lll_futex_wake (&cond->__data.__nwaiters, 1);
  173
  174   /* We are done with the condvar.  */
  175   lll_mutex_unlock (cond->__data.__lock);
  176
  177   /* The cancellation handling is back to normal, remove the handler.  */
  178   __pthread_cleanup_pop (&buffer, 0);
  179
  180   /* Get the mutex before returning.  */
  181   return __pthread_mutex_cond_lock (mutex);
  182 }

I'm not sure but it seems that the pseudo-code could be:

(mutex must be locked before calling pthread_cond_wait.)
-A01 pthread_cond_wait {
+A01 pthread_cond_wait (futex,mutex) {
+A0*   mutex_unlock(mutex);
  A02   timeout = 0;
  A03   lock(counters);
  A04     total++;
  A05     val = get_from(futex);
  A06   unlock(counters);
  A07
  A08   sys_futex(futex, FUTEX_WAIT, val, timeout);
  A09
  A10   lock(counters);
  A11     woken++;
  A12   unlock(counters);
+A1*   mutex_lock(mutex);
  A13 }

(and it's better to replace var "futex" to "cond".)

Is it possible that NPTL shut the window between mutex_unlock()
and actual queueing in futex_wait?


Thanks,
H.Seto


  reply	other threads:[~2004-11-15  3:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20041113164048.2f31a8dd.akpm@osdl.org>
2004-11-14  9:00 ` Futex queue_me/get_user ordering (was: 2.6.10-rc1-mm5 [u]) Emergency Services Jamie Lokier
2004-11-14  9:09   ` Andrew Morton
2004-11-14  9:23     ` Jamie Lokier
2004-11-14  9:50       ` bert hubert
2004-11-15 14:12         ` Jamie Lokier
2004-11-16  8:30           ` Futex queue_me/get_user ordering Hidetoshi Seto
2004-11-16 14:58             ` Jamie Lokier
2004-11-18  1:29               ` Hidetoshi Seto
2004-11-15  0:58       ` Hidetoshi Seto
2004-11-15  2:01         ` Jamie Lokier
2004-11-15  3:06           ` Hidetoshi Seto [this message]
2004-11-15 13:22             ` Jamie Lokier
2004-11-17  8:47               ` Jakub Jelinek
2004-11-18  2:10                 ` Hidetoshi Seto
2004-11-18  7:20                 ` Jamie Lokier
2004-11-18 19:47                   ` Jakub Jelinek
2005-03-17 10:26                     ` Jakub Jelinek
2005-03-17 15:20                       ` Jamie Lokier
2005-03-17 15:55                         ` Jakub Jelinek
2005-03-18 17:00                           ` Ingo Molnar
2005-03-21  2:55                             ` Jamie Lokier
2005-03-18 16:53                         ` Jakub Jelinek
2004-11-26 17:06                 ` Jamie Lokier
2004-11-28 17:36                   ` Joe Seigh
2004-11-29 11:24                   ` Jakub Jelinek
2004-11-29 21:50                     ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41981D4D.9030505@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ahu@ds9a.nl \
    --cc=akpm@osdl.org \
    --cc=jamie@shareable.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).