All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: mingo@elte.hu, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, rusty@rustcorp.com.au, ahu@ds9a.nl
Subject: Re: Futex queue_me/get_user ordering
Date: Mon, 15 Nov 2004 12:06:53 +0900	[thread overview]
Message-ID: <41981D4D.9030505@jp.fujitsu.com> (raw)
In-Reply-To: <20041115020148.GA17979@mail.shareable.org>

Jamie Lokier wrote:
> Third possibility: your test is buggy.  Do you actually use a mutex in
> your test when you call pthread_cond_wait, and does the waker hold it
> when it calls pthread_cond_signal?
> 
> If you don't use a mutex as you are supposed to with condvars, then it
> might not be a kernel or NPTL bug.  I'm not sure if POSIX-specified
> behaviour is defined when you use condvars without a mutex.
> 
> If you do use a mutex (and you just didn't mention it), then the code
> above is not enough to decide if there's an NPTL bug.  We need to look
> at pthread_cond_wait as well, to see how it does the "atomic" wait and
> mutex release.
> 
> -- Jamie

Now I'm asking our test team about that.

Again, from glibc-2.3.3(RHEL4b2):

[nptl/sysdeps/pthread/pthread_cond_wait.c]
   85 int
   86 __pthread_cond_wait (cond, mutex)
   87      pthread_cond_t *cond;
   88      pthread_mutex_t *mutex;
   89 {
   90   struct _pthread_cleanup_buffer buffer;
   91   struct _condvar_cleanup_buffer cbuffer;
   92   int err;
   93
   94   /* Make sure we are along.  */
   95   lll_mutex_lock (cond->__data.__lock);
   96
   97   /* Now we can release the mutex.  */
   98   err = __pthread_mutex_unlock_usercnt (mutex, 0);
   99   if (__builtin_expect (err, 0))
  100     {
  101       lll_mutex_unlock (cond->__data.__lock);
  102       return err;
  103     }
  104
  105   /* We have one new user of the condvar.  */
  106   ++cond->__data.__total_seq;
  107   ++cond->__data.__futex;
  108   cond->__data.__nwaiters += 1 << COND_CLOCK_BITS;
  109
  110   /* Remember the mutex we are using here.  If there is already a
  111      different address store this is a bad user bug.  Do not store
  112      anything for pshared condvars.  */
  113   if (cond->__data.__mutex != (void *) ~0l)
  114     cond->__data.__mutex = mutex;
  115
  116   /* Prepare structure passed to cancellation handler.  */
  117   cbuffer.cond = cond;
  118   cbuffer.mutex = mutex;
  119
  120   /* Before we block we enable cancellation.  Therefore we have to
  121      install a cancellation handler.  */
  122   __pthread_cleanup_push (&buffer, __condvar_cleanup, &cbuffer);
  123
  124   /* The current values of the wakeup counter.  The "woken" counter
  125      must exceed this value.  */
  126   unsigned long long int val;
  127   unsigned long long int seq;
  128   val = seq = cond->__data.__wakeup_seq;
  129   /* Remember the broadcast counter.  */
  130   cbuffer.bc_seq = cond->__data.__broadcast_seq;
  131
  132   do
  133     {
  134       unsigned int futex_val = cond->__data.__futex;
  135
  136       /* Prepare to wait.  Release the condvar futex.  */
  137       lll_mutex_unlock (cond->__data.__lock);
  138
  139       /* Enable asynchronous cancellation.  Required by the standard.  */
  140       cbuffer.oldtype = __pthread_enable_asynccancel ();
  141
  142       /* Wait until woken by signal or broadcast.  */
  143       lll_futex_wait (&cond->__data.__futex, futex_val);
  144
  145       /* Disable asynchronous cancellation.  */
  146       __pthread_disable_asynccancel (cbuffer.oldtype);
  147
  148       /* We are going to look at shared data again, so get the lock.  */
  149       lll_mutex_lock (cond->__data.__lock);
  150
  151       /* If a broadcast happened, we are done.  */
  152       if (cbuffer.bc_seq != cond->__data.__broadcast_seq)
  153         goto bc_out;
  154
  155       /* Check whether we are eligible for wakeup.  */
  156       val = cond->__data.__wakeup_seq;
  157     }
  158   while (val == seq || cond->__data.__woken_seq == val);
  159
  160   /* Another thread woken up.  */
  161   ++cond->__data.__woken_seq;
  162
  163  bc_out:
  164
  165   cond->__data.__nwaiters -= 1 << COND_CLOCK_BITS;
  166
  167   /* If pthread_cond_destroy was called on this varaible already,
  168      notify the pthread_cond_destroy caller all waiters have left
  169      and it can be successfully destroyed.  */
  170   if (cond->__data.__total_seq == -1ULL
  171       && cond->__data.__nwaiters < (1 << COND_CLOCK_BITS))
  172     lll_futex_wake (&cond->__data.__nwaiters, 1);
  173
  174   /* We are done with the condvar.  */
  175   lll_mutex_unlock (cond->__data.__lock);
  176
  177   /* The cancellation handling is back to normal, remove the handler.  */
  178   __pthread_cleanup_pop (&buffer, 0);
  179
  180   /* Get the mutex before returning.  */
  181   return __pthread_mutex_cond_lock (mutex);
  182 }

I'm not sure but it seems that the pseudo-code could be:

(mutex must be locked before calling pthread_cond_wait.)
-A01 pthread_cond_wait {
+A01 pthread_cond_wait (futex,mutex) {
+A0*   mutex_unlock(mutex);
  A02   timeout = 0;
  A03   lock(counters);
  A04     total++;
  A05     val = get_from(futex);
  A06   unlock(counters);
  A07
  A08   sys_futex(futex, FUTEX_WAIT, val, timeout);
  A09
  A10   lock(counters);
  A11     woken++;
  A12   unlock(counters);
+A1*   mutex_lock(mutex);
  A13 }

(and it's better to replace var "futex" to "cond".)

Is it possible that NPTL shut the window between mutex_unlock()
and actual queueing in futex_wait?


Thanks,
H.Seto


  reply	other threads:[~2004-11-15  3:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20041113164048.2f31a8dd.akpm@osdl.org>
2004-11-14  9:00 ` Futex queue_me/get_user ordering (was: 2.6.10-rc1-mm5 [u]) Emergency Services Jamie Lokier
2004-11-14  9:09   ` Andrew Morton
2004-11-14  9:23     ` Jamie Lokier
2004-11-14  9:50       ` bert hubert
2004-11-15 14:12         ` Jamie Lokier
2004-11-16  8:30           ` Futex queue_me/get_user ordering Hidetoshi Seto
2004-11-16 14:58             ` Jamie Lokier
2004-11-18  1:29               ` Hidetoshi Seto
2004-11-15  0:58       ` Hidetoshi Seto
2004-11-15  2:01         ` Jamie Lokier
2004-11-15  3:06           ` Hidetoshi Seto [this message]
2004-11-15 13:22             ` Jamie Lokier
2004-11-17  8:47               ` Jakub Jelinek
2004-11-18  2:10                 ` Hidetoshi Seto
2004-11-18  7:20                 ` Jamie Lokier
2004-11-18 19:47                   ` Jakub Jelinek
2005-03-17 10:26                     ` Jakub Jelinek
2005-03-17 15:20                       ` Jamie Lokier
2005-03-17 15:55                         ` Jakub Jelinek
2005-03-18 17:00                           ` Ingo Molnar
2005-03-21  2:55                             ` Jamie Lokier
2005-03-18 16:53                         ` Jakub Jelinek
2004-11-26 17:06                 ` Jamie Lokier
2004-11-28 17:36                   ` Joe Seigh
2004-11-29 11:24                   ` Jakub Jelinek
2004-11-29 21:50                     ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41981D4D.9030505@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ahu@ds9a.nl \
    --cc=akpm@osdl.org \
    --cc=jamie@shareable.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.