public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-29 22:56 Boehm, Hans
  2006-03-29 23:33 ` Christoph Lameter
  0 siblings, 1 reply; 37+ messages in thread
From: Boehm, Hans @ 2006-03-29 22:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Grundler, Grant G, Chen, Kenneth W, Nick Piggin, Zoltan Menyhart,
	akpm, linux-kernel, linux-ia64

The slides and paper are really discussing a different issue, namely the
extent to which it is safe to use a compiler like gcc which was
primarily designed for sequential code on concurrent code, using a model
similar to what pthreads (or I believe the kernel) uses.  The short
answer is that it isn't.  There is some low probablity that the compiler
will introduce a data race.  And that is very hard to anticipate based
on the source, and seems to be similarly hard to avoid by any known
programming guidelines.  The solution strategy is to fix language
standards and compilers.  We're working on the first part for now.

You can find the paper corresponding to those slides at
http://portal.acm.org/citation.cfm?doid=1065010.1065042 or
http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html .

Returning to the original topic, I don't think I'm the one to design the
bitops API, since I'm not sufficiently familiar with the kernel issues.
I did design a vaguely comparable user-level interface that addresses
atomic operations in general, not specifically bit vector operations.
That's described at http://www.hpl.hp.com/research/linux/atomic_ops/, or
more specifically at
http://www.hpl.hp.com/research/linux/atomic_ops/README.txt .  I'm making
another pass over the C++ proposal version as we speak, but it's mostly
similar in spirit.  Design decisions that have turned out to be dubious
are:

1. Including all ordering types for simple load and store operations.
Some don't make sense.

2. The set of ordering constraints there is probably too large.  None,
acquire, release, and full really seem to be the important ones.
dd_acquire_read is nice, but probably nonsensical.  Hardware tends to
give you data dependent ordering for free, but the compiler doesn't
preserve it.

Hans

> -----Original Message-----
> From: Christoph Lameter 
> [mailto:christoph@schroedinger.engr.sgi.com] On Behalf Of 
> Christoph Lameter
> Sent: Wednesday, March 29, 2006 2:18 PM
> To: Boehm, Hans
> Cc: Grundler, Grant G; Chen, Kenneth W; Nick Piggin; Zoltan 
> Menyhart; akpm@osdl.org; linux-kernel@vger.kernel.org; 
> linux-ia64@vger.kernel.org
> Subject: RE: Fix unlock_buffer() to work the same way as bit_unlock()
> 
> On Wed, 29 Mar 2006, Boehm, Hans wrote:
> 
> > Somewhat improved slides for the talk Grant is referring to are at 
> > 
> http://www.hpl.hp.com/personal/Hans_Boehm/misc_slides/pldi05_threads.p
> > df
> 
> Hmm.. A paper on that subject would be better. Cannot get 
> much from the slides.
> 
> > It's hard to get this stuff right.  But we knew that.
> 
> Could you come up with a proposal how to correctly define and 
> API to bit ops in such a way that they work for all 
> architectures and allow us to utilize all the features that 
> processors may have?
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread
* RE: Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-30 22:26 Boehm, Hans
  0 siblings, 0 replies; 37+ messages in thread
From: Boehm, Hans @ 2006-03-30 22:26 UTC (permalink / raw)
  To: Christoph Lameter, Nick Piggin
  Cc: Zoltan Menyhart, Grundler, Grant G, Chen, Kenneth W, akpm,
	linux-kernel, linux-ia64

> From: Christoph Lameter 
> On Thu, 30 Mar 2006, Nick Piggin wrote:
> 
> > Zoltan Menyhart wrote:
> > 
> > > However, I do not think your implementation would be 
> efficient due 
> > > to selecting the ordering mode at run time:
> > > 
> > > > +    switch (mode) {
> > > > +    case MODE_NONE :
> > > > +    case MODE_ACQUIRE :
> > > > +        return cmpxchg_acq(m, old, new);
> > > > +    case MODE_FENCE :
> > > > +        smp_mb();
> > > > +        /* Fall through */
> > > > +    case MODE_RELEASE :
> > > > +        return cmpxchg_rel(m, old, new);
> > > 
> > 
> > BTW. Isn't MODE_FENCE wrong? Seems like a read or write 
> could be moved 
> > above cmpxchg_rel?
> 
> Hmmm.... We should call this MODE_BARRIER I guess...
>  
I arrived at the conclusion that "fence" is a better term, at least in
user-level code.  "Barrier" seems to generate confusion with
"pthread_barrier_wait" and similar constructs, which are a different
kind of beast.

Hans

^ permalink raw reply	[flat|nested] 37+ messages in thread
* RE: Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-30 17:57 Boehm, Hans
  2006-03-30 18:17 ` Christoph Lameter
  0 siblings, 1 reply; 37+ messages in thread
From: Boehm, Hans @ 2006-03-30 17:57 UTC (permalink / raw)
  To: Christoph Lameter, Zoltan Menyhart
  Cc: Nick Piggin, Grundler, Grant G, Chen, Kenneth W, akpm,
	linux-kernel, linux-ia64

> From: Christoph Lameter 
> 
> On Thu, 30 Mar 2006, Zoltan Menyhart wrote:
> 
> > Form semantical point of view, the forms:
> > 
> > 	bit_foo(..., mode)
> > and
> > 	bit_foo_mode(...)
> > 
> > are equivalent.
> 
> Correct but the above form leads to less macro definitions.
>  
> > However, I do not think your implementation would be 
> efficient due to 
> > selecting the ordering mode at run time:
> 
> The compiler will select that at compile time. One has the 
> option of also generating run time seletion by specifying a 
> variable instead of a constant when callig these functions.
I would view the latter as a disadvantage, since I can't think of a case
in which you wouldn't want it reported as an error instead, at least if
you care about performance.  If you know of one, I'd be very interested.

The first form does have the advantage that it's possible to build up
more complicated primitives from simpler ones without repeating the
definition four times.

I'm not sure there's a clear winner here.

In the C++ case, I currently expect we will go with template arguments,
which are guaranteed to be static, but are an option you don't have ...

Hans

^ permalink raw reply	[flat|nested] 37+ messages in thread
* RE: Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-29 19:31 Boehm, Hans
  2006-03-29 22:17 ` Christoph Lameter
  0 siblings, 1 reply; 37+ messages in thread
From: Boehm, Hans @ 2006-03-29 19:31 UTC (permalink / raw)
  To: Grundler, Grant G
  Cc: Christoph Lameter, Chen, Kenneth W, Nick Piggin, Zoltan Menyhart,
	akpm, linux-kernel, linux-ia64

That was actually based on a different, earlier paper.

Somewhat improved slides for the talk Grant is referring to are at
http://www.hpl.hp.com/personal/Hans_Boehm/misc_slides/pldi05_threads.pdf
.  The problem is also relevant for kernel development, though the title
doesn't fit, and it clearly needs to be addressed at the language spec
and compiler level.  (Note that the claim about gcc on slide 14 is
actually incorrect as it stands (I misread the .s file), but the claim
is correct if you add a conditional to the body of the example loop.
Thus you won't be led far astray.)  The PLDI paper on which the talk is
based contained a conjecture about required ordering for Posix locks,
which is disproved by the TR below.

It's hard to get this stuff right.  But we knew that.

Hans

> -----Original Message-----
> From: Grundler, Grant G 
> Sent: Wednesday, March 29, 2006 11:12 AM
> To: Boehm, Hans
> Cc: Christoph Lameter; Chen, Kenneth W; Nick Piggin; Zoltan 
> Menyhart; akpm@osdl.org; linux-kernel@vger.kernel.org; 
> linux-ia64@vger.kernel.org
> Subject: Re: Fix unlock_buffer() to work the same way as bit_unlock()
> 
> On Wed, Mar 29, 2006 at 10:33:57AM -0800, Boehm, Hans wrote:
> ...
> > - At user level, the ordering semantics required for something like
> > pthread_mutex_lock() are unfortunately unclear.  If you try to 
> > interpret the current standard, you arrive at the conclusion that
> > pthread_mutex_lock() basically needs a full barrier, though
> > pthread_mutex_unlock() doesn't.  (See
> > http://www.hpl.hp.com/techreports/2005/HPL-2005-217.html .)
> 
> Was the talk you presented at the May 2005 Gelato meeting in 
> Cupertino based on an earlier version of this paper?
> 
> That was a very good presentation that exposed the 
> deficiencies in the programming models and languages.  If the 
> slides and/or a recording are available, that might be 
> helpful here too.
> 
> thanks,
> grant
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread
* RE: Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-29 18:33 Boehm, Hans
  2006-03-29 19:11 ` Grant Grundler
  0 siblings, 1 reply; 37+ messages in thread
From: Boehm, Hans @ 2006-03-29 18:33 UTC (permalink / raw)
  To: Christoph Lameter, Chen, Kenneth W
  Cc: Nick Piggin, Zoltan Menyhart, akpm, linux-kernel, linux-ia64

For what it's worth, I've been involved in some recent work whose goals
currently include properly specifying such things at the user level,
currently primarily in the context of C++, with the hope that this will
eventually migrate into C.  (See
http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/ for details.)

Some potentially relevant observations:

- Java uses the terms "acquire" and "release" to express approximately
what "lock" and "unlock" mean here.  The major difference is that
performing a "release" operation on a particular object only ensures
visibility to threads that later perform an "acquire" operation on the
same object.  I'm not sure that there are any architectures on which
that difference would currently result in a different implementation,
though it may matter for software DSM implementations.  I think this
terminology has been in fairly widespread use in the architecture
community since at least 1990 or so.  (Google "release consistency".)

- C# uses similar terminology, as we have been doing so far for C++.
I'm not convinced these are Itanium specific terms.

- At user level, the ordering semantics required for something like
pthread_mutex_lock() are unfortunately unclear.  If you try to interpret
the current standard, you arrive at the conclusion that
pthread_mutex_lock() basically needs a full barrier, though
pthread_mutex_unlock() doesn't.  (See
http://www.hpl.hp.com/techreports/2005/HPL-2005-217.html .)  There is
some evidence that this is unintentional, inconsistently implemented,
and probably undesired. 

- The C++ equivalent of this is not very far along and still
controversial.  But I think the current majority sentiment is in favor
of expressing the ordering constraints in a way that ensures they cannot
accidentally become dynamically computed arguments. i.e. in favor of
making them either part of the operation name or template arguments, but
not regular arguments.

Hans

> -----Original Message-----
> From: linux-ia64-owner@vger.kernel.org 
> [mailto:linux-ia64-owner@vger.kernel.org] On Behalf Of 
> Christoph Lameter
> Sent: Tuesday, March 28, 2006 11:11 PM
> To: Chen, Kenneth W
> Cc: 'Nick Piggin'; Zoltan Menyhart; akpm@osdl.org; 
> linux-kernel@vger.kernel.org; linux-ia64@vger.kernel.org
> Subject: RE: Fix unlock_buffer() to work the same way as bit_unlock()
> 
> On Tue, 28 Mar 2006, Chen, Kenneth W wrote:
> 
> > Nick Piggin wrote on Tuesday, March 28, 2006 6:36 PM
> > > Hmm, not sure. Maybe a few new bitops with _lock / 
> _unlock postfixes?
> > > For page lock and buffer lock we'd just need 
> test_and_set_bit_lock, 
> > > clear_bit_unlock, smp_mb__after_clear_bit_unlock.
> > > 
> > > I don't know, _for_lock might be a better name. But it's 
> getting long.
> > 
> > I think kernel needs all 4 variants:
> > 
> > clear_bit
> > clear_bit_lock
> > clear_bit_unlock
> > clear_bit_fence
> > 
> > And the variant need to permutated on all other bit ops ... 
>  I think 
> > it would be indeed a better API and be more explicit about 
> the ordering.
> 
> How about clear_bit(why, bit, address) in order to keep the 
> variants down? Get rid of the smp_mb__*_xxxx stuff.
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread
* Fix unlock_buffer() to work the same way as bit_unlock()
@ 2006-03-28  3:59 Christoph Lameter
  2006-03-28  8:10 ` Nick Piggin
       [not found] ` <442AA13B.3050104@bull.net>
  0 siblings, 2 replies; 37+ messages in thread
From: Christoph Lameter @ 2006-03-28  3:59 UTC (permalink / raw)
  To: akpm; +Cc: nickpiggin, Zoltan.Menyhart, linux-kernel, linux-ia64

Currently unlock_buffer() contains a smb_mb__after_clear_bit() which is 
weird because bit_spin_unlock() uses smb_mb__before_clear_bit():

>From include/linux/bit_spinlock.h:

static inline void bit_spin_unlock(int bitnum, unsigned long *addr)
{
        smp_mb__before_clear_bit();
        clear_bit(bitnum, addr);
        preempt_enable();
        __release(bitlock);
}

For most architectures there is no difference because both
smp_mb__after_clear_bit() and smp_mb__before_clear_bit() are both
memory barriers and clear_buffer_locked() is an atomic operation.
However, they differ under IA64.

Note that this potential race has never been seen under IA64. It was 
discovered by inspection by Zoltan Menyhart <Zoltan.Menyhart@free.fr>. 

Regardless if this is a true race or not, I think the unlock sequence 
needs to be the same for bit locks and unlock_buffer(). Maybe 
unlock_buffer and lock_buffer better use bit spinlock operations?

Change unlock_buffer() to work the same way as bit_spin_unlock.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6/fs/buffer.c
===================================================================
--- linux-2.6.orig/fs/buffer.c	2006-03-27 14:09:54.000000000 -0800
+++ linux-2.6/fs/buffer.c	2006-03-27 19:40:32.000000000 -0800
@@ -78,8 +78,8 @@ EXPORT_SYMBOL(__lock_buffer);
 
 void fastcall unlock_buffer(struct buffer_head *bh)
 {
+	smp_mb__before_clear_bit();
 	clear_buffer_locked(bh);
-	smp_mb__after_clear_bit();
 	wake_up_bit(&bh->b_state, BH_Lock);
 }
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2006-03-30 22:26 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-29 22:56 Fix unlock_buffer() to work the same way as bit_unlock() Boehm, Hans
2006-03-29 23:33 ` Christoph Lameter
2006-03-29 23:49   ` Chen, Kenneth W
2006-03-29 23:50     ` Christoph Lameter
2006-03-29 23:50     ` Grant Grundler
2006-03-30  8:43   ` Zoltan Menyhart
2006-03-30  8:55     ` Nick Piggin
2006-03-30 19:11       ` Christoph Lameter
2006-03-30 17:17     ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2006-03-30 22:26 Boehm, Hans
2006-03-30 17:57 Boehm, Hans
2006-03-30 18:17 ` Christoph Lameter
2006-03-29 19:31 Boehm, Hans
2006-03-29 22:17 ` Christoph Lameter
2006-03-29 18:33 Boehm, Hans
2006-03-29 19:11 ` Grant Grundler
2006-03-28  3:59 Christoph Lameter
2006-03-28  8:10 ` Nick Piggin
2006-03-28 18:53   ` Chen, Kenneth W
2006-03-28 21:42     ` Zoltan Menyhart
2006-03-28 23:48       ` Christoph Lameter
2006-03-29  0:07         ` Nick Piggin
2006-03-29  2:23           ` Christoph Lameter
2006-03-29  2:35             ` Nick Piggin
2006-03-29  6:46               ` Chen, Kenneth W
2006-03-29  7:11                 ` Christoph Lameter
2006-03-30  1:34                 ` Nick Piggin
2006-03-29  0:12         ` Chen, Kenneth W
2006-03-29  0:27         ` Chen, Kenneth W
2006-03-29  0:47           ` Christoph Lameter
2006-03-29  1:39             ` Chen, Kenneth W
2006-03-29 12:16               ` Zoltan Menyhart
2006-03-30  1:56                 ` Nick Piggin
2006-03-29 10:57         ` Zoltan Menyhart
2006-03-29  6:50   ` Chen, Kenneth W
2006-03-30  1:36     ` Nick Piggin
     [not found] ` <442AA13B.3050104@bull.net>
2006-03-30  1:57   ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox