* Re: [PATCH] Document Linux's memory barriers [try #5]
2006-03-23 18:34 ` [PATCH] Document Linux's memory barriers [try #5] David Howells
@ 2006-03-23 19:28 ` Linus Torvalds
2006-03-23 22:26 ` Paul E. McKenney
1 sibling, 0 replies; 3+ messages in thread
From: Linus Torvalds @ 2006-03-23 19:28 UTC (permalink / raw)
To: David Howells
Cc: akpm, linux-arch, linux-kernel, paulmck, davem, linuxppc64-dev
On Thu, 23 Mar 2006, David Howells wrote:
>
> > Some architectures have more expansive definition of data dependency,
> > including then- and else-clauses being data-dependent on the if-condition,
> > but this is probably too much detail.
>
> Linus calls that a "control dependency" and doesn't seem to think that's a
> problem as it's sorted out by branch prediction. What you said makes me
> wonder about conditional instructions (such as conditional move).
I'd put it the other way: a control dependency is not "sorted out" by
branch prediction, it is effectively _nullified_ by branch prediction.
Basically, control dependencies aren't dependencies at all. There is
absolutely _zero_ dependency between the following two loads:
if (load a)
load b;
because the "load b" can happen before the "load a" because of control
prediction.
So if you need a read barrier where there is a _control_ dependency in
between loading a and loading b, you need to make it a full "smp_rmb()".
It is _not_ sufficient to make this a "read_barrier_depends", because the
load of b really doesn't depend on the load of a at all.
So data dependencies that result in control dependencies aren't
dependencies at all. Not even if the address depends on the control
dependency.
So
int *address_of_b;
address_of_b = load(&a);
smp_read_barrier_depends();
b = load(address_of_b);
is correct, but
int *address_of_b = default_address_of_b;
if (load(&a))
address_of_b = another_address_of_b;
smp_read_barrier_depends();
b = load(address_of_b);
is NOT correct, because there is no data dependency on the load of b, just
a control dependency that the CPU may short-circuit with prediction, and
that second case thus needs a real smp_rmb().
And yes, if we ever hit a CPU that does actual data prediction, not just
control prediction, that will force smp_read_barrier_depends() to be the
same as smp_rmb() on such an architecture.
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Document Linux's memory barriers [try #5]
2006-03-23 18:34 ` [PATCH] Document Linux's memory barriers [try #5] David Howells
2006-03-23 19:28 ` Linus Torvalds
@ 2006-03-23 22:26 ` Paul E. McKenney
1 sibling, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2006-03-23 22:26 UTC (permalink / raw)
To: David Howells
Cc: akpm, linux-arch, linux-kernel, torvalds, davem, linuxppc64-dev
On Thu, Mar 23, 2006 at 06:34:27PM +0000, David Howells wrote:
> Paul E. McKenney <paulmck@us.ibm.com> wrote:
>
> > smp_mb__before_atomic_dec() and friends as well?
>
> These seem to be something Sparc64 related; or, at least, Sparc64 seems to do
> something weird with them.
>
> What are these meant to achieve anyway? They seems to just be barrier() on a
> lot of systems, even SMP ones.
On architectures such as x86 where atomic_dec() implies an smp_mb(),
they do nothing. On other architectures, they supply whatever memory
barrier is required.
So, on x86:
smp_mb();
atomic_dec(&my_atomic_counter);
would result in -two- atomic instructions, but the smp_mb() would be
absolutely required on CPUs with weaker memory-consistency models.
So your choice is to (1) be inefficient on x86 or (2) be unsafe on
weak-memory-consistency systems. What we can do instead is:
smp_mb__before_atomic_dec();
atomic_dec(&my_atomic_counter);
This allows x86 to generate efficient code -and- allows weak-memory
machines (e.g., Alpha, MIPS, PA-RISC(!), ppc, s390, SPARC64) to generate
safe code.
Thanx, Paul
^ permalink raw reply [flat|nested] 3+ messages in thread