* [PATCH] x86 bitops.h commentary on instruction reordering
@ 2004-08-05 20:06 Marcelo Tosatti
2004-08-06 14:17 ` Vladislav Bolkhovitin
0 siblings, 1 reply; 19+ messages in thread
From: Marcelo Tosatti @ 2004-08-05 20:06 UTC (permalink / raw)
To: linux-kernel; +Cc: andrea, akpm
Hi,
Back when we were discussing the need for a memory barrier in sync_page(),
it came to me (thanks Andrea!) that the bit operations can be perfectly
reordered on architectures other than x86.
I think the commentary on i386 bitops.h is misleading, its worth
to note that that these operations are not guaranteed not to be
reordered on different architectures.
clear_bit() already does that:
* clear_bit() is atomic and may not be reordered. However, it does
* not contain a memory barrier, so if it is used for locking purposes,
* you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit()
* in order to ensure changes are visible on other processors.
What you think of the following
--- a/include/asm-i386/bitops.h.orig 2004-08-05 17:35:40.639336160 -0300
+++ b/include/asm-i386/bitops.h 2004-08-05 17:35:12.486616024 -0300
@@ -30,7 +30,12 @@
* @addr: the address to start counting from
*
* This function is atomic and may not be reordered. See __set_bit()
- * if you do not require the atomic guarantees.
+ * if you do not require the atomic guarantees.
+ *
+ * Note: there are no guarantees that this function will not be reordered
+ * on non x86 architectures, so if you are writting portable code,
+ * make sure not to rely on its reordering guarantees.
+ *
* Note that @nr may be almost arbitrarily large; this function is not
* restricted to acting on a single-word quantity.
*/
@@ -109,7 +114,8 @@ static inline void __change_bit(int nr,
* @nr: Bit to change
* @addr: Address to start counting from
*
- * change_bit() is atomic and may not be reordered.
+ * change_bit() is atomic and may not be reordered. It may be
+ * reordered on other architectures than x86.
* Note that @nr may be almost arbitrarily large; this function is not
* restricted to acting on a single-word quantity.
*/
@@ -127,6 +133,7 @@ static inline void change_bit(int nr, vo
* @addr: Address to count from
*
* This operation is atomic and cannot be reordered.
+ * It may be reordered on other architectures than x86.
* It also implies a memory barrier.
*/
static inline int test_and_set_bit(int nr, volatile unsigned long * addr)
@@ -165,7 +172,8 @@ static inline int __test_and_set_bit(int
* @nr: Bit to clear
* @addr: Address to count from
*
- * This operation is atomic and cannot be reordered.
+ * This operation is atomic and cannot be reordered.
+ * It can be reorderdered on other architectures other than x86.
* It also implies a memory barrier.
*/
static inline int test_and_clear_bit(int nr, volatile unsigned long * addr)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-05 20:06 [PATCH] x86 bitops.h commentary on instruction reordering Marcelo Tosatti
@ 2004-08-06 14:17 ` Vladislav Bolkhovitin
2004-08-06 14:33 ` Marcelo Tosatti
0 siblings, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-06 14:17 UTC (permalink / raw)
To: linux-kernel
So, is there any way to workaround this problem, i.e. prevent bit
operations reordering on non-x86 architectures? Some kinds of memory
barriers?
Vlad
Marcelo Tosatti wrote:
> Hi,
>
> Back when we were discussing the need for a memory barrier in sync_page(),
> it came to me (thanks Andrea!) that the bit operations can be perfectly
> reordered on architectures other than x86.
>
> I think the commentary on i386 bitops.h is misleading, its worth
> to note that that these operations are not guaranteed not to be
> reordered on different architectures.
>
> clear_bit() already does that:
>
> * clear_bit() is atomic and may not be reordered. However, it does
> * not contain a memory barrier, so if it is used for locking purposes,
> * you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit()
> * in order to ensure changes are visible on other processors.
>
> What you think of the following
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 14:17 ` Vladislav Bolkhovitin
@ 2004-08-06 14:33 ` Marcelo Tosatti
2004-08-06 15:36 ` Vladislav Bolkhovitin
0 siblings, 1 reply; 19+ messages in thread
From: Marcelo Tosatti @ 2004-08-06 14:33 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: linux-kernel
On Fri, Aug 06, 2004 at 06:17:04PM +0400, Vladislav Bolkhovitin wrote:
> So, is there any way to workaround this problem, i.e. prevent bit
> operations reordering on non-x86 architectures? Some kinds of memory
> barriers?
Memory barriers, yes, smp_mb(), rmb, wmb and friends.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 14:33 ` Marcelo Tosatti
@ 2004-08-06 15:36 ` Vladislav Bolkhovitin
2004-08-06 15:53 ` Marcelo Tosatti
0 siblings, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-06 15:36 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Thanks.
One more question, if you don't object. How after some variable
assigment to make other CPUs *immediatelly* see the assigned value, i.e.
to make current CPU immediately flush its write cache in memory? *mb()
seems deal with reordering, barrier() with the compiler optimization (am
I right?). The similar memory barrier spin_lock() does, but it's not
easy to uderstand its internal magic.
Vlad
Marcelo Tosatti wrote:
> On Fri, Aug 06, 2004 at 06:17:04PM +0400, Vladislav Bolkhovitin wrote:
>
>>So, is there any way to workaround this problem, i.e. prevent bit
>>operations reordering on non-x86 architectures? Some kinds of memory
>>barriers?
>
>
> Memory barriers, yes, smp_mb(), rmb, wmb and friends.
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 15:36 ` Vladislav Bolkhovitin
@ 2004-08-06 15:53 ` Marcelo Tosatti
2004-08-06 16:52 ` Vladislav Bolkhovitin
2004-08-09 11:58 ` Maciej W. Rozycki
0 siblings, 2 replies; 19+ messages in thread
From: Marcelo Tosatti @ 2004-08-06 15:53 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: linux-kernel
On Fri, Aug 06, 2004 at 07:36:25PM +0400, Vladislav Bolkhovitin wrote:
> Thanks.
>
> One more question, if you don't object. How after some variable
> assigment to make other CPUs *immediatelly* see the assigned value, i.e.
> to make current CPU immediately flush its write cache in memory? *mb()
> seems deal with reordering, barrier() with the compiler optimization (am
> I right?).
Yes correct. *mb() usually imply barrier().
About the flush, each architecture defines its own instruction for doing so,
PowerPC has "sync" and "isync" instructions (to flush the whole cache and instruction
cache respectively), MIPS has "sync" and so on..
> The similar memory barrier spin_lock() does, but it's not
> easy to uderstand its internal magic.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 15:53 ` Marcelo Tosatti
@ 2004-08-06 16:52 ` Vladislav Bolkhovitin
2004-08-06 17:09 ` Marcelo Tosatti
2004-08-09 11:58 ` Maciej W. Rozycki
1 sibling, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-06 16:52 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Marcelo Tosatti wrote:
> On Fri, Aug 06, 2004 at 07:36:25PM +0400, Vladislav Bolkhovitin wrote:
>
>>Thanks.
>>
>>One more question, if you don't object. How after some variable
>>assigment to make other CPUs *immediatelly* see the assigned value, i.e.
>>to make current CPU immediately flush its write cache in memory? *mb()
>>seems deal with reordering, barrier() with the compiler optimization (am
>>I right?).
>
>
> Yes correct. *mb() usually imply barrier().
>
> About the flush, each architecture defines its own instruction for doing so,
> PowerPC has "sync" and "isync" instructions (to flush the whole cache and instruction
> cache respectively), MIPS has "sync" and so on..
So, there is no platform independent way for doing that in the kernel?
>>The similar memory barrier spin_lock() does, but it's not
>>easy to uderstand its internal magic.
>
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 16:52 ` Vladislav Bolkhovitin
@ 2004-08-06 17:09 ` Marcelo Tosatti
2004-08-06 22:33 ` H. Peter Anvin
2004-08-09 15:14 ` Vladislav Bolkhovitin
0 siblings, 2 replies; 19+ messages in thread
From: Marcelo Tosatti @ 2004-08-06 17:09 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: linux-kernel
On Fri, Aug 06, 2004 at 08:52:34PM +0400, Vladislav Bolkhovitin wrote:
> Marcelo Tosatti wrote:
> >On Fri, Aug 06, 2004 at 07:36:25PM +0400, Vladislav Bolkhovitin wrote:
> >
> >>Thanks.
> >>
> >>One more question, if you don't object. How after some variable
> >>assigment to make other CPUs *immediatelly* see the assigned value, i.e.
> >>to make current CPU immediately flush its write cache in memory? *mb()
> >>seems deal with reordering, barrier() with the compiler optimization (am
> >>I right?).
> >
> >
> >Yes correct. *mb() usually imply barrier().
> >
> >About the flush, each architecture defines its own instruction for doing
> >so,
> > PowerPC has "sync" and "isync" instructions (to flush the whole cache
> > and instruction cache respectively), MIPS has "sync" and so on..
>
> So, there is no platform independent way for doing that in the kernel?
Not really. x86 doesnt have such an instruction.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 17:09 ` Marcelo Tosatti
@ 2004-08-06 22:33 ` H. Peter Anvin
2004-08-07 1:01 ` Zwane Mwaikambo
2004-08-07 22:16 ` Alan Cox
2004-08-09 15:14 ` Vladislav Bolkhovitin
1 sibling, 2 replies; 19+ messages in thread
From: H. Peter Anvin @ 2004-08-06 22:33 UTC (permalink / raw)
To: linux-kernel
Followup to: <20040806170931.GA21683@logos.cnet>
By author: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
In newsgroup: linux.dev.kernel
> > >
> > >Yes correct. *mb() usually imply barrier().
> > >
> > >About the flush, each architecture defines its own instruction for doing
> > >so,
> > > PowerPC has "sync" and "isync" instructions (to flush the whole cache
> > > and instruction cache respectively), MIPS has "sync" and so on..
> >
> > So, there is no platform independent way for doing that in the kernel?
>
> Not really. x86 doesnt have such an instruction.
>
Actually it does (sfence, lfence, mfence); they only apply to SSE
loads and stores since all other x86 operations are guaranteed to be
strictly ordered.
-hpa
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 22:33 ` H. Peter Anvin
@ 2004-08-07 1:01 ` Zwane Mwaikambo
2004-08-07 22:16 ` Alan Cox
1 sibling, 0 replies; 19+ messages in thread
From: Zwane Mwaikambo @ 2004-08-07 1:01 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On Fri, 6 Aug 2004, H. Peter Anvin wrote:
> Followup to: <20040806170931.GA21683@logos.cnet>
> By author: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
> In newsgroup: linux.dev.kernel
> > > >
> > > >Yes correct. *mb() usually imply barrier().
> > > >
> > > >About the flush, each architecture defines its own instruction for doing
> > > >so,
> > > > PowerPC has "sync" and "isync" instructions (to flush the whole cache
> > > > and instruction cache respectively), MIPS has "sync" and so on..
> > >
> > > So, there is no platform independent way for doing that in the kernel?
> >
> > Not really. x86 doesnt have such an instruction.
> >
>
> Actually it does (sfence, lfence, mfence); they only apply to SSE
> loads and stores since all other x86 operations are guaranteed to be
> strictly ordered.
How about the, rather brutal, wbinvd?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 22:33 ` H. Peter Anvin
2004-08-07 1:01 ` Zwane Mwaikambo
@ 2004-08-07 22:16 ` Alan Cox
1 sibling, 0 replies; 19+ messages in thread
From: Alan Cox @ 2004-08-07 22:16 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Linux Kernel Mailing List
On Gwe, 2004-08-06 at 23:33, H. Peter Anvin wrote:
> > > So, there is no platform independent way for doing that in the kernel?
> > Not really. x86 doesnt have such an instruction.
> Actually it does (sfence, lfence, mfence); they only apply to SSE
> loads and stores since all other x86 operations are guaranteed to be
> strictly ordered.
The ordering guarantees on x86 are not completely strict and they refer
to the observed order of store on the bus rather than execution order.
So while code will generate the same observable results as if it was in
order you can get burned on timing suprises in some odd cases. Some of
the devices will even do weakly ordered stores to memory and fix up the
bus transactions by snooping them and spotting the impending problem.
Secondly on PPro and some others there are various chip errata where not
using locking operations _does_ cause observable out of order stores.
Thats why our unlock path for PPro uses lock movb.
Also for pedants only we support IDT Winchips (ancestor of the VIA C3)
which we do run with weak store ordering from an external view (not an
internal one). On those x86 variants the barrier instructions do what
you would expect.
Alan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 15:53 ` Marcelo Tosatti
2004-08-06 16:52 ` Vladislav Bolkhovitin
@ 2004-08-09 11:58 ` Maciej W. Rozycki
1 sibling, 0 replies; 19+ messages in thread
From: Maciej W. Rozycki @ 2004-08-09 11:58 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Vladislav Bolkhovitin, linux-kernel
On Fri, 6 Aug 2004, Marcelo Tosatti wrote:
> Yes correct. *mb() usually imply barrier().
>
> About the flush, each architecture defines its own instruction for doing so,
> PowerPC has "sync" and "isync" instructions (to flush the whole cache and instruction
> cache respectively), MIPS has "sync" and so on..
JFTR, in the absence of an external write-back buffer (which some
processors have) the MIPS "sync" instruction has exactly the semantics of
mb(). There is no MIPS instruction to perform writeback, invalidation,
etc. operations on whole caches -- such operations can only be performed
in the kernel mode on a line-by-line basis and are model-specific, though
some standardisation exists.
Maciej
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-06 17:09 ` Marcelo Tosatti
2004-08-06 22:33 ` H. Peter Anvin
@ 2004-08-09 15:14 ` Vladislav Bolkhovitin
2004-08-09 15:50 ` Marcelo Tosatti
1 sibling, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-09 15:14 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Marcelo Tosatti wrote:
>>>Yes correct. *mb() usually imply barrier().
>>>
>>>About the flush, each architecture defines its own instruction for doing
>>>so,
>>>PowerPC has "sync" and "isync" instructions (to flush the whole cache
>>>and instruction cache respectively), MIPS has "sync" and so on..
>>
>>So, there is no platform independent way for doing that in the kernel?
>
>
> Not really. x86 doesnt have such an instruction.
But how then spin_lock() works? It guarantees memory sync between CPUs,
doesn't it? Otherwise how can it prevent possible races with concurrent
data modifications?
Vlad
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-09 15:14 ` Vladislav Bolkhovitin
@ 2004-08-09 15:50 ` Marcelo Tosatti
2004-08-09 17:35 ` Vladislav Bolkhovitin
0 siblings, 1 reply; 19+ messages in thread
From: Marcelo Tosatti @ 2004-08-09 15:50 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: linux-kernel
On Mon, Aug 09, 2004 at 07:14:48PM +0400, Vladislav Bolkhovitin wrote:
> Marcelo Tosatti wrote:
> >>>Yes correct. *mb() usually imply barrier().
> >>>
> >>>About the flush, each architecture defines its own instruction for doing
> >>>so,
> >>>PowerPC has "sync" and "isync" instructions (to flush the whole cache
> >>>and instruction cache respectively), MIPS has "sync" and so on..
> >>
> >>So, there is no platform independent way for doing that in the kernel?
> >
> >
> >Not really. x86 doesnt have such an instruction.
>
> But how then spin_lock() works? It guarantees memory sync between CPUs,
> doesn't it? Otherwise how can it prevent possible races with concurrent
> data modifications?
It makes use of the "lock" instruction to lock the bus before the "dec" instruction:
#define spin_lock_string \
"\n1:\t" \
"lock ; decb %0\n\t" \
"js 2f\n" \
That way it prevent races wrt other CPUs. atomic accesses which need to modify
(atomic_inc, atomic_dec, etc) data also use the "lock" to prevent other CPUs
from reading the data.
grep for "lock" in include/asm-i386/.
As hpa said, most x86 instructions (except SSE-related ones) are
strictly ordered (except the cases Alan pointed, which were not known
to me).
Thats why there is no "sync"-like instruction on x86 (again, except SSE-related
ones).
This is just a simple and short explanation of how this works. It gets more complex
you think about cache coherency between processors, etc. For more details
the best book probably is "UNIX Systems for Modern Architectures.
Symmetric Multiprocessing and Caching for Kernel Programming" - Curt Schimmel (which
has been suggested here over and over).
I hope that helps.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-09 15:50 ` Marcelo Tosatti
@ 2004-08-09 17:35 ` Vladislav Bolkhovitin
[not found] ` <20040809183437.GD6361@logos.cnet>
0 siblings, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-09 17:35 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Marcelo Tosatti wrote:
>>>Not really. x86 doesnt have such an instruction.
>>
>>But how then spin_lock() works? It guarantees memory sync between CPUs,
>>doesn't it? Otherwise how can it prevent possible races with concurrent
>>data modifications?
>
> It makes use of the "lock" instruction to lock the bus before the "dec" instruction:
>
> #define spin_lock_string \
> "\n1:\t" \
> "lock ; decb %0\n\t" \
> "js 2f\n" \
>
> That way it prevent races wrt other CPUs. atomic accesses which need to modify
> (atomic_inc, atomic_dec, etc) data also use the "lock" to prevent other CPUs
> from reading the data.
>
> grep for "lock" in include/asm-i386/.
>
> As hpa said, most x86 instructions (except SSE-related ones) are
> strictly ordered (except the cases Alan pointed, which were not known
> to me).
>
> Thats why there is no "sync"-like instruction on x86 (again, except SSE-related
> ones).
>
> This is just a simple and short explanation of how this works. It gets more complex
> you think about cache coherency between processors, etc. For more details
> the best book probably is "UNIX Systems for Modern Architectures.
> Symmetric Multiprocessing and Caching for Kernel Programming" - Curt Schimmel (which
> has been suggested here over and over).
I know basically, how spinlocks work to provide ordering, my question
was about the cache coherency. Thanks for the link to the book, I will
try to find it.
Vlad
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
[not found] ` <20040809183437.GD6361@logos.cnet>
@ 2004-08-09 20:12 ` Vladislav Bolkhovitin
2004-08-09 21:21 ` Alan Cox
0 siblings, 1 reply; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-09 20:12 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Marcelo Tosatti wrote:
> Vladislav,
>
> There is no cache coherency issues on x86, it handles the cache coherency
> on hardware.
Well, Marcelo, sorry if I'm getting too annoying, but we had a race with
cache coherency during SCST (SCSI target mid-level) development. We
discovered that on P4 Xeon after atomic_set() there is very small
window, when atomic_read() on another CPUs returns the old value. We had
to rewrite the code without using atomic_set(). Isn't it cache coherency
issue?
And, BTW, returning to the original topic, would it be better to make
set_bit() and friends guarantee not to be reordered on all
architectures, instead of just add the comment. Otherwise, what is the
difference with versions with `__` prefix (__set_bit(), for example)?
Just adding the comments will lead to creating different functions with
gurantees by everyone who need it in all over the kernel. Is it the
right thing? In some places in SCST we heavy rely on non-ordering
guarantees.
Thanks,
Vlad
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-09 20:12 ` Vladislav Bolkhovitin
@ 2004-08-09 21:21 ` Alan Cox
2004-08-10 12:28 ` Vladislav Bolkhovitin
0 siblings, 1 reply; 19+ messages in thread
From: Alan Cox @ 2004-08-09 21:21 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: Marcelo Tosatti, Linux Kernel Mailing List
On Llu, 2004-08-09 at 21:12, Vladislav Bolkhovitin wrote:
> Well, Marcelo, sorry if I'm getting too annoying, but we had a race with
> cache coherency during SCST (SCSI target mid-level) development. We
> discovered that on P4 Xeon after atomic_set() there is very small
> window, when atomic_read() on another CPUs returns the old value. We had
> to rewrite the code without using atomic_set(). Isn't it cache coherency
> issue?
atomic_set/atomic_read are _atomic_ operations. Nothing is said about
ordering. You get old or new but not half and half. Two atomic_inc's
will both occur and so on.
If you want ordering you need locks otherwise there is nothing defining
the time order of both processors.
How can you even measure such a window without locking to know what the
state of the processors is ?
> And, BTW, returning to the original topic, would it be better to make
> set_bit() and friends guarantee not to be reordered on all
> architectures, instead of just add the comment. Otherwise, what is the
x86 and some other platforms have certain ordering guarantees. set_bit
doesn't guarantee them but it happens to unavoidably work for most
(ab)uses.
> right thing? In some places in SCST we heavy rely on non-ordering
> guarantees.
Then you will get burned on most hardware.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
[not found] ` <2roWw-40d-17@gated-at.bofh.it>
@ 2004-08-09 21:23 ` Andi Kleen
2004-08-10 12:26 ` Vladislav Bolkhovitin
0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2004-08-09 21:23 UTC (permalink / raw)
To: Vladislav Bolkhovitin; +Cc: linux-kernel, marcelo.tosatti
Vladislav Bolkhovitin <vst@vlnb.net> writes:
> Marcelo Tosatti wrote:
>> Vladislav, There is no cache coherency issues on x86, it handles the
>> cache coherency
>> on hardware.
>
> Well, Marcelo, sorry if I'm getting too annoying, but we had a race
> with cache coherency during SCST (SCSI target mid-level)
> development. We discovered that on P4 Xeon after atomic_set() there is
> very small window, when atomic_read() on another CPUs returns the old
> value. We had to rewrite the code without using atomic_set(). Isn't it
> cache coherency issue?
Add an rmb() after the atomic_set. atomic_set doesn't have one by itself
(it is non locked on Linux/x86)
> And, BTW, returning to the original topic, would it be better to make
> set_bit() and friends guarantee not to be reordered on all
> architectures, instead of just add the comment. Otherwise, what is the
That makes them a *lot* slower on some systems. And most of the
set_bits in the kernel don't need strong ordering.
> difference with versions with `__` prefix (__set_bit(), for example)?
> Just adding the comments will lead to creating different functions
> with gurantees by everyone who need it in all over the kernel. Is it
> the right thing? In some places in SCST we heavy rely on non-ordering
> guarantees.
Better add lots of memory barriers then.
-Andi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-09 21:23 ` Andi Kleen
@ 2004-08-10 12:26 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-10 12:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, marcelo.tosatti
Andi Kleen wrote:
> That makes them a *lot* slower on some systems. And most of the
> set_bits in the kernel don't need strong ordering.
>
>
>>difference with versions with `__` prefix (__set_bit(), for example)?
>>Just adding the comments will lead to creating different functions
>>with gurantees by everyone who need it in all over the kernel. Is it
>>the right thing? In some places in SCST we heavy rely on non-ordering
>>guarantees.
>
>
> Better add lots of memory barriers then.
It looks like we have to do that now.
Thanks,
Vlad
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86 bitops.h commentary on instruction reordering
2004-08-09 21:21 ` Alan Cox
@ 2004-08-10 12:28 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 19+ messages in thread
From: Vladislav Bolkhovitin @ 2004-08-10 12:28 UTC (permalink / raw)
To: Alan Cox; +Cc: Marcelo Tosatti, Linux Kernel Mailing List
Alan Cox wrote:
> On Llu, 2004-08-09 at 21:12, Vladislav Bolkhovitin wrote:
>
>>Well, Marcelo, sorry if I'm getting too annoying, but we had a race with
>>cache coherency during SCST (SCSI target mid-level) development. We
>>discovered that on P4 Xeon after atomic_set() there is very small
>>window, when atomic_read() on another CPUs returns the old value. We had
>>to rewrite the code without using atomic_set(). Isn't it cache coherency
>>issue?
>
> atomic_set/atomic_read are _atomic_ operations. Nothing is said about
> ordering. You get old or new but not half and half. Two atomic_inc's
> will both occur and so on.
>
> If you want ordering you need locks otherwise there is nothing defining
> the time order of both processors.
>
> How can you even measure such a window without locking to know what the
> state of the processors is ?
We didn't measure it, we just had lockup. In our code in the commands
serialization we have two paths: the regular (fast) one, where the
atomic value is equal to current value and there are no locks used,
because only one command can be on the fast path, and the slow one,
where the atomic and the current values are different, so lock used to
perform all necessary job. After the fast path the atomic value gets
incremented together with some checks on it. We used to use atomic_set()
here and sometimes had lockup.
>>And, BTW, returning to the original topic, would it be better to make
>>set_bit() and friends guarantee not to be reordered on all
>>architectures, instead of just add the comment. Otherwise, what is the
>
>
> x86 and some other platforms have certain ordering guarantees. set_bit
> doesn't guarantee them but it happens to unavoidably work for most
> (ab)uses.
So, seems there is no difference between operations with and without
"__" prefix (like set_bit() and __set_bit()) now? Maybe, it's worth to
leave only one version of them?
>>right thing? In some places in SCST we heavy rely on non-ordering
>>guarantees.
>
>
> Then you will get burned on most hardware.
Designing that we also be guided by the comments, which turned out to be
misleading :).
Actually, I feel it would be perfect if someone wrote a document, where
he described various issues with using atomic variables and bit
operations on various architectures, like some FAQ. For example, this
thread has at least several good questions and answers. Also, I can add
one more question: for an user it could be non-obvious why
smp_mb__after_atomic_*() and friends needed. This document wouldn't be
big, so it didn't take too much time to write it, but it would help a
*lot* for people who wish to write portable code.
Thanks,
Vlad
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2004-08-10 12:31 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-05 20:06 [PATCH] x86 bitops.h commentary on instruction reordering Marcelo Tosatti
2004-08-06 14:17 ` Vladislav Bolkhovitin
2004-08-06 14:33 ` Marcelo Tosatti
2004-08-06 15:36 ` Vladislav Bolkhovitin
2004-08-06 15:53 ` Marcelo Tosatti
2004-08-06 16:52 ` Vladislav Bolkhovitin
2004-08-06 17:09 ` Marcelo Tosatti
2004-08-06 22:33 ` H. Peter Anvin
2004-08-07 1:01 ` Zwane Mwaikambo
2004-08-07 22:16 ` Alan Cox
2004-08-09 15:14 ` Vladislav Bolkhovitin
2004-08-09 15:50 ` Marcelo Tosatti
2004-08-09 17:35 ` Vladislav Bolkhovitin
[not found] ` <20040809183437.GD6361@logos.cnet>
2004-08-09 20:12 ` Vladislav Bolkhovitin
2004-08-09 21:21 ` Alan Cox
2004-08-10 12:28 ` Vladislav Bolkhovitin
2004-08-09 11:58 ` Maciej W. Rozycki
[not found] <2pY7Y-W1-7@gated-at.bofh.it>
[not found] ` <2qdJL-3sh-27@gated-at.bofh.it>
[not found] ` <2qePx-4eM-21@gated-at.bofh.it>
[not found] ` <2qf8S-4pI-31@gated-at.bofh.it>
[not found] ` <2qg4V-54c-17@gated-at.bofh.it>
[not found] ` <2qgoh-5og-29@gated-at.bofh.it>
[not found] ` <2qhkn-649-41@gated-at.bofh.it>
[not found] ` <2rkgh-zP-45@gated-at.bofh.it>
[not found] ` <2rlFk-1wC-27@gated-at.bofh.it>
[not found] ` <2rmhX-20I-17@gated-at.bofh.it>
[not found] ` <2roWw-40d-19@gated-at.bofh.it>
[not found] ` <2roWw-40d-17@gated-at.bofh.it>
2004-08-09 21:23 ` Andi Kleen
2004-08-10 12:26 ` Vladislav Bolkhovitin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox