* [patch] linux 2.4.17: An mb() rework
@ 2002-01-30 14:56 Maciej W. Rozycki
2002-01-30 20:02 ` Jason Gunthorpe
0 siblings, 1 reply; 17+ messages in thread
From: Maciej W. Rozycki @ 2002-01-30 14:56 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, linux-mips
Hello,
The current implementation of mb() and friends is broken. It depends on
a model-specific behaviour of the R4000/R4400 CPU and does not guarantee
preserving the desired semantics over the whole MIPS family. What's
worse, recently I've identified a case it doesn't work at all on an
R4400SC CPU -- values written to an i/o memory resource were not committed
to the device even after executing over 40 subsequent instructions.
Here is a patch that implements mb() in an "architecturally defined" way.
Since "sync" acts as a reordering barrier and an uncached load cannot be
prefetched or postponed, this implementation assures a correct
serialization on every MIPS processor, regardless of its implementation
details.
For MIPS I CPUs, which do not support the "sync" instruction, a lone
uncached load is defined to provide a serialization itself, thus the
implementation is correct for these CPUs as well.
The patch looks more complicated than it should, but it's a part of a
wbflush() clean-up and a few of these helper submacros are really needed
(and others are defined for consistency).
Tested successfully on an R3400 and an R4400SC CPU on DECstation systems.
If anyone has any comments, then please speak up. Otherwise, Ralf,
please apply.
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
patch-mips-2.4.17-20020111-mb-4
diff -up --recursive --new-file linux-mips-2.4.17-20020111.macro/arch/mips/config.in linux-mips-2.4.17-20020111/arch/mips/config.in
--- linux-mips-2.4.17-20020111.macro/arch/mips/config.in Mon Jan 7 05:27:13 2002
+++ linux-mips-2.4.17-20020111/arch/mips/config.in Sat Jan 26 02:35:35 2002
@@ -363,6 +363,12 @@ else
fi
fi
fi
+if [ "$CONFIG_CPU_R3000" = "y" -o \
+ "$CONFIG_CPU_TX39XX" = "y" ]; then
+ define_bool CONFIG_CPU_HAS_SYNC n
+else
+ define_bool CONFIG_CPU_HAS_SYNC y
+fi
endmenu
mainmenu_option next_comment
diff -up --recursive --new-file linux-mips-2.4.17-20020111.macro/include/asm-mips/system.h linux-mips-2.4.17-20020111/include/asm-mips/system.h
--- linux-mips-2.4.17-20020111.macro/include/asm-mips/system.h Thu Dec 13 05:28:09 2001
+++ linux-mips-2.4.17-20020111/include/asm-mips/system.h Wed Jan 30 02:17:18 2002
@@ -18,9 +18,12 @@
#include <linux/config.h>
#include <asm/sgidefs.h>
-#include <asm/ptrace.h>
+
#include <linux/kernel.h>
+#include <asm/addrspace.h>
+#include <asm/ptrace.h>
+
__asm__ (
".macro\t__sti\n\t"
".set\tpush\n\t"
@@ -170,27 +173,57 @@ extern void __global_restore_flags(unsig
/*
* These are probably defined overly paranoid ...
*/
+#ifdef CONFIG_CPU_HAS_SYNC
+#define __sync() __asm__ __volatile__("sync" : : : "memory")
+#else
+#define __sync() do { } while(0)
+#endif
+
+#define __fast_wmb() do { } while(0)
+#define __fast_rmb() \
+ __asm__ __volatile__( \
+ ".set push\n\t" \
+ ".set noreorder\n\t" \
+ "lw $0,%0\n\t" \
+ "nop\n\t" \
+ ".set pop" \
+ : /* no output */ \
+ : "m" (*(int *)KSEG1) \
+ : "memory")
+#define __fast_mb() __fast_rmb()
+
+#define fast_wmb() \
+ do { \
+ __sync(); \
+ __fast_wmb(); \
+ } while (0)
+#define fast_rmb() \
+ do { \
+ __sync(); \
+ __fast_rmb(); \
+ } while (0)
+#define fast_mb() \
+ do { \
+ __sync(); \
+ __fast_mb(); \
+ } while (0)
+
#ifdef CONFIG_CPU_HAS_WB
#include <asm/wbflush.h>
-#define rmb() do { } while(0)
-#define wmb() wbflush()
-#define mb() wbflush()
+#define wmb() fast_wmb()
+#define rmb() \
+ do { \
+ __sync(); \
+ wbflush(); \
+ } while (0)
+#define mb() rmb()
#else /* CONFIG_CPU_HAS_WB */
-#define mb() \
-__asm__ __volatile__( \
- "# prevent instructions being moved around\n\t" \
- ".set\tnoreorder\n\t" \
- "# 8 nops to fool the R4400 pipeline\n\t" \
- "nop;nop;nop;nop;nop;nop;nop;nop\n\t" \
- ".set\treorder" \
- : /* no output */ \
- : /* no input */ \
- : "memory")
-#define rmb() mb()
-#define wmb() mb()
+#define wmb() fast_wmb()
+#define rmb() fast_rmb()
+#define mb() fast_mb()
#endif /* CONFIG_CPU_HAS_WB */
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [patch] linux 2.4.17: An mb() rework 2002-01-30 14:56 [patch] linux 2.4.17: An mb() rework Maciej W. Rozycki @ 2002-01-30 20:02 ` Jason Gunthorpe 2002-01-31 12:17 ` Maciej W. Rozycki 0 siblings, 1 reply; 17+ messages in thread From: Jason Gunthorpe @ 2002-01-30 20:02 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-mips, linux-mips On Wed, 30 Jan 2002, Maciej W. Rozycki wrote: > worse, recently I've identified a case it doesn't work at all on an > R4400SC CPU -- values written to an i/o memory resource were not committed > to the device even after executing over 40 subsequent instructions. Nice patch! I was under the impression that the mb() functions existed to maintain order of memory access and were not defined as flushes per say - so is the time delay a concern (perhaps sticking a sync before ERET is appropriate?) I spent some time talking with the Sandcraft people about memory barrier issues, and it turns out that at least on the SR71000 (and in most cases the RM7K) the order of SysAD transactions will always match the order of the instruction stream, but all writes are posted and all reads are split - that is the CPU can execute two back to back uncached loads and several back to back uncached stores without stalling the pipeline, or getting the IO's out of order. Adding sync's and uncached loads only slows things down for these chips. I understand this is because the CPUs have a single load/store unit and do not do out of order execution. Many older/embedded MIPS designs probably have a similar configuration, they could likely also run with out the syncs. So - could you add something like CONFIG_IN_ORDER_IO which would nullify the syncs for these processors? BTW, does anyone know what CONFIG_WB is ment to mean? The CPU has a write buffer that does not preserve order? Thanks, Jason ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-30 20:02 ` Jason Gunthorpe @ 2002-01-31 12:17 ` Maciej W. Rozycki 2002-01-31 20:35 ` Jason Gunthorpe 2002-02-04 17:02 ` Ralf Baechle 0 siblings, 2 replies; 17+ messages in thread From: Maciej W. Rozycki @ 2002-01-31 12:17 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-mips, linux-mips On Wed, 30 Jan 2002, Jason Gunthorpe wrote: > I was under the impression that the mb() functions existed to maintain > order of memory access and were not defined as flushes per say - so is > the time delay a concern (perhaps sticking a sync before ERET is > appropriate?) Well, other architectures seem to define the macros such that wmb() is a pure ordering barrier (to assure strong ordering between writes) and rmb() is a flush (since reads are synchronous by their nature this implies all other transactions have to be finished before). Putting a "sync" before "eret" certainly doesn't work. The case I've identified is mask_and_ack() in an interrupt handler. It masks an active IRQ line in an external controller so that handle_IRQ_event() can unmask interrupts in the CPU (this implies mask_and_ack() is synchronous). But due to the current lack of proper synchronization, the write isn't executed until after the __sti() there. As a result for almost every IRQ routed through the external controler there is a spurious one signalled shortly afterwards. There is still a long path from here to an "eret". > I spent some time talking with the Sandcraft people about memory barrier > issues, and it turns out that at least on the SR71000 (and in most cases > the RM7K) the order of SysAD transactions will always match the order of > the instruction stream, but all writes are posted and all reads are split > - that is the CPU can execute two back to back uncached loads and several > back to back uncached stores without stalling the pipeline, or getting the > IO's out of order. Adding sync's and uncached loads only slows things down > for these chips. Ok, what about assuring a value written to an I/O memory resource has actually been commited? That's what rmb() is for. Since the CPU is strongly-orderes wmb() (a "sync") should be indistinguishable from a "nop". > I understand this is because the CPUs have a single load/store unit and do > not do out of order execution. Many older/embedded MIPS designs probably > have a similar configuration, they could likely also run with out the > syncs. Putting a wmb() or not should be the matter of requirements of specific drivers. Wasting a "nop" (effectively) is surely a justifiable price for system's consistency. > So - could you add something like CONFIG_IN_ORDER_IO which would nullify > the syncs for these processors? Hmm, the option seems to exist already, namely CONFIG_NONCOHERENT_IO, but is it really worth making the third variation to save a single "nop", especially as barriers are relatively rare? I'll do it, if it is. > BTW, does anyone know what CONFIG_WB is ment to mean? The CPU has a write > buffer that does not preserve order? Certain DECstation models have a write-back buffer that needs to be handled explicitly. For example rmb() is "1: bc0f 1b" for the R3220 WB chip. Wmb() is null, certainly, as the buffer is strongly-ordered. See arch/mips/dec/wbflush.c for details. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 12:17 ` Maciej W. Rozycki @ 2002-01-31 20:35 ` Jason Gunthorpe 2002-01-31 21:50 ` Maciej W. Rozycki 2002-01-31 23:38 ` Alan Cox 2002-02-04 17:02 ` Ralf Baechle 1 sibling, 2 replies; 17+ messages in thread From: Jason Gunthorpe @ 2002-01-31 20:35 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-mips, linux-mips On Thu, 31 Jan 2002, Maciej W. Rozycki wrote: > Well, other architectures seem to define the macros such that wmb() is a > pure ordering barrier (to assure strong ordering between writes) and rmb() > is a flush (since reads are synchronous by their nature this implies all > other transactions have to be finished before). Yes, I've noticed the wmb bit, not sure about rmb - it is hard to tell without knowing the gritty details of the unusual asm ops used :< sparc64 for instance seems to define them purely as ordering barriers. A little more qualification would be that: write(...,device); // Disable int wmb() enable_ints(); Is expected to have a potential spurious interrupt. But, perhaps this is OK: outl(...,device); wmb(); enable_ints(); This is consistant with how the PCI spec discusses ordering/etc and barriers are frequently used in the PCI drivers. Looking at the i386 code this is what I would expect to see. Anyone know for sure? > Putting a "sync" before "eret" certainly doesn't work. The case I've > identified is mask_and_ack() in an interrupt handler. It masks an active > IRQ line in an external controller so that handle_IRQ_event() can unmask Yeah, Ok, this is fairly normal. You have the unique case were using a sync will solve your problem. Sync only fixes this if the target address is in the CPU's system controller, it does not work for, say, PCI devices. IMHO it is better to manage this sort of flushing explicitly in all drivers, PCI drivers already do a read from device, drivers for system controller devices could do that, or use a special sync macro.. That way it is visible in the code and not relying on an arch/cpu-specific wmb() side effect. If you are using a UP mips that has strongly ordered blocking reads and writes I'd think that rmb/wmb should only be asm("":::"memory"). If your driver needs to do syncs for non-ordering reasons it should be doing syncs. > Ok, what about assuring a value written to an I/O memory resource has > actually been commited? That's what rmb() is for. Since the CPU is > strongly-orderes wmb() (a "sync") should be indistinguishable from a > "nop". No, a sync will still empty the write buffer. It may halt the pipeline for many (~80 perhaps) cycles while posted write data is dumped to the system controller. Regrettably, the SysAD bus that alot of mips chips use does not allow a non-posted write, so in that case 'sync' is not going to commit an I/O write as you would expect, it just moves it into a write buffer on the system controller or on a PCI bridge. (This is not what PCI defines for IO accesses, I don't know of any really elegent way to fix it though..) I don't think rmb is generaly defined to flush IO writes.. It isn't used very often at all in the drives/net and drivers/scsi stuff.. > Hmm, the option seems to exist already, namely CONFIG_NONCOHERENT_IO, but > is it really worth making the third variation to save a single "nop", > especially as barriers are relatively rare? I'll do it, if it is. Er, no, CONFIG_NONCOHERENT_IO is for caches that are not coherent with the IO subsystem. Barriers seem to be pretty common in the ISR's for PCI drivers, at least the ones I've looked at.. While we are on this topic, does anyone know what the correct linux way to turn off an interrupt at a PCI device is? I've seen this in a few of the drivers: writeb(0,base+INT_ENABLE); readb(base+INT_ENABLE); // Flush the write through the PCI bus wmb(); ints_on(); Or should it be: writeb(0,base+INT_ENABLE); wmb(); readb(base+INT_ENABLE); // Flush the write through the PCI bus rmb(); ints_on(); Which makes more sense.. If the latter is correct then your patch needs to define rmb as: 'lw $1,KSEG' 'addiu $0, $1, 0x0000' Otherwise MIPS chips that do not support blocking uncached memory reads will not work right. (SR71000 is such a chip) Thanks, Jason ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 20:35 ` Jason Gunthorpe @ 2002-01-31 21:50 ` Maciej W. Rozycki 2002-02-01 6:44 ` Jason Gunthorpe 2002-01-31 23:38 ` Alan Cox 1 sibling, 1 reply; 17+ messages in thread From: Maciej W. Rozycki @ 2002-01-31 21:50 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-mips, linux-mips On Thu, 31 Jan 2002, Jason Gunthorpe wrote: > > Well, other architectures seem to define the macros such that wmb() is a > > pure ordering barrier (to assure strong ordering between writes) and rmb() > > is a flush (since reads are synchronous by their nature this implies all > > other transactions have to be finished before). > > Yes, I've noticed the wmb bit, not sure about rmb - it is hard to tell > without knowing the gritty details of the unusual asm ops used :< sparc64 > for instance seems to define them purely as ordering barriers. Weird. > A little more qualification would be that: > write(...,device); // Disable int > wmb() > enable_ints(); > Is expected to have a potential spurious interrupt. But, perhaps this is Of course -- this only assures any write within "enable_ints()" won't happen before "write(...,device)". Actual writes may happen much later, e.g. when conditions on the bus allow. > OK: > outl(...,device); > wmb(); > enable_ints(); > This is consistant with how the PCI spec discusses ordering/etc and > barriers are frequently used in the PCI drivers. Looking at the i386 code > this is what I would expect to see. Hmm, the DECstation is not a PCI machine. And the external interrupt controller can be treated as a part of the chipset -- writes to it doesn't go through any kind of an external bus. It's possible that the chipset acts as a write-back buffer extension for the CPU's internal buffer. > > Putting a "sync" before "eret" certainly doesn't work. The case I've > > identified is mask_and_ack() in an interrupt handler. It masks an active > > IRQ line in an external controller so that handle_IRQ_event() can unmask > > Yeah, Ok, this is fairly normal. You have the unique case were using a > sync will solve your problem. Sync only fixes this if the target address > is in the CPU's system controller, it does not work for, say, PCI devices. A "sync" is required since otherwise a write can bypass a read. For CPUs for which it doesn't happen, a "sync" is effectively a "nop", so it doesn't really matter. Note that I don't need a "sync" specifically (the code works for me either way), since for the R4400 a write cannot bypass a read, but it's still needed for correctness. > IMHO it is better to manage this sort of flushing explicitly in all > drivers, PCI drivers already do a read from device, drivers for system > controller devices could do that, or use a special sync macro.. That way > it is visible in the code and not relying on an arch/cpu-specific wmb() > side effect. Hmm, wmb() is pretty clearly defined. The current implementation does not enforce strict ordering and is thus incorrect. Note that the R4400 manual explicitly states a cached write can bypass an uncached one, hence the CPU may exploit weak ordering under certain circumstances. The "sync" instruction was specifically defined to avoid such a risk. > If you are using a UP mips that has strongly ordered blocking reads and > writes I'd think that rmb/wmb should only be asm("":::"memory"). If your > driver needs to do syncs for non-ordering reasons it should be doing > syncs. Nope, it's not always strongly ordered -- see the R4400 manual. Also note that rmb() is defined as asm("lock; addl $0,0(%%esp)":::"memory") for the i386, to make sure all transactions are completed even in the UP mode, even though the i386 is strongly ordered. > No, a sync will still empty the write buffer. It may halt the pipeline for > many (~80 perhaps) cycles while posted write data is dumped to the system > controller. Then it's an implementation bug. For a CPU in the UP mode "sync" is only defined to prevent write and read reordering -- there is nothing about flushing. > Regrettably, the SysAD bus that alot of mips chips use does not allow a > non-posted write, so in that case 'sync' is not going to commit an I/O > write as you would expect, it just moves it into a write buffer on the > system controller or on a PCI bridge. (This is not what PCI defines for IO > accesses, I don't know of any really elegent way to fix it though..) That's a correct implementation -- see the "sync" definition. > I don't think rmb is generaly defined to flush IO writes.. It isn't used > very often at all in the drives/net and drivers/scsi stuff.. If a driver doesn't need it then it doesn't use it. It may be buggy as well, e.g. due to being i386-centric. > Er, no, CONFIG_NONCOHERENT_IO is for caches that are not coherent with the > IO subsystem. I haven't investigated the option much, I admit. > While we are on this topic, does anyone know what the correct linux way > to turn off an interrupt at a PCI device is? I've seen this in a few > of the drivers: > > writeb(0,base+INT_ENABLE); > readb(base+INT_ENABLE); // Flush the write through the PCI bus > wmb(); > ints_on(); > > Or should it be: > > writeb(0,base+INT_ENABLE); > wmb(); > readb(base+INT_ENABLE); // Flush the write through the PCI bus > rmb(); > ints_on(); > > Which makes more sense.. I would do it as: writeb(0,base+INT_ENABLE); mb(); readb(base+INT_ENABLE); // Flush the write through the PCI bus rmb(); ints_on(); You need an "mb()", since you are changing the access type, so you need to synchronize both kinds. > If the latter is correct then your patch needs to define rmb as: > 'lw $1,KSEG' > 'addiu $0, $1, 0x0000' > Otherwise MIPS chips that do not support blocking uncached memory reads > will not work right. (SR71000 is such a chip) I don't understand what the purpose of the above code is, except that it wastes a cycle. Please elaborate. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 21:50 ` Maciej W. Rozycki @ 2002-02-01 6:44 ` Jason Gunthorpe 2002-02-01 8:55 ` Dominic Sweetman ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Jason Gunthorpe @ 2002-02-01 6:44 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-mips, linux-mips On Thu, 31 Jan 2002, Maciej W. Rozycki wrote: > Hmm, wmb() is pretty clearly defined. The current implementation does > not enforce strict ordering and is thus incorrect. Note that the R4400 > manual explicitly states a cached write can bypass an uncached one, hence > the CPU may exploit weak ordering under certain circumstances. The "sync" > instruction was specifically defined to avoid such a risk. Yes, you are correct. I suppose *mb() must also work correctly with the cache flush macros - didn't think about that one! I'm afraid I don't have any manuals for any of the MIPS chips just 3rd party ones SB1, RM7K and SR71000 - which is why I have some many odd questions. :> > > No, a sync will still empty the write buffer. It may halt the pipeline for > > many (~80 perhaps) cycles while posted write data is dumped to the system > > controller. > > Then it's an implementation bug. For a CPU in the UP mode "sync" is only > defined to prevent write and read reordering -- there is nothing about > flushing. Did some more research + phoning.. RM7K is definately documented to dump the write buffer on 'sync'. The RM7K guide even has an example (7.8.5) where it implies that sync also forces a write back of any dirty cache lines - gah! (Hard to belive though..) Sandcraft tells me that sync only waits for outstanding reads on their chips - so it is very cheap. Writebacks from cached and uncached writes are never put out of program order. Sorry my viewpoint is skewed by this small sampling of processors :< > You need an "mb()", since you are changing the access type, so you need to > synchronize both kinds. OK. > I don't understand what the purpose of the above code is, except that it > wastes a cycle. Please elaborate. There was a miscommunication on my part, please ignore it. I hope Ralf accepts your patch, I think it will be good to have. Jason ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 6:44 ` Jason Gunthorpe @ 2002-02-01 8:55 ` Dominic Sweetman 2002-02-01 12:04 ` Maciej W. Rozycki 2002-02-01 9:52 ` Hartvig Ekner 2002-02-01 12:01 ` Maciej W. Rozycki 2 siblings, 1 reply; 17+ messages in thread From: Dominic Sweetman @ 2002-02-01 8:55 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Maciej W. Rozycki, linux-mips, linux-mips > > > No, a sync will still empty the write buffer. It may halt the > > >pipeline for many (~80 perhaps) cycles while posted write data is > > >dumped to the system controller. > > > > Then it's an implementation bug. For a CPU in the UP mode "sync" > > is only defined to prevent write and read reordering -- there is > > nothing about flushing. > > Did some more research + phoning.. RM7K is definately documented to dump > the write buffer on 'sync'. The RM7000 sync does more than is required by the ISA. However, since the write-buffer is not part of the architecture, this is not a bug. Though it might well be held to be undesirable. And there has to be *some* way to force the write buffer to empty, and this is cheaper than the standard alternative of an uncached read. I believe the RM7000 write buffer can hold one pending cache-line writeback or up to four uncached writes. Emptying the write buffer can only occupy more than a handful of cycles when the system controller is already backed-up with pending writes; under those circumstances the system was probably about to stall anyway, so I wouldn't be too worried about the performance implications of a 'sync'. > The RM7K guide even has an example (7.8.5) > where it implies that sync also forces a write back of any dirty cache > lines - gah! (Hard to belive though..) There's not a word of truth in that (there just aren't the right wires in the hardware to make that possible). But I think what it means is that it stops the CPU until a pending cache line write back is complete. It's hard to see why that would be useful, but since it uses the same write buffer hardware it's easy to see why it would happen. Of course (as everyone has been piling in and pointing out) the real problems are: 1. The MIPS people who invented 'sync' had a much more sophisticated understanding of read/write behaviour than is common in those who support, document and program uniprocessor "embedded" CPUs; so there's lots of scope for misunderstanding 'sync'. 2. When write-posting bites you, it's usually not in the CPU but somewhere further down the system. You should never believe a write has actually arrived at the device you sent it to, until you see the device itself change state... And a plea: the word "flush" in Linux is already abused for caches (where it means invalidate, write back, or possibly both). That's enough trouble: if you also "flush" write buffers, your confusion will be complete. Maybe we should leave "flush" to plumbers and use more precise terminology for computers... -- Dominic Sweetman, Algorithmics Ltd The Fruit Farm, Ely Road, Chittering, CAMBS CB5 9PH, ENGLAND phone: +44 1223 706200 / fax: +44 1223 706250 / direct: +44 1223 706205 http://www.algor.co.uk ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 8:55 ` Dominic Sweetman @ 2002-02-01 12:04 ` Maciej W. Rozycki 0 siblings, 0 replies; 17+ messages in thread From: Maciej W. Rozycki @ 2002-02-01 12:04 UTC (permalink / raw) To: Dominic Sweetman; +Cc: Jason Gunthorpe, linux-mips, linux-mips On Fri, 1 Feb 2002, Dominic Sweetman wrote: > And a plea: the word "flush" in Linux is already abused for caches > (where it means invalidate, write back, or possibly both). That's > enough trouble: if you also "flush" write buffers, your confusion will > be complete. Maybe we should leave "flush" to plumbers and use more > precise terminology for computers... Too bad the original name for the function is wbflush() already... At least IDT uses it in their docs. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 6:44 ` Jason Gunthorpe 2002-02-01 8:55 ` Dominic Sweetman @ 2002-02-01 9:52 ` Hartvig Ekner 2002-02-01 9:52 ` Hartvig Ekner 2002-02-01 12:01 ` Maciej W. Rozycki 2 siblings, 1 reply; 17+ messages in thread From: Hartvig Ekner @ 2002-02-01 9:52 UTC (permalink / raw) To: jgg; +Cc: Maciej W. Rozycki, linux-mips, linux-mips Jason Gunthorpe writes: > > > On Thu, 31 Jan 2002, Maciej W. Rozycki wrote: > > > Hmm, wmb() is pretty clearly defined. The current implementation does > > not enforce strict ordering and is thus incorrect. Note that the R4400 > > manual explicitly states a cached write can bypass an uncached one, hence > > the CPU may exploit weak ordering under certain circumstances. The "sync" > > instruction was specifically defined to avoid such a risk. > > Yes, you are correct. I suppose *mb() must also work correctly with the > cache flush macros - didn't think about that one! > > I'm afraid I don't have any manuals for any of the MIPS chips just 3rd > party ones SB1, RM7K and SR71000 - which is why I have some many > odd questions. :> > > > > No, a sync will still empty the write buffer. It may halt the pipeline for > > > many (~80 perhaps) cycles while posted write data is dumped to the system > > > controller. > > > > Then it's an implementation bug. For a CPU in the UP mode "sync" is only > > defined to prevent write and read reordering -- there is nothing about > > flushing. Not quite. Extract from the MIPS32 Architecture Reference Manual (available on www.mips.com, documentation section): SYNC Purpose: To order loads and stores. Description: * SYNC affects only uncached and cached coherent loads and stores. The loads and stores that occur before the SYNC must be completed before the loads and stores after the SYNC are allowed to start. * Loads are completed when the destination register is written. Stores are completed when the stored value is visible to every other processor in the system. * SYNC is required, potentially in conjunction with SSNOP, to guarantee that memory reference results are visible across operating mode changes. For example, a SYNC is required on some implementations on entry to and exit from Debug Mode to guarantee that memory affects are handled correctly. Detailed Description: * When the stype field has a value of zero, every synchronizable load and store that occurs in the instruction stream before the SYNC instruction must be globally performed before any synchronizable load or store that occurs after the SYNC can be performed, with respect to any other processor or coherent I/O module. * SYNC does not guarantee the order in which instruction fetches are performed. The stype values 1-31 are reserved; they produce the same result as the value zero. ----------- end of extract The interesting one is the second-last bullet, and where the synchronization point lies. All the cores from MIPS technologies provide the means through lines and/or signalling on the bus interface to move the synchronization point outside the processor. That is, without doing anything additional, the processor itself will empty all writebuffers and present the writes on the bus interface before any bus transaction of an instruction following the sync will appear on the bus interface. Through the use of the additional signalling mentioned above, the system logic can choose to delay even more, thus moving the synchronization point further out. /Hartvig -- _ _ _____ ____ Hartvig Ekner Mailto:hartvige@mips.com |\ /| | |____)(____ Direct: +45 4486 5503 | \/ | | | _____) MIPS Denmark Switch: +45 4486 5555 T E C H N O L O G I E S http://www.mips.com Fax...: +45 4486 5556 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 9:52 ` Hartvig Ekner @ 2002-02-01 9:52 ` Hartvig Ekner 0 siblings, 0 replies; 17+ messages in thread From: Hartvig Ekner @ 2002-02-01 9:52 UTC (permalink / raw) To: jgg; +Cc: Maciej W. Rozycki, linux-mips, linux-mips Jason Gunthorpe writes: > > > On Thu, 31 Jan 2002, Maciej W. Rozycki wrote: > > > Hmm, wmb() is pretty clearly defined. The current implementation does > > not enforce strict ordering and is thus incorrect. Note that the R4400 > > manual explicitly states a cached write can bypass an uncached one, hence > > the CPU may exploit weak ordering under certain circumstances. The "sync" > > instruction was specifically defined to avoid such a risk. > > Yes, you are correct. I suppose *mb() must also work correctly with the > cache flush macros - didn't think about that one! > > I'm afraid I don't have any manuals for any of the MIPS chips just 3rd > party ones SB1, RM7K and SR71000 - which is why I have some many > odd questions. :> > > > > No, a sync will still empty the write buffer. It may halt the pipeline for > > > many (~80 perhaps) cycles while posted write data is dumped to the system > > > controller. > > > > Then it's an implementation bug. For a CPU in the UP mode "sync" is only > > defined to prevent write and read reordering -- there is nothing about > > flushing. Not quite. Extract from the MIPS32 Architecture Reference Manual (available on www.mips.com, documentation section): SYNC Purpose: To order loads and stores. Description: * SYNC affects only uncached and cached coherent loads and stores. The loads and stores that occur before the SYNC must be completed before the loads and stores after the SYNC are allowed to start. * Loads are completed when the destination register is written. Stores are completed when the stored value is visible to every other processor in the system. * SYNC is required, potentially in conjunction with SSNOP, to guarantee that memory reference results are visible across operating mode changes. For example, a SYNC is required on some implementations on entry to and exit from Debug Mode to guarantee that memory affects are handled correctly. Detailed Description: * When the stype field has a value of zero, every synchronizable load and store that occurs in the instruction stream before the SYNC instruction must be globally performed before any synchronizable load or store that occurs after the SYNC can be performed, with respect to any other processor or coherent I/O module. * SYNC does not guarantee the order in which instruction fetches are performed. The stype values 1-31 are reserved; they produce the same result as the value zero. ----------- end of extract The interesting one is the second-last bullet, and where the synchronization point lies. All the cores from MIPS technologies provide the means through lines and/or signalling on the bus interface to move the synchronization point outside the processor. That is, without doing anything additional, the processor itself will empty all writebuffers and present the writes on the bus interface before any bus transaction of an instruction following the sync will appear on the bus interface. Through the use of the additional signalling mentioned above, the system logic can choose to delay even more, thus moving the synchronization point further out. /Hartvig -- _ _ _____ ____ Hartvig Ekner Mailto:hartvige@mips.com |\ /| | |____)(____ Direct: +45 4486 5503 | \/ | | | _____) MIPS Denmark Switch: +45 4486 5555 T E C H N O L O G I E S http://www.mips.com Fax...: +45 4486 5556 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 6:44 ` Jason Gunthorpe 2002-02-01 8:55 ` Dominic Sweetman 2002-02-01 9:52 ` Hartvig Ekner @ 2002-02-01 12:01 ` Maciej W. Rozycki 2 siblings, 0 replies; 17+ messages in thread From: Maciej W. Rozycki @ 2002-02-01 12:01 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-mips, linux-mips On Thu, 31 Jan 2002, Jason Gunthorpe wrote: > I'm afraid I don't have any manuals for any of the MIPS chips just 3rd > party ones SB1, RM7K and SR71000 - which is why I have some many > odd questions. :> Please feel free to grab a few from 'http://www.mips.com/Documentation/' and 'http://www.mips.com/declassified/'. ;-) I have a few excellent IDT manuals on their original CD-ROM as well. If they can't be downloaded anymore, I may put them online at my site. > Did some more research + phoning.. RM7K is definately documented to dump > the write buffer on 'sync'. The RM7K guide even has an example (7.8.5) > where it implies that sync also forces a write back of any dirty cache > lines - gah! (Hard to belive though..) If RM7K is always strongly ordered, I may code a workaround. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 20:35 ` Jason Gunthorpe 2002-01-31 21:50 ` Maciej W. Rozycki @ 2002-01-31 23:38 ` Alan Cox 2002-01-31 23:38 ` Alan Cox 2002-02-01 12:27 ` Maciej W. Rozycki 1 sibling, 2 replies; 17+ messages in thread From: Alan Cox @ 2002-01-31 23:38 UTC (permalink / raw) To: jgg; +Cc: Maciej W. Rozycki, linux-mips, linux-mips > A little more qualification would be that: > write(...,device); // Disable int > wmb() > enable_ints(); > Is expected to have a potential spurious interrupt. But, perhaps this is > OK: > outl(...,device); > wmb(); > enable_ints(); > This is consistant with how the PCI spec discusses ordering/etc and > barriers are frequently used in the PCI drivers. Looking at the i386 code > this is what I would expect to see. > > Anyone know for sure? The x86 behaviour forced as I understand it is barrier() - compiler level store barrier rmb() - read barrier to bus/DMA level [no operation] wmb() - write barrier to bus/DMA level [synchronizing instruction sequence of locked add of 0 to stack top] (mb and wmb as names come from Alpha so I guess its definitive 8)) It does not enforce PCI posting. Also your spurious interrupt case is wrong for other horrible reasons. Interrupt delivery must never be assumed to be synchronous in a portable driver. (In fact you'll see async irq delivery on an X86) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 23:38 ` Alan Cox @ 2002-01-31 23:38 ` Alan Cox 2002-02-01 12:27 ` Maciej W. Rozycki 1 sibling, 0 replies; 17+ messages in thread From: Alan Cox @ 2002-01-31 23:38 UTC (permalink / raw) To: jgg; +Cc: Maciej W. Rozycki, linux-mips, linux-mips > A little more qualification would be that: > write(...,device); // Disable int > wmb() > enable_ints(); > Is expected to have a potential spurious interrupt. But, perhaps this is > OK: > outl(...,device); > wmb(); > enable_ints(); > This is consistant with how the PCI spec discusses ordering/etc and > barriers are frequently used in the PCI drivers. Looking at the i386 code > this is what I would expect to see. > > Anyone know for sure? The x86 behaviour forced as I understand it is barrier() - compiler level store barrier rmb() - read barrier to bus/DMA level [no operation] wmb() - write barrier to bus/DMA level [synchronizing instruction sequence of locked add of 0 to stack top] (mb and wmb as names come from Alpha so I guess its definitive 8)) It does not enforce PCI posting. Also your spurious interrupt case is wrong for other horrible reasons. Interrupt delivery must never be assumed to be synchronous in a portable driver. (In fact you'll see async irq delivery on an X86) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 23:38 ` Alan Cox 2002-01-31 23:38 ` Alan Cox @ 2002-02-01 12:27 ` Maciej W. Rozycki 2002-02-02 12:09 ` Maciej W. Rozycki 1 sibling, 1 reply; 17+ messages in thread From: Maciej W. Rozycki @ 2002-02-01 12:27 UTC (permalink / raw) To: Alan Cox; +Cc: jgg, linux-mips, linux-mips On Thu, 31 Jan 2002, Alan Cox wrote: > The x86 behaviour forced as I understand it is > > barrier() - compiler level store barrier > rmb() - read barrier to bus/DMA level > [no operation] > wmb() - write barrier to bus/DMA level > [synchronizing instruction sequence > of locked add of 0 to stack top] > > (mb and wmb as names come from Alpha so I guess its definitive 8)) Well, after looking at the Alpha Architecture Handbook I see "mb" and "wmb" are pure ordering barriers -- any transactions at the CPU bus (pins) may still be deferred or prefetched (architecturally -- can't comment on specific chips). So after all, maybe all the macros should be purely "sync" for MIPS ("" for MIPS I, and mb() equal to wbflush() for R3220 and similar setups) and anything that wants to see all writes actually committed should use wbflush(), which would be defined as "mb(); uncached_read();" (or in a system-specific way, for R3220, etc.)? The i386 implementation seems stronger than it should be, but that's probably because of the limited choice available. Any thoughts? > It does not enforce PCI posting. Also your spurious interrupt case is > wrong for other horrible reasons. Interrupt delivery must never be > assumed to be synchronous in a portable driver. (In fact you'll see async > irq delivery on an X86) For interrupts arriving to an interrupt controller -- agreed. But we don't generally expect a spurious interrupt from a line that was already masked at the controller level. In other words mask_and_ack() must undertake any means possible, to assure the addressed controller received the new mask. If an interrupt passes by ocassionally anyway, then it's not fatal, i.e. we can handle it, but it shouldn't be a rule (i.e. receiving as many spurious interrupts as real ones). Am I right? Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-01 12:27 ` Maciej W. Rozycki @ 2002-02-02 12:09 ` Maciej W. Rozycki 0 siblings, 0 replies; 17+ messages in thread From: Maciej W. Rozycki @ 2002-02-02 12:09 UTC (permalink / raw) To: linux-mips, linux-mips; +Cc: jgg On Fri, 1 Feb 2002, Maciej W. Rozycki wrote: > Well, after looking at the Alpha Architecture Handbook I see "mb" and > "wmb" are pure ordering barriers -- any transactions at the CPU bus (pins) > may still be deferred or prefetched (architecturally -- can't comment on > specific chips). So after all, maybe all the macros should be purely > "sync" for MIPS ("" for MIPS I, and mb() equal to wbflush() for R3220 and > similar setups) and anything that wants to see all writes actually > committed should use wbflush(), which would be defined as "mb(); > uncached_read();" (or in a system-specific way, for R3220, etc.)? > > The i386 implementation seems stronger than it should be, but that's > probably because of the limited choice available. > > Any thoughts? Well, I've seen no thoughts so far. :-( Anyway here is a new patch, which takes my considerations about a synchronization vs a write commitment into account. This implementation only assures the absolute minimum specified in the operations. I am also tempted to add "#define iob() wbflush()" to system.h and attempt to define the macro for other architectures as well. The intended semantics is to assure all preceding transactions arrived at a system bus and none of following ones were started yet. The "system bus" here is defined: "the bus that connects the bus controller of a CPU (whether internal to the chip or not) to the rest of a system." -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + patch-mips-2.4.17-20020111-mb-wb-1 diff -up --recursive --new-file linux-mips-2.4.17-20020111.macro/arch/mips/config.in linux-mips-2.4.17-20020111/arch/mips/config.in --- linux-mips-2.4.17-20020111.macro/arch/mips/config.in Mon Jan 7 05:27:13 2002 +++ linux-mips-2.4.17-20020111/arch/mips/config.in Sat Jan 26 02:35:35 2002 @@ -363,6 +363,12 @@ else fi fi fi +if [ "$CONFIG_CPU_R3000" = "y" -o \ + "$CONFIG_CPU_TX39XX" = "y" ]; then + define_bool CONFIG_CPU_HAS_SYNC n +else + define_bool CONFIG_CPU_HAS_SYNC y +fi endmenu mainmenu_option next_comment diff -up --recursive --new-file linux-mips-2.4.17-20020111.macro/include/asm-mips/system.h linux-mips-2.4.17-20020111/include/asm-mips/system.h --- linux-mips-2.4.17-20020111.macro/include/asm-mips/system.h Thu Dec 13 05:28:09 2001 +++ linux-mips-2.4.17-20020111/include/asm-mips/system.h Sat Feb 2 00:05:19 2002 @@ -167,32 +167,31 @@ extern void __global_restore_flags(unsig #define local_irq_disable() __cli(); #define local_irq_enable() __sti(); -/* - * These are probably defined overly paranoid ... - */ +#ifdef CONFIG_CPU_HAS_SYNC +#define __sync() __asm__ __volatile__("sync" : : : "memory") +#else +#define __sync() do { } while(0) +#endif + +#define fast_wmb() __sync() +#define fast_rmb() __sync() +#define fast_mb() __sync() + #ifdef CONFIG_CPU_HAS_WB #include <asm/wbflush.h> -#define rmb() do { } while(0) -#define wmb() wbflush() -#define mb() wbflush() - -#else /* CONFIG_CPU_HAS_WB */ - -#define mb() \ -__asm__ __volatile__( \ - "# prevent instructions being moved around\n\t" \ - ".set\tnoreorder\n\t" \ - "# 8 nops to fool the R4400 pipeline\n\t" \ - "nop;nop;nop;nop;nop;nop;nop;nop\n\t" \ - ".set\treorder" \ - : /* no output */ \ - : /* no input */ \ - : "memory") -#define rmb() mb() -#define wmb() mb() -#endif /* CONFIG_CPU_HAS_WB */ +#define wmb() fast_wmb() +#define rmb() fast_rmb() +#define mb() wbflush(); + +#else /* !CONFIG_CPU_HAS_WB */ + +#define wmb() fast_wmb() +#define rmb() fast_rmb() +#define mb() fast_mb() + +#endif /* !CONFIG_CPU_HAS_WB */ #ifdef CONFIG_SMP #define smp_mb() mb() diff -up --recursive --new-file linux-mips-2.4.17-20020111.macro/include/asm-mips/wbflush.h linux-mips-2.4.17-20020111/include/asm-mips/wbflush.h --- linux-mips-2.4.17-20020111.macro/include/asm-mips/wbflush.h Fri Sep 7 04:26:33 2001 +++ linux-mips-2.4.17-20020111/include/asm-mips/wbflush.h Sat Feb 2 00:05:21 2002 @@ -6,28 +6,49 @@ * for more details. * * Copyright (c) 1998 Harald Koerfgen + * Copyright (C) 2002 Maciej W. Rozycki */ #ifndef __ASM_MIPS_WBFLUSH_H #define __ASM_MIPS_WBFLUSH_H #include <linux/config.h> -#if defined(CONFIG_CPU_HAS_WB) -/* - * R2000 or R3000 - */ -extern void (*__wbflush) (void); +#include <asm/addrspace.h> +#include <asm/system.h> -#define wbflush() __wbflush() +#define __fast_wbflush() \ + __asm__ __volatile__( \ + ".set push\n\t" \ + ".set noreorder\n\t" \ + "lw $0,%0\n\t" \ + "nop\n\t" \ + ".set pop" \ + : /* no output */ \ + : "m" (*(int *)KSEG1) \ + : "memory") + +#define fast_wbflush() \ + do { \ + fast_mb(); \ + __fast_wbflush(); \ + } while (0) + + +#ifdef CONFIG_CPU_HAS_WB + +extern void (*__wbflush)(void); + +#define wbflush() \ + do { \ + fast_mb(); \ + __wbflush(); \ + } while (0) -#else -/* - * we don't need no stinkin' wbflush - */ +#else /* !CONFIG_CPU_HAS_WB */ -#define wbflush() do { } while(0) +#define wbflush() fast_wbflush() -#endif +#endif /* !CONFIG_CPU_HAS_WB */ extern void wbflush_setup(void); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-01-31 12:17 ` Maciej W. Rozycki 2002-01-31 20:35 ` Jason Gunthorpe @ 2002-02-04 17:02 ` Ralf Baechle 2002-02-05 10:33 ` Maciej W. Rozycki 1 sibling, 1 reply; 17+ messages in thread From: Ralf Baechle @ 2002-02-04 17:02 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Jason Gunthorpe, linux-mips, linux-mips On Thu, Jan 31, 2002 at 01:17:39PM +0100, Maciej W. Rozycki wrote: > Certain DECstation models have a write-back buffer that needs to be > handled explicitly. For example rmb() is "1: bc0f 1b" for the R3220 WB > chip. Wmb() is null, certainly, as the buffer is strongly-ordered. See > arch/mips/dec/wbflush.c for details. Just as an aside that isn't directly relevant to DECstations. To date all MIPS _processors_ are strongly ordered. I now know of at least one processor that implements a weakly ordered memory model, so the assumption of a strongly ordered memory model has become void for large parts of the kernel. Even before that some systems had strongly ordered processors in system environments that may reorder requests. Bugs due to surprise memory reordering are entirely unfun to debug, so be paranoid. They're out there to get you ... Ralf ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] linux 2.4.17: An mb() rework 2002-02-04 17:02 ` Ralf Baechle @ 2002-02-05 10:33 ` Maciej W. Rozycki 0 siblings, 0 replies; 17+ messages in thread From: Maciej W. Rozycki @ 2002-02-05 10:33 UTC (permalink / raw) To: Ralf Baechle; +Cc: Jason Gunthorpe, linux-mips, linux-mips On Mon, 4 Feb 2002, Ralf Baechle wrote: > Bugs due to surprise memory reordering are entirely unfun to debug, so be > paranoid. They're out there to get you ... The rule of thumb is to use barriers whenever a *device* requires them and depend on the platform to define them appropriately. They will expand to nothing if unneeded, so there is no trade-off. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2002-02-05 10:33 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-01-30 14:56 [patch] linux 2.4.17: An mb() rework Maciej W. Rozycki 2002-01-30 20:02 ` Jason Gunthorpe 2002-01-31 12:17 ` Maciej W. Rozycki 2002-01-31 20:35 ` Jason Gunthorpe 2002-01-31 21:50 ` Maciej W. Rozycki 2002-02-01 6:44 ` Jason Gunthorpe 2002-02-01 8:55 ` Dominic Sweetman 2002-02-01 12:04 ` Maciej W. Rozycki 2002-02-01 9:52 ` Hartvig Ekner 2002-02-01 9:52 ` Hartvig Ekner 2002-02-01 12:01 ` Maciej W. Rozycki 2002-01-31 23:38 ` Alan Cox 2002-01-31 23:38 ` Alan Cox 2002-02-01 12:27 ` Maciej W. Rozycki 2002-02-02 12:09 ` Maciej W. Rozycki 2002-02-04 17:02 ` Ralf Baechle 2002-02-05 10:33 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox