linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Workaround for 745x data corruption bug
@ 2004-07-31 16:27 Adrian Cox
  2004-08-01  2:53 ` Benjamin Herrenschmidt
  2004-08-02 18:20 ` Mark A. Greer
  0 siblings, 2 replies; 14+ messages in thread
From: Adrian Cox @ 2004-07-31 16:27 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

Recently released errata documents show a new bug in all 745x family
processors. This can cause data corruption when memory is mapped
non-coherent and one of these conditions is true:
1) L2 hardware prefetch is enabled (as it is in Linux)
2) instructions and data are fetched from the same or adjacent cache
lines.

The attached patch adds a workaround, by setting CPU_FTR_NEED_COHERENT
on all 745x processors.

The bug is 7447A errata #16, 7457 errata #26, 7455 errata #33, 7450
errata #69.

Signed-off-by: Adrian Cox <adrian@humboldt.co.uk>

- Adrian Cox
Humboldt Solutions Ltd.



[-- Attachment #2: Type: text/x-patch, Size: 3070 bytes --]

===== arch/ppc/kernel/cputable.c 1.32 vs edited =====
--- 1.32/arch/ppc/kernel/cputable.c	Sat Jun 19 03:39:25 2004
+++ edited/arch/ppc/kernel/cputable.c	Sat Jul 31 16:51:34 2004
@@ -263,7 +263,7 @@
 	CPU_FTR_COMMON |
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
-	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450,
+	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -274,7 +274,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_L3_DISABLE_NAP,
+	CPU_FTR_L3_DISABLE_NAP | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -284,7 +284,8 @@
 	CPU_FTR_COMMON |
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
-	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR,
+	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
+	CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -294,7 +295,8 @@
 	CPU_FTR_COMMON |
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
-	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_HAS_HIGH_BATS,
+	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_HAS_HIGH_BATS |
+	CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -305,7 +307,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_L3_DISABLE_NAP | CPU_FTR_HAS_HIGH_BATS,
+	CPU_FTR_L3_DISABLE_NAP | CPU_FTR_HAS_HIGH_BATS | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -316,7 +318,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_HAS_HIGH_BATS,
+	CPU_FTR_HAS_HIGH_BATS | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -327,7 +329,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_HAS_HIGH_BATS,
+	CPU_FTR_HAS_HIGH_BATS | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x
@@ -338,7 +340,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_HAS_HIGH_BATS,
+	CPU_FTR_HAS_HIGH_BATS | CPU_FTR_NEED_COHERENT,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-07-31 16:27 [PATCH] Workaround for 745x data corruption bug Adrian Cox
@ 2004-08-01  2:53 ` Benjamin Herrenschmidt
  2004-08-02 15:20   ` Kumar Gala
  2004-08-02 18:20 ` Mark A. Greer
  1 sibling, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2004-08-01  2:53 UTC (permalink / raw)
  To: Adrian Cox; +Cc: linuxppc-dev list


I have a more complete patch addressing another errata as well, will
be up rsn...

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-01  2:53 ` Benjamin Herrenschmidt
@ 2004-08-02 15:20   ` Kumar Gala
  2004-08-02 21:40     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Kumar Gala @ 2004-08-02 15:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev list, Adrian Cox


Ben,

Do you plan on also back porting your patch to 2.4?

thanks

- kumar

On Jul 31, 2004, at 9:53 PM, Benjamin Herrenschmidt wrote:

>
> I have a more complete patch addressing another errata as well, will
> be up rsn...
>
> Ben.
>
>


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-07-31 16:27 [PATCH] Workaround for 745x data corruption bug Adrian Cox
  2004-08-01  2:53 ` Benjamin Herrenschmidt
@ 2004-08-02 18:20 ` Mark A. Greer
  2004-08-02 21:47   ` Sven Luther
  2004-08-03 13:17   ` Brian Waite
  1 sibling, 2 replies; 14+ messages in thread
From: Mark A. Greer @ 2004-08-02 18:20 UTC (permalink / raw)
  To: Adrian Cox; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 792 bytes --]

Adrian Cox wrote:

>Recently released errata documents show a new bug in all 745x family
>processors. This can cause data corruption when memory is mapped
>non-coherent and one of these conditions is true:
>1) L2 hardware prefetch is enabled (as it is in Linux)
>2) instructions and data are fetched from the same or adjacent cache
>lines.
>
>The attached patch adds a workaround, by setting CPU_FTR_NEED_COHERENT
>on all 745x processors.
>

Well that sucks (the bug, not the patch :).  Many people like to turn
off coherency when using Marvell host bridges b/c they struggle
performance-wise with coherency on (at least on some versions).

One change to the patch, though.  According to the 7447/7457 errata doc,
rev 1.2 doesn't have the bug.  The attached patch accounts for that.

Mark
--

[-- Attachment #2: coherent2.patch --]
[-- Type: text/plain, Size: 537 bytes --]

===== arch/ppc/kernel/cputable.c 1.25 vs edited =====
--- 1.25/arch/ppc/kernel/cputable.c	2004-08-01 17:30:29 -07:00
+++ edited/arch/ppc/kernel/cputable.c	2004-08-02 11:12:37 -07:00
@@ -352,7 +352,7 @@
     	CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_CAN_NAP |
 	CPU_FTR_L2CR | CPU_FTR_ALTIVEC_COMP | CPU_FTR_L3CR |
 	CPU_FTR_HPTE_TABLE | CPU_FTR_SPEC7450 | CPU_FTR_NAP_DISABLE_L2_PR |
-	CPU_FTR_HAS_HIGH_BATS | CPU_FTR_NEED_COHERENT,
+	CPU_FTR_HAS_HIGH_BATS,
 	COMMON_PPC | PPC_FEATURE_ALTIVEC_COMP,
 	32, 32,
 	__setup_cpu_745x

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-02 15:20   ` Kumar Gala
@ 2004-08-02 21:40     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2004-08-02 21:40 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Adrian Cox


On Tue, 2004-08-03 at 01:20, Kumar Gala wrote:
> Ben,
>
> Do you plan on also back porting your patch to 2.4?

Nope, but you are welcome to do it ;)

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-02 18:20 ` Mark A. Greer
@ 2004-08-02 21:47   ` Sven Luther
  2004-08-03 13:17   ` Brian Waite
  1 sibling, 0 replies; 14+ messages in thread
From: Sven Luther @ 2004-08-02 21:47 UTC (permalink / raw)
  To: Mark A. Greer; +Cc: Adrian Cox, linuxppc-dev


On Mon, Aug 02, 2004 at 11:20:30AM -0700, Mark A. Greer wrote:
> Adrian Cox wrote:
>
> >Recently released errata documents show a new bug in all 745x family
> >processors. This can cause data corruption when memory is mapped
> >non-coherent and one of these conditions is true:
> >1) L2 hardware prefetch is enabled (as it is in Linux)
> >2) instructions and data are fetched from the same or adjacent cache
> >lines.
> >
> >The attached patch adds a workaround, by setting CPU_FTR_NEED_COHERENT
> >on all 745x processors.
> >
>
> Well that sucks (the bug, not the patch :).  Many people like to turn
> off coherency when using Marvell host bridges b/c they struggle
> performance-wise with coherency on (at least on some versions).
>
> One change to the patch, though.  According to the 7447/7457 errata doc,
> rev 1.2 doesn't have the bug.  The attached patch accounts for that.

Notice that there are rev 1.1 7447s around, and that benh's patch does include
support for both the 1.1 and the 1.2 ones.

Friendly,

Sven Luther

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-02 18:20 ` Mark A. Greer
  2004-08-02 21:47   ` Sven Luther
@ 2004-08-03 13:17   ` Brian Waite
  2004-08-03 21:55     ` Mark A. Greer
  2004-08-04  0:45     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 14+ messages in thread
From: Brian Waite @ 2004-08-03 13:17 UTC (permalink / raw)
  To: Mark A. Greer; +Cc: Adrian Cox, linuxppc-dev


> Well that sucks (the bug, not the patch :).  Many people like to turn
> off coherency when using Marvell host bridges b/c they struggle
> performance-wise with coherency on (at least on some versions).
>
> One change to the patch, though.  According to the 7447/7457 errata doc,
> rev 1.2 doesn't have the bug.  The attached patch accounts for that.
>
> Mark
> --

Mark,

Looking at this errata, I don't think poeple using Marvell chips will
be affected by this errata.
I don't see this as impacting the IO subsystem only the internal MPX system. IE
using a 2 74xx system and not using the M bit. This race can also
occur on a single processor system with the M disabled, as it is in
Linux, but it has nothing to do with the memory controller mapping its
PCI<->mem windows as non cacheable. You should be able to boost the IO
throughput using non coherency without this errata impacting you any
more than it already is.

Anyone please correct me if I am wrong because I need to know.

Thanks
Brian

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-03 13:17   ` Brian Waite
@ 2004-08-03 21:55     ` Mark A. Greer
  2004-08-04 14:37       ` Brian Waite
  2004-08-04  0:45     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 14+ messages in thread
From: Mark A. Greer @ 2004-08-03 21:55 UTC (permalink / raw)
  To: Brian Waite; +Cc: Adrian Cox, linuxppc-dev


Brian,

One of us must be seriously misunderstanding that erratum.  Below is my
thinking.

--
Brian Waite wrote:

>Mark,
>
>Looking at this errata, I don't think poeple using Marvell chips will
>be affected by this errata.
>I don't see this as impacting the IO subsystem only the internal MPX system.
>

Please define exactly what you mean by "MPX system".

> IE
>using a 2 74xx system and not using the M bit.
>

Doubtful.  Assuming you mean a dual *SMP* 745x system, you'll almost
certainly want "coherency required" (i.e., M=1) so the erratum doesn't
apply.

> This race can also
>occur on a single processor system with the M disabled,
>

Yep.

> as it is in
>Linux, but it has nothing to do with the memory controller mapping its
>PCI<->mem windows as non cacheable.
>

This statement is wrong.  Firstly (nit pick), its not the memory
controller part of the chip that does that mapping, its the "PCI bridge"
part of that chip that does.

Secondly, you can't define a pci mem->system mem window as "non
cacheable"; you can only specify what type of snooping you want (none,
WB or WT).  Caches are assumed to exist and be turned on.  If they
aren't, you can just specify "none" for your snooping option.

Thirdly, it has everything to do with pci mem->system mem windows
because that mapping points to normal old system memory that PCI devices
are going to DMA into and out of.  If CONFIG_NOT_COHERENT_CACHE is NOT
defined, the the processor will set M=1 for data mappings so the
processor will assume that when a PCI device DMAs into memory, the
bridge will follow whatever coherency protocol they've agreed upon
(i.e., all that MEI, MESI, MERSI stuff) so the processor can
flush/invalidate cachelines as necessary.  Everyone has to follow the
protocol or you will end up with stale cachelines and seemingly broken
I/O sooner or later.

So, if CONFIG_NOT_COHERENT_CACHE is not defined, we must set up the
snoop windows on the bridge to either WB or WT.  If we don't set the
snoop windows correctly, coherency will be broken.  Also remember that
when CONFIG_NOT_COHERENT_CACHE is not defined the pci_/dma_ calls that
drivers use are essentially null (i.e., no manual cache management).

If CONFIG_NOT_COHERENT_CACHE is defined, then M=0 and now we set the
snoop windows on the bridge to "none".  The pci_/dma_ calls used by
drivers manage the caches and, assuming the driver is correctly written,
I/O works.

The point is, the bridge snoop window setting have to match the M bit
the processor uses.

> You should be able to boost the IO
>throughput using non coherency without this errata impacting you any
>more than it already is.
>

Hopefully my comments above show that this statement is incorrect.

Mark


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-03 13:17   ` Brian Waite
  2004-08-03 21:55     ` Mark A. Greer
@ 2004-08-04  0:45     ` Benjamin Herrenschmidt
  2004-08-04 17:38       ` Brian Waite
  1 sibling, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2004-08-04  0:45 UTC (permalink / raw)
  To: Brian Waite; +Cc: Mark A. Greer, Adrian Cox, linuxppc-dev list


> Looking at this errata, I don't think poeple using Marvell chips will
> be affected by this errata.
> I don't see this as impacting the IO subsystem only the internal MPX system. IE
> using a 2 74xx system and not using the M bit. This race can also
> occur on a single processor system with the M disabled, as it is in
> Linux, but it has nothing to do with the memory controller mapping its
> PCI<->mem windows as non cacheable. You should be able to boost the IO
> throughput using non coherency without this errata impacting you any
> more than it already is.
>
> Anyone please correct me if I am wrong because I need to know.

Note that I feel obligated to re-state again that doing non-coherent
DMA on a 6xx/7xx/7xxx CPU is a big no brainer and is asking for all sorts
of cache aliasing issues because of the way the kernel linear mapping of
memory is BAT mapped... (Note that 2.6 should be able to deal with the
DBAT beeing disabled, at the expense of perfs, though I'm pretty sure we
need the IBAT still set).

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-03 21:55     ` Mark A. Greer
@ 2004-08-04 14:37       ` Brian Waite
  2004-08-04 17:55         ` Mark A. Greer
  0 siblings, 1 reply; 14+ messages in thread
From: Brian Waite @ 2004-08-04 14:37 UTC (permalink / raw)
  To: Mark A. Greer; +Cc: Adrian Cox, linuxppc-dev


Mark,
      Pardon the poor descriptions. Lame excuses follow.
Thanks
Brian

On Tue, 03 Aug 2004 14:55:37 -0700, Mark A. Greer <mgreer@mvista.com> wrote:
> Brian,
>
> One of us must be seriously misunderstanding that erratum.  Below is my
> thinking.
>
> --
> Brian Waite wrote:
>
> >Mark,
> >
> >Looking at this errata, I don't think poeple using Marvell chips will
> >be affected by this errata.
> >I don't see this as impacting the IO subsystem only the internal MPX system.
> >
>
> Please define exactly what you mean by "MPX system".
Poor nomencalture. I was referring to M affecting the behaviour of
multi-CPU interfaces.
>
> > IE
> >using a 2 74xx system and not using the M bit.
> >
>
> Doubtful.  Assuming you mean a dual *SMP* 745x system, you'll almost
> certainly want "coherency required" (i.e., M=1) so the erratum doesn't
> apply.
It was bantered around my office that to improve the memory bus
bandwidth of a dual CPU system that we place horrendus restrictions on
user apps being locked to a single processor, not share data/code and
eliminate the memory coherence between them. We measured about a 30%
bandwidth reduction due to the M bit so the first reaction was to
eliminate it. Thankfully I was able to kill that quickly. I should not
have even brought the dual proc, non-coherent case up here.  Sorry.

>
> > This race can also
> >occur on a single processor system with the M disabled,
> >
>
> Yep.
>
> > as it is in
> >Linux, but it has nothing to do with the memory controller mapping its
> >PCI<->mem windows as non cacheable.
> >
>
> This statement is wrong.  Firstly (nit pick), its not the memory
> controller part of the chip that does that mapping, its the "PCI bridge"
> part of that chip that does.
Nit pick accepted. I should have been more descriptive here.
>
> Secondly, you can't define a pci mem->system mem window as "non
> cacheable"; you can only specify what type of snooping you want (none,
> WB or WT).  Caches are assumed to exist and be turned on.  If they
> aren't, you can just specify "none" for your snooping option.

I agree. What I really meant was non-coherent. Ie Snoop option of "none".

>
> Thirdly, it has everything to do with pci mem->system mem windows
> because that mapping points to normal old system memory that PCI devices
> are going to DMA into and out of.  If CONFIG_NOT_COHERENT_CACHE is NOT
> defined, the the processor will set M=1 for data mappings so the
> processor will assume that when a PCI device DMAs into memory, the
> bridge will follow whatever coherency protocol they've agreed upon
> (i.e., all that MEI, MESI, MERSI stuff) so the processor can
> flush/invalidate cachelines as necessary.  Everyone has to follow the
> protocol or you will end up with stale cachelines and seemingly broken
> I/O sooner or later.
>
> So, if CONFIG_NOT_COHERENT_CACHE is not defined, we must set up the
> snoop windows on the bridge to either WB or WT.  If we don't set the
> snoop windows correctly, coherency will be broken.  Also remember that
> when CONFIG_NOT_COHERENT_CACHE is not defined the pci_/dma_ calls that
> drivers use are essentially null (i.e., no manual cache management).
>
Agreed.

> If CONFIG_NOT_COHERENT_CACHE is defined, then M=0 and now we set the
> snoop windows on the bridge to "none".  The pci_/dma_ calls used by
> drivers manage the caches and, assuming the driver is correctly written,
> I/O works.
Here is where I get lost. I am searching for the point where we modify
the M bit reletive to CONFIG_NOT_COHERENT_CACHE and I can't find it.
Under 2.6 I see us using the CPU_FTR_NEED_COHERENT to determine the M
bit setting in mm/hashtable.S and mm/ppc_mmu.c but I can only find us
setting CPU_FTR_NEED_COHERENT in kernel cputable.c: #if
defined(CONFIG_SMP) || defined(CONFIG_MPC10X_BRIDGE)

Under 2.4, I see it only set depending on CONFIG_SMP

Looking at Programing Environments for 32 bit processors section
5.2.1.3 about the Memory Coherence Attribute I see the following
description:
"When the M attribute is set, and the access is performed to memory,
there is a hardware indication to the rest of the system that the
access is global. Other processors affected by the access must then
respond to this global access signal." Example then follows.
The M bit allows 2 processors to communicate address acess with each
other. I don't see where it has anything to do with PCI->memory
access.

M bit affects the processor asserting Global
I hope I am missing something here.
>
> The point is, the bridge snoop window setting have to match the M bit
> the processor uses.
I disagree here. In fact I know that this is not the case. M=1 on SMP
and M=0 on UP the bridge snoop settings will stay the same. But WB vs
WT settings must be the same between bridge and CPUs otherwise the
protocol will be broken.


>
> > You should be able to boost the IO
> >throughput using non coherency without this errata impacting you any
> >more than it already is.
> >
>
Long story short I don't think the M bit will affect anything in the
behaviour a PCI DMA.
The snoop registers in the bridge certainly will affect behaviour but
not the other way around.

If I am wrong please let me know :)

> Hopefully my comments above show that this statement is incorrect.
>
> Mark
>
>

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-04  0:45     ` Benjamin Herrenschmidt
@ 2004-08-04 17:38       ` Brian Waite
  2004-08-04 22:43         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Brian Waite @ 2004-08-04 17:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Mark A. Greer, Adrian Cox, linuxppc-dev list


Ben,

So what kind of aliasing issues do you forsee uner 2.4?
Thanks
Brian
On Wed, 04 Aug 2004 10:45:32 +1000, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> > Looking at this errata, I don't think poeple using Marvell chips will
> > be affected by this errata.
> > I don't see this as impacting the IO subsystem only the internal MPX system. IE
> > using a 2 74xx system and not using the M bit. This race can also
> > occur on a single processor system with the M disabled, as it is in
> > Linux, but it has nothing to do with the memory controller mapping its
> > PCI<->mem windows as non cacheable. You should be able to boost the IO
> > throughput using non coherency without this errata impacting you any
> > more than it already is.
> >
> > Anyone please correct me if I am wrong because I need to know.
>
> Note that I feel obligated to re-state again that doing non-coherent
> DMA on a 6xx/7xx/7xxx CPU is a big no brainer and is asking for all sorts
> of cache aliasing issues because of the way the kernel linear mapping of
> memory is BAT mapped... (Note that 2.6 should be able to deal with the
> DBAT beeing disabled, at the expense of perfs, though I'm pretty sure we
> need the IBAT still set).
>
> Ben.
>
>

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-04 14:37       ` Brian Waite
@ 2004-08-04 17:55         ` Mark A. Greer
  2004-08-04 20:39           ` Adrian Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Mark A. Greer @ 2004-08-04 17:55 UTC (permalink / raw)
  To: Brian Waite; +Cc: Adrian Cox, linuxppc-dev


Brian Waite wrote:

>Mark,
>      Pardon the poor descriptions. Lame excuses follow.
>

No biggie.  I made a mistake in my description too.  More on that below.

>Thanks
>Brian
>
>On Tue, 03 Aug 2004 14:55:37 -0700, Mark A. Greer <mgreer@mvista.com> wrote:
>
>
>>Brian,
>>
>>One of us must be seriously misunderstanding that erratum.  Below is my
>>thinking.
>>
>>--
>>Brian Waite wrote:
>>
>>
>>
>>>Mark,
>>>
>>>Looking at this errata, I don't think poeple using Marvell chips will
>>>be affected by this errata.
>>>I don't see this as impacting the IO subsystem only the internal MPX system.
>>>
>>>
>>>
>>Please define exactly what you mean by "MPX system".
>>
>>
>Poor nomencalture. I was referring to M affecting the behaviour of
>multi-CPU interfaces.
>

Okay.

>
>
>>>IE
>>>using a 2 74xx system and not using the M bit.
>>>
>>>
>>>
>>Doubtful.  Assuming you mean a dual *SMP* 745x system, you'll almost
>>certainly want "coherency required" (i.e., M=1) so the erratum doesn't
>>apply.
>>
>>
>It was bantered around my office that to improve the memory bus
>bandwidth of a dual CPU system that we place horrendus restrictions on
>user apps being locked to a single processor, not share data/code and
>eliminate the memory coherence between them. We measured about a 30%
>bandwidth reduction due to the M bit so the first reaction was to
>eliminate it. Thankfully I was able to kill that quickly. I should not
>have even brought the dual proc, non-coherent case up here.  Sorry.
>

No problem.

>
>
>
>>>This race can also
>>>occur on a single processor system with the M disabled,
>>>
>>>
>>>
>>Yep.
>>
>>
>>
>>>as it is in
>>>Linux, but it has nothing to do with the memory controller mapping its
>>>PCI<->mem windows as non cacheable.
>>>
>>>
>>>
>>This statement is wrong.  Firstly (nit pick), its not the memory
>>controller part of the chip that does that mapping, its the "PCI bridge"
>>part of that chip that does.
>>
>>
>Nit pick accepted. I should have been more descriptive here.
>
>
>>Secondly, you can't define a pci mem->system mem window as "non
>>cacheable"; you can only specify what type of snooping you want (none,
>>WB or WT).  Caches are assumed to exist and be turned on.  If they
>>aren't, you can just specify "none" for your snooping option.
>>
>>
>
>I agree. What I really meant was non-coherent. Ie Snoop option of "none".
>

I figured you meant that but wasn't sure.  This one was a nit pick too.

>
>
>
>>Thirdly, it has everything to do with pci mem->system mem windows
>>because that mapping points to normal old system memory that PCI devices
>>are going to DMA into and out of.  If CONFIG_NOT_COHERENT_CACHE is NOT
>>defined, the the processor will set M=1 for data mappings so the
>>processor will assume that when a PCI device DMAs into memory, the
>>bridge will follow whatever coherency protocol they've agreed upon
>>(i.e., all that MEI, MESI, MERSI stuff) so the processor can
>>flush/invalidate cachelines as necessary.  Everyone has to follow the
>>protocol or you will end up with stale cachelines and seemingly broken
>>I/O sooner or later.
>>
>>So, if CONFIG_NOT_COHERENT_CACHE is not defined, we must set up the
>>snoop windows on the bridge to either WB or WT.  If we don't set the
>>snoop windows correctly, coherency will be broken.  Also remember that
>>when CONFIG_NOT_COHERENT_CACHE is not defined the pci_/dma_ calls that
>>drivers use are essentially null (i.e., no manual cache management).
>>
>>
>>
>Agreed.
>
>
>
>>If CONFIG_NOT_COHERENT_CACHE is defined, then M=0 and now we set the
>>snoop windows on the bridge to "none".  The pci_/dma_ calls used by
>>drivers manage the caches and, assuming the driver is correctly written,
>>I/O works.
>>
>>
>Here is where I get lost. I am searching for the point where we modify
>the M bit reletive to CONFIG_NOT_COHERENT_CACHE and I can't find it.
>Under 2.6 I see us using the CPU_FTR_NEED_COHERENT to determine the M
>bit setting in mm/hashtable.S and mm/ppc_mmu.c but I can only find us
>setting CPU_FTR_NEED_COHERENT in kernel cputable.c: #if
>defined(CONFIG_SMP) || defined(CONFIG_MPC10X_BRIDGE)
>
>Under 2.4, I see it only set depending on CONFIG_SMP
>
>Looking at Programing Environments for 32 bit processors section
>5.2.1.3 about the Memory Coherence Attribute I see the following
>description:
>"When the M attribute is set, and the access is performed to memory,
>there is a hardware indication to the rest of the system that the
>access is global. Other processors affected by the access must then
>respond to this global access signal." Example then follows.
>The M bit allows 2 processors to communicate address acess with each
>other. I don't see where it has anything to do with PCI->memory
>access.
>
>M bit affects the processor asserting Global
>I hope I am missing something here.
>

You're right.  Randy Vinson pointed out to me that what I said about the
M bit was wrong.  Part of it was that I was not understanding what you
were saying.  I was assuming that you were talking about turning cache
coherency on in the cpu but leaving it off on the bridge.  I then went
off into the weeds.  I now understand that you were talking specifically
about the M bit.  So you're right, the erratum shouldn't affect whether
we turn coherency on or off.  Good point.

What I should have said is that if CONFIG_NOT_COHERENT_CACHE is not
defined, we need to set the snoop windows to WB or WT (depending on what
the cpu is set to).  If it is defined, we set it to none (or just
disable the snoop window(s)).  Forget what I said about the M bit, I
agree that should only be an issue in an SMP system.

>
>
>>The point is, the bridge snoop window setting have to match the M bit
>>the processor uses.
>>
>>
>I disagree here. In fact I know that this is not the case. M=1 on SMP
>and M=0 on UP the bridge snoop settings will stay the same. But WB vs
>WT settings must be the same between bridge and CPUs otherwise the
>protocol will be broken.
>

I agree.

>
>
>>>You should be able to boost the IO
>>>throughput using non coherency without this errata impacting you any
>>>more than it already is.
>>>
>>>
>>>
>Long story short I don't think the M bit will affect anything in the
>behaviour a PCI DMA.
>The snoop registers in the bridge certainly will affect behaviour but
>not the other way around.
>
>If I am wrong please let me know :)
>

No, you're right.  What I thought you were saying was that even when you
wanted coherency, we didn't have to enable the snoop windows in the
bridge.  That was what I was taking issue with.  I just wasn't
understanding what you were saying.

I think we're on the same page after all.  Just to make sure, here's a
summary:

1) M=1 is only necessary in an SMP environment (or for the 745x erratum).
2) If CONFIG_NOT_COHERENT_CACHE is not set, we enable snoop windows on
the bridge.
3) If CONFIG_NOT_COHERENT_CACHE is set, we disable snoop windows (or set
to none) on the bridge.

Back to your original comment, I think you're right, we should be able
to turn off the snooping windows and still have M=1 with no problem.

Mark


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-04 17:55         ` Mark A. Greer
@ 2004-08-04 20:39           ` Adrian Cox
  0 siblings, 0 replies; 14+ messages in thread
From: Adrian Cox @ 2004-08-04 20:39 UTC (permalink / raw)
  To: Mark A. Greer; +Cc: Brian Waite, linuxppc-dev


On Wed, 2004-08-04 at 18:55, Mark A. Greer wrote:
> 1) M=1 is only necessary in an SMP environment (or for the 745x erratum).

And the reason we have to turn it on for the MPC10x bridges is that they
contain their own cache. Even though it only has two cache lines, it has
to be treated as a second processor for coherence purposes.

- Adrian Cox
Humboldt Solutions Ltd.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Workaround for 745x data corruption bug
  2004-08-04 17:38       ` Brian Waite
@ 2004-08-04 22:43         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2004-08-04 22:43 UTC (permalink / raw)
  To: Brian Waite; +Cc: Mark A. Greer, Adrian Cox, linuxppc-dev list


On Thu, 2004-08-05 at 03:38, Brian Waite wrote:
> Ben,
>
> So what kind of aliasing issues do you forsee uner 2.4?

The problem is that the kernel maps all RAM (or as much as it can fit
into 2 BATs) using BATs for the linear mapping at c0000000. This mapping
is cacheable.

If you map, even temporarily, part of your RAM as uncacheable (for DMA
purposes), you end up having 2 different mappings of the same physical
addresses, one cacheable, one non cacheable. The problem is that you may
have

  1) stale data for these physical addresses from before the mapping
     (you have to take care of that by flushing the cache over the area
     when creating the non-cacheable mapping)

  2) CPU speculative load or HW prefetch (most likely speculative load)
     can bring some of those data into the cache via the cacheable
     mapping even though you are not actually reading from the affected
     pages, but only from one nearby (like the one just before).

In both cases, you end up with the possibility that you try to do a non
cacheable access to some space that is also aliased in your cache
somewhere.

This is an undefined behaviour as far as the CPU is concerned. For
example, I think a POWER4 or a G5 will checkstop. I'm not sure what will
happen with the various 74xx but it's definitely not something you can
rely on working properly.

You can of course try to completely avoid non-cacheable mappings and
only do explicit cache invalidate/flushes around DMA operations, but
that would prevent proper implementation of dma_alloc_coherent(), and
I wish you good luck trying to get something like a USB OHCI driver
working with such a scheme.

The solution would be to get rid of the BAT mapping, at least for the
D-BAT, at the expense of kernel performances (possibly significant here)
. The I-BAT must remain though, which can be an issue, I'm not sure how
things will work out at the L2 and L3 caches level...

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-08-04 22:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-31 16:27 [PATCH] Workaround for 745x data corruption bug Adrian Cox
2004-08-01  2:53 ` Benjamin Herrenschmidt
2004-08-02 15:20   ` Kumar Gala
2004-08-02 21:40     ` Benjamin Herrenschmidt
2004-08-02 18:20 ` Mark A. Greer
2004-08-02 21:47   ` Sven Luther
2004-08-03 13:17   ` Brian Waite
2004-08-03 21:55     ` Mark A. Greer
2004-08-04 14:37       ` Brian Waite
2004-08-04 17:55         ` Mark A. Greer
2004-08-04 20:39           ` Adrian Cox
2004-08-04  0:45     ` Benjamin Herrenschmidt
2004-08-04 17:38       ` Brian Waite
2004-08-04 22:43         ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).