tg3 support broken on PPC, a workaround

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* tg3 support broken on PPC, a workaround
@ 2005-05-10  9:33 Manuel Perez Ayala
  2005-05-10 16:52 ` Michael Chan
  0 siblings, 1 reply; 18+ messages in thread
From: Manuel Perez Ayala @ 2005-05-10  9:33 UTC (permalink / raw)
  To: netdev

Hello,

Since linux kernel 2.6.8 the tg3 support was broken on my PPC machines (Power
Mac G4 silver) with data corruption.

After a while finally I've found the code responsible of my problems:

On debian kernel 2.6.8 in the tg3.c file, at line 7356 (inside the tg3_test_dma
function) there are a conditional piece of code that compiles only if the
platform is NOT 386:

#ifndef CONFIG_X86
        {
                u8 byte;
                int cacheline_size;
                pci_read_config_byte(tp->pdev, PCI_CACHE_LINE_SIZE, &byte);

                if (byte == 0)
                        cacheline_size = 1024;
                else
                        cacheline_size = (int) byte * 4;

                switch (cacheline_size) {
                case 16:
                case 32:
                case 64:
                case 128:
                        if ((tp->tg3_flags & TG3_FLAG_PCIX_MODE) &&
                            !(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS)) {
                                tp->dma_rwctrl |=
                                        DMA_RWCTRL_WRITE_BNDRY_384_PCIX;
                                break;
                        } else if (tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS) {
                                tp->dma_rwctrl &=
                                        ~(DMA_RWCTRL_PCI_WRITE_CMD);
                                tp->dma_rwctrl |=
                                        DMA_RWCTRL_WRITE_BNDRY_128_PCIE;
                                break;
                        }
                        /* fallthrough */
                case 256:
                        if (!(tp->tg3_flags & TG3_FLAG_PCIX_MODE) &&
                            !(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS))
                                tp->dma_rwctrl |=
                                        DMA_RWCTRL_WRITE_BNDRY_256;
                        else if (!(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS))
                                tp->dma_rwctrl |=
                                        DMA_RWCTRL_WRITE_BNDRY_256_PCIX;
                };
        }
#endif


This has changed from version 2.6.7 (also Debian dist). At line 6964 (inside the
same function) was:

#ifndef CONFIG_X86
        {
                u8 byte;
                int cacheline_size;
                pci_read_config_byte(tp->pdev, PCI_CACHE_LINE_SIZE, &byte);

                if (byte == 0)
                        cacheline_size = 1024;
                else
                        cacheline_size = (int) byte * 4;

                tp->dma_rwctrl &= ~(DMA_RWCTRL_READ_BNDRY_MASK |
                                    DMA_RWCTRL_WRITE_BNDRY_MASK);

                switch (cacheline_size) {
                case 16:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_16 |
                                 DMA_RWCTRL_WRITE_BNDRY_16);
                        break;

                case 32:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_32 |
                                 DMA_RWCTRL_WRITE_BNDRY_32);
                        break;

                case 64:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_64 |
                                 DMA_RWCTRL_WRITE_BNDRY_64);
                        break;

                case 128:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_128 |
                                 DMA_RWCTRL_WRITE_BNDRY_128);
                        break;

                case 256:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_256 |
                                 DMA_RWCTRL_WRITE_BNDRY_256);
                        break;

                case 512:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_512 |
                                 DMA_RWCTRL_WRITE_BNDRY_512);
                        break;

                case 1024:
                        tp->dma_rwctrl |=
                                (DMA_RWCTRL_READ_BNDRY_1024 |
                                 DMA_RWCTRL_WRITE_BNDRY_1024);
                        break;
                };
        }
#endif


If I replace the 2.6.8 piece of code with the 2.6.7 one and compile the code, it
seems to work without problems of data corruption.

Perhaps is there a better solution adding or modifiyng only certain 2.6.8 code
to make the thinks works, but this works for me.

If anybody can sugest anything about it I will be very happy to try and report
anything about it.

Thanks.


----------
Manuel Perez Ayala
mperaya@alcazaba.unex.es
Facultad de Biblioteconomía y Documentación
Universidad de Extremadura

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10  9:33 tg3 support broken on PPC, a workaround Manuel Perez Ayala
@ 2005-05-10 16:52 ` Michael Chan
  2005-05-10 19:12   ` David S. Miller
  2005-05-11  6:04   ` Manuel Perez Ayala
  0 siblings, 2 replies; 18+ messages in thread
From: Michael Chan @ 2005-05-10 16:52 UTC (permalink / raw)
  To: Manuel Perez Ayala; +Cc: netdev

On Tue, 2005-05-10 at 11:33 +0200, Manuel Perez Ayala wrote:

> 
> If I replace the 2.6.8 piece of code with the 2.6.7 one and compile the code, it
> seems to work without problems of data corruption.
> 
Can you print out the value of tp->dma_rwctrl in hex just before it is
written to the register in the line:

tw32(TG3PCI_DMA_RW_CTRL, tp->dma_rwctrl);

Please do this for the working and non-working driver versions.

I assume you have a 5700 or 5701 as this code that controls the DMA
boundaries only affects those devices. Please confirm with the lspci
output or tg3's probing output.

In the new code, the DMA write bursts will disconnect at multiples of
cache lines instead of 1 cache line. And DMA read bursts will not
disconnect at cache line boundaries.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 16:52 ` Michael Chan
@ 2005-05-10 19:12   ` David S. Miller
  2005-05-10 19:43     ` Michael Chan
                       ` (2 more replies)
  2005-05-11  6:04   ` Manuel Perez Ayala
  1 sibling, 3 replies; 18+ messages in thread
From: David S. Miller @ 2005-05-10 19:12 UTC (permalink / raw)
  To: mchan; +Cc: mperaya, netdev

From: "Michael Chan" <mchan@broadcom.com>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 09:52:46 -0700

> In the new code, the DMA write bursts will disconnect at multiples of
> cache lines instead of 1 cache line. And DMA read bursts will not
> disconnect at cache line boundaries.

We really should be disconnecting at single cacheline boundaries
on RISC systems.  The PCI controllers on RISC machines are
going to disconnect the tg3 when it crosses a cache line
boundary, so all these setting do is waste PCI bandwidth.

>From the sparc64 PCI controller programmer's manual:

"When a DMA burst transfer attempts to go past a cache line (64B)
 boundary, U2P generates a disconnect.  This should cause the
 master device to attempt the transaction again beginning at the
 address of the next untransferred data."

Most other RISC systems have PCI controllers which
behave similarly if not identically, although there
are probably some exceptions.

Anyways, it is clear this code needs to change. :-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 19:12   ` David S. Miller
@ 2005-05-10 19:43     ` Michael Chan
  2005-05-10 21:15       ` David S. Miller
  2005-05-10 20:26     ` David S. Miller
  2005-05-10 22:13     ` Rick Jones
  2 siblings, 1 reply; 18+ messages in thread
From: Michael Chan @ 2005-05-10 19:43 UTC (permalink / raw)
  To: David S.Miller; +Cc: mperaya, netdev

On Tue, 2005-05-10 at 12:12 -0700, David S.Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Subject: Re: tg3 support broken on PPC, a workaround
> Date: Tue, 10 May 2005 09:52:46 -0700
> 
> > In the new code, the DMA write bursts will disconnect at multiples of
> > cache lines instead of 1 cache line. And DMA read bursts will not
> > disconnect at cache line boundaries.
> 
> We really should be disconnecting at single cacheline boundaries
> on RISC systems.  The PCI controllers on RISC machines are
> going to disconnect the tg3 when it crosses a cache line
> boundary, so all these setting do is waste PCI bandwidth.
> 
> From the sparc64 PCI controller programmer's manual:
> 
> "When a DMA burst transfer attempts to go past a cache line (64B)
>  boundary, U2P generates a disconnect.  This should cause the
>  master device to attempt the transaction again beginning at the
>  address of the next untransferred data."
> 

This should be fine. If the bridge requires termination at every cache
line, the bridge (target) will initiate disconnect and data will
terminate.

There is clear benefit in doing longer bursts when the bridge can handle
it.

It was explained to me that on some risc systems such as ppc, and
assuming the bridge can handle long DMA bursts, it is still best to
disconnect at page or cache line boundaries. The reason is that if the
burst stops at any arbitrary address, the bridge has to refetch the
cache line and often the mapping information. Disconnecting at multiple
cache lines is to address this problem while still achieving longer DMA
bursts.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 20:26     ` David S. Miller
@ 2005-05-10 20:14       ` Michael Chan
  2005-05-10 21:23         ` David S. Miller
  2005-05-10 20:32       ` David S. Miller
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Chan @ 2005-05-10 20:14 UTC (permalink / raw)
  To: David S.Miller; +Cc: mperaya, netdev

On Tue, 2005-05-10 at 13:26 -0700, David S.Miller wrote:
> From: "David S. Miller" <davem@davemloft.net>
> Subject: Re: tg3 support broken on PPC, a workaround
> Date: Tue, 10 May 2005 12:12:14 -0700 (PDT)
> 
> > Anyways, it is clear this code needs to change. :-)
> 
> I propose something like the patch below.  I unfortunately
> discovered that the PCI-X boundary controls are limited, and
> even worse PCI-E only allows controlling the write side and
> not the read site at all. :-(
> 
> I think this should really be considered to be fixed
> in future chip revisions, as performance will suffer
> unnecessarily without proper boundary controls.  Even
> just a single bit in the DMA RW control register which
> says "do not cross PCI_CACHELINE_SIZE boundary" would
> work just fine as that is essentially what the code below
> is trying to convince the Tigon3 chip to do. :)
> 
> I am running this patch now on my sparc64 SunBlade1500
> workstation's onboard 5703.
> 
> Comments?

DMA boundary control bits are only valid on 5700 and 5701. On all other
PCI/PCIX chips, these bits are no longer defined.

On PCI Express systems, the cache line size register is not used and is
set to zero on most systems.

> 
> [TG3]: Do not burst across cache line boundary on non-X86
> 
> PCI controllers on these systems will disconnect the tg3
> when it crosses a cache line boundary anyways, wasting
> precious PCI bandwidth.

I don't think target-initiated disconnects will waste PCI bandwidth
compared to master-initiated terminations. In both cases, you see the
same DMA bursts across the bus, only the termination of each burst is
different.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 19:12   ` David S. Miller
  2005-05-10 19:43     ` Michael Chan
@ 2005-05-10 20:26     ` David S. Miller
  2005-05-10 20:14       ` Michael Chan
  2005-05-10 20:32       ` David S. Miller
  2005-05-10 22:13     ` Rick Jones
  2 siblings, 2 replies; 18+ messages in thread
From: David S. Miller @ 2005-05-10 20:26 UTC (permalink / raw)
  To: mchan; +Cc: mperaya, netdev

From: "David S. Miller" <davem@davemloft.net>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 12:12:14 -0700 (PDT)

> Anyways, it is clear this code needs to change. :-)

I propose something like the patch below.  I unfortunately
discovered that the PCI-X boundary controls are limited, and
even worse PCI-E only allows controlling the write side and
not the read site at all. :-(

I think this should really be considered to be fixed
in future chip revisions, as performance will suffer
unnecessarily without proper boundary controls.  Even
just a single bit in the DMA RW control register which
says "do not cross PCI_CACHELINE_SIZE boundary" would
work just fine as that is essentially what the code below
is trying to convince the Tigon3 chip to do. :)

I am running this patch now on my sparc64 SunBlade1500
workstation's onboard 5703.

Comments?

[TG3]: Do not burst across cache line boundary on non-X86

PCI controllers on these systems will disconnect the tg3
when it crosses a cache line boundary anyways, wasting
precious PCI bandwidth.

Signed-off-by: David S. Miller <davem@davemloft.net>

--- drivers/net/tg3.c.~3~	2005-05-09 15:52:32.000000000 -0700
+++ drivers/net/tg3.c	2005-05-10 13:16:51.000000000 -0700
@@ -8865,33 +8865,93 @@
 		else
 			cacheline_size = (int) byte * 4;
 
-		switch (cacheline_size) {
-		case 16:
-		case 32:
-		case 64:
-		case 128:
-			if ((tp->tg3_flags & TG3_FLAG_PCIX_MODE) &&
-			    !(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS)) {
-				tp->dma_rwctrl |=
-					DMA_RWCTRL_WRITE_BNDRY_384_PCIX;
-				break;
-			} else if (tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS) {
-				tp->dma_rwctrl &=
-					~(DMA_RWCTRL_PCI_WRITE_CMD);
+		/* PCI controllers on most RISC systems tend to disconnect
+		 * when a device tries to burst across a cache-line boundary.
+		 * Therefore, letting tg3 do so just wastes PCI bandwidth.
+		 *
+		 * Unfortunately, for PCI-E there are only limited
+		 * write-side controls for this, and thus for reads
+		 * we will still get the disconnects.
+		 */
+		if ((tp->tg3_flags & TG3_FLAG_PCIX_MODE) &&
+		    !(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS)) {
+			switch (cacheline_size) {
+			case 16:
+			case 32:
+			case 64:
+			case 128:
 				tp->dma_rwctrl |=
-					DMA_RWCTRL_WRITE_BNDRY_128_PCIE;
+					(DMA_RWCTRL_READ_BNDRY_128_PCIX |
+					 DMA_RWCTRL_WRITE_BNDRY_128_PCIX);
 				break;
-			}
-			/* fallthrough */
-		case 256:
-			if (!(tp->tg3_flags & TG3_FLAG_PCIX_MODE) &&
-			    !(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS))
+
+			case 256:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_256_PCIX |
+					 DMA_RWCTRL_WRITE_BNDRY_256_PCIX);
+				break;
+
+			default:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_384_PCIX |
+					 DMA_RWCTRL_WRITE_BNDRY_384_PCIX);
+				break;
+			};
+		} else if (tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS) {
+			switch (cacheline_size) {
+			case 16:
+			case 32:
+			case 64:
+				tp->dma_rwctrl |=
+					 DMA_RWCTRL_WRITE_BNDRY_64_PCIE;
+				break;
+
+			case 128:
+			default:
+				tp->dma_rwctrl |=
+					 DMA_RWCTRL_WRITE_BNDRY_128_PCIE;
+				break;
+			};
+		} else {
+			switch (cacheline_size) {
+			case 16:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_16 |
+					 DMA_RWCTRL_WRITE_BNDRY_16);
+				break;
+			case 32:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_32 |
+					 DMA_RWCTRL_WRITE_BNDRY_32);
+				break;
+			case 64:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_64 |
+					 DMA_RWCTRL_WRITE_BNDRY_64);
+				break;
+			case 128:
 				tp->dma_rwctrl |=
-					DMA_RWCTRL_WRITE_BNDRY_256;
-			else if (!(tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS))
+					(DMA_RWCTRL_READ_BNDRY_128 |
+					 DMA_RWCTRL_WRITE_BNDRY_128);
+				break;
+			case 256:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_256 |
+					 DMA_RWCTRL_WRITE_BNDRY_256);
+				break;
+			case 512:
 				tp->dma_rwctrl |=
-					DMA_RWCTRL_WRITE_BNDRY_256_PCIX;
-		};
+					(DMA_RWCTRL_READ_BNDRY_512 |
+					 DMA_RWCTRL_WRITE_BNDRY_512);
+				break;
+			case 1024:
+			default:
+				tp->dma_rwctrl |=
+					(DMA_RWCTRL_READ_BNDRY_512 |
+					 DMA_RWCTRL_WRITE_BNDRY_512);
+				break;
+			};
+		}
 	}
 #endif
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 20:26     ` David S. Miller
  2005-05-10 20:14       ` Michael Chan
@ 2005-05-10 20:32       ` David S. Miller
  1 sibling, 0 replies; 18+ messages in thread
From: David S. Miller @ 2005-05-10 20:32 UTC (permalink / raw)
  To: mchan; +Cc: mperaya, netdev

From: "David S. Miller" <davem@davemloft.net>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 13:26:42 -0700 (PDT)

> +			case 1024:
> +			default:
> +				tp->dma_rwctrl |=
> +					(DMA_RWCTRL_READ_BNDRY_512 |
> +					 DMA_RWCTRL_WRITE_BNDRY_512);
> +				break;

This should use the _1024 bit values, of course.
I've fixed that in my local copy.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 21:23         ` David S. Miller
@ 2005-05-10 20:49           ` Michael Chan
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Chan @ 2005-05-10 20:49 UTC (permalink / raw)
  To: David S.Miller; +Cc: mperaya, netdev

On Tue, 2005-05-10 at 14:23 -0700, David S.Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Subject: Re: tg3 support broken on PPC, a workaround
> Date: Tue, 10 May 2005 13:14:30 -0700
> 
> > I don't think target-initiated disconnects will waste PCI bandwidth
> > compared to master-initiated terminations. In both cases, you see the
> > same DMA bursts across the bus, only the termination of each burst is
> > different.
> 
> I think it does Michael.  Performance on sparc64 went non-trivially up
> when I added the read/write boundary settings initially long ago.
> 
> You have the extra phase where the tg3 tries to start the DMA of the
> next cacheline, and that is where unnecessary time is lost.  I think
> it's about 2 clocks you lose if the PCI controller disconnects instead
> of tg3.
> 
> Tigon3 will drive the data of the next cacheline for 1 cycle and this
> is when the PCI controller will disconnect.  Tigon3 will drop the data
> and respond to the disconnect sometime in the next cycle or so.
> 

You're right. This is Disconnect Without Data and it does cost a few
clock cycles compared to Disconnect With Data or master initiated
termination.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 19:43     ` Michael Chan
@ 2005-05-10 21:15       ` David S. Miller
  0 siblings, 0 replies; 18+ messages in thread
From: David S. Miller @ 2005-05-10 21:15 UTC (permalink / raw)
  To: mchan; +Cc: mperaya, netdev

From: "Michael Chan" <mchan@broadcom.com>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 12:43:30 -0700

> There is clear benefit in doing longer bursts when the bridge can handle
> it.

No question.

> It was explained to me that on some risc systems such as ppc, and
> assuming the bridge can handle long DMA bursts, it is still best to
> disconnect at page or cache line boundaries. The reason is that if the
> burst stops at any arbitrary address, the bridge has to refetch the
> cache line and often the mapping information. Disconnecting at multiple
> cache lines is to address this problem while still achieving longer DMA
> bursts.

Ok, I see.

It is aparently causing some kind of trouble for this person
on his PPC system though.  I think we should back down to
single-cacheline on non-X86 until we really can get a grasp
on what machines it is both:

1) beneficial
2) does not corrupt data

And that is what my patch aims to do for the time being.

Even once the problem case is resolved, that setting should
be bracketed in some test that specifically only enables
the longer bursting where it actually helps not hinders.
It definitely should not be done unconditonally for all
non-X86 systems.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 20:14       ` Michael Chan
@ 2005-05-10 21:23         ` David S. Miller
  2005-05-10 20:49           ` Michael Chan
  0 siblings, 1 reply; 18+ messages in thread
From: David S. Miller @ 2005-05-10 21:23 UTC (permalink / raw)
  To: mchan; +Cc: mperaya, netdev

From: "Michael Chan" <mchan@broadcom.com>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 13:14:30 -0700

> I don't think target-initiated disconnects will waste PCI bandwidth
> compared to master-initiated terminations. In both cases, you see the
> same DMA bursts across the bus, only the termination of each burst is
> different.

I think it does Michael.  Performance on sparc64 went non-trivially up
when I added the read/write boundary settings initially long ago.

You have the extra phase where the tg3 tries to start the DMA of the
next cacheline, and that is where unnecessary time is lost.  I think
it's about 2 clocks you lose if the PCI controller disconnects instead
of tg3.

Tigon3 will drive the data of the next cacheline for 1 cycle and this
is when the PCI controller will disconnect.  Tigon3 will drop the data
and respond to the disconnect sometime in the next cycle or so.

All of this activity will not occur if Tigon3 just ends the data phase
itself when the cacheline boundary is hit.

Or are you talking about PCI reads as opposed to writes?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 19:12   ` David S. Miller
  2005-05-10 19:43     ` Michael Chan
  2005-05-10 20:26     ` David S. Miller
@ 2005-05-10 22:13     ` Rick Jones
  2005-05-10 23:21       ` Grant Grundler
  2 siblings, 1 reply; 18+ messages in thread
From: Rick Jones @ 2005-05-10 22:13 UTC (permalink / raw)
  To: netdev; +Cc: Grant Grundler

David S. Miller wrote:
> We really should be disconnecting at single cacheline boundaries
> on RISC systems.  The PCI controllers on RISC machines are
> going to disconnect the tg3 when it crosses a cache line
> boundary, so all these setting do is waste PCI bandwidth.

It is my understanding that PA-RISC and IA64 controllers behave differently. 
For confirmation one way or the other, I've cc'd someone who could talk about it 
much more cogently than I.

rick jones

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 22:13     ` Rick Jones
@ 2005-05-10 23:21       ` Grant Grundler
  2005-05-10 23:36         ` David S. Miller
  0 siblings, 1 reply; 18+ messages in thread
From: Grant Grundler @ 2005-05-10 23:21 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev, Grant Grundler

On Tue, May 10, 2005 at 03:13:25PM -0700, Rick Jones wrote:
> David S. Miller wrote:
> >We really should be disconnecting at single cacheline boundaries
> >on RISC systems.  The PCI controllers on RISC machines are
> >going to disconnect the tg3 when it crosses a cache line
> >boundary, so all these setting do is waste PCI bandwidth.
> 
> It is my understanding that PA-RISC and IA64 controllers behave 
> differently. For confirmation one way or the other, I've cc'd someone who 
> could talk about it much more cogently than I.

Yup, thanks rick.

Dave,
HP PCI bus controllers don't disconnect after a cacheline.
The latest "LBA" (aka Mercury) will disconnect on 4k page
boundaries. Alex Williamson confirmed.

Has anyone confirmed PPC, PPC64 and Alpha PCI/PCI-X bus
controllers do the same?

ISTR MMRBC (PCI-X only) allows one to specify
shorter blocks. I'd have to look that up again.

thanks,
grant

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 23:21       ` Grant Grundler
@ 2005-05-10 23:36         ` David S. Miller
  0 siblings, 0 replies; 18+ messages in thread
From: David S. Miller @ 2005-05-10 23:36 UTC (permalink / raw)
  To: iod00d; +Cc: rick.jones2, netdev

From: Grant Grundler <iod00d@hp.com>
Subject: Re: tg3 support broken on PPC, a workaround
Date: Tue, 10 May 2005 16:21:32 -0700

> HP PCI bus controllers don't disconnect after a cacheline.
> The latest "LBA" (aka Mercury) will disconnect on 4k page
> boundaries. Alex Williamson confirmed.

Ok, good data point.

> Has anyone confirmed PPC, PPC64 and Alpha PCI/PCI-X bus
> controllers do the same?
> 
> ISTR MMRBC (PCI-X only) allows one to specify
> shorter blocks. I'd have to look that up again.

BTW, I just noticed this in the NetBSD tigon3 driver:

#ifdef __brokenalpha__
	/*
	 * Must insure that we do not cross an 8K (bytes) boundary
	 * for DMA reads.  Our highest limit is 1K bytes.  This is a
	 * restriction on some ALPHA platforms with early revision
	 * 21174 PCI chipsets, such as the AlphaPC 164lx
	 */
	PCI_SETBIT(sc, BGE_PCI_DMA_RW_CTL, BGE_PCI_READ_BNDRY_1024, 4);
#endif

This whole DMA boundary issue is turning into a very non-trivial one.

And if it is really true that plain PCI boundary crossing cannot be
controlled on non 5700/5701 chips, the tigon3 is certainly not going
to work reliably on systems such as the Alpha mentioned above.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-10 16:52 ` Michael Chan
  2005-05-10 19:12   ` David S. Miller
@ 2005-05-11  6:04   ` Manuel Perez Ayala
  2005-05-11 15:24     ` Michael Chan
  1 sibling, 1 reply; 18+ messages in thread
From: Manuel Perez Ayala @ 2005-05-11  6:04 UTC (permalink / raw)
  To: Michael Chan; +Cc: netdev

>
> I assume you have a 5700 or 5701 as this code that controls the DMA
> boundaries only affects those devices. Please confirm with the lspci
> output or tg3's probing output.
>

This is the lspci output

0000:00:0b.0 Host bridge: Apple Computer Inc. UniNorth 1.5 AGP
0000:00:10.0 VGA compatible controller: nVidia Corporation NV11 
[GeForce2 MX/MX
400] (rev b2)
0001:10:0b.0 Host bridge: Apple Computer Inc. UniNorth 1.5 PCI
0001:10:13.0 Ethernet controller: Broadcom Corporation NetXtreme 
BCM5700 Gigabit
Ethernet (rev 12)
0001:10:17.0 ff00: Apple Computer Inc. KeyLargo Mac I/O (rev 03)
0001:10:18.0 USB Controller: Apple Computer Inc. KeyLargo USB
0001:10:19.0 USB Controller: Apple Computer Inc. KeyLargo USB
0002:20:0b.0 Host bridge: Apple Computer Inc. UniNorth 1.5 Internal PCI
0002:20:0e.0 ffff: Lucent Microelectronics FW323 (rev ff)
0002:20:0f.0 Ethernet controller: Apple Computer Inc. UniNorth GMAC (Sun GEM)
(rev 01)

> Can you print out the value of tp->dma_rwctrl in hex just before it is
> written to the register in the line:
>
> tw32(TG3PCI_DMA_RW_CTRL, tp->dma_rwctrl);
>
> Please do this for the working and non-working driver versions.

Value of tp->dma_rwctrl

non-working driver:

On init of driver

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)

tg3: tg3_test_dma #1: 76FF280F

eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

I've added the output of the tp->dma_rwctrl value in hex to kern messages:
tg3: tg3_test_dma #1: 76FF280F

On init of network interface:

tg3: tg3_reset_hw: 76FF280F

tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.

I've added too the output of the tp->dma_rwctrl value in hex in the 
tg3_reset_hw
function:
tg3: tg3_reset_hw: 76FF280F

Working driver:

On init of driver:

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)

tg3: tg3_test_dma #1: 76FF120F

eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

And on init of the network interface:

tg3: tg3_reset_hw: 76FF120F

tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.

Hope this will be useful

Thanks

----------
Manuel Perez Ayala
mperaya@alcazaba.unex.es
Facultad de Biblioteconomía y Documentación
Universidad de Extremadura

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-11  6:04   ` Manuel Perez Ayala
@ 2005-05-11 15:24     ` Michael Chan
  2005-05-12  9:28       ` Manuel Perez Ayala
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Chan @ 2005-05-11 15:24 UTC (permalink / raw)
  To: Manuel Perez Ayala; +Cc: netdev

On Wed, 2005-05-11 at 08:04 +0200, Manuel Perez Ayala wrote:

> Working driver:
> 
> On init of driver:
> 
> tg3.c:v3.10 (September 14, 2004)
> PCI: Enabling device 0001:10:13.0 (0014 -> 0016)
> 
> tg3: tg3_test_dma #1: 76FF120F
> 

This tells me that your cache line size is 32 bytes.

Let's try some experiments and see what works and what doesn't. Please
hardcode the following values in tp->dma_rwctrl before it is written to
the register:

1. DMA read/write boundaries set to 256:

tp->dma_rwctrl = 0x76ff2d0f;

2. DMA read boundary 256, write boundary 32:

tp->dma_rwctrl = 0x76ff150f;

3. DMA read boundary 32, write boundary 256

tp->dma_rwctrl = 0x76ff2a0f;

4. Let's also try without asserting all byte enables:

tp->dma_rwctrl = 0x763f2d0f;

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-11 15:24     ` Michael Chan
@ 2005-05-12  9:28       ` Manuel Perez Ayala
  2005-05-12 16:33         ` Rick Jones
  0 siblings, 1 reply; 18+ messages in thread
From: Manuel Perez Ayala @ 2005-05-12  9:28 UTC (permalink / raw)
  To: Michael Chan

Michael Chan <mchan@broadcom.com> wrote:

>
> This tells me that your cache line size is 32 bytes.
>
> Let's try some experiments and see what works and what doesn't. Please
> hardcode the following values in tp->dma_rwctrl before it is written to
> the register:
>

Net performance of the working driver:

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    10.00     415.08


Tryouts:

> 1. DMA read/write boundaries set to 256:
>
> tp->dma_rwctrl = 0x76ff2d0f;
>

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)
tg3: tg3_test_dma DMA read/write boundaries set to 256: 76FF2D0F
eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

--> DATA CORRUPTION

> 2. DMA read boundary 256, write boundary 32:
>
> tp->dma_rwctrl = 0x76ff150f;
>

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)
tg3: tg3_test_dma DMA read boundary 256, write boundary 32: 76FF150F
eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

--> DATA OK

Net Performance:

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    10.00     647.19

The net performance is grown over a 50% from the working driver default 
setting.

> 3. DMA read boundary 32, write boundary 256
>
> tp->dma_rwctrl = 0x76ff2a0f;
>

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)
tg3: tg3_test_dma DMA read boundary 32, write boundary 256: 76FF2A0F
eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

--> DATA CORRUPTION

No net performance impact.

> 4. Let's also try without asserting all byte enables:
>
> tp->dma_rwctrl = 0x763f2d0f;
>

tg3.c:v3.10 (September 14, 2004)
PCI: Enabling device 0001:10:13.0 (0014 -> 0016)
tg3: tg3_test_dma without asserting all byte enables: 763F2D0F
eth1: Tigon3 [partno(BCM95700A6) rev 7102 PHY(5401)] (PCI:33MHz:64-bit)
10/100/1000BaseT Ethernet 00:04:76:3b:51:ae
eth1: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[0] TSOcap[0]

--> DATA CORRUPTION

Net performance like DMA read boundary 256, write boundary 32:, over 50%, but
with data corruption.



I've only tried whith the working driver (2.6.8 with dma code from 2.6.7). Do
you need I also try the original 2.6.8?

Before my initial report, I've tried the broadcom driver 8.1.55 with no 
success
and the same kind of data corruption. Then I focused on the tg3 driver.

----------
Manuel Perez Ayala
mperaya@alcazaba.unex.es
Facultad de Biblioteconomía y Documentación
Universidad de Extremadura

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-12  9:28       ` Manuel Perez Ayala
@ 2005-05-12 16:33         ` Rick Jones
  2005-05-12 18:06           ` Manuel Perez Ayala
  0 siblings, 1 reply; 18+ messages in thread
From: Rick Jones @ 2005-05-12 16:33 UTC (permalink / raw)
  To: Manuel Perez Ayala; +Cc: Michael Chan, netdev

Since both read and write boundaries are being changed, doesn't netperf need to 
be run in each direction?

rick jones

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: tg3 support broken on PPC, a workaround
  2005-05-12 16:33         ` Rick Jones
@ 2005-05-12 18:06           ` Manuel Perez Ayala
  0 siblings, 0 replies; 18+ messages in thread
From: Manuel Perez Ayala @ 2005-05-12 18:06 UTC (permalink / raw)
  To: Rick Jones; +Cc: mchan, netdev

Rick Jones <rick.jones2@hp.com> wrote:

> Since both read and write boundaries are being changed, doesn't 
> netperf need to be run in each direction?

I have two twin systems: PowerMac G4 both with 3com 3c996T connected to a 3com
gigabit switch, same memomry, same disk, same os. I have compiled and 
installed
the driver with the harcoded best dma configuration on both.

These are the netperf results from one side:

afs02-i:/usr/share/doc/netperf# netperf -H afs01-i
TCP STREAM TEST to afs01-i
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    10.00     671.71

And these are the netperf results from the other side:

afs01-i:~# netperf -H afs02-i
TCP STREAM TEST to afs02-i
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    10.00     667.96

And on the local device:

afs01-i:~# netperf -H afs01-i
TCP STREAM TEST to afs01-i
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    10.00    1834.33

----------
Manuel Perez Ayala
mperaya@alcazaba.unex.es
Facultad de Biblioteconomía y Documentación
Universidad de Extremadura

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-05-12 18:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-10  9:33 tg3 support broken on PPC, a workaround Manuel Perez Ayala
2005-05-10 16:52 ` Michael Chan
2005-05-10 19:12   ` David S. Miller
2005-05-10 19:43     ` Michael Chan
2005-05-10 21:15       ` David S. Miller
2005-05-10 20:26     ` David S. Miller
2005-05-10 20:14       ` Michael Chan
2005-05-10 21:23         ` David S. Miller
2005-05-10 20:49           ` Michael Chan
2005-05-10 20:32       ` David S. Miller
2005-05-10 22:13     ` Rick Jones
2005-05-10 23:21       ` Grant Grundler
2005-05-10 23:36         ` David S. Miller
2005-05-11  6:04   ` Manuel Perez Ayala
2005-05-11 15:24     ` Michael Chan
2005-05-12  9:28       ` Manuel Perez Ayala
2005-05-12 16:33         ` Rick Jones
2005-05-12 18:06           ` Manuel Perez Ayala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).