linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
@ 2012-03-14 17:01 Timo Teras
  2012-03-14 17:15 ` Eric Dumazet
  2012-03-14 21:16 ` Francois Romieu
  0 siblings, 2 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-14 17:01 UTC (permalink / raw)
  To: netdev; +Cc: Francois Romieu

Hi,

I have a router box running linux-3.0.18 (with grsec patches).

with the NIC hardware:
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:09.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
r8169 0000:00:09.0: (unregistered net_device): no PCI Express capability
r8169 0000:00:09.0: eth0: RTL8169sc/8110sc at 0xf82f8000, 00:30:18:ab:6b:54, XID 18000000 IRQ 18
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:0b.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
r8169 0000:00:0b.0: (unregistered net_device): no PCI Express capability
r8169 0000:00:0b.0: eth1: RTL8169sc/8110sc at 0xf82fa000, 00:30:18:ab:6b:55, XID 18000000 IRQ 19
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:00:0c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
r8169 0000:00:0c.0: (unregistered net_device): no PCI Express capability
r8169 0000:00:0c.0: eth2: RTL8169sc/8110sc at 0xf82fc000, 00:30:18:ab:6b:56, XID 18000000 IRQ 16

This box is working just as a plain IPv4 router (internal RFC1918
address space) forwarding packets.

It routes basically from eth2 to multiple vlans over bond0 consisting of eth0 and eth1.

I have most hw accel stuff turned off, and "ethtool -k eth0" says:
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

The same applies for all interfaces (except lo).

However, tcpdump on this box indicates that I'm receiving very
long (tcp length more than mtu) incoming packets on eth2 implying that
gso/tso got turned on somehow. eth2 is connected with cross-over cable
to similar box running a bit older linux box; but gso/tso is turned off
there too. When dumping simultaneously on the other side, it indicates
that all packets sent are normal length, and no merging was performed
earlier (fits mtu 1500).

So it would appear that the router box somehow insists on doing gso/tso,
and sadly it will also mess up on the send path (the incoming merged
packet is forwarded, but sent out short) causing lost segments and
serious performance degration.

Any pointers how to next debug/fix/workaround this issue?

-Timo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 17:01 linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration Timo Teras
@ 2012-03-14 17:15 ` Eric Dumazet
  2012-03-14 17:29   ` Timo Teras
  2012-03-14 21:16 ` Francois Romieu
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2012-03-14 17:15 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev, Francois Romieu

On Wed, 2012-03-14 at 19:01 +0200, Timo Teras wrote:
> Hi,
> 
> I have a router box running linux-3.0.18 (with grsec patches).
> 
> with the NIC hardware:
> r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> r8169 0000:00:09.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> r8169 0000:00:09.0: (unregistered net_device): no PCI Express capability
> r8169 0000:00:09.0: eth0: RTL8169sc/8110sc at 0xf82f8000, 00:30:18:ab:6b:54, XID 18000000 IRQ 18
> r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> r8169 0000:00:0b.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> r8169 0000:00:0b.0: (unregistered net_device): no PCI Express capability
> r8169 0000:00:0b.0: eth1: RTL8169sc/8110sc at 0xf82fa000, 00:30:18:ab:6b:55, XID 18000000 IRQ 19
> r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> r8169 0000:00:0c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> r8169 0000:00:0c.0: (unregistered net_device): no PCI Express capability
> r8169 0000:00:0c.0: eth2: RTL8169sc/8110sc at 0xf82fc000, 00:30:18:ab:6b:56, XID 18000000 IRQ 16
> 
> This box is working just as a plain IPv4 router (internal RFC1918
> address space) forwarding packets.
> 
> It routes basically from eth2 to multiple vlans over bond0 consisting of eth0 and eth1.
> 
> I have most hw accel stuff turned off, and "ethtool -k eth0" says:
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: off
> tcp segmentation offload: off
> udp fragmentation offload: off
> generic segmentation offload: off
> 
> The same applies for all interfaces (except lo).
> 
> However, tcpdump on this box indicates that I'm receiving very
> long (tcp length more than mtu) incoming packets on eth2 implying that
> gso/tso got turned on somehow. eth2 is connected with cross-over cable
> to similar box running a bit older linux box; but gso/tso is turned off
> there too. When dumping simultaneously on the other side, it indicates
> that all packets sent are normal length, and no merging was performed
> earlier (fits mtu 1500).
> 
> So it would appear that the router box somehow insists on doing gso/tso,
> and sadly it will also mess up on the send path (the incoming merged
> packet is forwarded, but sent out short) causing lost segments and
> serious performance degration.
> 
> Any pointers how to next debug/fix/workaround this issue?
> 

You are fighting the wrong side ;)

Here, its GRO doing the aggregation on receiver.

What kind of problems do you experiment because of this ?

ethtool -k eth2

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 17:15 ` Eric Dumazet
@ 2012-03-14 17:29   ` Timo Teras
  2012-03-14 18:25     ` Eric Dumazet
  2012-03-14 19:29     ` Ben Hutchings
  0 siblings, 2 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-14 17:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Francois Romieu

On Wed, 14 Mar 2012 10:15:14 -0700 Eric Dumazet
<eric.dumazet@gmail.com> wrote:

> On Wed, 2012-03-14 at 19:01 +0200, Timo Teras wrote:
> > Hi,
> > 
> > I have a router box running linux-3.0.18 (with grsec patches).
> > 
> > with the NIC hardware:
> > r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> > r8169 0000:00:09.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> > r8169 0000:00:09.0: (unregistered net_device): no PCI Express
> > capability r8169 0000:00:09.0: eth0: RTL8169sc/8110sc at
> > 0xf82f8000, 00:30:18:ab:6b:54, XID 18000000 IRQ 18 r8169 Gigabit
> > Ethernet driver 2.3LK-NAPI loaded r8169 0000:00:0b.0: PCI INT A ->
> > GSI 19 (level, low) -> IRQ 19 r8169 0000:00:0b.0: (unregistered
> > net_device): no PCI Express capability r8169 0000:00:0b.0: eth1:
> > RTL8169sc/8110sc at 0xf82fa000, 00:30:18:ab:6b:55, XID 18000000 IRQ
> > 19 r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169
> > 0000:00:0c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 r8169
> > 0000:00:0c.0: (unregistered net_device): no PCI Express capability
> > r8169 0000:00:0c.0: eth2: RTL8169sc/8110sc at 0xf82fc000,
> > 00:30:18:ab:6b:56, XID 18000000 IRQ 16
> > 
> > This box is working just as a plain IPv4 router (internal RFC1918
> > address space) forwarding packets.
> > 
> > It routes basically from eth2 to multiple vlans over bond0
> > consisting of eth0 and eth1.
> > 
> > I have most hw accel stuff turned off, and "ethtool -k eth0" says:
> > Offload parameters for eth0:
> > rx-checksumming: on
> > tx-checksumming: on
> > scatter-gather: off
> > tcp segmentation offload: off
> > udp fragmentation offload: off
> > generic segmentation offload: off
> > 
> > The same applies for all interfaces (except lo).
> > 
> > However, tcpdump on this box indicates that I'm receiving very
> > long (tcp length more than mtu) incoming packets on eth2 implying
> > that gso/tso got turned on somehow. eth2 is connected with
> > cross-over cable to similar box running a bit older linux box; but
> > gso/tso is turned off there too. When dumping simultaneously on the
> > other side, it indicates that all packets sent are normal length,
> > and no merging was performed earlier (fits mtu 1500).
> > 
> > So it would appear that the router box somehow insists on doing
> > gso/tso, and sadly it will also mess up on the send path (the
> > incoming merged packet is forwarded, but sent out short) causing
> > lost segments and serious performance degration.
> > 
> > Any pointers how to next debug/fix/workaround this issue?
> > 
> 
> You are fighting the wrong side ;)
> 
> Here, its GRO doing the aggregation on receiver.

Yes, I figured this much. But I have explictly turned GRO off and it's
still happening.

> What kind of problems do you experiment because of this ?

I'm getting lost packets (the non-first TCP segments off the GRO merged
packet). This causes serious TCP speed degration (should get 10MB/s
through 100mbit/s link; but I'm getting only 2-3MB/s). Doing the same
transfer on the next hop router gives full speed, so the problem is
definitely on this router and due to GRO badness.

I also remember this working before, so this seems a regression from
upgrading 2.6.35.x kernel or something like that.

> ethtool -k eth2

gro off. I am even trying now with:

Offload parameters for eth2:
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

Additionally, I'm looking at my other router boxes with same hardware
but different kernel versions. Looks that all of them are acting as GRO
is enabled, even though it's turned off by ethtool.

I can verify that 2.6.35.8, 2.6.38.8, and 3.0.18 (all of these with
grsec patch) are doing GRO for this r8169 hardware, even though it's
configured OFF on all boxes.

There seems to be no performance issues in 2.6.35.8 kernel. This would
indicate that the incoming GRO packets are properly handled and
segmented (likely by software) on the path out. However, I'm also
having issues with the 2.6.38.8 box, and badness on GRO send path
seems to be the cause. And of course to mention that GRO is happening
even though it's turned off.

Additionally, it seems that at the 2.6.38.8 and 3.0.18 kernels are
having the performance issues even if it's locally terminated TCP
connection. So it's not limited to the forward path. The latest good
kernel I can verify is 2.6.35.x.

- Timo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 17:29   ` Timo Teras
@ 2012-03-14 18:25     ` Eric Dumazet
  2012-03-14 19:29     ` Ben Hutchings
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2012-03-14 18:25 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev, Francois Romieu

On Wed, 2012-03-14 at 19:29 +0200, Timo Teras wrote:
> On Wed, 14 Mar 2012 10:15:14 -0700 Eric Dumazet
> <eric.dumazet@gmail.com> wrote:
> 
> > On Wed, 2012-03-14 at 19:01 +0200, Timo Teras wrote:
> > > Hi,
> > > 
> > > I have a router box running linux-3.0.18 (with grsec patches).
> > > 
> > > with the NIC hardware:
> > > r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> > > r8169 0000:00:09.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> > > r8169 0000:00:09.0: (unregistered net_device): no PCI Express
> > > capability r8169 0000:00:09.0: eth0: RTL8169sc/8110sc at
> > > 0xf82f8000, 00:30:18:ab:6b:54, XID 18000000 IRQ 18 r8169 Gigabit
> > > Ethernet driver 2.3LK-NAPI loaded r8169 0000:00:0b.0: PCI INT A ->
> > > GSI 19 (level, low) -> IRQ 19 r8169 0000:00:0b.0: (unregistered
> > > net_device): no PCI Express capability r8169 0000:00:0b.0: eth1:
> > > RTL8169sc/8110sc at 0xf82fa000, 00:30:18:ab:6b:55, XID 18000000 IRQ
> > > 19 r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169
> > > 0000:00:0c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 r8169
> > > 0000:00:0c.0: (unregistered net_device): no PCI Express capability
> > > r8169 0000:00:0c.0: eth2: RTL8169sc/8110sc at 0xf82fc000,
> > > 00:30:18:ab:6b:56, XID 18000000 IRQ 16
> > > 
> > > This box is working just as a plain IPv4 router (internal RFC1918
> > > address space) forwarding packets.
> > > 
> > > It routes basically from eth2 to multiple vlans over bond0
> > > consisting of eth0 and eth1.
> > > 
> > > I have most hw accel stuff turned off, and "ethtool -k eth0" says:
> > > Offload parameters for eth0:
> > > rx-checksumming: on
> > > tx-checksumming: on
> > > scatter-gather: off
> > > tcp segmentation offload: off
> > > udp fragmentation offload: off
> > > generic segmentation offload: off
> > > 
> > > The same applies for all interfaces (except lo).
> > > 
> > > However, tcpdump on this box indicates that I'm receiving very
> > > long (tcp length more than mtu) incoming packets on eth2 implying
> > > that gso/tso got turned on somehow. eth2 is connected with
> > > cross-over cable to similar box running a bit older linux box; but
> > > gso/tso is turned off there too. When dumping simultaneously on the
> > > other side, it indicates that all packets sent are normal length,
> > > and no merging was performed earlier (fits mtu 1500).
> > > 
> > > So it would appear that the router box somehow insists on doing
> > > gso/tso, and sadly it will also mess up on the send path (the
> > > incoming merged packet is forwarded, but sent out short) causing
> > > lost segments and serious performance degration.
> > > 
> > > Any pointers how to next debug/fix/workaround this issue?
> > > 
> > 
> > You are fighting the wrong side ;)
> > 
> > Here, its GRO doing the aggregation on receiver.
> 
> Yes, I figured this much. But I have explictly turned GRO off and it's
> still happening.
> 
> > What kind of problems do you experiment because of this ?
> 
> I'm getting lost packets (the non-first TCP segments off the GRO merged
> packet). This causes serious TCP speed degration (should get 10MB/s
> through 100mbit/s link; but I'm getting only 2-3MB/s). Doing the same
> transfer on the next hop router gives full speed, so the problem is
> definitely on this router and due to GRO badness.

There is something completely unrelated to GRO then. 2-3 MB/s sound more
a tcp issue.

> 
> I also remember this working before, so this seems a regression from
> upgrading 2.6.35.x kernel or something like that.
> 
> > ethtool -k eth2
> 
> gro off. I am even trying now with:
> 
> Offload parameters for eth2:
> rx-checksumming: off
> tx-checksumming: off
> scatter-gather: off
> tcp segmentation offload: off
> udp fragmentation offload: off
> generic segmentation offload: off
> 

I cant see how you can then receive tcp frames bigger than MTU.

> Additionally, I'm looking at my other router boxes with same hardware
> but different kernel versions. Looks that all of them are acting as GRO
> is enabled, even though it's turned off by ethtool.
> 
> I can verify that 2.6.35.8, 2.6.38.8, and 3.0.18 (all of these with
> grsec patch) are doing GRO for this r8169 hardware, even though it's
> configured OFF on all boxes.
> 
> There seems to be no performance issues in 2.6.35.8 kernel. This would
> indicate that the incoming GRO packets are properly handled and
> segmented (likely by software) on the path out. However, I'm also
> having issues with the 2.6.38.8 box, and badness on GRO send path
> seems to be the cause. And of course to mention that GRO is happening
> even though it's turned off.
> 
> Additionally, it seems that at the 2.6.38.8 and 3.0.18 kernels are
> having the performance issues even if it's locally terminated TCP
> connection. So it's not limited to the forward path. The latest good
> kernel I can verify is 2.6.35.x.
> 
> - Timo

If trafic is localy terminated :

netstat -s 

should give us some input.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 17:29   ` Timo Teras
  2012-03-14 18:25     ` Eric Dumazet
@ 2012-03-14 19:29     ` Ben Hutchings
  2012-03-14 19:51       ` Timo Teras
  1 sibling, 1 reply; 23+ messages in thread
From: Ben Hutchings @ 2012-03-14 19:29 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, netdev, Francois Romieu

On Wed, 2012-03-14 at 19:29 +0200, Timo Teras wrote:
[...]
> gro off. I am even trying now with:
> 
> Offload parameters for eth2:
> rx-checksumming: off
> tx-checksumming: off
> scatter-gather: off
> tcp segmentation offload: off
> udp fragmentation offload: off
> generic segmentation offload: off
[...]

GRO isn't even reported there!  Apparently you need a newer version of
ethtool.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 19:29     ` Ben Hutchings
@ 2012-03-14 19:51       ` Timo Teras
  2012-03-14 20:12         ` Eric Dumazet
  0 siblings, 1 reply; 23+ messages in thread
From: Timo Teras @ 2012-03-14 19:51 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Eric Dumazet, netdev, Francois Romieu

On Wed, 14 Mar 2012 19:29:14 +0000 Ben Hutchings
<bhutchings@solarflare.com> wrote:

> On Wed, 2012-03-14 at 19:29 +0200, Timo Teras wrote:
> [...]
> > gro off. I am even trying now with:
> > 
> > Offload parameters for eth2:
> > rx-checksumming: off
> > tx-checksumming: off
> > scatter-gather: off
> > tcp segmentation offload: off
> > udp fragmentation offload: off
> > generic segmentation offload: off
> [...]
> 
> GRO isn't even reported there!  Apparently you need a newer version of
> ethtool.

Very good point. I thought gso also enabled gro, but seems that my
ethtool was old.

And GRO was enabled along with some other stuff. Turning GRO off made
my tcp performance immediately a lot better; jumped from 2MB/s to 8MB/s
or so (not ideal yet, though; but the remainder of the difference could
be related to other issue).

So something is definitely broke in 3.0.x with GRO enabled, but GSO off.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 19:51       ` Timo Teras
@ 2012-03-14 20:12         ` Eric Dumazet
  2012-03-14 20:33           ` Timo Teras
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2012-03-14 20:12 UTC (permalink / raw)
  To: Timo Teras; +Cc: Ben Hutchings, netdev, Francois Romieu

On Wed, 2012-03-14 at 21:51 +0200, Timo Teras wrote:

> Very good point. I thought gso also enabled gro, but seems that my
> ethtool was old.
> 
> And GRO was enabled along with some other stuff. Turning GRO off made
> my tcp performance immediately a lot better; jumped from 2MB/s to 8MB/s
> or so (not ideal yet, though; but the remainder of the difference could
> be related to other issue).
> 
> So something is definitely broke in 3.0.x with GRO enabled, but GSO off.

"ifconfig eth2 ; netstat -s" can really help, I suspect tcp stack drops

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 20:12         ` Eric Dumazet
@ 2012-03-14 20:33           ` Timo Teras
  2012-03-14 20:52             ` Eric Dumazet
  2012-03-14 20:53             ` Francois Romieu
  0 siblings, 2 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-14 20:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Hutchings, netdev, Francois Romieu

On Wed, 14 Mar 2012 13:12:45 -0700 Eric Dumazet
<eric.dumazet@gmail.com> wrote:

> On Wed, 2012-03-14 at 21:51 +0200, Timo Teras wrote:
> 
> > Very good point. I thought gso also enabled gro, but seems that my
> > ethtool was old.
> > 
> > And GRO was enabled along with some other stuff. Turning GRO off
> > made my tcp performance immediately a lot better; jumped from 2MB/s
> > to 8MB/s or so (not ideal yet, though; but the remainder of the
> > difference could be related to other issue).
> > 
> > So something is definitely broke in 3.0.x with GRO enabled, but GSO
> > off.
> 
> "ifconfig eth2 ; netstat -s" can really help, I suspect tcp stack
> drops

After doing several wget's that have "bad performance".

# ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:30:18:AB:6B:56  
          inet addr:10.26.0.2  Bcast:0.0.0.0  Mask:255.255.255.252
          inet6 addr: fe80::230:18ff:feab:6b56/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:32334060 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18520452 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1775070027 (1.6 GiB)  TX bytes:1962364861 (1.8 GiB)
          Interrupt:16 Base address:0xc000 

# ethtool -S eth2
NIC statistics:
     tx_packets: 2069391193
     rx_packets: 3245815642
     tx_errors: 0
     rx_errors: 645238
     rx_missed: 31414
     align_errors: 0
     tx_single_collisions: 0
     tx_multi_collisions: 0
     unicast: 3245815640
     broadcast: 2
     multicast: 0
     tx_aborted: 0
     tx_underrun: 0

netstat is busybox, so no -s. But /proc/net/netstat as parsed looks
like:
SyncookiesSent 0 
SyncookiesRecv 0 
SyncookiesFailed 3 
EmbryonicRsts 0 
PruneCalled 0 
RcvPruned 0 
OfoPruned 0 
OutOfWindowIcmps 0 
LockDroppedIcmps 0 
ArpFilter 0 
TW 15 
TWRecycled 0 
TWKilled 0 
PAWSPassive 0 
PAWSActive 0 
PAWSEstab 0 
DelayedACKs 564 
DelayedACKLocked 5 
DelayedACKLost 0 
ListenOverflows 0 
ListenDrops 0 
TCPPrequeued 73 
TCPDirectCopyFromBacklog 605264 
TCPDirectCopyFromPrequeue 15961 
TCPPrequeueDropped 0 
TCPHPHits 191774 
TCPHPHitsToUser 425 
TCPPureAcks 19228 
TCPHPAcks 112359 
TCPRenoRecovery 0 
TCPSackRecovery 2 
TCPSACKReneging 0 
TCPFACKReorder 0 
TCPSACKReorder 4 
TCPRenoReorder 0 
TCPTSReorder 2 
TCPFullUndo 2 
TCPPartialUndo 107 
TCPDSACKUndo 0 
TCPLossUndo 0 
TCPLoss 0 
TCPLostRetransmit 0 
TCPRenoFailures 0 
TCPSackFailures 0 
TCPLossFailures 0 
TCPFastRetrans 2 
TCPForwardRetrans 2 
TCPSlowStartRetrans 0 
TCPTimeouts 0 
TCPRenoRecoveryFail 0 
TCPSackRecoveryFail 0 
TCPSchedulerFailed 0 
TCPRcvCollapsed 0 
TCPDSACKOldSent 0 
TCPDSACKOfoSent 0 
TCPDSACKRecv 4 
TCPDSACKOfoRecv 0 
TCPAbortOnSyn 0 
TCPAbortOnData 6 
TCPAbortOnClose 4 
TCPAbortOnMemory 0 
TCPAbortOnTimeout 0 
TCPAbortOnLinger 0 
TCPAbortFailed 0 
TCPMemoryPressures 0 
TCPSACKDiscard 0 
TCPDSACKIgnoredOld 0 
TCPDSACKIgnoredNoUndo 0 
TCPSpuriousRTOs 0 
TCPMD5NotFound 0 
TCPMD5Unexpected 0 
TCPSackShifted 0 
TCPSackMerged 0 
TCPSackShiftFallback 4484 
TCPBacklogDrop 0 
TCPMinTTLDrop 0 
TCPDeferAcceptDrop 0 
IPReversePathFilter 4 
TCPTimeWaitOverflow 0 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 20:33           ` Timo Teras
@ 2012-03-14 20:52             ` Eric Dumazet
  2012-03-14 20:53             ` Francois Romieu
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2012-03-14 20:52 UTC (permalink / raw)
  To: Timo Teras; +Cc: Ben Hutchings, netdev, Francois Romieu

On Wed, 2012-03-14 at 22:33 +0200, Timo Teras wrote:

> # ethtool -S eth2
> NIC statistics:
>      tx_packets: 2069391193
>      rx_packets: 3245815642
>      tx_errors: 0
>      rx_errors: 645238
>      rx_missed: 31414

So many packets on your tests, did you fresh reboot your machine before
doing them ?

Is your cpu fully loaded when doing your wget ?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 20:33           ` Timo Teras
  2012-03-14 20:52             ` Eric Dumazet
@ 2012-03-14 20:53             ` Francois Romieu
  2012-03-15  6:06               ` Timo Teras
  1 sibling, 1 reply; 23+ messages in thread
From: Francois Romieu @ 2012-03-14 20:53 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, Ben Hutchings, netdev

Timo Teras <timo.teras@iki.fi> :
[...]
> # ethtool -S eth2
> NIC statistics:
>      tx_packets: 2069391193
>      rx_packets: 3245815642
>      tx_errors: 0
>      rx_errors: 645238
>      rx_missed: 31414

It does not look like stuff for the higher layers guys.

Can you tshark -w foobar on the sender side and
'while : ; do sleep 1; ethtool -S eth2 >> glop; done' on the receiver
during a bad wget (a big zero filled file should compress well).

-- 
Ueimor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 17:01 linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration Timo Teras
  2012-03-14 17:15 ` Eric Dumazet
@ 2012-03-14 21:16 ` Francois Romieu
  1 sibling, 0 replies; 23+ messages in thread
From: Francois Romieu @ 2012-03-14 21:16 UTC (permalink / raw)
  To: Timo Teras; +Cc: netdev

Timo Teras <timo.teras@iki.fi> :
> I have a router box running linux-3.0.18 (with grsec patches).
[...]
> Any pointers how to next debug/fix/workaround this issue?

No difference without grsec (test version) patch ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-14 20:53             ` Francois Romieu
@ 2012-03-15  6:06               ` Timo Teras
  2012-03-15 15:11                 ` Timo Teras
  0 siblings, 1 reply; 23+ messages in thread
From: Timo Teras @ 2012-03-15  6:06 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Ben Hutchings, netdev

On Wed, 14 Mar 2012 21:53:19 +0100 Francois Romieu
<romieu@fr.zoreil.com> wrote:

> Timo Teras <timo.teras@iki.fi> :
> [...]
> > # ethtool -S eth2
> > NIC statistics:
> >      tx_packets: 2069391193
> >      rx_packets: 3245815642
> >      tx_errors: 0
> >      rx_errors: 645238
> >      rx_missed: 31414
> 
> It does not look like stuff for the higher layers guys.
> 
> Can you tshark -w foobar on the sender side and
> 'while : ; do sleep 1; ethtool -S eth2 >> glop; done' on the receiver
> during a bad wget (a big zero filled file should compress well).

Indeed.

It seems that my earlier test about the "GRO off" effect were mistaken
(I used accidentally proxy, and that gave the illusion that things are
working. Whoops.)

So far I changed the cross-over cable and it didn't help. However,
forcing the NIC to 100mbit/full-duplex mode fixes the rx_errors. It
seems that something bad is happening in the gigabit mode.

# ethtool eth2
Settings for eth2:
	Supported ports: [ TP MII ]
	Supported link modes:   10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Supported pause frame use: No
	Supports auto-negotiation: Yes
	Advertised link modes:  100baseT/Full 
	Advertised pause frame use: Symmetric Receive-only
	Advertised auto-negotiation: Yes
	Link partner advertised link modes:  10baseT/Half 10baseT/Full 
	                                     100baseT/Half 100baseT/Full 
	Link partner advertised pause frame use: Symmetric Receive-only
	Link partner advertised auto-negotiation: Yes
	Speed: 100Mb/s
	Duplex: Full
	Port: MII
	PHYAD: 0
	Transceiver: internal
	Auto-negotiation: on
	Supports Wake-on: pumbg
	Wake-on: g
	Current message level: 0x00000033 (51)
			       drv probe ifdown ifup
	Link detected: yes


I wonder if it's using pause frames and that's messing things up. Seems
that I can't turn it off, though.

I can also double check my cables, though it is factory made Cat-5E
cross-over cable; and happens with two different cables.

-Timo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-15  6:06               ` Timo Teras
@ 2012-03-15 15:11                 ` Timo Teras
  2012-03-15 16:11                   ` Eric Dumazet
  2012-03-15 19:11                   ` Francois Romieu
  0 siblings, 2 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-15 15:11 UTC (permalink / raw)
  To: Timo Teras; +Cc: Francois Romieu, Eric Dumazet, Ben Hutchings, netdev

On Thu, 15 Mar 2012 08:06:35 +0200 Timo Teras <timo.teras@iki.fi> wrote:

> On Wed, 14 Mar 2012 21:53:19 +0100 Francois Romieu
> <romieu@fr.zoreil.com> wrote:
> 
> > Timo Teras <timo.teras@iki.fi> :
> > [...]
> > > # ethtool -S eth2
> > > NIC statistics:
> > >      tx_packets: 2069391193
> > >      rx_packets: 3245815642
> > >      tx_errors: 0
> > >      rx_errors: 645238
> > >      rx_missed: 31414
> > 
> > It does not look like stuff for the higher layers guys.
> > 
> > Can you tshark -w foobar on the sender side and
> > 'while : ; do sleep 1; ethtool -S eth2 >> glop; done' on the
> > receiver during a bad wget (a big zero filled file should compress
> > well).
> 
> Indeed.
> 
> It seems that my earlier test about the "GRO off" effect were mistaken
> (I used accidentally proxy, and that gave the illusion that things are
> working. Whoops.)
> 
> So far I changed the cross-over cable and it didn't help. However,
> forcing the NIC to 100mbit/full-duplex mode fixes the rx_errors. It
> seems that something bad is happening in the gigabit mode.
> 
> I wonder if it's using pause frames and that's messing things up.
> Seems that I can't turn it off, though.
> 
> I can also double check my cables, though it is factory made Cat-5E
> cross-over cable; and happens with two different cables.

Ok. So far I have two of these boxes with same r8169 hardware. Both
generate bad packets on transmit only; and on both 3 nic systems it's
the middle eth1 nic. The symptoms are identical: in 1GB mode I have
minor packet loss, where as 100Mbit/s mode seems to work just fine.

The first box, that I've been talking so far about, is as mentioned
connected to another similar box. The r8169 there reports rx_errors.
The cable is ok; I've tried with two different ones.

The other broken box is connected to a HP ProCurve 4202vl-48G, and the
switch is reporting drops due to FCS Rx errors.

So I have two broken pieces of hardware, or there is a driver bug.

I'll try upgrading my kernel to 3.0.x series on the sender box and see
if it's fixing anything. Suggestions for further testing would be
appreciated.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-15 15:11                 ` Timo Teras
@ 2012-03-15 16:11                   ` Eric Dumazet
  2012-03-15 18:47                     ` Timo Teras
  2012-03-15 19:11                   ` Francois Romieu
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2012-03-15 16:11 UTC (permalink / raw)
  To: Timo Teras; +Cc: Francois Romieu, Ben Hutchings, netdev

On Thu, 2012-03-15 at 17:11 +0200, Timo Teras wrote:
> On Thu, 15 Mar 2012 08:06:35 +0200 Timo Teras <timo.teras@iki.fi> wrote:
> 
> > On Wed, 14 Mar 2012 21:53:19 +0100 Francois Romieu
> > <romieu@fr.zoreil.com> wrote:
> > 
> > > Timo Teras <timo.teras@iki.fi> :
> > > [...]
> > > > # ethtool -S eth2
> > > > NIC statistics:
> > > >      tx_packets: 2069391193
> > > >      rx_packets: 3245815642
> > > >      tx_errors: 0
> > > >      rx_errors: 645238
> > > >      rx_missed: 31414
> > > 
> > > It does not look like stuff for the higher layers guys.
> > > 
> > > Can you tshark -w foobar on the sender side and
> > > 'while : ; do sleep 1; ethtool -S eth2 >> glop; done' on the
> > > receiver during a bad wget (a big zero filled file should compress
> > > well).
> > 
> > Indeed.
> > 
> > It seems that my earlier test about the "GRO off" effect were mistaken
> > (I used accidentally proxy, and that gave the illusion that things are
> > working. Whoops.)
> > 
> > So far I changed the cross-over cable and it didn't help. However,
> > forcing the NIC to 100mbit/full-duplex mode fixes the rx_errors. It
> > seems that something bad is happening in the gigabit mode.
> > 
> > I wonder if it's using pause frames and that's messing things up.
> > Seems that I can't turn it off, though.
> > 
> > I can also double check my cables, though it is factory made Cat-5E
> > cross-over cable; and happens with two different cables.
> 
> Ok. So far I have two of these boxes with same r8169 hardware. Both
> generate bad packets on transmit only; and on both 3 nic systems it's
> the middle eth1 nic. The symptoms are identical: in 1GB mode I have
> minor packet loss, where as 100Mbit/s mode seems to work just fine.
> 
> The first box, that I've been talking so far about, is as mentioned
> connected to another similar box. The r8169 there reports rx_errors.
> The cable is ok; I've tried with two different ones.
> 
> The other broken box is connected to a HP ProCurve 4202vl-48G, and the
> switch is reporting drops due to FCS Rx errors.
> 
> So I have two broken pieces of hardware, or there is a driver bug.
> 
> I'll try upgrading my kernel to 3.0.x series on the sender box and see
> if it's fixing anything. Suggestions for further testing would be
> appreciated.

r8169 has to make an additional copy of incoming frames, because of
hardware flaw and security requirements.

This was added in 2.6.37 or 2.6.38, dont remember exactly.

So your cpu might be to slow to handle the load at 1Gb speed.

If you have one flow, there is nothing to do, but if your workload has
several flows and your machine is SMP, you can try RPS/RFS as documented
in Documentation/networking/scaling.txt

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-15 16:11                   ` Eric Dumazet
@ 2012-03-15 18:47                     ` Timo Teras
  0 siblings, 0 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-15 18:47 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Francois Romieu, Ben Hutchings, netdev

On Thu, 15 Mar 2012 09:11:46 -0700 Eric Dumazet
<eric.dumazet@gmail.com> wrote:

> On Thu, 2012-03-15 at 17:11 +0200, Timo Teras wrote:
> > On Thu, 15 Mar 2012 08:06:35 +0200 Timo Teras <timo.teras@iki.fi>
> > wrote:
> > 
> > > On Wed, 14 Mar 2012 21:53:19 +0100 Francois Romieu
> > > <romieu@fr.zoreil.com> wrote:
> > > 
> > > > Timo Teras <timo.teras@iki.fi> :
> > > > [...]
> > > > > # ethtool -S eth2
> > > > > NIC statistics:
> > > > >      tx_packets: 2069391193
> > > > >      rx_packets: 3245815642
> > > > >      tx_errors: 0
> > > > >      rx_errors: 645238
> > > > >      rx_missed: 31414
> > > > 
> > > > It does not look like stuff for the higher layers guys.
> > > > 
> > > > Can you tshark -w foobar on the sender side and
> > > > 'while : ; do sleep 1; ethtool -S eth2 >> glop; done' on the
> > > > receiver during a bad wget (a big zero filled file should
> > > > compress well).
> > > 
> > > Indeed.
> > > 
> > > It seems that my earlier test about the "GRO off" effect were
> > > mistaken (I used accidentally proxy, and that gave the illusion
> > > that things are working. Whoops.)
> > > 
> > > So far I changed the cross-over cable and it didn't help. However,
> > > forcing the NIC to 100mbit/full-duplex mode fixes the rx_errors.
> > > It seems that something bad is happening in the gigabit mode.
> > > 
> > > I wonder if it's using pause frames and that's messing things up.
> > > Seems that I can't turn it off, though.
> > > 
> > > I can also double check my cables, though it is factory made
> > > Cat-5E cross-over cable; and happens with two different cables.
> > 
> > Ok. So far I have two of these boxes with same r8169 hardware. Both
> > generate bad packets on transmit only; and on both 3 nic systems
> > it's the middle eth1 nic. The symptoms are identical: in 1GB mode I
> > have minor packet loss, where as 100Mbit/s mode seems to work just
> > fine.
> > 
> > The first box, that I've been talking so far about, is as mentioned
> > connected to another similar box. The r8169 there reports rx_errors.
> > The cable is ok; I've tried with two different ones.
> > 
> > The other broken box is connected to a HP ProCurve 4202vl-48G, and
> > the switch is reporting drops due to FCS Rx errors.
> > 
> > So I have two broken pieces of hardware, or there is a driver bug.
> > 
> > I'll try upgrading my kernel to 3.0.x series on the sender box and
> > see if it's fixing anything. Suggestions for further testing would
> > be appreciated.
> 
> r8169 has to make an additional copy of incoming frames, because of
> hardware flaw and security requirements.
> 
> This was added in 2.6.37 or 2.6.38, dont remember exactly.
> 
> So your cpu might be to slow to handle the load at 1Gb speed.
> 
> If you have one flow, there is nothing to do, but if your workload has
> several flows and your machine is SMP, you can try RPS/RFS as
> documented in Documentation/networking/scaling.txt

No. It's exactly the same amount of traffic on link: approx
50-80mbit/s. If link is in 100mbit/s mode, everything is perfect. But
if link is in 1gbit/s mode (but having only the 50-80mbit/s in average),
it's getting packet loss (and kills TCP performance).

There is definitely a hardware or a driver issue.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-15 15:11                 ` Timo Teras
  2012-03-15 16:11                   ` Eric Dumazet
@ 2012-03-15 19:11                   ` Francois Romieu
  2012-03-16 20:15                     ` Timo Teras
  1 sibling, 1 reply; 23+ messages in thread
From: Francois Romieu @ 2012-03-15 19:11 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, Ben Hutchings, netdev

Timo Teras <timo.teras@iki.fi> :
[...]
> The other broken box is connected to a HP ProCurve 4202vl-48G, and the
> switch is reporting drops due to FCS Rx errors.
[...]
> So I have two broken pieces of hardware, or there is a driver bug.

I'll take blame for any bug in the driver. However many ethernet controllers
are and the PCI 8169 is no exception.

> I'll try upgrading my kernel to 3.0.x series on the sender box and see
> if it's fixing anything. Suggestions for further testing would be
> appreciated.

Please check you are using nothing but SLAB.

If you have not done so, you may then disable Tx checksumming.

If it does not change anything, you may consider using the r8169 from
David Miller's -next branch (backported ? no, no, the real thing). If it
still does not change anything and you are interested in new experiences,
please confirm you are above 18 and we may use Ben Grear's bad rx packets
capture (available in -next) and the port mirroring feature of your switch
to see what the corrupted tx frames look like. Before that, I would
welcome a short description of the router boxes (lspci, proc, etc) and
overall traffic / irq.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-15 19:11                   ` Francois Romieu
@ 2012-03-16 20:15                     ` Timo Teras
  2012-03-17  9:56                       ` Timo Teras
  0 siblings, 1 reply; 23+ messages in thread
From: Timo Teras @ 2012-03-16 20:15 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Ben Hutchings, netdev

On Thu, 15 Mar 2012 20:11:18 +0100 Francois Romieu
<romieu@fr.zoreil.com> wrote:

> Timo Teras <timo.teras@iki.fi> :
> [...]
> > The other broken box is connected to a HP ProCurve 4202vl-48G, and
> > the switch is reporting drops due to FCS Rx errors.
> [...]
> > So I have two broken pieces of hardware, or there is a driver bug.
> 
> I'll take blame for any bug in the driver. However many ethernet
> controllers are and the PCI 8169 is no exception.

Ok.

As a side though, all these devices suffered from the bug I fixed
earlier. See commit 024a07bac (r8169: fix random mdio_write failures).
Also, all these devices probably got garbage written to their PHY. So
I'm wondering if it is possible that it caused some permanent damage?

Would it be possible to dump/compare the related things?

Additional pointer to this direction is that one of the "broken" boxes
has different PCI ID for the "broken NIC" of the three. The hardware is
Jetway daughter board with the three NICs on single board. So it sounds
really weird that one of those NICs chips would be from different
series. I wonder if the PCI ID and other stuff could have got corrupted
in EEPROM or something similar.

> > I'll try upgrading my kernel to 3.0.x series on the sender box and
> > see if it's fixing anything. Suggestions for further testing would
> > be appreciated.
> 
> Please check you are using nothing but SLAB.

Using SLUB, the current kernel default. Can retry with SLAB later.

> If you have not done so, you may then disable Tx checksumming.

Tx checksumming is off.
 
> If it does not change anything, you may consider using the r8169 from
> David Miller's -next branch (backported ? no, no, the real thing). If
> it still does not change anything and you are interested in new
> experiences, please confirm you are above 18 and we may use Ben
> Grear's bad rx packets capture (available in -next) and the port
> mirroring feature of your switch to see what the corrupted tx frames
> look like. Before that, I would welcome a short description of the
> router boxes (lspci, proc, etc) and overall traffic / irq.

Ah, I see the good stuff. Will try to do capture of the FCS on broken
link. And I'll try to relocate the broken hardware to lab environment
where this can be easier reproduced and debugged.

>From the system with one NIC showing wrong PCI id (but XID and
boottime detection is identical for all these):

# lspci -nn 
00:00.0 Host bridge [0600]: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge [1106:0314]
00:00.1 Host bridge [0600]: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge [1106:1314]
00:00.2 Host bridge [0600]: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge [1106:2314]
00:00.3 Host bridge [0600]: VIA Technologies, Inc. PT890 Host Bridge [1106:3208]
00:00.4 Host bridge [0600]: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge [1106:4314]
00:00.7 Host bridge [0600]: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge [1106:7314]
00:01.0 PCI bridge [0604]: VIA Technologies, Inc. VT8237/VX700 PCI Bridge [1106:b198]
00:09.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet [10ec:8167] (rev 10)
00:0a.0 FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306 Fire II IEEE 1394 OHCI Link Layer Controller [1106:3044] (rev 80)
00:0b.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet [10ec:8169] (rev 10)
00:0c.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet [10ec:8167] (rev 10)
00:0f.0 IDE interface [0101]: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller [1106:3149] (rev 80)
00:0f.1 IDE interface [0101]: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE [1106:0571] (rev 06)
00:10.0 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 81)
00:10.1 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 81)
00:10.2 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 81)
00:10.3 USB Controller [0c03]: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller [1106:3038] (rev 81)
00:10.4 USB Controller [0c03]: VIA Technologies, Inc. USB 2.0 [1106:3104] (rev 86)
00:11.0 ISA bridge [0601]: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] [1106:3227]
00:11.5 Multimedia audio controller [0401]: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller [1106:3059] (rev 60)
00:12.0 Ethernet controller [0200]: VIA Technologies, Inc. VT6102 [Rhine-II] [1106:3065] (rev 78)
01:00.0 VGA compatible controller [0300]: VIA Technologies, Inc. CN700/P4M800 Pro/P4M800 CE/VN800 [S3 UniChrome Pro] [1106:3344] (rev 01)

# grep eth /var/log/dmesg 
r8169 0000:00:09.0: eth0: RTL8169sc/8110sc at 0xf81fe000, 00:30:18:a8:14:ac, XID 18000000 IRQ 18
r8169 0000:00:0b.0: eth1: RTL8169sc/8110sc at 0xf8202000, 00:30:18:ab:69:4b, XID 18000000 IRQ 19
r8169 0000:00:0c.0: eth2: RTL8169sc/8110sc at 0xf8206000, 00:30:18:a8:14:ad, XID 18000000 IRQ 16
eth3: VIA Rhine II at 0x1e800, 00:30:18:a0:d5:53, IRQ 23.
eth3: MII PHY found at address 1, status 0x7849 advertising 05e1 Link 0000.

# cat /proc/cpuinfo 
processor	: 0
vendor_id	: CentaurHauls
cpu family	: 6
model		: 13
model name	: VIA Eden Processor 1200MHz
stepping	: 0
cpu MHz		: 1199.906
cache size	: 128 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce apic mtrr pge cmov pat clflush acpi mmx fxsr sse sse2 tm nx up pni est tm2 xtpr rng rng_en ace ace_en ace2 ace2_en phe phe_en pmm pmm_en
bogomips	: 2400.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 32 bits virtual
power management:

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-16 20:15                     ` Timo Teras
@ 2012-03-17  9:56                       ` Timo Teras
  2012-03-17 11:35                         ` Francois Romieu
  0 siblings, 1 reply; 23+ messages in thread
From: Timo Teras @ 2012-03-17  9:56 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Ben Hutchings, netdev

On Fri, 16 Mar 2012 22:15:57 +0200 Timo Teras <timo.teras@iki.fi> wrote:

> On Thu, 15 Mar 2012 20:11:18 +0100 Francois Romieu
> <romieu@fr.zoreil.com> wrote:
> 
> > Timo Teras <timo.teras@iki.fi> :
> > [...]
> > > The other broken box is connected to a HP ProCurve 4202vl-48G, and
> > > the switch is reporting drops due to FCS Rx errors.
> > [...]
> > > So I have two broken pieces of hardware, or there is a driver bug.
> > 
> > I'll take blame for any bug in the driver. However many ethernet
> > controllers are and the PCI 8169 is no exception.
> 
> Ok.
> 
> As a side though, all these devices suffered from the bug I fixed
> earlier. See commit 024a07bac (r8169: fix random mdio_write failures).
> Also, all these devices probably got garbage written to their PHY. So
> I'm wondering if it is possible that it caused some permanent damage?
> 
> Would it be possible to dump/compare the related things?
> 
> Additional pointer to this direction is that one of the "broken" boxes
> has different PCI ID for the "broken NIC" of the three. The hardware
> is Jetway daughter board with the three NICs on single board. So it
> sounds really weird that one of those NICs chips would be from
> different series. I wonder if the PCI ID and other stuff could have
> got corrupted in EEPROM or something similar.

It seems that we have working eeprom reading code in commit 6709fe9a27e4
"r8169: read MAC address from EEPROM on init (2nd attempt)" which later
got reverted due to problems. I'm now wondering if those problems were
actually caused by unrelated issues that got later fixed in 78f1cd02457
"r8169: fix broken register writes".

I wonder if it'd be worth to do the eeprom reading and expose it via
ethtool so I can compare those. Or as easy alternative, enabling the VPD
bit in Config1 should allow me to read the EEPROM contents using the PCI
/sys/.../vpd interface, right?

And maybe re-introduce the reading of the MAC from there on reboot. Or
if could just do:
-	Cfg9346_Lock    = 0x00,
+	Cfg9346_Lock    = 0x40,

The 0x40 apparently means "Auto-load: the EEPROM contents will be
reloaded when PCI RSTB signal is asserted, and will automatically
resume to normal 0x00 mode after the load".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-17  9:56                       ` Timo Teras
@ 2012-03-17 11:35                         ` Francois Romieu
  2012-03-17 22:20                           ` Francois Romieu
  0 siblings, 1 reply; 23+ messages in thread
From: Francois Romieu @ 2012-03-17 11:35 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, Ben Hutchings, netdev

Timo Teras <timo.teras@iki.fi> :
> On Fri, 16 Mar 2012 22:15:57 +0200 Timo Teras <timo.teras@iki.fi> wrote:
[...]
> > Additional pointer to this direction is that one of the "broken" boxes
> > has different PCI ID for the "broken NIC" of the three. The hardware
> > is Jetway daughter board with the three NICs on single board. So it
> > sounds really weird that one of those NICs chips would be from
> > different series. I wonder if the PCI ID and other stuff could have
> > got corrupted in EEPROM or something similar.

Some of my old PCI 8169 show an unpleasant trend to lose config bits and
they can turn into unavailable devices (i.e. all 0xff registers) when
things go really wrong.

> It seems that we have working eeprom reading code in commit 6709fe9a27e4
> "r8169: read MAC address from EEPROM on init (2nd attempt)" which later
> got reverted due to problems. I'm now wondering if those problems were
> actually caused by unrelated issues that got later fixed in 78f1cd02457
> "r8169: fix broken register writes".

I have not tried working with the eeprom again since 024a07bac.

> I wonder if it'd be worth to do the eeprom reading and expose it via
> ethtool so I can compare those.

Yes.

> Or as easy alternative, enabling the VPD bit in Config1 should allow me
> to read the EEPROM contents using the PCI /sys/.../vpd interface, right?

In theory, yes. I have not tested it. Imho both access methods will be
useful.

I should have some unfinished VPD stuff somewhere. Will have to dig...

> And maybe re-introduce the reading of the MAC from there on reboot. Or
> if could just do:
> -	Cfg9346_Lock    = 0x00,
> +	Cfg9346_Lock    = 0x40,
> 
> The 0x40 apparently means "Auto-load: the EEPROM contents will be
> reloaded when PCI RSTB signal is asserted, and will automatically
> resume to normal 0x00 mode after the load".

It's a bit early to tell but I agree it may make some sense with
adequate conditions. I do not want to immediately break platforms
where bios / firmware plays itself games with eeprom reload or such.

See some resurrected r8169 eeprom patch below. I have to leave for work so
it is done in a hurry. It does not seem to crash immediately though:

# for d in 8168d-vb-gr 8102e-vb-gr 8168b-lom netgear; do ethtool -e $d; done
Offset		Values
------		------
0x0000		29 81 ec 10 68 81 ec 10 68 81 04 01 9c 62 00 e0 
0x0010		4c 68 00 2c 05 cf c3 ff 04 02 c0 8c 80 02 00 00 
0x0020		11 3c 07 00 10 20 76 00 63 01 01 ff 00 27 aa 03 
0x0030		02 20 89 7a 80 02 00 20 04 40 20 00 04 40 20 20 
0x0040		00 00 20 e1 22 b5 60 00 0a 00 e0 00 68 4c 00 00 
0x0050		30 00 00 00 b2 73 75 ea 87 75 7a 39 ca 98 ff ff 
0x0060		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0070		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
Offset		Values
------		------
0x0000		29 81 ec 10 36 81 ec 10 36 81 04 01 3c 62 00 e0 
0x0010		4c 36 00 07 05 0f c3 ff 02 14 c1 86 80 02 00 00 
0x0020		11 3c 07 00 10 20 76 00 63 01 01 ff 00 27 aa 03 
0x0030		02 20 4e 86 80 02 00 20 10 00 21 00 10 00 21 20 
0x0040		00 00 80 70 22 1d 80 00 20 00 e0 00 36 4c 00 00 
0x0050		07 00 00 00 af eb b5 35 00 00 00 00 00 00 00 00 
0x0060		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0070		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Offset		Values
------		------
0x0000		29 81 ec 10 68 81 49 18 68 81 04 01 00 20 00 13 
0x0010		8f ea b1 5d 05 df c2 f7 42 00 23 7f 00 10 04 03 
0x0020		68 81 ec 10 00 00 00 1a ff ff ff ff ff ff 1f 00 
0x0030		00 47 ee 79 10 f0 f0 01 bf 01 00 00 60 00 00 01 
0x0040		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0050		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0060		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0070		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Offset		Values
------		------
0x0000		29 81 ec 10 69 81 85 13 1a 31 20 40 00 a1 00 09 
0x0010		5b bd c1 a5 15 0d c2 f7 00 80 00 00 00 00 00 13 
0x0020		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0030		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 
0x0040		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0050		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0060		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0070		00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

"netgear" is PCI but its XID is 04000000. I'll swap it this evening.

diff --git a/drivers/net/ethernet/realtek/Kconfig b/drivers/net/ethernet/realtek/Kconfig
index 5821966..039fcc6 100644
--- a/drivers/net/ethernet/realtek/Kconfig
+++ b/drivers/net/ethernet/realtek/Kconfig
@@ -109,6 +109,7 @@ config R8169
 	select CRC32
 	select NET_CORE
 	select MII
+	select EEPROM_93CX6
 	---help---
 	  Say Y here if you have a Realtek 8169 PCI Gigabit Ethernet adapter.
 
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 27c358c..b909475 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -28,6 +28,7 @@
 #include <linux/firmware.h>
 #include <linux/pci-aspm.h>
 #include <linux/prefetch.h>
+#include <linux/eeprom_93cx6.h>
 
 #include <asm/system.h>
 #include <asm/io.h>
@@ -310,6 +311,7 @@ enum rtl_registers {
 #define	RXCFG_DMA_SHIFT			8
 					/* Unlimited maximum PCI burst. */
 #define	RX_DMA_BURST			(7 << RXCFG_DMA_SHIFT)
+#define EEPROM_9356_SELECT		(1 << 6)
 
 	RxMissed	= 0x4c,
 	Cfg9346		= 0x50,
@@ -450,9 +452,16 @@ enum rtl_register_content {
 	NPQ		= 0x40,		/* Poll cmd on the low prio queue */
 	FSWInt		= 0x01,		/* Forced software interrupt */
 
-	/* Cfg9346Bits */
-	Cfg9346_Lock	= 0x00,
+	/* Cfg9346 operating mode register p.23 */
 	Cfg9346_Unlock	= 0xc0,
+	Cfg9346_Prog	= 0x80,
+	Cfg9346_Auto	= 0x80,
+	Cfg9346_Lock	= 0x00,
+	/* Sub-mode bits in Programming or Auto-load mode. */
+	Cfg9346_CS	= 0x08,		/* Chip Select */
+	Cfg9346_SK	= 0x04,		/* Serial Data Clock */
+	Cfg9346_DI	= 0x02,		/* Data In (going into the eeprom) */
+	Cfg9346_DO	= 0x01,		/* Data Out (coming from the eeprom) */
 
 	/* rx_mode_bits */
 	AcceptErr	= 0x20,
@@ -751,6 +760,8 @@ struct rtl8169_private {
 		} phy_action;
 	} *rtl_fw;
 #define RTL_FIRMWARE_UNKNOWN	ERR_PTR(-EAGAIN)
+
+	struct eeprom_93cx6 eeprom;
 };
 
 MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
@@ -1702,6 +1713,58 @@ static int rtl8169_gset_xmii(struct net_device *dev, struct ethtool_cmd *cmd)
 	return mii_ethtool_gset(&tp->mii, cmd);
 }
 
+static int rtl_get_eeprom_len(struct net_device *dev)
+{
+	struct rtl8169_private *tp = netdev_priv(dev);
+
+	return tp->eeprom.size;
+}
+
+static void eeprom_cmd_start(void __iomem *ioaddr)
+{
+	RTL_W8(Cfg9346, Cfg9346_Prog);
+}
+
+static void eeprom_cmd_end(void __iomem *ioaddr)
+{
+	RTL_W8(Cfg9346, Cfg9346_Lock);
+}
+
+static int rtl_get_eeprom(struct net_device *dev, struct ethtool_eeprom *ee,
+			  u8 *data)
+{
+	struct rtl8169_private *tp = netdev_priv(dev);
+	struct eeprom_93cx6 *eeprom = &tp->eeprom;
+	void __iomem *ioaddr = tp->mmio_addr;
+	u32 offset = ee->offset;
+	u32 len = ee->len;
+	u16 reg;
+
+	rtl_lock_work(tp);
+
+	eeprom_cmd_start(ioaddr);
+
+	if (offset & 0x1) {
+		eeprom_93cx6_read(eeprom, offset >> 1, &reg);
+		*data++ = reg >> 8;
+		offset++;
+		len--;
+	}
+
+	eeprom_93cx6_multiread(eeprom, offset >> 1,  (__le16 *)data, len >> 1);
+
+	if (len & 0x1) {
+		eeprom_93cx6_read(eeprom, (offset >> 1) + (len >> 1) + 1, &reg);
+		data[len] = reg;
+	}
+
+	eeprom_cmd_end(ioaddr);
+
+	rtl_unlock_work(tp);
+
+	return 0;
+}
+
 static int rtl8169_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -1844,6 +1907,8 @@ static const struct ethtool_ops rtl8169_ethtool_ops = {
 	.get_drvinfo		= rtl8169_get_drvinfo,
 	.get_regs_len		= rtl8169_get_regs_len,
 	.get_link		= ethtool_op_get_link,
+	.get_eeprom_len		= rtl_get_eeprom_len,
+	.get_eeprom		= rtl_get_eeprom,
 	.get_settings		= rtl8169_get_settings,
 	.set_settings		= rtl8169_set_settings,
 	.get_msglevel		= rtl8169_get_msglevel,
@@ -5803,6 +5868,63 @@ static int rtl8169_suspend(struct device *device)
 	return 0;
 }
 
+static void rtl_93cx6_register_read(struct eeprom_93cx6 *eeprom)
+{
+	struct net_device *dev = eeprom->data;
+	struct rtl8169_private *tp = netdev_priv(dev);
+	void __iomem *ioaddr = tp->mmio_addr;
+	u8 reg;
+
+	reg = RTL_R8(Cfg9346);
+
+	eeprom->reg_data_in     = reg & Cfg9346_DI;
+	eeprom->reg_data_out    = reg & Cfg9346_DO;
+	eeprom->reg_data_clock  = reg & Cfg9346_SK;
+	eeprom->reg_chip_select = reg & Cfg9346_CS;
+}
+
+static void rtl_93cx6_register_write(struct eeprom_93cx6 *eeprom)
+{
+	struct net_device *dev = eeprom->data;
+	struct rtl8169_private *tp = netdev_priv(dev);
+	void __iomem *ioaddr = tp->mmio_addr;
+	u8 reg = Cfg9346_Prog;
+
+	if (eeprom->reg_data_in)
+		reg |= Cfg9346_DI;
+	if (eeprom->reg_data_out)
+		reg |= Cfg9346_DO;
+	if (eeprom->reg_data_clock)
+		reg |= Cfg9346_SK;
+	if (eeprom->reg_chip_select)
+		reg |= Cfg9346_CS;
+
+	RTL_W8(Cfg9346, reg);
+	/* PCI commit */
+	RTL_R8(ChipCmd);
+	/* This is not a posting bug band-aid: the eeprom wants ~250 ns. */
+	ndelay(250);
+}
+
+static void rtl_init_eeprom(struct net_device *dev, struct rtl8169_private *tp)
+{
+	struct eeprom_93cx6 *eeprom = &tp->eeprom;
+	void __iomem *ioaddr = tp->mmio_addr;
+
+	eeprom->data = dev;
+
+	eeprom->register_read  = rtl_93cx6_register_read;
+	eeprom->register_write = rtl_93cx6_register_write;
+
+	if (RTL_R32(RxConfig) & EEPROM_9356_SELECT) {
+		eeprom->width = PCI_EEPROM_WIDTH_93C56;
+		eeprom->size = 256;
+	} else {
+		eeprom->width = PCI_EEPROM_WIDTH_93C46;
+		eeprom->size = 128;
+	}
+}
+
 static void __rtl8169_resume(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -6243,6 +6365,8 @@ rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	tp->opts1_mask = (tp->mac_version != RTL_GIGA_MAC_VER_01) ?
 		~(RxBOVF | RxFOVF) : ~0;
 
+	rtl_init_eeprom(dev, tp);
+
 	init_timer(&tp->timer);
 	tp->timer.data = (unsigned long) dev;
 	tp->timer.function = rtl8169_phy_timer;
diff --git a/include/linux/eeprom_93cx6.h b/include/linux/eeprom_93cx6.h
index e50f98b..d6e9cef 100644
--- a/include/linux/eeprom_93cx6.h
+++ b/include/linux/eeprom_93cx6.h
@@ -63,6 +63,7 @@ struct eeprom_93cx6 {
 	void (*register_write)(struct eeprom_93cx6 *eeprom);
 
 	int width;
+	int size;
 
 	char drive_data;
 	char reg_data_in;
-- 
Ueimor

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-17 11:35                         ` Francois Romieu
@ 2012-03-17 22:20                           ` Francois Romieu
  2012-03-18  7:00                             ` Timo Teras
  2012-03-20 15:31                             ` Timo Teras
  0 siblings, 2 replies; 23+ messages in thread
From: Francois Romieu @ 2012-03-17 22:20 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, Ben Hutchings, netdev

Francois Romieu <romieu@fr.zoreil.com> :
[...]
> > Or as easy alternative, enabling the VPD bit in Config1 should allow me
> > to read the EEPROM contents using the PCI /sys/.../vpd interface, right?
> 
> In theory, yes. I have not tested it. Imho both access methods will be
> useful.

I tried vpd and got the eeprom content, duplicated 256 times.

The eeprom content is fairly boring:

# ethtool -e 8169sc-1                                                                      
Offset          Values
------          ------
0x0000          29 81 ec 10 67 81 ec 10 67 81 20 40 01 a1 00 e0 
0x0010          4c 67 00 01 15 cd c2 f7 ff 80 ff ff ff ff ff 13 
0x0020          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0030          ff ff fa d6 ff ff ff ff ff ff ff ff ff ff ff 20 
0x0040          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0050          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0060          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0070          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 

-- 
Ueimor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-17 22:20                           ` Francois Romieu
@ 2012-03-18  7:00                             ` Timo Teras
  2012-03-20 15:31                             ` Timo Teras
  1 sibling, 0 replies; 23+ messages in thread
From: Timo Teras @ 2012-03-18  7:00 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Ben Hutchings, netdev

On Sat, 17 Mar 2012 23:20:04 +0100 Francois Romieu
<romieu@fr.zoreil.com> wrote:

> Francois Romieu <romieu@fr.zoreil.com> :
> [...]
> > > Or as easy alternative, enabling the VPD bit in Config1 should
> > > allow me to read the EEPROM contents using the PCI /sys/.../vpd
> > > interface, right?
> > 
> > In theory, yes. I have not tested it. Imho both access methods will
> > be useful.
> 
> I tried vpd and got the eeprom content, duplicated 256 times.
> 
> The eeprom content is fairly boring:
> 
> # ethtool -e
> 8169sc-1 Offset          Values
> ------          ------
> 0x0000          29 81 ec 10 67 81 ec 10 67 81 20 40 01 a1 00 e0 
> 0x0010          4c 67 00 01 15 cd c2 f7 ff 80 ff ff ff ff ff 13 
> 0x0020          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0030          ff ff fa d6 ff ff ff ff ff ff ff ff ff ff ff 20 
> 0x0040          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0050          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0060          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0070          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 

Ok. Thanks. I should be able to swap my broken box tomorrow or the day
after, and then start doing expirements and compare the eeprom.

I also found one more weirdly broken box. One of my similar boxes fails
to LACP bonding with a HP ProCurve switches. The switch just says LACP
failed, but linux thinks it's ok and starts aggregation, and this
results in broken traffic for all flows using the link that was not
accepted as part of the aggregation by the switch.

To only explanation I've found so far is: the switch reports other link
as MDI and the other as MDI-X. And this is enough to fail the
aggregation (links are not identical), but the linux bonding does not
notice this. I have no idea why the other link is MDI-X - it's a
straight cable. I wonder if it could be eeprom related too... or maybe
it's just another broken NIC.

Oh well - I'll get back to this after few days when I have the broken
box on my table for playing.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-17 22:20                           ` Francois Romieu
  2012-03-18  7:00                             ` Timo Teras
@ 2012-03-20 15:31                             ` Timo Teras
  2012-03-20 18:20                               ` Francois Romieu
  1 sibling, 1 reply; 23+ messages in thread
From: Timo Teras @ 2012-03-20 15:31 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Ben Hutchings, netdev

On Sat, 17 Mar 2012 23:20:04 +0100 Francois Romieu
<romieu@fr.zoreil.com> wrote:

> Francois Romieu <romieu@fr.zoreil.com> :
> [...]
> > > Or as easy alternative, enabling the VPD bit in Config1 should
> > > allow me to read the EEPROM contents using the PCI /sys/.../vpd
> > > interface, right?
> > 
> > In theory, yes. I have not tested it. Imho both access methods will
> > be useful.
> 
> I tried vpd and got the eeprom content, duplicated 256 times.
> 
> The eeprom content is fairly boring:
> 
> # ethtool -e
> 8169sc-1 Offset          Values
> ------          ------
> 0x0000          29 81 ec 10 67 81 ec 10 67 81 20 40 01 a1 00 e0 
> 0x0010          4c 67 00 01 15 cd c2 f7 ff 80 ff ff ff ff ff 13 
> 0x0020          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0030          ff ff fa d6 ff ff ff ff ff ff ff ff ff ff ff 20 
> 0x0040          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0050          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0060          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> 0x0070          ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 

Ok. I now have the box that was sending faulty packets and has weird
PCI ID on my desk for playing. I swapped it for identical box - the
software and configuration are identical - and now the 1 gig mode errors
are gone on the setup. So the problem is caused by hardware; either due
to bad eeprom/firmware in the rtl8110sc or some other issue.

I also built net-next and took the ehttool -e dumps of the eeproms.

>From a working eth0:

Offset          Values
------          ------
0x0000          29 81 ec 10 67 81 f3 16 ec 10 20 40 00 a1 00 30 
0x0010          18 a8 14 ac 15 0d c2 f7 00 80 00 00 00 00 00 13 
0x0020          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0030          00 00 82 c7 00 00 00 00 00 00 00 00 00 00 00 20 
0x0040          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0050          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0060          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0070          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

And the "broken" eth1:

Offset          Values
------          ------
0x0000          29 81 ec 10 69 81 f3 16 ec 10 20 40 00 a1 00 30 
0x0010          18 ab 69 4b 14 0d c2 f7 00 80 00 00 00 00 00 13 
0x0020          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0030          00 00 2c 25 00 00 00 00 00 00 00 00 00 00 00 20 
0x0040          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0050          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0060          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x0070          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

The only differences seem to be the PCI ID field at byte offset 4-5,
Config 0 field at byte offset 0x14, and the checksum at 0x32-0x33.

If I understand correctly, the Config 0 bit 0 affects Boot ROM size. It
affects only the PXE boot sequence?

Additionally, I can verify that all the chips have "RTL8110SC 67233S1
G28B" on them. So the differing PCI IDs is an oddity.

I can do some additional tests, and test if the bad packets can be
reproduced against a switch and captured.

But other than that, I'm wondering if the failed mdio writing could have
caused permanent damage in the PHY.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration
  2012-03-20 15:31                             ` Timo Teras
@ 2012-03-20 18:20                               ` Francois Romieu
  0 siblings, 0 replies; 23+ messages in thread
From: Francois Romieu @ 2012-03-20 18:20 UTC (permalink / raw)
  To: Timo Teras; +Cc: Eric Dumazet, Ben Hutchings, netdev

Timo Teras <timo.teras@iki.fi> :
[...]
> The only differences seem to be the PCI ID field at byte offset 4-5,
> Config 0 field at byte offset 0x14, and the checksum at 0x32-0x33.
> 
> If I understand correctly, the Config 0 bit 0 affects Boot ROM size. It
> affects only the PXE boot sequence?

I have never played with it. I can only tell that it has a different value
for my motherboard included 8168b (see previous messages).

[...]
> But other than that, I'm wondering if the failed mdio writing could have
> caused permanent damage in the PHY.

It's hard to tell and it wouldn't explain the corrupted eeprom.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-03-20 18:22 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-14 17:01 linux-3.0.18+r8169+ipv4/tcp forwarding = tso/gso weirdness and performance degration Timo Teras
2012-03-14 17:15 ` Eric Dumazet
2012-03-14 17:29   ` Timo Teras
2012-03-14 18:25     ` Eric Dumazet
2012-03-14 19:29     ` Ben Hutchings
2012-03-14 19:51       ` Timo Teras
2012-03-14 20:12         ` Eric Dumazet
2012-03-14 20:33           ` Timo Teras
2012-03-14 20:52             ` Eric Dumazet
2012-03-14 20:53             ` Francois Romieu
2012-03-15  6:06               ` Timo Teras
2012-03-15 15:11                 ` Timo Teras
2012-03-15 16:11                   ` Eric Dumazet
2012-03-15 18:47                     ` Timo Teras
2012-03-15 19:11                   ` Francois Romieu
2012-03-16 20:15                     ` Timo Teras
2012-03-17  9:56                       ` Timo Teras
2012-03-17 11:35                         ` Francois Romieu
2012-03-17 22:20                           ` Francois Romieu
2012-03-18  7:00                             ` Timo Teras
2012-03-20 15:31                             ` Timo Teras
2012-03-20 18:20                               ` Francois Romieu
2012-03-14 21:16 ` Francois Romieu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).