netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problems with dropped packets on bonded interface for 3.x kernels
@ 2011-11-21  5:16 Albert Chin
  2011-11-21  6:32 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Chin @ 2011-11-21  5:16 UTC (permalink / raw)
  To: netdev

I'm running Ubuntu 11.10 on an Intel SR2625URLXR system with an Intel
S5520UR motherboard and an internal Intel E1G44HT (I340-T4) Quad Port
Server Adapter. I am seeing dropped packets on a bonded interface,
comprised of two GigE ports on the Intel E1G44HT Quad Port Server
Adapter. The following kernels exhibit this problem:
  3.0.0-12-server, 3.0.0-13-server, 3.1.0-2-server, 3.2.0-rc2
Installing Fedora 16 with a 3.1.1-1.fc16.x86_64 also showed dropped
packets.

I also tried RHEL6 with a 2.6.32-131.17.1.el6.x86_64 kernel and didn't
see any dropped packets. Testing an older 2.6.32-28.55-generic Ubuntu
kernel also didn't show any dropped packets.

So, with 2.6, I don't see dropped packets, but everything including
3.0 and after show dropped packets.

# ifconfig bond0
bond0     Link encap:Ethernet  HWaddr 00:1b:21:d3:f6:0a  
          inet6 addr: fe80::21b:21ff:fed3:f60a/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:225 errors:0 dropped:186 overruns:0 frame:0
          TX packets:231 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:25450 (25.4 KB)  TX bytes:28368 (28.3 KB)

With lacp_rate=fast, I see higher packet loss than with
lacp_rate=slow. I've tried bonding t

This server has the following network controllers for the two internal
NICs:
  # lspci -vv
  01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
  01:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)

And it has the following network controllers for the four NICs on the
I340-T4 PCI-E card:
  # lspci -vv
  0a:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
  0a:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
  0a:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
  0a:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)

I tried bonding the two 82575EB NICs rather than two NICs on the 82580
but see the same dropped packet issue.

I have replaced the cables, tested each port individually on the
switch without bonding, and don't see any reason to expect hardware as
the issue. The switch is a Summit Extreme 400-48t.

I am using a 802.3ad configuration:
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 1
        Actor Key: 17
        Partner Key: 24
        Partner Mac Address: 00:04:96:18:54:d5

Slave Interface: eth4
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:d3:f6:0a
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth5
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:d3:f6:0b
Aggregator ID: 2
Slave queue ID: 0

Anyone have any ideas?

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dropped packets on bonded interface for 3.x kernels
  2011-11-21  5:16 Problems with dropped packets on bonded interface for 3.x kernels Albert Chin
@ 2011-11-21  6:32 ` Eric Dumazet
  2011-11-21  7:44   ` Albert Chin
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2011-11-21  6:32 UTC (permalink / raw)
  To: netdev

Le dimanche 20 novembre 2011 à 23:16 -0600, Albert Chin a écrit :
> I'm running Ubuntu 11.10 on an Intel SR2625URLXR system with an Intel
> S5520UR motherboard and an internal Intel E1G44HT (I340-T4) Quad Port
> Server Adapter. I am seeing dropped packets on a bonded interface,
> comprised of two GigE ports on the Intel E1G44HT Quad Port Server
> Adapter. The following kernels exhibit this problem:
>   3.0.0-12-server, 3.0.0-13-server, 3.1.0-2-server, 3.2.0-rc2
> Installing Fedora 16 with a 3.1.1-1.fc16.x86_64 also showed dropped
> packets.
> 
> I also tried RHEL6 with a 2.6.32-131.17.1.el6.x86_64 kernel and didn't
> see any dropped packets. Testing an older 2.6.32-28.55-generic Ubuntu
> kernel also didn't show any dropped packets.
> 
> So, with 2.6, I don't see dropped packets, but everything including
> 3.0 and after show dropped packets.
> 
> # ifconfig bond0
> bond0     Link encap:Ethernet  HWaddr 00:1b:21:d3:f6:0a  
>           inet6 addr: fe80::21b:21ff:fed3:f60a/64 Scope:Link
>           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>           RX packets:225 errors:0 dropped:186 overruns:0 frame:0
>           TX packets:231 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0 
>           RX bytes:25450 (25.4 KB)  TX bytes:28368 (28.3 KB)
> 
> With lacp_rate=fast, I see higher packet loss than with
> lacp_rate=slow. I've tried bonding t
> 
> This server has the following network controllers for the two internal
> NICs:
>   # lspci -vv
>   01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
>   01:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
> 
> And it has the following network controllers for the four NICs on the
> I340-T4 PCI-E card:
>   # lspci -vv
>   0a:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
>   0a:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
>   0a:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
>   0a:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
> 
> I tried bonding the two 82575EB NICs rather than two NICs on the 82580
> but see the same dropped packet issue.
> 
> I have replaced the cables, tested each port individually on the
> switch without bonding, and don't see any reason to expect hardware as
> the issue. The switch is a Summit Extreme 400-48t.
> 
> I am using a 802.3ad configuration:
> # cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> 
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> Transmit Hash Policy: layer2 (0)
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 200
> Down Delay (ms): 0
> 
> 802.3ad info
> LACP rate: fast
> Aggregator selection policy (ad_select): stable
> Active Aggregator Info:
>         Aggregator ID: 1
>         Number of ports: 1
>         Actor Key: 17
>         Partner Key: 24
>         Partner Mac Address: 00:04:96:18:54:d5
> 
> Slave Interface: eth4
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 00:1b:21:d3:f6:0a
> Aggregator ID: 1
> Slave queue ID: 0
> 
> Slave Interface: eth5
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 00:1b:21:d3:f6:0b
> Aggregator ID: 2
> Slave queue ID: 0
> 
> Anyone have any ideas?
> 

Old kernels were dropping some packets (unknown protocols...) without
counting them.

So following patch was added in 2.6.37 :

You could use tcdpump to identify what are these dropped packets :)

commit caf586e5f23cebb2a68cbaf288d59dbbf2d74052
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Sep 30 21:06:55 2010 +0000

    net: add a core netdev->rx_dropped counter
    
    In various situations, a device provides a packet to our stack and we
    drop it before it enters protocol stack :
    - softnet backlog full (accounted in /proc/net/softnet_stat)
    - bad vlan tag (not accounted)
    - unknown/unregistered protocol (not accounted)
    
    We can handle a per-device counter of such dropped frames at core level,
    and automatically adds it to the device provided stats (rx_dropped), so
    that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
    
    This is a generalization of commit 8990f468a (net: rx_dropped
    accounting), thus reverting it.
    
    Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dropped packets on bonded interface for 3.x kernels
  2011-11-21  6:32 ` Eric Dumazet
@ 2011-11-21  7:44   ` Albert Chin
  2011-11-21  8:02     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Chin @ 2011-11-21  7:44 UTC (permalink / raw)
  To: netdev

On Mon, Nov 21, 2011 at 07:32:03AM +0100, Eric Dumazet wrote:
> Le dimanche 20 novembre 2011 à 23:16 -0600, Albert Chin a écrit :
> > I'm running Ubuntu 11.10 on an Intel SR2625URLXR system with an Intel
> > S5520UR motherboard and an internal Intel E1G44HT (I340-T4) Quad Port
> > Server Adapter. I am seeing dropped packets on a bonded interface,
> > comprised of two GigE ports on the Intel E1G44HT Quad Port Server
> > Adapter. The following kernels exhibit this problem:
> >   3.0.0-12-server, 3.0.0-13-server, 3.1.0-2-server, 3.2.0-rc2
> > Installing Fedora 16 with a 3.1.1-1.fc16.x86_64 also showed dropped
> > packets.
> > 
> > I also tried RHEL6 with a 2.6.32-131.17.1.el6.x86_64 kernel and didn't
> > see any dropped packets. Testing an older 2.6.32-28.55-generic Ubuntu
> > kernel also didn't show any dropped packets.
> > 
> > So, with 2.6, I don't see dropped packets, but everything including
> > 3.0 and after show dropped packets.
> > 
> > # ifconfig bond0
> > bond0     Link encap:Ethernet  HWaddr 00:1b:21:d3:f6:0a  
> >           inet6 addr: fe80::21b:21ff:fed3:f60a/64 Scope:Link
> >           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
> >           RX packets:225 errors:0 dropped:186 overruns:0 frame:0
> >           TX packets:231 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0 
> >           RX bytes:25450 (25.4 KB)  TX bytes:28368 (28.3 KB)
> > 
> > [[ snip snip ]]
> 
> Old kernels were dropping some packets (unknown protocols...) without
> counting them.
> 
> So following patch was added in 2.6.37 :
> 
> You could use tcdpump to identify what are these dropped packets :)

So only "unknown" protocols are dropped? I just ran tcpdump for 96
packets and inspected the output. The only packets received are LACP,
ARP, STP (Spanning Tree Protocol), DTP (Dynamic Trunking Protocol).
How are these "unknown"?

> commit caf586e5f23cebb2a68cbaf288d59dbbf2d74052
> Author: Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Thu Sep 30 21:06:55 2010 +0000
> 
>     net: add a core netdev->rx_dropped counter
>     
>     In various situations, a device provides a packet to our stack and we
>     drop it before it enters protocol stack :
>     - softnet backlog full (accounted in /proc/net/softnet_stat)
>     - bad vlan tag (not accounted)
>     - unknown/unregistered protocol (not accounted)
>     
>     We can handle a per-device counter of such dropped frames at core level,
>     and automatically adds it to the device provided stats (rx_dropped), so
>     that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
>     
>     This is a generalization of commit 8990f468a (net: rx_dropped
>     accounting), thus reverting it.
>     
>     Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dropped packets on bonded interface for 3.x kernels
  2011-11-21  7:44   ` Albert Chin
@ 2011-11-21  8:02     ` Eric Dumazet
  2011-11-21  8:08       ` Albert Chin
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2011-11-21  8:02 UTC (permalink / raw)
  To: netdev

Le lundi 21 novembre 2011 à 01:44 -0600, Albert Chin a écrit :

> So only "unknown" protocols are dropped? I just ran tcpdump for 96
> packets and inspected the output. The only packets received are LACP,
> ARP, STP (Spanning Tree Protocol), DTP (Dynamic Trunking Protocol).
> How are these "unknown"?

No protocol handler is setup to analyze some them in your box.

(ARP is handled of course)

Like if you receive IPv6 packets while IPv6 was not compiled/loaded into
your kernel.

Nothing you have to worry about. We receive plenty of unknown packets
these days...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dropped packets on bonded interface for 3.x kernels
  2011-11-21  8:02     ` Eric Dumazet
@ 2011-11-21  8:08       ` Albert Chin
  0 siblings, 0 replies; 5+ messages in thread
From: Albert Chin @ 2011-11-21  8:08 UTC (permalink / raw)
  To: netdev

On Mon, Nov 21, 2011 at 09:02:29AM +0100, Eric Dumazet wrote:
> Le lundi 21 novembre 2011 à 01:44 -0600, Albert Chin a écrit :
> 
> > So only "unknown" protocols are dropped? I just ran tcpdump for 96
> > packets and inspected the output. The only packets received are LACP,
> > ARP, STP (Spanning Tree Protocol), DTP (Dynamic Trunking Protocol).
> > How are these "unknown"?
> 
> No protocol handler is setup to analyze some them in your box.
> 
> (ARP is handled of course)
> 
> Like if you receive IPv6 packets while IPv6 was not compiled/loaded into
> your kernel.
> 
> Nothing you have to worry about. We receive plenty of unknown packets
> these days...

Ok. If the above is the case though, I should see dropped packets on
all of my interfaces. But, I only see it on the bonded interface.
Kinda odd.

Thanks.

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-21  8:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-21  5:16 Problems with dropped packets on bonded interface for 3.x kernels Albert Chin
2011-11-21  6:32 ` Eric Dumazet
2011-11-21  7:44   ` Albert Chin
2011-11-21  8:02     ` Eric Dumazet
2011-11-21  8:08       ` Albert Chin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).