Difficulties to get 1Gbps on be2net ethernet card

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Difficulties to get 1Gbps on be2net ethernet card
@ 2012-05-29 14:46 Jean-Michel Hautbois
  2012-05-30  6:28 ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-29 14:46 UTC (permalink / raw)
  To: netdev

Hi list,

I am using a NC553i ethernet card connected on a HP 10GbE Flex-10.
I am sending UDP multicast packets from one blade to another (HP
ProLiant BL460c G7) which has stricly the same HW.

I have lots of packet loss from Tx to Rx, and I can't understand why.
I suspected TX coalescing but since 3.4 I can't set this parameter
(and adaptive-tx is on by default).
I have tried the same test with a debian lenny (2.6.26 kernel and HP
drivers) and it works very well (adaptive-tx is off).

Here is the netstat (from Tx point of view) :

$> netstat -s eth1 > before ; sleep 10 ; netstat -s eth1 > after
$> beforeafter before after
Ip:
    280769 total packets received
    4 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    275063 incoming packets delivered
    305430 requests sent out
    0 dropped because of missing route
Icmp:
    0 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        destination unreachable: 0
        echo requests: 0
    0 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        destination unreachable: 0
        echo replies: 0
IcmpMsg:
        InType3: 0
        InType8: 0
        OutType0: 0
        OutType3: 0
Tcp:
    18 active connections openings
    18 passive connection openings
    0 failed connection attempts
    0 connection resets received
    0 connections established
    3681 segments received
    3650 segments send out
    0 segments retransmited
    0 bad segments received.
    0 resets sent
Udp:
    12626 packets received
    0 packets to unknown port received.
    0 packet receive errors
    259025 packets sent
UdpLite:
TcpExt:
    0 invalid SYN cookies received
    0 packets pruned from receive queue because of socket buffer overrun
    14 TCP sockets finished time wait in fast timer
    0 packets rejects in established connections because of timestamp
    61 delayed acks sent
    0 delayed acks further delayed because of locked socket
    Quick ack mode was activated 0 times
    2924 packets directly queued to recvmsg prequeue.
    32 bytes directly in process context from backlog
    48684 bytes directly received in process context from prequeue
    232 packet headers predicted
    1991 packets header predicted and directly queued to user
    132 acknowledgments not containing data payload received
    2230 predicted acknowledgments
    0 times recovered from packet loss by selective acknowledgements
    0 congestion windows recovered without slow start after partial ack
    0 TCP data loss events
    0 timeouts after SACK recovery
    0 fast retransmits
    0 forward retransmits
    0 retransmits in slow start
    0 other TCP timeouts
    1 times receiver scheduled too late for direct processing
    0 packets collapsed in receive queue due to low socket buffer
    0 DSACKs sent for old packets
    0 DSACKs received
    0 connections reset due to unexpected data
    0 connections reset due to early user close
    0 connections aborted due to timeout
    0 times unabled to send RST due to no memory
    TCPSackShifted: 0
    TCPSackMerged: 0
    TCPSackShiftFallback: 0
    TCPBacklogDrop: 0
    TCPDeferAcceptDrop: 0
IpExt:
    InMcastPkts: -652745397
    OutMcastPkts: 301498
    InBcastPkts: 13
    InOctets: -2004227752
    OutOctets: -2096666083
    InMcastOctets: 1058181285
    OutMcastOctets: -1510963815
    InBcastOctets: 1014

And ethtool diff :
$> ethtool -S eth1 > before ; sleep 10 ; ethtool -S eth1 > after
$> beforeafter before after
NIC statistics:
     rx_crc_errors: 0
     rx_alignment_symbol_errors: 0
     rx_pause_frames: 0
     rx_control_frames: 0
     rx_in_range_errors: 0
     rx_out_range_errors: 0
     rx_frame_too_long: 0
     rx_address_mismatch_drops: 6
     rx_dropped_too_small: 0
     rx_dropped_too_short: 0
     rx_dropped_header_too_small: 0
     rx_dropped_tcp_length: 0
     rx_dropped_runt: 0
     rxpp_fifo_overflow_drop: 0
     rx_input_fifo_overflow_drop: 0
     rx_ip_checksum_errs: 0
     rx_tcp_checksum_errs: 0
     rx_udp_checksum_errs: 0
     tx_pauseframes: 0
     tx_controlframes: 0
     rx_priority_pause_frames: 0
     pmem_fifo_overflow_drop: 0
     jabber_events: 0
     rx_drops_no_pbuf: 0
     rx_drops_no_erx_descr: 0
     rx_drops_no_tpre_descr: 0
     rx_drops_too_many_frags: 0
     forwarded_packets: 0
     rx_drops_mtu: 0
     eth_red_drops: 0
     be_on_die_temperature: 0
     rxq0: rx_bytes: 0
     rxq0: rx_pkts: 0
     rxq0: rx_compl: 0
     rxq0: rx_mcast_pkts: 0
     rxq0: rx_post_fail: 0
     rxq0: rx_drops_no_skbs: 0
     rxq0: rx_drops_no_frags: 0
     txq0: tx_compl: 257113
     txq0: tx_bytes: 1038623935
     txq0: tx_pkts: 257113
     txq0: tx_reqs: 257113
     txq0: tx_wrbs: 514226
     txq0: tx_stops: 10

As you can see, there is 10 tx_stops in 10 seconds (it varies, can be 3 to 15).
Any thoughts ?

Regards,
JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-29 14:46 Difficulties to get 1Gbps on be2net ethernet card Jean-Michel Hautbois
@ 2012-05-30  6:28 ` Jean-Michel Hautbois
  2012-05-30  6:48   ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30  6:28 UTC (permalink / raw)
  To: netdev

2012/5/29 Jean-Michel Hautbois <jhautbois@gmail.com>:
> Hi list,
>
> I am using a NC553i ethernet card connected on a HP 10GbE Flex-10.
> I am sending UDP multicast packets from one blade to another (HP
> ProLiant BL460c G7) which has stricly the same HW.
>
> I have lots of packet loss from Tx to Rx, and I can't understand why.
> I suspected TX coalescing but since 3.4 I can't set this parameter
> (and adaptive-tx is on by default).
> I have tried the same test with a debian lenny (2.6.26 kernel and HP
> drivers) and it works very well (adaptive-tx is off).
>
> Here is the netstat (from Tx point of view) :
>
> $> netstat -s eth1 > before ; sleep 10 ; netstat -s eth1 > after
> $> beforeafter before after
> Ip:
>    280769 total packets received
>    4 with invalid addresses
>    0 forwarded
>    0 incoming packets discarded
>    275063 incoming packets delivered
>    305430 requests sent out
>    0 dropped because of missing route
> Icmp:
>    0 ICMP messages received
>    0 input ICMP message failed.
>    ICMP input histogram:
>        destination unreachable: 0
>        echo requests: 0
>    0 ICMP messages sent
>    0 ICMP messages failed
>    ICMP output histogram:
>        destination unreachable: 0
>        echo replies: 0
> IcmpMsg:
>        InType3: 0
>        InType8: 0
>        OutType0: 0
>        OutType3: 0
> Tcp:
>    18 active connections openings
>    18 passive connection openings
>    0 failed connection attempts
>    0 connection resets received
>    0 connections established
>    3681 segments received
>    3650 segments send out
>    0 segments retransmited
>    0 bad segments received.
>    0 resets sent
> Udp:
>    12626 packets received
>    0 packets to unknown port received.
>    0 packet receive errors
>    259025 packets sent
> UdpLite:
> TcpExt:
>    0 invalid SYN cookies received
>    0 packets pruned from receive queue because of socket buffer overrun
>    14 TCP sockets finished time wait in fast timer
>    0 packets rejects in established connections because of timestamp
>    61 delayed acks sent
>    0 delayed acks further delayed because of locked socket
>    Quick ack mode was activated 0 times
>    2924 packets directly queued to recvmsg prequeue.
>    32 bytes directly in process context from backlog
>    48684 bytes directly received in process context from prequeue
>    232 packet headers predicted
>    1991 packets header predicted and directly queued to user
>    132 acknowledgments not containing data payload received
>    2230 predicted acknowledgments
>    0 times recovered from packet loss by selective acknowledgements
>    0 congestion windows recovered without slow start after partial ack
>    0 TCP data loss events
>    0 timeouts after SACK recovery
>    0 fast retransmits
>    0 forward retransmits
>    0 retransmits in slow start
>    0 other TCP timeouts
>    1 times receiver scheduled too late for direct processing
>    0 packets collapsed in receive queue due to low socket buffer
>    0 DSACKs sent for old packets
>    0 DSACKs received
>    0 connections reset due to unexpected data
>    0 connections reset due to early user close
>    0 connections aborted due to timeout
>    0 times unabled to send RST due to no memory
>    TCPSackShifted: 0
>    TCPSackMerged: 0
>    TCPSackShiftFallback: 0
>    TCPBacklogDrop: 0
>    TCPDeferAcceptDrop: 0
> IpExt:
>    InMcastPkts: -652745397
>    OutMcastPkts: 301498
>    InBcastPkts: 13
>    InOctets: -2004227752
>    OutOctets: -2096666083
>    InMcastOctets: 1058181285
>    OutMcastOctets: -1510963815
>    InBcastOctets: 1014
>
> And ethtool diff :
> $> ethtool -S eth1 > before ; sleep 10 ; ethtool -S eth1 > after
> $> beforeafter before after
> NIC statistics:
>     rx_crc_errors: 0
>     rx_alignment_symbol_errors: 0
>     rx_pause_frames: 0
>     rx_control_frames: 0
>     rx_in_range_errors: 0
>     rx_out_range_errors: 0
>     rx_frame_too_long: 0
>     rx_address_mismatch_drops: 6
>     rx_dropped_too_small: 0
>     rx_dropped_too_short: 0
>     rx_dropped_header_too_small: 0
>     rx_dropped_tcp_length: 0
>     rx_dropped_runt: 0
>     rxpp_fifo_overflow_drop: 0
>     rx_input_fifo_overflow_drop: 0
>     rx_ip_checksum_errs: 0
>     rx_tcp_checksum_errs: 0
>     rx_udp_checksum_errs: 0
>     tx_pauseframes: 0
>     tx_controlframes: 0
>     rx_priority_pause_frames: 0
>     pmem_fifo_overflow_drop: 0
>     jabber_events: 0
>     rx_drops_no_pbuf: 0
>     rx_drops_no_erx_descr: 0
>     rx_drops_no_tpre_descr: 0
>     rx_drops_too_many_frags: 0
>     forwarded_packets: 0
>     rx_drops_mtu: 0
>     eth_red_drops: 0
>     be_on_die_temperature: 0
>     rxq0: rx_bytes: 0
>     rxq0: rx_pkts: 0
>     rxq0: rx_compl: 0
>     rxq0: rx_mcast_pkts: 0
>     rxq0: rx_post_fail: 0
>     rxq0: rx_drops_no_skbs: 0
>     rxq0: rx_drops_no_frags: 0
>     txq0: tx_compl: 257113
>     txq0: tx_bytes: 1038623935
>     txq0: tx_pkts: 257113
>     txq0: tx_reqs: 257113
>     txq0: tx_wrbs: 514226
>     txq0: tx_stops: 10
>
> As you can see, there is 10 tx_stops in 10 seconds (it varies, can be 3 to 15).
> Any thoughts ?
>
> Regards,
> JM

If this can help, setting tx queue length to 5000 seems to make the
problem disappear.
I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  6:28 ` Jean-Michel Hautbois
@ 2012-05-30  6:48   ` Eric Dumazet
  2012-05-30  6:51     ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-05-30  6:48 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: netdev

On Wed, 2012-05-30 at 08:28 +0200, Jean-Michel Hautbois wrote:

> If this can help, setting tx queue length to 5000 seems to make the
> problem disappear.

Then you should have drops at Qdisc layer (before your change to 5000)

tc -s -d qdisc

> I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  6:48   ` Eric Dumazet
@ 2012-05-30  6:51     ` Jean-Michel Hautbois
  2012-05-30  7:06       ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30  6:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2012-05-30 at 08:28 +0200, Jean-Michel Hautbois wrote:
>
>> If this can help, setting tx queue length to 5000 seems to make the
>> problem disappear.
>
> Then you should have drops at Qdisc layer (before your change to 5000)
>
> tc -s -d qdisc
>
>> I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.
>

Yes :
qdisc mq 0: dev eth1 root
 Sent 5710049154383 bytes 1413544639 pkt (dropped 73078, overlimits 0
requeues 281540)
 backlog 0b 0p requeues 281540

Why ? With a 2.6.26 kernel it works well with a tx queue length of 1000.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  6:51     ` Jean-Michel Hautbois
@ 2012-05-30  7:06       ` Eric Dumazet
  2012-05-30  7:25         ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-05-30  7:06 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: netdev

On Wed, 2012-05-30 at 08:51 +0200, Jean-Michel Hautbois wrote:
> 2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
> > On Wed, 2012-05-30 at 08:28 +0200, Jean-Michel Hautbois wrote:
> >
> >> If this can help, setting tx queue length to 5000 seems to make the
> >> problem disappear.
> >
> > Then you should have drops at Qdisc layer (before your change to 5000)
> >
> > tc -s -d qdisc
> >
> >> I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.
> >
> 
> Yes :
> qdisc mq 0: dev eth1 root
>  Sent 5710049154383 bytes 1413544639 pkt (dropped 73078, overlimits 0
> requeues 281540)
>  backlog 0b 0p requeues 281540
> 
> Why ? With a 2.6.26 kernel it works well with a tx queue length of 1000.

If you send big bursts of packets, then you need a large enough queue.

Maybe your kernel is now faster than before and queue fills faster, or
TX ring is smaller ?

ethtool -g eth0

Note that everybody try to reduce dumb queue sizes because of latencies.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  7:06       ` Eric Dumazet
@ 2012-05-30  7:25         ` Jean-Michel Hautbois
  2012-05-30  9:40           ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30  7:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2012-05-30 at 08:51 +0200, Jean-Michel Hautbois wrote:
>> 2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
>> > On Wed, 2012-05-30 at 08:28 +0200, Jean-Michel Hautbois wrote:
>> >
>> >> If this can help, setting tx queue length to 5000 seems to make the
>> >> problem disappear.
>> >
>> > Then you should have drops at Qdisc layer (before your change to 5000)
>> >
>> > tc -s -d qdisc
>> >
>> >> I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.
>> >
>>
>> Yes :
>> qdisc mq 0: dev eth1 root
>>  Sent 5710049154383 bytes 1413544639 pkt (dropped 73078, overlimits 0
>> requeues 281540)
>>  backlog 0b 0p requeues 281540
>>
>> Why ? With a 2.6.26 kernel it works well with a tx queue length of 1000.
>
> If you send big bursts of packets, then you need a large enough queue.
>
> Maybe your kernel is now faster than before and queue fills faster, or
> TX ring is smaller ?
>
> ethtool -g eth0
>
> Note that everybody try to reduce dumb queue sizes because of latencies.
>

TX ring is not the same :
On 3.2 :
$> ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX:             1024
RX Mini:        0
RX Jumbo:       0
TX:             2048
Current hardware settings:
RX:             1024
RX Mini:        0
RX Jumbo:       0
TX:             2048


On 2.6.26 :
$>ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX:             1024
RX Mini:        0
RX Jumbo:       0
TX:             2048
Current hardware settings:
RX:             1003
RX Mini:        0
RX Jumbo:       0
TX:             0

I can't set TX ring using ethtool -G eth1 tx N : operation not supported
I am not really impacted by latency, but the lower the better.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  7:25         ` Jean-Michel Hautbois
@ 2012-05-30  9:40           ` Jean-Michel Hautbois
  2012-05-30  9:56             ` Eric Dumazet
  2012-05-30 10:04             ` Sathya.Perla
  0 siblings, 2 replies; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30  9:40 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
>> On Wed, 2012-05-30 at 08:51 +0200, Jean-Michel Hautbois wrote:
>>> 2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
>>> > On Wed, 2012-05-30 at 08:28 +0200, Jean-Michel Hautbois wrote:
>>> >
>>> >> If this can help, setting tx queue length to 5000 seems to make the
>>> >> problem disappear.
>>> >
>>> > Then you should have drops at Qdisc layer (before your change to 5000)
>>> >
>>> > tc -s -d qdisc
>>> >
>>> >> I didn't specified it : MTU is 4096, UDP packets are 4000 bytes.
>>> >
>>>
>>> Yes :
>>> qdisc mq 0: dev eth1 root
>>>  Sent 5710049154383 bytes 1413544639 pkt (dropped 73078, overlimits 0
>>> requeues 281540)
>>>  backlog 0b 0p requeues 281540
>>>
>>> Why ? With a 2.6.26 kernel it works well with a tx queue length of 1000.
>>
>> If you send big bursts of packets, then you need a large enough queue.
>>
>> Maybe your kernel is now faster than before and queue fills faster, or
>> TX ring is smaller ?
>>
>> ethtool -g eth0
>>
>> Note that everybody try to reduce dumb queue sizes because of latencies.
>>
>
> TX ring is not the same :
> On 3.2 :
> $> ethtool -g eth1
> Ring parameters for eth1:
> Pre-set maximums:
> RX:             1024
> RX Mini:        0
> RX Jumbo:       0
> TX:             2048
> Current hardware settings:
> RX:             1024
> RX Mini:        0
> RX Jumbo:       0
> TX:             2048
>
>
> On 2.6.26 :
> $>ethtool -g eth1
> Ring parameters for eth1:
> Pre-set maximums:
> RX:             1024
> RX Mini:        0
> RX Jumbo:       0
> TX:             2048
> Current hardware settings:
> RX:             1003
> RX Mini:        0
> RX Jumbo:       0
> TX:             0
>
> I can't set TX ring using ethtool -G eth1 tx N : operation not supported
> I am not really impacted by latency, but the lower the better.
>
> JM

I used vmstat in order to see the differences between the two kernels.
The main difference is the number of interrupts per second.
I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
I suspect the be2net driver to be the main cause, and I checkes the
/proc/interrupts file in order to be sure.

I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
whereas there is eth1-rx0 and eth1-tx on 2.6.26.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  9:40           ` Jean-Michel Hautbois
@ 2012-05-30  9:56             ` Eric Dumazet
  2012-05-30 10:06               ` Jean-Michel Hautbois
  2012-05-30 10:04             ` Sathya.Perla
  1 sibling, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-05-30  9:56 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: netdev

On Wed, 2012-05-30 at 11:40 +0200, Jean-Michel Hautbois wrote:

> I used vmstat in order to see the differences between the two kernels.
> The main difference is the number of interrupts per second.
> I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
> I suspect the be2net driver to be the main cause, and I checkes the
> /proc/interrupts file in order to be sure.
> 
> I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
> BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
> whereas there is eth1-rx0 and eth1-tx on 2.6.26.
> 

Might be different coalescing params :

ethtool -c eth1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  9:56             ` Eric Dumazet
@ 2012-05-30 10:06               ` Jean-Michel Hautbois
  0 siblings, 0 replies; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30 10:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

2012/5/30 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2012-05-30 at 11:40 +0200, Jean-Michel Hautbois wrote:
>
>> I used vmstat in order to see the differences between the two kernels.
>> The main difference is the number of interrupts per second.
>> I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
>> I suspect the be2net driver to be the main cause, and I checkes the
>> /proc/interrupts file in order to be sure.
>>
>> I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
>> BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
>> whereas there is eth1-rx0 and eth1-tx on 2.6.26.
>>
>
> Might be different coalescing params :
>
> ethtool -c eth1
>

Yes, as stated in my first e-mail, this is different, in 2.6.26 the
adaptive-tx coalescing is off, while it is on for 3.4 (sorry, I said
3.2 before but it is 3.4).
But I can't change this setting since commit 10ef9ab...
JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30  9:40           ` Jean-Michel Hautbois
  2012-05-30  9:56             ` Eric Dumazet
@ 2012-05-30 10:04             ` Sathya.Perla
  2012-05-30 10:07               ` Jean-Michel Hautbois
                                 ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Sathya.Perla @ 2012-05-30 10:04 UTC (permalink / raw)
  To: jhautbois, eric.dumazet; +Cc: netdev

>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>Behalf Of Jean-Michel Hautbois
>
>2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
>
>I used vmstat in order to see the differences between the two kernels.
>The main difference is the number of interrupts per second.
>I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
>I suspect the be2net driver to be the main cause, and I checkes the
>/proc/interrupts file in order to be sure.
>
>I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
>BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
>whereas there is eth1-rx0 and eth1-tx on 2.6.26.

Yes, there is an issue with be2net interrupt mitigation in the recent code with
RX and TX on the same Evt-Q (commit 10ef9ab4). The high interrupt rate happens when a TX blast is
done while RX is relatively silent on a queue pair. Interrupt rate due to TX completions is not being
mitigated.

I have a fix and will send it out soon..

thanks,
-Sathya

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30 10:04             ` Sathya.Perla
@ 2012-05-30 10:07               ` Jean-Michel Hautbois
  2012-06-06 10:04                 ` Jean-Michel Hautbois
  2012-05-30 10:30               ` Eric Dumazet
  2012-05-31  6:54               ` Jean-Michel Hautbois
  2 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-30 10:07 UTC (permalink / raw)
  To: Sathya.Perla; +Cc: eric.dumazet, netdev

2012/5/30  <Sathya.Perla@emulex.com>:
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of Jean-Michel Hautbois
>>
>>2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
>>
>>I used vmstat in order to see the differences between the two kernels.
>>The main difference is the number of interrupts per second.
>>I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
>>I suspect the be2net driver to be the main cause, and I checkes the
>>/proc/interrupts file in order to be sure.
>>
>>I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
>>BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
>>whereas there is eth1-rx0 and eth1-tx on 2.6.26.
>
> Yes, there is an issue with be2net interrupt mitigation in the recent code with
> RX and TX on the same Evt-Q (commit 10ef9ab4). The high interrupt rate happens when a TX blast is
> done while RX is relatively silent on a queue pair. Interrupt rate due to TX completions is not being
> mitigated.
>
> I have a fix and will send it out soon..
>
> thanks,
> -Sathya

Hi Sathya !
Thanks for this information !
I had the correct diagnostic :). I am waiting for your fix.

Regards,
JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30 10:07               ` Jean-Michel Hautbois
@ 2012-06-06 10:04                 ` Jean-Michel Hautbois
  2012-06-06 11:01                   ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-06 10:04 UTC (permalink / raw)
  To: Sathya.Perla; +Cc: eric.dumazet, netdev

2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/5/30  <Sathya.Perla@emulex.com>:
>>>-----Original Message-----
>>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>>Behalf Of Jean-Michel Hautbois
>>>
>>>2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
>>>
>>>I used vmstat in order to see the differences between the two kernels.
>>>The main difference is the number of interrupts per second.
>>>I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
>>>I suspect the be2net driver to be the main cause, and I checkes the
>>>/proc/interrupts file in order to be sure.
>>>
>>>I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
>>>BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
>>>whereas there is eth1-rx0 and eth1-tx on 2.6.26.
>>
>> Yes, there is an issue with be2net interrupt mitigation in the recent code with
>> RX and TX on the same Evt-Q (commit 10ef9ab4). The high interrupt rate happens when a TX blast is
>> done while RX is relatively silent on a queue pair. Interrupt rate due to TX completions is not being
>> mitigated.
>>
>> I have a fix and will send it out soon..
>>
>> thanks,
>> -Sathya
>
> Hi Sathya !
> Thanks for this information !
> I had the correct diagnostic :). I am waiting for your fix.
>

Well, well, well, after having tested several configurations, several
drivers, I have a big difference between an old 2.6.26 kernel and a
newer one (I tried 3.2 and 3.4).

Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
set to 4096. I am sending packets only, nothing on RX.
I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
kernel, but a lot of drops with a newer kernel.
So, I don't know if I missed something in my kernel configuration, but
I have used the 2.6.26 one as a reference, in order to set the same
options (DMA related, etc).

I easily reproduce this problem and setting a bigger txqueuelen solves
it partially.
1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !

If you have any idea, I am interested, as this is a big issue for my use case.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-06 10:04                 ` Jean-Michel Hautbois
@ 2012-06-06 11:01                   ` Eric Dumazet
  2012-06-06 12:34                     ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-06 11:01 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Wed, 2012-06-06 at 12:04 +0200, Jean-Michel Hautbois wrote:

> Well, well, well, after having tested several configurations, several
> drivers, I have a big difference between an old 2.6.26 kernel and a
> newer one (I tried 3.2 and 3.4).
> 
> Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
> set to 4096. I am sending packets only, nothing on RX.
> I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
> kernel, but a lot of drops with a newer kernel.
> So, I don't know if I missed something in my kernel configuration, but
> I have used the 2.6.26 one as a reference, in order to set the same
> options (DMA related, etc).
> 
> I easily reproduce this problem and setting a bigger txqueuelen solves
> it partially.
> 1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !
> 
> If you have any idea, I am interested, as this is a big issue for my use case.
> 

Yep.

This driver wants to limit number of tx completions, thats just wrong.

Fix and dirty patch:


diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index c5c4c0e..1e8f8a6 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -105,7 +105,7 @@ static inline char *nic_name(struct pci_dev *pdev)
 #define MAX_TX_QS		8
 #define MAX_ROCE_EQS		5
 #define MAX_MSIX_VECTORS	(MAX_RSS_QS + MAX_ROCE_EQS) /* RSS qs + RoCE */
-#define BE_TX_BUDGET		256
+#define BE_TX_BUDGET		65535
 #define BE_NAPI_WEIGHT		64
 #define MAX_RX_POST		BE_NAPI_WEIGHT /* Frags posted at a time */
 #define RX_FRAGS_REFILL_WM	(RX_Q_LEN - MAX_RX_POST)

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-06 11:01                   ` Eric Dumazet
@ 2012-06-06 12:34                     ` Jean-Michel Hautbois
  2012-06-06 13:07                       ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-06 12:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/6 Eric Dumazet <eric.dumazet@gmail.com>:
> On Wed, 2012-06-06 at 12:04 +0200, Jean-Michel Hautbois wrote:
>
>> Well, well, well, after having tested several configurations, several
>> drivers, I have a big difference between an old 2.6.26 kernel and a
>> newer one (I tried 3.2 and 3.4).
>>
>> Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
>> set to 4096. I am sending packets only, nothing on RX.
>> I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
>> kernel, but a lot of drops with a newer kernel.
>> So, I don't know if I missed something in my kernel configuration, but
>> I have used the 2.6.26 one as a reference, in order to set the same
>> options (DMA related, etc).
>>
>> I easily reproduce this problem and setting a bigger txqueuelen solves
>> it partially.
>> 1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !
>>
>> If you have any idea, I am interested, as this is a big issue for my use case.
>>
>
> Yep.
>
> This driver wants to limit number of tx completions, thats just wrong.
>
> Fix and dirty patch:
>
>
> diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
> index c5c4c0e..1e8f8a6 100644
> --- a/drivers/net/ethernet/emulex/benet/be.h
> +++ b/drivers/net/ethernet/emulex/benet/be.h
> @@ -105,7 +105,7 @@ static inline char *nic_name(struct pci_dev *pdev)
>  #define MAX_TX_QS              8
>  #define MAX_ROCE_EQS           5
>  #define MAX_MSIX_VECTORS       (MAX_RSS_QS + MAX_ROCE_EQS) /* RSS qs + RoCE */
> -#define BE_TX_BUDGET           256
> +#define BE_TX_BUDGET           65535
>  #define BE_NAPI_WEIGHT         64
>  #define MAX_RX_POST            BE_NAPI_WEIGHT /* Frags posted at a time */
>  #define RX_FRAGS_REFILL_WM     (RX_Q_LEN - MAX_RX_POST)
>

I will try that in a few minutes.
I also have a mlx4 driver (mlx4_en) which has a similar behaviour, and
a broadcom (bnx2x).

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-06 12:34                     ` Jean-Michel Hautbois
@ 2012-06-06 13:07                       ` Jean-Michel Hautbois
  2012-06-06 14:36                         ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-06 13:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/6/6 Eric Dumazet <eric.dumazet@gmail.com>:
>> On Wed, 2012-06-06 at 12:04 +0200, Jean-Michel Hautbois wrote:
>>
>>> Well, well, well, after having tested several configurations, several
>>> drivers, I have a big difference between an old 2.6.26 kernel and a
>>> newer one (I tried 3.2 and 3.4).
>>>
>>> Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
>>> set to 4096. I am sending packets only, nothing on RX.
>>> I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
>>> kernel, but a lot of drops with a newer kernel.
>>> So, I don't know if I missed something in my kernel configuration, but
>>> I have used the 2.6.26 one as a reference, in order to set the same
>>> options (DMA related, etc).
>>>
>>> I easily reproduce this problem and setting a bigger txqueuelen solves
>>> it partially.
>>> 1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !
>>>
>>> If you have any idea, I am interested, as this is a big issue for my use case.
>>>
>>
>> Yep.
>>
>> This driver wants to limit number of tx completions, thats just wrong.
>>
>> Fix and dirty patch:
>>
>>
>> diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
>> index c5c4c0e..1e8f8a6 100644
>> --- a/drivers/net/ethernet/emulex/benet/be.h
>> +++ b/drivers/net/ethernet/emulex/benet/be.h
>> @@ -105,7 +105,7 @@ static inline char *nic_name(struct pci_dev *pdev)
>>  #define MAX_TX_QS              8
>>  #define MAX_ROCE_EQS           5
>>  #define MAX_MSIX_VECTORS       (MAX_RSS_QS + MAX_ROCE_EQS) /* RSS qs + RoCE */
>> -#define BE_TX_BUDGET           256
>> +#define BE_TX_BUDGET           65535
>>  #define BE_NAPI_WEIGHT         64
>>  #define MAX_RX_POST            BE_NAPI_WEIGHT /* Frags posted at a time */
>>  #define RX_FRAGS_REFILL_WM     (RX_Q_LEN - MAX_RX_POST)
>>
>
> I will try that in a few minutes.
> I also have a mlx4 driver (mlx4_en) which has a similar behaviour, and
> a broadcom (bnx2x).
>

And it is not really better, still need about 18000 at 2.4Gbps in
order to avoid drops...
I really think there is something in the networking stack or in my
configuration (DMA ? Something else ?)...
As it doesn't seem to be driver related as I said...

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-06 13:07                       ` Jean-Michel Hautbois
@ 2012-06-06 14:36                         ` Jean-Michel Hautbois
  2012-06-07 12:27                           ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-06 14:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
>> 2012/6/6 Eric Dumazet <eric.dumazet@gmail.com>:
>>> On Wed, 2012-06-06 at 12:04 +0200, Jean-Michel Hautbois wrote:
>>>
>>>> Well, well, well, after having tested several configurations, several
>>>> drivers, I have a big difference between an old 2.6.26 kernel and a
>>>> newer one (I tried 3.2 and 3.4).
>>>>
>>>> Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
>>>> set to 4096. I am sending packets only, nothing on RX.
>>>> I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
>>>> kernel, but a lot of drops with a newer kernel.
>>>> So, I don't know if I missed something in my kernel configuration, but
>>>> I have used the 2.6.26 one as a reference, in order to set the same
>>>> options (DMA related, etc).
>>>>
>>>> I easily reproduce this problem and setting a bigger txqueuelen solves
>>>> it partially.
>>>> 1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !
>>>>
>>>> If you have any idea, I am interested, as this is a big issue for my use case.
>>>>
>>>
>>> Yep.
>>>
>>> This driver wants to limit number of tx completions, thats just wrong.
>>>
>>> Fix and dirty patch:
>>>
>>>
>>> diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
>>> index c5c4c0e..1e8f8a6 100644
>>> --- a/drivers/net/ethernet/emulex/benet/be.h
>>> +++ b/drivers/net/ethernet/emulex/benet/be.h
>>> @@ -105,7 +105,7 @@ static inline char *nic_name(struct pci_dev *pdev)
>>>  #define MAX_TX_QS              8
>>>  #define MAX_ROCE_EQS           5
>>>  #define MAX_MSIX_VECTORS       (MAX_RSS_QS + MAX_ROCE_EQS) /* RSS qs + RoCE */
>>> -#define BE_TX_BUDGET           256
>>> +#define BE_TX_BUDGET           65535
>>>  #define BE_NAPI_WEIGHT         64
>>>  #define MAX_RX_POST            BE_NAPI_WEIGHT /* Frags posted at a time */
>>>  #define RX_FRAGS_REFILL_WM     (RX_Q_LEN - MAX_RX_POST)
>>>
>>
>> I will try that in a few minutes.
>> I also have a mlx4 driver (mlx4_en) which has a similar behaviour, and
>> a broadcom (bnx2x).
>>
>
> And it is not really better, still need about 18000 at 2.4Gbps in
> order to avoid drops...
> I really think there is something in the networking stack or in my
> configuration (DMA ? Something else ?)...
> As it doesn't seem to be driver related as I said...
>

If it can help, on a 3.0 kernel a txqueuelen of 9000 is sufficient in
order to get this bandwith on TX.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-06 14:36                         ` Jean-Michel Hautbois
@ 2012-06-07 12:27                           ` Jean-Michel Hautbois
  2012-06-07 12:31                             ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-07 12:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
>> 2012/6/6 Jean-Michel Hautbois <jhautbois@gmail.com>:
>>> 2012/6/6 Eric Dumazet <eric.dumazet@gmail.com>:
>>>> On Wed, 2012-06-06 at 12:04 +0200, Jean-Michel Hautbois wrote:
>>>>
>>>>> Well, well, well, after having tested several configurations, several
>>>>> drivers, I have a big difference between an old 2.6.26 kernel and a
>>>>> newer one (I tried 3.2 and 3.4).
>>>>>
>>>>> Here is my stream : UDP packets (multicast), 4000 bytes length, MTU
>>>>> set to 4096. I am sending packets only, nothing on RX.
>>>>> I send from 1Gbps upto 2.4Gbps and I see no drops in tc with 2.6.26
>>>>> kernel, but a lot of drops with a newer kernel.
>>>>> So, I don't know if I missed something in my kernel configuration, but
>>>>> I have used the 2.6.26 one as a reference, in order to set the same
>>>>> options (DMA related, etc).
>>>>>
>>>>> I easily reproduce this problem and setting a bigger txqueuelen solves
>>>>> it partially.
>>>>> 1Gbps requires a txqueulen of 9000, 2.4Gbps requires more than 20000 !
>>>>>
>>>>> If you have any idea, I am interested, as this is a big issue for my use case.
>>>>>
>>>>
>>>> Yep.
>>>>
>>>> This driver wants to limit number of tx completions, thats just wrong.
>>>>
>>>> Fix and dirty patch:
>>>>
>>>>
>>>> diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
>>>> index c5c4c0e..1e8f8a6 100644
>>>> --- a/drivers/net/ethernet/emulex/benet/be.h
>>>> +++ b/drivers/net/ethernet/emulex/benet/be.h
>>>> @@ -105,7 +105,7 @@ static inline char *nic_name(struct pci_dev *pdev)
>>>>  #define MAX_TX_QS              8
>>>>  #define MAX_ROCE_EQS           5
>>>>  #define MAX_MSIX_VECTORS       (MAX_RSS_QS + MAX_ROCE_EQS) /* RSS qs + RoCE */
>>>> -#define BE_TX_BUDGET           256
>>>> +#define BE_TX_BUDGET           65535
>>>>  #define BE_NAPI_WEIGHT         64
>>>>  #define MAX_RX_POST            BE_NAPI_WEIGHT /* Frags posted at a time */
>>>>  #define RX_FRAGS_REFILL_WM     (RX_Q_LEN - MAX_RX_POST)
>>>>
>>>
>>> I will try that in a few minutes.
>>> I also have a mlx4 driver (mlx4_en) which has a similar behaviour, and
>>> a broadcom (bnx2x).
>>>
>>
>> And it is not really better, still need about 18000 at 2.4Gbps in
>> order to avoid drops...
>> I really think there is something in the networking stack or in my
>> configuration (DMA ? Something else ?)...
>> As it doesn't seem to be driver related as I said...
>>
>
> If it can help, on a 3.0 kernel a txqueuelen of 9000 is sufficient in
> order to get this bandwith on TX.
>
> JM

All,

I made some tests, and I didn't mention it : I am using the bonding
driver over my ethernet drivers (be2net/mlx4 etc.).
When I am using bonding, I need a big txqeuelen in order to send 2.4Gbps.
When I disable bonding, and use directly the NIC then I don't see any
drops in qdisc and it works well.
So, I think there is something between 2.6.26 and 3.0 in the bonding
driver which causes this issue.

Any help would be appreciated :).
Regards,
JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-07 12:27                           ` Jean-Michel Hautbois
@ 2012-06-07 12:31                             ` Eric Dumazet
  2012-06-07 12:54                               ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-07 12:31 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Thu, 2012-06-07 at 14:27 +0200, Jean-Michel Hautbois wrote:

> I made some tests, and I didn't mention it : I am using the bonding
> driver over my ethernet drivers (be2net/mlx4 etc.).
> When I am using bonding, I need a big txqeuelen in order to send 2.4Gbps.
> When I disable bonding, and use directly the NIC then I don't see any
> drops in qdisc and it works well.
> So, I think there is something between 2.6.26 and 3.0 in the bonding
> driver which causes this issue.
> 

What your bond configuration looks like ?

cat /proc/net/bonding/bond0
ifconfig -a
tc -s -d qdisc

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-07 12:31                             ` Eric Dumazet
@ 2012-06-07 12:54                               ` Jean-Michel Hautbois
  2012-06-08  6:08                                 ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-07 12:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/7 Eric Dumazet <eric.dumazet@gmail.com>:
> On Thu, 2012-06-07 at 14:27 +0200, Jean-Michel Hautbois wrote:
>
>> I made some tests, and I didn't mention it : I am using the bonding
>> driver over my ethernet drivers (be2net/mlx4 etc.).
>> When I am using bonding, I need a big txqeuelen in order to send 2.4Gbps.
>> When I disable bonding, and use directly the NIC then I don't see any
>> drops in qdisc and it works well.
>> So, I think there is something between 2.6.26 and 3.0 in the bonding
>> driver which causes this issue.
>>
>
> What your bond configuration looks like ?
>
> cat /proc/net/bonding/bond0
cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 50
Up Delay (ms): 100
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 4000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 68:b5:99:b9:8d:d4
Slave queue ID: 0

Slave Interface: eth9
MII Status: up
Speed: 4000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 78:e7:d1:68:bb:38
Slave queue ID: 0

> ifconfig -a
bond0     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          inet addr:192.168.250.11  Bcast:192.168.250.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd0/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:6570 errors:0 dropped:74 overruns:0 frame:0
          TX packets:5208 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:900993 (879.8 KiB)  TX bytes:863735 (843.4 KiB)

bond1     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          inet addr:192.168.2.1  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

bond2     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          inet addr:10.11.17.190  Bcast:10.11.17.255  Mask:255.255.255.128
          inet6 addr: fe80::6ab5:99ff:feb9:8dd1/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:1301996 errors:0 dropped:27 overruns:0 frame:0
          TX packets:959 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1760302182 (1.6 GiB)  TX bytes:502828 (491.0 KiB)

bond3     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          inet6 addr: fe80::6ab5:99ff:feb9:8dd5/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1278313720 (1.1 GiB)  TX bytes:2616 (2.5 KiB)

bond4     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          inet addr:192.168.202.1  Bcast:192.168.202.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd2/64 Scope:Link
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

bond5     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          inet addr:192.168.203.1  Bcast:192.168.203.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd6/64 Scope:Link
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

bond6     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          inet6 addr: fe80::6ab5:99ff:feb9:8dd3/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:25531 (24.9 KiB)  TX bytes:1046 (1.0 KiB)

bond7     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

bond3.4   Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          inet addr:192.168.201.1  Bcast:192.168.201.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1265116746 (1.1 GiB)  TX bytes:1980 (1.9 KiB)

bond6.7   Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          inet addr:192.168.204.1  Bcast:192.168.204.255  Mask:255.255.255.0
          inet6 addr: fe80::6ab5:99ff:feb9:8dd3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:21765 (21.2 KiB)  TX bytes:468 (468.0 B)

eth0      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:6496 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5208 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:896849 (875.8 KiB)  TX bytes:863735 (843.4 KiB)

eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

eth2      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:1301996 errors:0 dropped:27 overruns:0 frame:0
          TX packets:959 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1760302182 (1.6 GiB)  TX bytes:502828 (491.0 KiB)

eth3      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth4      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth5      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth6      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:269 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:25531 (24.9 KiB)  TX bytes:1046 (1.0 KiB)

eth7      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth8      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:74 errors:0 dropped:74 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4144 (4.0 KiB)  TX bytes:0 (0.0 B)

eth9      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth10     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d1
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth11     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d5
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:942641 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1278313720 (1.1 GiB)  TX bytes:2616 (2.5 KiB)

eth12     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d2
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

eth13     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d6
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:90 (90.0 B)

eth14     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d3
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth15     Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d7
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:667883 errors:0 dropped:0 overruns:0 frame:0
          TX packets:667883 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:109537849 (104.4 MiB)  TX bytes:109537849 (104.4 MiB)


> tc -s -d qdisc
 tc -s -d qdisc
qdisc mq 0: dev eth0 root
 Sent 873668 bytes 5267 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth1 root
 Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
requeues 17480)
 backlog 0b 0p requeues 17480
qdisc mq 0: dev eth2 root
 Sent 516248 bytes 983 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth3 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth4 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth5 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth6 root
 Sent 1022 bytes 13 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth7 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth8 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth9 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth10 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth11 root
 Sent 2448 bytes 40 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth12 root
 Sent 90 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth13 root
 Sent 90 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth14 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth15 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-07 12:54                               ` Jean-Michel Hautbois
@ 2012-06-08  6:08                                 ` Eric Dumazet
  2012-06-08  8:14                                   ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-08  6:08 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:

> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)

> qdisc mq 0: dev eth1 root
>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
> requeues 17480)

OK, and "tc -s -d cl show dev eth1"

(How many queues are really used)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-08  6:08                                 ` Eric Dumazet
@ 2012-06-08  8:14                                   ` Jean-Michel Hautbois
  2012-06-08  8:22                                     ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-08  8:14 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>
>> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>
>> qdisc mq 0: dev eth1 root
>>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>> requeues 17480)
>
> OK, and "tc -s -d cl show dev eth1"
>
> (How many queues are really used)
>
>
>

tc -s -d cl show dev eth1
class mq :1 root
 Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
 backlog 0b 0p requeues 2747
class mq :2 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :3 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :4 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :5 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :6 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :7 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq :8 root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-08  8:14                                   ` Jean-Michel Hautbois
@ 2012-06-08  8:22                                     ` Eric Dumazet
  2012-06-08 14:53                                       ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-08  8:22 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
> >
> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
> >>           collisions:0 txqueuelen:1000
> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
> >
> >> qdisc mq 0: dev eth1 root
> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
> >> requeues 17480)
> >
> > OK, and "tc -s -d cl show dev eth1"
> >
> > (How many queues are really used)
> >
> >
> >
> 
> tc -s -d cl show dev eth1
> class mq :1 root
>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>  backlog 0b 0p requeues 2747
> class mq :2 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :3 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :4 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :5 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :6 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :7 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> class mq :8 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0


Do you have the same distribution on old kernels as well ?
(ie only queue 0 is used)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-08  8:22                                     ` Eric Dumazet
@ 2012-06-08 14:53                                       ` Jean-Michel Hautbois
  2012-06-12  8:24                                         ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-08 14:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
>> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
>> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>> >
>> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>> >>           collisions:0 txqueuelen:1000
>> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>> >
>> >> qdisc mq 0: dev eth1 root
>> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>> >> requeues 17480)
>> >
>> > OK, and "tc -s -d cl show dev eth1"
>> >
>> > (How many queues are really used)
>> >
>> >
>> >
>>
>> tc -s -d cl show dev eth1
>> class mq :1 root
>>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>>  backlog 0b 0p requeues 2747
>> class mq :2 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :3 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :4 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :5 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :6 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :7 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> class mq :8 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>
>
> Do you have the same distribution on old kernels as well ?
> (ie only queue 0 is used)
>
>
>

On the old kernel, there is nothing returned by this command.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-08 14:53                                       ` Jean-Michel Hautbois
@ 2012-06-12  8:24                                         ` Jean-Michel Hautbois
  2012-06-12  8:55                                           ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-12  8:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/8 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
>> On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
>>> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
>>> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>>> >
>>> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>>> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>>> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>>> >>           collisions:0 txqueuelen:1000
>>> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>>> >
>>> >> qdisc mq 0: dev eth1 root
>>> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>>> >> requeues 17480)
>>> >
>>> > OK, and "tc -s -d cl show dev eth1"
>>> >
>>> > (How many queues are really used)
>>> >
>>> >
>>> >
>>>
>>> tc -s -d cl show dev eth1
>>> class mq :1 root
>>>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>>>  backlog 0b 0p requeues 2747
>>> class mq :2 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :3 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :4 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :5 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :6 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :7 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :8 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>
>>
>> Do you have the same distribution on old kernels as well ?
>> (ie only queue 0 is used)
>>
>>
>>
>
> On the old kernel, there is nothing returned by this command.
>
> JM

I used perf in order to get more information.
Here is the perf record -a sleep 10 result (I took only kernel) :
     6.93%    ModuleTester  [kernel.kallsyms]                 [k]
copy_user_generic_string
     2.99%         swapper  [kernel.kallsyms]                 [k] mwait_idle
     2.60%          kipmi0  [ipmi_si]                         [k] port_inb
     1.75%         swapper  [kernel.kallsyms]                 [k] rb_prev
     1.63%    ModuleTester  [kernel.kallsyms]                 [k] _raw_spin_lock
     1.43%     NodeManager  [kernel.kallsyms]                 [k] delay_tsc
     0.90%    ModuleTester  [kernel.kallsyms]                 [k] clear_page_c
     0.88%    ModuleTester  [kernel.kallsyms]                 [k] dev_queue_xmit
     0.80%              ip  [kernel.kallsyms]                 [k]
snmp_fold_field
     0.73%    ModuleTester  [kernel.kallsyms]                 [k]
clflush_cache_range
     0.69%            grep  [kernel.kallsyms]                 [k] page_fault
     0.61%              sh  [kernel.kallsyms]                 [k] page_fault
     0.59%    ModuleTester  [kernel.kallsyms]                 [k] udp_sendmsg
     0.55%    ModuleTester  [kernel.kallsyms]                 [k] _raw_read_lock
     0.53%              sh  [kernel.kallsyms]                 [k] unmap_vmas
     0.52%    ModuleTester  [kernel.kallsyms]                 [k] rb_prev
     0.51%    ModuleTester  [kernel.kallsyms]                 [k]
find_busiest_group
     0.49%    ModuleTester  [kernel.kallsyms]                 [k] __ip_make_skb
     0.48%    ModuleTester  [kernel.kallsyms]                 [k]
sock_alloc_send_pskb
     0.48%    ModuleTester  libpthread-2.7.so                 [.]
pthread_mutex_lock
     0.47%    ModuleTester  [kernel.kallsyms]                 [k]
__netif_receive_skb
     0.44%              ip  [kernel.kallsyms]                 [k] find_next_bit
     0.43%         swapper  [kernel.kallsyms]                 [k]
clflush_cache_range
     0.41%              ps  [kernel.kallsyms]                 [k] format_decode
     0.41%    ModuleTester  [bonding]                         [k]
bond_start_xmit
     0.39%    ModuleTester  [be2net]                          [k] be_xmit
     0.39%    ModuleTester  [kernel.kallsyms]                 [k]
__ip_append_data
     0.38%    ModuleTester  [kernel.kallsyms]                 [k] netif_rx
     0.37%         swapper  [be2net]                          [k] be_poll
     0.37%         swapper  [kernel.kallsyms]                 [k] ktime_get
     0.37%              sh  [kernel.kallsyms]                 [k] copy_page_c
     0.36%         swapper  [kernel.kallsyms]                 [k]
irq_entries_start
     0.36%    ModuleTester  [kernel.kallsyms]                 [k]
__alloc_pages_nodemask
     0.35%    ModuleTester  [kernel.kallsyms]                 [k] __slab_free
     0.35%    ModuleTester  [kernel.kallsyms]                 [k] ip_mc_output
     0.34%    ModuleTester  [kernel.kallsyms]                 [k]
skb_release_data
     0.33%              ip  [kernel.kallsyms]                 [k] page_fault
     0.33%    ModuleTester  [kernel.kallsyms]                 [k] udp_send_skb

And here is the perf record -a result without bonding :
     2.49%     ModuleTester  [kernel.kallsyms]               [k]
csum_partial_copy_generic
     1.35%     ModuleTester  [kernel.kallsyms]               [k] _raw_spin_lock
     1.29%     ModuleTester  [kernel.kallsyms]               [k]
clflush_cache_range
     1.16%       jobprocess  [kernel.kallsyms]               [k] rb_prev
     1.01%       jobprocess  [kernel.kallsyms]               [k]
clflush_cache_range
     0.81%     ModuleTester  [be2net]                        [k] be_xmit
     0.78%       jobprocess  [kernel.kallsyms]               [k] __slab_free
     0.77%          swapper  [kernel.kallsyms]               [k] mwait_idle
     0.72%     ModuleTester  [kernel.kallsyms]               [k]
__domain_mapping
     0.66%       jobprocess  [kernel.kallsyms]               [k] _raw_spin_lock
     0.59%       jobprocess  [kernel.kallsyms]               [k]
_raw_spin_lock_irqsave
     0.56%     ModuleTester  [kernel.kallsyms]               [k] rb_prev
     0.53%          swapper  [kernel.kallsyms]               [k] rb_prev
     0.49%     ModuleTester  [kernel.kallsyms]               [k] sock_wmalloc
     0.47%       jobprocess  [be2net]                        [k] be_poll
     0.47%     ModuleTester  [kernel.kallsyms]               [k]
kmem_cache_alloc
     0.47%          swapper  [kernel.kallsyms]               [k]
clflush_cache_range
     0.45%           kipmi0  [ipmi_si]                       [k] port_inb
     0.42%          swapper  [kernel.kallsyms]               [k] __slab_free
     0.41%       jobprocess  [kernel.kallsyms]               [k] try_to_wake_up
     0.40%     ModuleTester  [kernel.kallsyms]               [k]
kmem_cache_alloc_node
     0.40%       jobprocess  [kernel.kallsyms]               [k] tg_load_down
     0.39%       jobprocess  libodyssey.so.1.8.2             [.]
y8_deblocking_luma_vert_edge_h264_sse2
     0.38%       jobprocess  libodyssey.so.1.8.2             [.]
y8_deblocking_luma_horz_edge_h264_ssse3
     0.38%     ModuleTester  [kernel.kallsyms]               [k] rb_insert_color
     0.37%       jobprocess  [kernel.kallsyms]               [k] find_iova
     0.37%       jobprocess  [kernel.kallsyms]               [k]
find_busiest_group
     0.36%       jobprocess  libpthread-2.7.so               [.]
pthread_mutex_lock
     0.35%          swapper  [kernel.kallsyms]               [k]
_raw_spin_unlock_irqrestore
     0.34%     ModuleTester  [kernel.kallsyms]               [k]
_raw_spin_lock_irqsave
     0.33%     ModuleTester  [kernel.kallsyms]               [k]
pfifo_fast_dequeue
     0.32%     ModuleTester  [kernel.kallsyms]               [k]
__kmalloc_node_track_caller
     0.32%       jobprocess  [be2net]                        [k]
be_tx_compl_process
     0.31%     ModuleTester  [kernel.kallsyms]               [k] ip_fragment
     0.29%          swapper  [kernel.kallsyms]               [k]
__hrtimer_start_range_ns
     0.29%       jobprocess  [kernel.kallsyms]               [k] __schedule
     0.29%     ModuleTester  [kernel.kallsyms]               [k] dev_queue_xmit
     0.28%          swapper  [kernel.kallsyms]               [k] __schedule

First thing I notice is the difference in copy_user_generic_string (it
is only 0.11% on the second measure, I didn't reported it here).
I think perf can help in finding the issue I observe with bonding,
maybe do you have  suggestions on the parameters to use ?
FYI, with bonding, TX goes up to 640Mbps, without bonding, I can send
2.4Gbps without suffering...

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-12  8:24                                         ` Jean-Michel Hautbois
@ 2012-06-12  8:55                                           ` Eric Dumazet
  2012-06-12  9:01                                             ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-12  8:55 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Tue, 2012-06-12 at 10:24 +0200, Jean-Michel Hautbois wrote:
> 2012/6/8 Jean-Michel Hautbois <jhautbois@gmail.com>:
> > 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> >> On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
> >>> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
> >>> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
> >>> >
> >>> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
> >>> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
> >>> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
> >>> >>           collisions:0 txqueuelen:1000
> >>> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
> >>> >
> >>> >> qdisc mq 0: dev eth1 root
> >>> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
> >>> >> requeues 17480)
> >>> >
> >>> > OK, and "tc -s -d cl show dev eth1"
> >>> >
> >>> > (How many queues are really used)
> >>> >
> >>> >
> >>> >
> >>>
> >>> tc -s -d cl show dev eth1
> >>> class mq :1 root
> >>>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
> >>>  backlog 0b 0p requeues 2747
> >>> class mq :2 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :3 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :4 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :5 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :6 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :7 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>> class mq :8 root
> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>>  backlog 0b 0p requeues 0
> >>
> >>
> >> Do you have the same distribution on old kernels as well ?
> >> (ie only queue 0 is used)
> >>
> >>
> >>
> >
> > On the old kernel, there is nothing returned by this command.
> >
> > JM
> 
> I used perf in order to get more information.

What happens if you force some traffic in the other way (say 50.000
(small) packets per second in RX) while doing your tests ?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-12  8:55                                           ` Eric Dumazet
@ 2012-06-12  9:01                                             ` Jean-Michel Hautbois
  2012-06-12  9:06                                               ` Eric Dumazet
  0 siblings, 1 reply; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-12  9:01 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/12 Eric Dumazet <eric.dumazet@gmail.com>:
> On Tue, 2012-06-12 at 10:24 +0200, Jean-Michel Hautbois wrote:
>> 2012/6/8 Jean-Michel Hautbois <jhautbois@gmail.com>:
>> > 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
>> >> On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
>> >>> 2012/6/8 Eric Dumazet <eric.dumazet@gmail.com>:
>> >>> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>> >>> >
>> >>> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>> >>> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>> >>> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>> >>> >>           collisions:0 txqueuelen:1000
>> >>> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>> >>> >
>> >>> >> qdisc mq 0: dev eth1 root
>> >>> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>> >>> >> requeues 17480)
>> >>> >
>> >>> > OK, and "tc -s -d cl show dev eth1"
>> >>> >
>> >>> > (How many queues are really used)
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>> tc -s -d cl show dev eth1
>> >>> class mq :1 root
>> >>>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>> >>>  backlog 0b 0p requeues 2747
>> >>> class mq :2 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :3 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :4 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :5 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :6 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :7 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>> class mq :8 root
>> >>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>>  backlog 0b 0p requeues 0
>> >>
>> >>
>> >> Do you have the same distribution on old kernels as well ?
>> >> (ie only queue 0 is used)
>> >>
>> >>
>> >>
>> >
>> > On the old kernel, there is nothing returned by this command.
>> >
>> > JM
>>
>> I used perf in order to get more information.
>
> What happens if you force some traffic in the other way (say 50.000
> (small) packets per second in RX) while doing your tests ?
>

Can I do that using netperf ?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-12  9:01                                             ` Jean-Michel Hautbois
@ 2012-06-12  9:06                                               ` Eric Dumazet
  2012-06-12  9:10                                                 ` Jean-Michel Hautbois
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-06-12  9:06 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Sathya.Perla, netdev

On Tue, 2012-06-12 at 11:01 +0200, Jean-Michel Hautbois wrote:

> Can I do that using netperf ?


Sure, you could use netperf -t UDP_RR

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-06-12  9:06                                               ` Eric Dumazet
@ 2012-06-12  9:10                                                 ` Jean-Michel Hautbois
  0 siblings, 0 replies; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-06-12  9:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya.Perla, netdev

2012/6/12 Eric Dumazet <eric.dumazet@gmail.com>:
> On Tue, 2012-06-12 at 11:01 +0200, Jean-Michel Hautbois wrote:
>
>> Can I do that using netperf ?
>
>
> Sure, you could use netperf -t UDP_RR
>
>

It sends, but no change on TX...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30 10:04             ` Sathya.Perla
  2012-05-30 10:07               ` Jean-Michel Hautbois
@ 2012-05-30 10:30               ` Eric Dumazet
  2012-05-30 11:10                 ` Sathya.Perla
  2012-05-31  6:54               ` Jean-Michel Hautbois
  2 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2012-05-30 10:30 UTC (permalink / raw)
  To: Sathya.Perla; +Cc: jhautbois, netdev

On Wed, 2012-05-30 at 03:04 -0700, Sathya.Perla@Emulex.Com wrote:
> >-----Original Message-----
> >From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> >Behalf Of Jean-Michel Hautbois
> >
> >2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
> >
> >I used vmstat in order to see the differences between the two kernels.
> >The main difference is the number of interrupts per second.
> >I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
> >I suspect the be2net driver to be the main cause, and I checkes the
> >/proc/interrupts file in order to be sure.
> >
> >I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
> >BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
> >whereas there is eth1-rx0 and eth1-tx on 2.6.26.
> 
> Yes, there is an issue with be2net interrupt mitigation in the recent code with
> RX and TX on the same Evt-Q (commit 10ef9ab4). The high interrupt rate happens when a TX blast is
> done while RX is relatively silent on a queue pair. Interrupt rate due to TX completions is not being
> mitigated.
> 
> I have a fix and will send it out soon..

I also have a benet fix for non GRO :

Pulling 64 bytes in skb head is too much for TCP IPv4 with no
timestamps, as this makes splice() or TCP coalescing less effective.

(Having tcp payload in linear part of the skb disables various optims)

Could you please test it ?

Thanks

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 08efd30..f446b11 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1202,15 +1202,19 @@ static void skb_fill_rx_data(struct be_rx_obj *rxo, struct sk_buff *skb,
 	/* Copy data in the first descriptor of this completion */
 	curr_frag_len = min(rxcp->pkt_size, rx_frag_size);
 
-	/* Copy the header portion into skb_data */
-	hdr_len = min(BE_HDR_LEN, curr_frag_len);
+	/* If frame is small enough to fit in skb->head, pull it completely.
+	 * If not, only pull ethernet header so that splice() or TCP coalesce
+	 * are more efficient.
+	 */
+	hdr_len = (curr_frag_len <= skb_tailroom(skb)) ?
+			curr_frag_len : ETH_HLEN;
+
 	memcpy(skb->data, start, hdr_len);
 	skb->len = curr_frag_len;
-	if (curr_frag_len <= BE_HDR_LEN) { /* tiny packet */
+	skb->tail += hdr_len;
+	if (hdr_len == curr_frag_len) { /* tiny packet */
 		/* Complete packet has now been moved to data */
 		put_page(page_info->page);
-		skb->data_len = 0;
-		skb->tail += curr_frag_len;
 	} else {
 		skb_shinfo(skb)->nr_frags = 1;
 		skb_frag_set_page(skb, 0, page_info->page);
@@ -1219,7 +1223,6 @@ static void skb_fill_rx_data(struct be_rx_obj *rxo, struct sk_buff *skb,
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], curr_frag_len - hdr_len);
 		skb->data_len = curr_frag_len - hdr_len;
 		skb->truesize += rx_frag_size;
-		skb->tail += hdr_len;
 	}
 	page_info->page = NULL;
 

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30 10:30               ` Eric Dumazet
@ 2012-05-30 11:10                 ` Sathya.Perla
  0 siblings, 0 replies; 31+ messages in thread
From: Sathya.Perla @ 2012-05-30 11:10 UTC (permalink / raw)
  To: eric.dumazet; +Cc: jhautbois, netdev

>-----Original Message-----
>From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>
>I also have a benet fix for non GRO :
>
>Pulling 64 bytes in skb head is too much for TCP IPv4 with no
>timestamps, as this makes splice() or TCP coalescing less effective.
>
>(Having tcp payload in linear part of the skb disables various optims)
>
>Could you please test it ?
>
Sure! thanks...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Difficulties to get 1Gbps on be2net ethernet card
  2012-05-30 10:04             ` Sathya.Perla
  2012-05-30 10:07               ` Jean-Michel Hautbois
  2012-05-30 10:30               ` Eric Dumazet
@ 2012-05-31  6:54               ` Jean-Michel Hautbois
  2 siblings, 0 replies; 31+ messages in thread
From: Jean-Michel Hautbois @ 2012-05-31  6:54 UTC (permalink / raw)
  To: Sathya.Perla; +Cc: eric.dumazet, netdev, yevgenyp

2012/5/30  <Sathya.Perla@emulex.com>:
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of Jean-Michel Hautbois
>>
>>2012/5/30 Jean-Michel Hautbois <jhautbois@gmail.com>:
>>
>>I used vmstat in order to see the differences between the two kernels.
>>The main difference is the number of interrupts per second.
>>I have an average of 87500 on 3.2 and 7500 on 2.6, 10 times lower !
>>I suspect the be2net driver to be the main cause, and I checkes the
>>/proc/interrupts file in order to be sure.
>>
>>I have for eth1-tx on 2.6.26 about 2200 interrupts per second and 23000 on 3.2.
>>BTW, it is named eth1-q0 on 3.2 (and tx and rx are the same IRQ)
>>whereas there is eth1-rx0 and eth1-tx on 2.6.26.
>
> Yes, there is an issue with be2net interrupt mitigation in the recent code with
> RX and TX on the same Evt-Q (commit 10ef9ab4). The high interrupt rate happens when a TX blast is
> done while RX is relatively silent on a queue pair. Interrupt rate due to TX completions is not being
> mitigated.
>
> I have a fix and will send it out soon..
>
> thanks,
> -Sathya

It seems this issue exist with mellanox mlx4 too...
I have a bond between eth1 (be2net) and eth9 (mlx4_en) and I can
switch from one to the other using ifenslave.
Setting a queue of 4000 on be2net works well in my use case, when I
switch to mlx4 based NIC which has a default queue len of 1000, I have
dropped packets too (less than be2net, but about 30-50 per second).
Setting a queue len of 4000 on mlx4 works too, but the number of
interrupts is similar.
The use case is the same : Lots of TX, no RX.

JM

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2012-06-12  9:11 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-29 14:46 Difficulties to get 1Gbps on be2net ethernet card Jean-Michel Hautbois
2012-05-30  6:28 ` Jean-Michel Hautbois
2012-05-30  6:48   ` Eric Dumazet
2012-05-30  6:51     ` Jean-Michel Hautbois
2012-05-30  7:06       ` Eric Dumazet
2012-05-30  7:25         ` Jean-Michel Hautbois
2012-05-30  9:40           ` Jean-Michel Hautbois
2012-05-30  9:56             ` Eric Dumazet
2012-05-30 10:06               ` Jean-Michel Hautbois
2012-05-30 10:04             ` Sathya.Perla
2012-05-30 10:07               ` Jean-Michel Hautbois
2012-06-06 10:04                 ` Jean-Michel Hautbois
2012-06-06 11:01                   ` Eric Dumazet
2012-06-06 12:34                     ` Jean-Michel Hautbois
2012-06-06 13:07                       ` Jean-Michel Hautbois
2012-06-06 14:36                         ` Jean-Michel Hautbois
2012-06-07 12:27                           ` Jean-Michel Hautbois
2012-06-07 12:31                             ` Eric Dumazet
2012-06-07 12:54                               ` Jean-Michel Hautbois
2012-06-08  6:08                                 ` Eric Dumazet
2012-06-08  8:14                                   ` Jean-Michel Hautbois
2012-06-08  8:22                                     ` Eric Dumazet
2012-06-08 14:53                                       ` Jean-Michel Hautbois
2012-06-12  8:24                                         ` Jean-Michel Hautbois
2012-06-12  8:55                                           ` Eric Dumazet
2012-06-12  9:01                                             ` Jean-Michel Hautbois
2012-06-12  9:06                                               ` Eric Dumazet
2012-06-12  9:10                                                 ` Jean-Michel Hautbois
2012-05-30 10:30               ` Eric Dumazet
2012-05-30 11:10                 ` Sathya.Perla
2012-05-31  6:54               ` Jean-Michel Hautbois

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).