netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Ixgbe question
@ 2008-03-10 21:27 Ben Greear
  2008-03-11  1:01 ` Brandeburg, Jesse
  0 siblings, 1 reply; 27+ messages in thread
From: Ben Greear @ 2008-03-10 21:27 UTC (permalink / raw)
  To: NetDev

I have a pair of IXGBE NICs in a system, and notice a strange
case where the NIC is not always 'UP' after my programs finish
trying to configure it.  I haven't noticed this with any other
NICs, but I also just moved to 2.6.23.17 from 23.9 and 23.14.

I see this in the logs:

ADDRCONF(NETDEV_UP): eth0: link is not ready

It would seem to me that we should be able to set the admin
state to UP, even if the link is not up??

Kernel is 2.6.23.17 plus my hacks.   Ixgbe driver is version 1.3.7.8-lro.

Hardware is quad-core Intel, 2 build-in e1000, 2-port ixgbe (CX4) chipset NIC
on a pcie riser.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: Ixgbe question
  2008-03-10 21:27 Ixgbe question Ben Greear
@ 2008-03-11  1:01 ` Brandeburg, Jesse
  0 siblings, 0 replies; 27+ messages in thread
From: Brandeburg, Jesse @ 2008-03-11  1:01 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

Ben Greear wrote:
> I have a pair of IXGBE NICs in a system, and notice a strange
> case where the NIC is not always 'UP' after my programs finish
> trying to configure it.  I haven't noticed this with any other
> NICs, but I also just moved to 2.6.23.17 from 23.9 and 23.14.
> 
> I see this in the logs:
> 
> ADDRCONF(NETDEV_UP): eth0: link is not ready
> 
> It would seem to me that we should be able to set the admin
> state to UP, even if the link is not up??
> 
> Kernel is 2.6.23.17 plus my hacks.   Ixgbe driver is version
> 1.3.7.8-lro. 
> 
> Hardware is quad-core Intel, 2 build-in e1000, 2-port ixgbe (CX4)
> chipset NIC 
> on a pcie riser.

addrconf_notify() is printing that message, and I see it if you enable
IPv6 on an interface.  I don't think it is ixgbe specific.

it doesn't depend or preclude administrative UP, it is from the notify
handler getting a NETDEV_UP message, regardless of link state. 

you'll see some other message like:
ADDRCONF(NETDEV_CHANGE): eth6: link becomes ready

when you plug in the cable, all courtesy of Ipv6

Jesse

^ permalink raw reply	[flat|nested] 27+ messages in thread

* ixgbe question
  2009-11-23  9:36   ` Peter P Waskiewicz Jr
@ 2009-11-23 10:21     ` Eric Dumazet
  2009-11-23 10:30       ` Badalian Vyacheslav
                         ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 10:21 UTC (permalink / raw)
  To: Peter P Waskiewicz Jr; +Cc: Linux Netdev List

Hi Peter

I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.

Setup is :

One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)

I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
was using many RX queues :

grep fiber1 /proc/interrupts 
117:       1301      13060          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-0
118:        601       1402          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-1
119:        634        832          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-2
120:        601       1303          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-3
121:        620       1246          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-4
122:       1287      13088          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-5
123:        606       1354          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-6
124:        653        827          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-7
125:        639        825          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-8
126:        596       1199          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-9
127:       2013      24800          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-10
128:        648       1353          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-11
129:        601       1123          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-12
130:        625        834          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-13
131:        665       1409          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-14
132:       2637      31699          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-15
133:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1:lsc



But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped

root@demodl380g6:/usr/src# ifconfig fiber0
fiber0    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:54  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
          TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:1000 
          Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)

root@demodl380g6:/usr/src# ifconfig fiber1
fiber1    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:55  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:1000 
          Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)


How and when multi queue rx can really start to use several cpus ?

Thanks
Eric


pktgen script :

pgset()
{
    local result

    echo $1 > $PGDEV

    result=`cat $PGDEV | fgrep "Result: OK:"`
    if [ "$result" = "" ]; then
         cat $PGDEV | fgrep Result:
    fi
}

pg()
{
    echo inject > $PGDEV
    cat $PGDEV
}


PGDEV=/proc/net/pktgen/kpktgend_4

 echo "Adding fiber0"
 pgset "add_device fiber0@0"


CLONE_SKB="clone_skb 15"

PKT_SIZE="pkt_size 60"


COUNT="count 100000000"
DELAY="delay 0"

PGDEV=/proc/net/pktgen/fiber0@0
  echo "Configuring $PGDEV"
 pgset "$COUNT"
 pgset "$CLONE_SKB"
 pgset "$PKT_SIZE"
 pgset "$DELAY"
 pgset "queue_map_min 0"
 pgset "queue_map_max 7"
 pgset "dst_min 192.168.0.2"
 pgset "dst_max 192.168.0.250"
 pgset "src_min 192.168.0.1"
 pgset "src_max 192.168.0.1"
 pgset "dst_mac  00:1b:21:4a:fe:55"


# Time to run
PGDEV=/proc/net/pktgen/pgctrl

 echo "Running... ctrl^C to stop"
 pgset "start" 
 echo "Done"

# Result can be vieved in /proc/net/pktgen/fiber0@0

for f in fiber0@0
do
 cat /proc/net/pktgen/$f
done



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:21     ` ixgbe question Eric Dumazet
@ 2009-11-23 10:30       ` Badalian Vyacheslav
  2009-11-23 10:34       ` Waskiewicz Jr, Peter P
  2009-11-23 14:10       ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-23 10:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Peter P Waskiewicz Jr, Linux Netdev List

Hello Eric. I paly with this card 3 weeks and maybe help for you :)

By default intel flower use only first cpu. Its strange.
If we add affinity to single cpu core for interrupt its will use this CPU core.
If we add affinity to two or more cpus its applying but don't work.
See ixgbe driver README from intel.com. Its have param for RSS flower. I think its do this :)
Also driver from intel.com have script for split RxTx->Cpu core but you must replace "tx rx" in code to "TxRx".

P.S. Please also see if you can and wont:
On e1000 and x86 kernel + 2xXeon 2core my TC rules load 3 min
On ixgbe and X86_64 kernel + 4xXeon 6core my TC rules load more 15 mins!
Its 64 bit regression?

Tc rules i can send to you if you ask me for it! Thanks!

Slavon


> Hi Peter
> 
> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
> 
> Setup is :
> 
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
> 
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :
> 
> grep fiber1 /proc/interrupts 
> 117:       1301      13060          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-0
> 118:        601       1402          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-1
> 119:        634        832          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-2
> 120:        601       1303          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-3
> 121:        620       1246          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-4
> 122:       1287      13088          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-5
> 123:        606       1354          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-6
> 124:        653        827          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-7
> 125:        639        825          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-8
> 126:        596       1199          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-9
> 127:       2013      24800          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-10
> 128:        648       1353          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-11
> 129:        601       1123          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-12
> 130:        625        834          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-13
> 131:        665       1409          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-14
> 132:       2637      31699          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-15
> 133:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1:lsc
> 
> 
> 
> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
> 
> root@demodl380g6:/usr/src# ifconfig fiber0
> fiber0    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:54  
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
>           TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 lg file transmission:1000 
>           Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
> 
> root@demodl380g6:/usr/src# ifconfig fiber1
> fiber1    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:55  
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
>           TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 lg file transmission:1000 
>           Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
> 
> 
> How and when multi queue rx can really start to use several cpus ?
> 
> Thanks
> Eric
> 
> 
> pktgen script :
> 
> pgset()
> {
>     local result
> 
>     echo $1 > $PGDEV
> 
>     result=`cat $PGDEV | fgrep "Result: OK:"`
>     if [ "$result" = "" ]; then
>          cat $PGDEV | fgrep Result:
>     fi
> }
> 
> pg()
> {
>     echo inject > $PGDEV
>     cat $PGDEV
> }
> 
> 
> PGDEV=/proc/net/pktgen/kpktgend_4
> 
>  echo "Adding fiber0"
>  pgset "add_device fiber0@0"
> 
> 
> CLONE_SKB="clone_skb 15"
> 
> PKT_SIZE="pkt_size 60"
> 
> 
> COUNT="count 100000000"
> DELAY="delay 0"
> 
> PGDEV=/proc/net/pktgen/fiber0@0
>   echo "Configuring $PGDEV"
>  pgset "$COUNT"
>  pgset "$CLONE_SKB"
>  pgset "$PKT_SIZE"
>  pgset "$DELAY"
>  pgset "queue_map_min 0"
>  pgset "queue_map_max 7"
>  pgset "dst_min 192.168.0.2"
>  pgset "dst_max 192.168.0.250"
>  pgset "src_min 192.168.0.1"
>  pgset "src_max 192.168.0.1"
>  pgset "dst_mac  00:1b:21:4a:fe:55"
> 
> 
> # Time to run
> PGDEV=/proc/net/pktgen/pgctrl
> 
>  echo "Running... ctrl^C to stop"
>  pgset "start" 
>  echo "Done"
> 
> # Result can be vieved in /proc/net/pktgen/fiber0@0
> 
> for f in fiber0@0
> do
>  cat /proc/net/pktgen/$f
> done
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:21     ` ixgbe question Eric Dumazet
  2009-11-23 10:30       ` Badalian Vyacheslav
@ 2009-11-23 10:34       ` Waskiewicz Jr, Peter P
  2009-11-23 10:37         ` Eric Dumazet
  2009-11-23 14:10       ` Jesper Dangaard Brouer
  2 siblings, 1 reply; 27+ messages in thread
From: Waskiewicz Jr, Peter P @ 2009-11-23 10:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Waskiewicz Jr, Peter P, Linux Netdev List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5581 bytes --]

On Mon, 23 Nov 2009, Eric Dumazet wrote:

> Hi Peter
> 
> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
> 
> Setup is :
> 
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)

Can you specify kernel version and driver version?

> 
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :
> 
> grep fiber1 /proc/interrupts 
> 117:       1301      13060          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-0
> 118:        601       1402          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-1
> 119:        634        832          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-2
> 120:        601       1303          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-3
> 121:        620       1246          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-4
> 122:       1287      13088          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-5
> 123:        606       1354          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-6
> 124:        653        827          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-7
> 125:        639        825          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-8
> 126:        596       1199          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-9
> 127:       2013      24800          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-10
> 128:        648       1353          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-11
> 129:        601       1123          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-12
> 130:        625        834          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-13
> 131:        665       1409          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-14
> 132:       2637      31699          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-15
> 133:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1:lsc
> 
> 
> 
> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
> 
> root@demodl380g6:/usr/src# ifconfig fiber0
> fiber0    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:54  
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
>           TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 lg file transmission:1000 
>           Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
> 
> root@demodl380g6:/usr/src# ifconfig fiber1
> fiber1    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:55  
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
>           TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 lg file transmission:1000 
>           Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)

I stay in the states too much.  I love seeing net stats in French.  :-)

> 
> 
> How and when multi queue rx can really start to use several cpus ?

If you're sending one flow to many consumers, it's still one flow.  Even 
using RSS won't help, since it requires differing flows to spread load  
(5-tuple matches for flow distribution).

Cheers,
-PJ

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:34       ` Waskiewicz Jr, Peter P
@ 2009-11-23 10:37         ` Eric Dumazet
  2009-11-23 14:05           ` Eric Dumazet
  2009-11-23 21:26           ` David Miller
  0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 10:37 UTC (permalink / raw)
  To: Waskiewicz Jr, Peter P; +Cc: Linux Netdev List

Waskiewicz Jr, Peter P a écrit :
> On Mon, 23 Nov 2009, Eric Dumazet wrote:
> 
>> Hi Peter
>>
>> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>>
>> Setup is :
>>
>> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
>> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
> 
> Can you specify kernel version and driver version?


Well, I forgot to mention I am only working with net-next-2.6 tree.

Ubuntu 9.10 kernel (Fedora Core 12 installer was not able to recognize disks on this machine :( )

ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2


> 
>> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
>> was using many RX queues :
>>
>> grep fiber1 /proc/interrupts 
>> 117:       1301      13060          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-0
>> 118:        601       1402          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-1
>> 119:        634        832          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-2
>> 120:        601       1303          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-3
>> 121:        620       1246          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-4
>> 122:       1287      13088          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-5
>> 123:        606       1354          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-6
>> 124:        653        827          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-7
>> 125:        639        825          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-8
>> 126:        596       1199          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-9
>> 127:       2013      24800          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-10
>> 128:        648       1353          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-11
>> 129:        601       1123          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-12
>> 130:        625        834          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-13
>> 131:        665       1409          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-14
>> 132:       2637      31699          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1-TxRx-15
>> 133:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      fiber1:lsc
>>
>>
>>
>> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
>>
>> root@demodl380g6:/usr/src# ifconfig fiber0
>> fiber0    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:54  
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
>>           TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 lg file transmission:1000 
>>           Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
>>
>> root@demodl380g6:/usr/src# ifconfig fiber1
>> fiber1    Link encap:Ethernet  HWaddr 00:1b:21:4a:fe:55  
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
>>           TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 lg file transmission:1000 
>>           Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
> 
> I stay in the states too much.  I love seeing net stats in French.  :-)

Ok :)

> 
>>
>> How and when multi queue rx can really start to use several cpus ?
> 
> If you're sending one flow to many consumers, it's still one flow.  Even 
> using RSS won't help, since it requires differing flows to spread load  
> (5-tuple matches for flow distribution).

Hm... I can try varying both src and dst on my pktgen test.

Thanks

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:37         ` Eric Dumazet
@ 2009-11-23 14:05           ` Eric Dumazet
  2009-11-23 21:26           ` David Miller
  1 sibling, 0 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 14:05 UTC (permalink / raw)
  To: Waskiewicz Jr, Peter P; +Cc: Linux Netdev List

Eric Dumazet a écrit :
> Waskiewicz Jr, Peter P a écrit :
>> On Mon, 23 Nov 2009, Eric Dumazet wrote:
>>
>>> Hi Peter
>>>
>>> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>>>
>>> Setup is :
>>>
>>> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
>>> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>> Can you specify kernel version and driver version?
> 
> 
> Well, I forgot to mention I am only working with net-next-2.6 tree.
> 
> Ubuntu 9.10 kernel (Fedora Core 12 installer was not able to recognize disks on this machine :( )
> 
> ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
> 
> 

I tried with several pktgen threads, no success so far.

Only one cpu handles all interrupts and ksoftirq enters
a mode with no escape to splitted mode.

To get real multi queue and uncontended handling, I had to force :

echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity` 
echo 2 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
echo 8 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
echo 20 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
echo 80 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
echo 200 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
echo 800 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
echo 2000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
echo 8000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`


Probably problem comes from fact that when ksoftirqd runs and
RX queues are not depleted, no hardware interrupts is sent,
and NAPI contexts stay sticked on one cpu forever ?


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:21     ` ixgbe question Eric Dumazet
  2009-11-23 10:30       ` Badalian Vyacheslav
  2009-11-23 10:34       ` Waskiewicz Jr, Peter P
@ 2009-11-23 14:10       ` Jesper Dangaard Brouer
  2009-11-23 14:38         ` Eric Dumazet
  2 siblings, 1 reply; 27+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-23 14:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Peter P Waskiewicz Jr, Linux Netdev List


On Mon, 23 Nov 2009, Eric Dumazet wrote:

> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>
> Setup is :
>
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :

How is your smp_affinity mask's set?

grep . /proc/irq/*/fiber1-*/../smp_affinity


> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped

Just a hint, I use 'ethtool -S fiber1' to see how the packets gets 
distributed across the rx and tx queues.



> CLONE_SKB="clone_skb 15"

Be careful with to high clone, as my experience is it will send a burst of 
clone_skb packets before the packet gets randomized again.


> pgset "dst_min 192.168.0.2"
> pgset "dst_max 192.168.0.250"
> pgset "src_min 192.168.0.1"
> pgset "src_max 192.168.0.1"
> pgset "dst_mac  00:1b:21:4a:fe:55"

To get a packets randomized across RX queues, I used:

     echo "- Random UDP source port min:$min - max:$max"
     pgset "flag UDPSRC_RND"
     pgset "udp_src_min $min"
     pgset "udp_src_max $max"

Ahh.. I think you are missing:

  pgset "flag IPDST_RND"


Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 14:10       ` Jesper Dangaard Brouer
@ 2009-11-23 14:38         ` Eric Dumazet
  2009-11-23 18:30           ` robert
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 14:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Peter P Waskiewicz Jr, Linux Netdev List

Jesper Dangaard Brouer a écrit :

> How is your smp_affinity mask's set?
> 
> grep . /proc/irq/*/fiber1-*/../smp_affinity

First, I tried default affinities (ffff)

Then I tried irqbalance... no more success.

Driver seems to try to handle all queues on one cpu on low trafic,
and possibly dynamically switches to a multi-cpu mode,
but as all interrupts are masked, we stay in 
a NAPI context handling all queues.

And we let one cpu in flood/drops mode.




> 
> 
>> But only one CPU (CPU1) had a softirq running, 100%, and many frames
>> were dropped
> 
> Just a hint, I use 'ethtool -S fiber1' to see how the packets gets
> distributed across the rx and tx queues.

They are correctly distributed

     rx_queue_0_packets: 14119644
     rx_queue_0_bytes: 847178640
     rx_queue_1_packets: 14126315
     rx_queue_1_bytes: 847578900
     rx_queue_2_packets: 14115249
     rx_queue_2_bytes: 846914940
     rx_queue_3_packets: 14118146
     rx_queue_3_bytes: 847088760
     rx_queue_4_packets: 14130869
     rx_queue_4_bytes: 847853268
     rx_queue_5_packets: 14112239
     rx_queue_5_bytes: 846734340
     rx_queue_6_packets: 14128425
     rx_queue_6_bytes: 847705500
     rx_queue_7_packets: 14110587
     rx_queue_7_bytes: 846635220
     rx_queue_8_packets: 14117350
     rx_queue_8_bytes: 847041000
     rx_queue_9_packets: 14125992
     rx_queue_9_bytes: 847559520
     rx_queue_10_packets: 14121732
     rx_queue_10_bytes: 847303920
     rx_queue_11_packets: 14120997
     rx_queue_11_bytes: 847259820
     rx_queue_12_packets: 14125576
     rx_queue_12_bytes: 847535854
     rx_queue_13_packets: 14118512
     rx_queue_13_bytes: 847110720
     rx_queue_14_packets: 14118348
     rx_queue_14_bytes: 847100880
     rx_queue_15_packets: 14118647
     rx_queue_15_bytes: 847118820



> 
> 
> 
>> CLONE_SKB="clone_skb 15"
> 
> Be careful with to high clone, as my experience is it will send a burst
> of clone_skb packets before the packet gets randomized again.

Yes, but 15 should be ok with 10Gb link  :)

Thanks

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 18:30           ` robert
@ 2009-11-23 16:59             ` Eric Dumazet
  2009-11-23 20:54               ` robert
  2009-11-23 23:28               ` Waskiewicz Jr, Peter P
  0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 16:59 UTC (permalink / raw)
  To: robert; +Cc: Jesper Dangaard Brouer, Peter P Waskiewicz Jr, Linux Netdev List

robert@herjulf.net a écrit :
> Eric Dumazet writes:
> 
>  > Jesper Dangaard Brouer a écrit :
>  > 
>  > > How is your smp_affinity mask's set?
>  > > 
>  > > grep . /proc/irq/*/fiber1-*/../smp_affinity
>  > 
> 
>  Weird... set clone_skb to 1 to be sure and vary dst or something so 
>  the HW classifier selects different queues and with proper RX affinty. 
>  
>  You should see in /proc/net/softnet_stat something like:
> 
> 012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
> 01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
> 0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> 01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
> 0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> 0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
> 0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> 012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> 

slone_skb set to 1, this changes nothing but slows down pktgen (obviously)

Result: OK: 117614452(c117608705+d5746) nsec, 100000000 (60byte,0frags)
  850235pps 408Mb/sec (408112800bps) errors: 0

All RX processing of 16 RX queues done by CPU 1 only.


# cat  /proc/net/softnet_stat  ; sleep 2 ; echo "--------------";cat  /proc/net/softnet_stat
0039f331 00000000 00002e10 00000000 00000000 00000000 00000000 00000000 00000000
03f2ed19 00000000 00037ca2 00000000 00000000 00000000 00000000 00000000 00000000
00000024 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000041 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000028 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000000b 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000c5 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000010d 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000250 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000498 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000616 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000012c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000d2 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000025d 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000003c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000127 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
--------------
0039f331 00000000 00002e10 00000000 00000000 00000000 00000000 00000000 00000000
03f66737 00000000 00038015 00000000 00000000 00000000 00000000 00000000 00000000
00000024 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000041 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000028 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000000b 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000c5 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000110 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000250 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000499 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000616 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000012c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000d2 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000263 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000003c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000129 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

ethtool -S fiber1  (to show how my trafic is equally distributed to 16 RX queues)

     rx_queue_0_packets: 4867706
     rx_queue_0_bytes: 292062360
     rx_queue_1_packets: 4862472
     rx_queue_1_bytes: 291748320
     rx_queue_2_packets: 4867111
     rx_queue_2_bytes: 292026660
     rx_queue_3_packets: 4859897
     rx_queue_3_bytes: 291593820
     rx_queue_4_packets: 4862267
     rx_queue_4_bytes: 291740814
     rx_queue_5_packets: 4861517
     rx_queue_5_bytes: 291691020
     rx_queue_6_packets: 4862699
     rx_queue_6_bytes: 291761940
     rx_queue_7_packets: 4860523
     rx_queue_7_bytes: 291631380
     rx_queue_8_packets: 4856891
     rx_queue_8_bytes: 291413460
     rx_queue_9_packets: 4868794
     rx_queue_9_bytes: 292127640
     rx_queue_10_packets: 4859099
     rx_queue_10_bytes: 291545940
     rx_queue_11_packets: 4867599
     rx_queue_11_bytes: 292055940
     rx_queue_12_packets: 4861868
     rx_queue_12_bytes: 291713374
     rx_queue_13_packets: 4862655
     rx_queue_13_bytes: 291759300
     rx_queue_14_packets: 4860798
     rx_queue_14_bytes: 291647880
     rx_queue_15_packets: 4860951
     rx_queue_15_bytes: 291657060


perf top -C 1 -E 25
------------------------------------------------------------------------------
   PerfTop:   24419 irqs/sec  kernel:100.0% [100000 cycles],  (all, cpu: 1)
------------------------------------------------------------------------------

             samples    pcnt   kernel function
             _______   _____   _______________

            46234.00 - 24.3% : ixgbe_clean_tx_irq	[ixgbe]
            21134.00 - 11.1% : __slab_free
            17838.00 -  9.4% : _raw_spin_lock
            17086.00 -  9.0% : skb_release_head_state
             9410.00 -  5.0% : ixgbe_clean_rx_irq	[ixgbe]
             8639.00 -  4.5% : kmem_cache_free
             6910.00 -  3.6% : kfree
             5743.00 -  3.0% : __ip_route_output_key
             5321.00 -  2.8% : ip_route_input
             3138.00 -  1.7% : ip_rcv
             2179.00 -  1.1% : kmem_cache_alloc_node
             2002.00 -  1.1% : __kmalloc_node_track_caller
             1907.00 -  1.0% : skb_put
             1807.00 -  1.0% : __xfrm_lookup
             1742.00 -  0.9% : get_partial_node
             1727.00 -  0.9% : csum_partial_copy_generic
             1541.00 -  0.8% : add_partial
             1516.00 -  0.8% : __kfree_skb
             1465.00 -  0.8% : __netdev_alloc_skb
             1420.00 -  0.7% : icmp_send
             1222.00 -  0.6% : dev_gro_receive
             1159.00 -  0.6% : fib_table_lookup
             1155.00 -  0.6% : __phys_addr
             1050.00 -  0.6% : skb_release_data
              982.00 -  0.5% : _raw_spin_unlock


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 14:38         ` Eric Dumazet
@ 2009-11-23 18:30           ` robert
  2009-11-23 16:59             ` Eric Dumazet
  0 siblings, 1 reply; 27+ messages in thread
From: robert @ 2009-11-23 18:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Peter P Waskiewicz Jr, Linux Netdev List


Eric Dumazet writes:

 > Jesper Dangaard Brouer a écrit :
 > 
 > > How is your smp_affinity mask's set?
 > > 
 > > grep . /proc/irq/*/fiber1-*/../smp_affinity
 > 

 Weird... set clone_skb to 1 to be sure and vary dst or something so 
 the HW classifier selects different queues and with proper RX affinty. 
 
 You should see in /proc/net/softnet_stat something like:

012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000

Or something is...
 
Cheers

					--ro

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 16:59             ` Eric Dumazet
@ 2009-11-23 20:54               ` robert
  2009-11-23 21:28                 ` David Miller
  2009-11-23 23:28               ` Waskiewicz Jr, Peter P
  1 sibling, 1 reply; 27+ messages in thread
From: robert @ 2009-11-23 20:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: robert, Jesper Dangaard Brouer, Peter P Waskiewicz Jr,
	Linux Netdev List


Eric Dumazet writes:

 > slone_skb set to 1, this changes nothing but slows down pktgen (obviously)

 > All RX processing of 16 RX queues done by CPU 1 only.


 Well just pulled net-next-2.6 and ran with both 82598 and 82599 boards and 
 pkt load gets distributed among the cpu-cores. 


 Something mysterious or very obvious...
 
 You can even try the script it's a sort of Internet Link traffic emulation
 well you have to set up your routing.


 Cheers
					--ro

 


#! /bin/sh

#modprobe pktgen

function pgset() {
    local result

    echo $1 > $PGDEV

    result=`cat $PGDEV | fgrep "Result: OK:"`
    if [ "$result" = "" ]; then
         cat $PGDEV | fgrep Result:
    fi
}

function pg() {
    echo inject > $PGDEV
    cat $PGDEV
}

# Config Start Here -----------------------------------------------------------

remove_all()
{
 # thread config
 PGDEV=/proc/net/pktgen/kpktgend_0
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_1
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_2
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_3
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_4
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_5
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_6
 pgset "rem_device_all" 

 PGDEV=/proc/net/pktgen/kpktgend_7
 pgset "rem_device_all" 
}

remove_all

 PGDEV=/proc/net/pktgen/kpktgend_0
 pgset "add_device eth2@0" 

 PGDEV=/proc/net/pktgen/kpktgend_1
 pgset "add_device eth2@1" 

 PGDEV=/proc/net/pktgen/kpktgend_2
 pgset "add_device eth2@2" 

 PGDEV=/proc/net/pktgen/kpktgend_3
 pgset "add_device eth2@3" 


# device config
#
# Sending a mix of pkt sizes of 64, 576 and 1500
#

CLONE_SKB="clone_skb 1"
PKT_SIZE="pkt_size 60"
COUNT="count 000000"
DELAY="delay 0000"
#MAC="00:21:28:08:40:EE"
#MAC="00:21:28:08:40:EF"
#MAC="00:1B:21:17:C1:CD"
MAC="00:14:4F:DA:8C:66"
#MAC="00:14:4F:6B:CD:E8"


PGDEV=/proc/net/pktgen/eth2@0
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 1496"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND" 
pgset "flag FLOW_SEQ" 
pgset "dst_min 11.0.0.0" 
pgset "dst_max 11.255.255.255" 
pgset "flows 2048" 
pgset "flowlen 30" 
pgset  "dst_mac $MAC"

PGDEV=/proc/net/pktgen/eth2@1
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 576"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND" 
pgset "flag FLOW_SEQ" 
pgset "dst_min 11.0.0.0" 
pgset "dst_max 11.255.255.255" 
pgset "flows 2048" 
pgset "flowlen 30" 
pgset  "dst_mac $MAC"

PGDEV=/proc/net/pktgen/eth2@2
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "$DELAY"
pgset "pkt_size 60"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND" 
pgset "flag FLOW_SEQ" 
pgset "dst_min 11.0.0.0" 
pgset "dst_max 11.255.255.255" 
pgset "flows 2048" 
pgset "flowlen 30" 
pgset  "dst_mac $MAC"

PGDEV=/proc/net/pktgen/eth2@3
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 1496"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND" 
pgset "flag FLOW_SEQ" 
pgset "dst_min 11.0.0.0" 
pgset "dst_max 11.255.255.255" 
pgset "flows 2048" 
pgset "flowlen 30" 
pgset  "dst_mac $MAC"

# Time to run
PGDEV=/proc/net/pktgen/pgctrl

echo "Running... ctrl^C to stop"
pgset "start" 
echo "Done"

grep pps /proc/net/pktgen/*

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 10:37         ` Eric Dumazet
  2009-11-23 14:05           ` Eric Dumazet
@ 2009-11-23 21:26           ` David Miller
  1 sibling, 0 replies; 27+ messages in thread
From: David Miller @ 2009-11-23 21:26 UTC (permalink / raw)
  To: eric.dumazet; +Cc: peter.p.waskiewicz.jr, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 23 Nov 2009 11:37:20 +0100

> (Fedora Core 12 installer was not able to recognize disks on this machine :( )

I ran into this problem too on my laptop, but only with the Live-CD images.

The DVD image recognized the disks and installed just fine.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 20:54               ` robert
@ 2009-11-23 21:28                 ` David Miller
  2009-11-23 22:14                   ` Robert Olsson
  0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2009-11-23 21:28 UTC (permalink / raw)
  To: robert; +Cc: eric.dumazet, hawk, peter.p.waskiewicz.jr, netdev

From: robert@herjulf.net
Date: Mon, 23 Nov 2009 21:54:43 +0100

>  Something mysterious or very obvious...

It seem very obvious to me that, for whatever reason, the MSI-X vectors
are only being sent to cpu 1 on Eric's system.

I also suspect, as a result, that it has nothing to do with the IXGBE
driver but rather is some IRQ controller programming or some bug or
limitation in the IRQ affinity mask handling in the kernel.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 21:28                 ` David Miller
@ 2009-11-23 22:14                   ` Robert Olsson
  0 siblings, 0 replies; 27+ messages in thread
From: Robert Olsson @ 2009-11-23 22:14 UTC (permalink / raw)
  To: David Miller; +Cc: robert, eric.dumazet, hawk, peter.p.waskiewicz.jr, netdev


David Miller writes:
 > It seem very obvious to me that, for whatever reason, the MSI-X vectors
 > are only being sent to cpu 1 on Eric's system.
 > 
 > I also suspect, as a result, that it has nothing to do with the IXGBE
 > driver but rather is some IRQ controller programming or some bug or
 > limitation in the IRQ affinity mask handling in the kernel.

 Probably so yes. I'll guess Eric will dig into this.
 
 Cheers

					--ro
 
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 16:59             ` Eric Dumazet
  2009-11-23 20:54               ` robert
@ 2009-11-23 23:28               ` Waskiewicz Jr, Peter P
  2009-11-23 23:44                 ` David Miller
  2009-11-24  7:46                 ` Eric Dumazet
  1 sibling, 2 replies; 27+ messages in thread
From: Waskiewicz Jr, Peter P @ 2009-11-23 23:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: robert@herjulf.net, Jesper Dangaard Brouer,
	Waskiewicz Jr, Peter P, Linux Netdev List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2301 bytes --]

On Mon, 23 Nov 2009, Eric Dumazet wrote:

> robert@herjulf.net a écrit :
> > Eric Dumazet writes:
> > 
> >  > Jesper Dangaard Brouer a écrit :
> >  > 
> >  > > How is your smp_affinity mask's set?
> >  > > 
> >  > > grep . /proc/irq/*/fiber1-*/../smp_affinity
> >  > 
> > 
> >  Weird... set clone_skb to 1 to be sure and vary dst or something so 
> >  the HW classifier selects different queues and with proper RX affinty. 
> >  
> >  You should see in /proc/net/softnet_stat something like:
> > 
> > 012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
> > 01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> > 01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
> > 0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> > 012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> > 
> 
> slone_skb set to 1, this changes nothing but slows down pktgen (obviously)
> 
> Result: OK: 117614452(c117608705+d5746) nsec, 100000000 (60byte,0frags)
>   850235pps 408Mb/sec (408112800bps) errors: 0
> 
> All RX processing of 16 RX queues done by CPU 1 only.

Ok, I was confused earlier.  I thought you were saying that all packets 
were headed into a single Rx queue.  This is different.

Do you know what version of irqbalance you're running, or if it's running 
at all?  We've seen issues with irqbalance where it won't recognize the 
ethernet device if the driver has been reloaded.  In that case, it won't 
balance the interrupts at all.  If the default affinity was set to one 
CPU, then well, you're screwed.

My suggestion in this case is after you reload ixgbe and start your tests, 
see if it all goes to one CPU.  If it does, then restart irqbalance 
(service irqbalance restart - or just kill it and restart by hand).  Then 
start running your test, and in 10 seconds you should see the interrupts 
move and spread out.

Let me know if this helps,
-PJ

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 23:28               ` Waskiewicz Jr, Peter P
@ 2009-11-23 23:44                 ` David Miller
  2009-11-24  7:46                 ` Eric Dumazet
  1 sibling, 0 replies; 27+ messages in thread
From: David Miller @ 2009-11-23 23:44 UTC (permalink / raw)
  To: peter.p.waskiewicz.jr; +Cc: eric.dumazet, robert, hawk, netdev

From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
Date: Mon, 23 Nov 2009 15:28:18 -0800 (Pacific Standard Time)

> Do you know what version of irqbalance you're running, or if it's running 
> at all?

Eric said he tried both with and without irqbalanced.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-23 23:28               ` Waskiewicz Jr, Peter P
  2009-11-23 23:44                 ` David Miller
@ 2009-11-24  7:46                 ` Eric Dumazet
  2009-11-24  8:46                   ` Badalian Vyacheslav
  2009-11-24  9:07                   ` Peter P Waskiewicz Jr
  1 sibling, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-24  7:46 UTC (permalink / raw)
  To: Waskiewicz Jr, Peter P
  Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List

Waskiewicz Jr, Peter P a écrit :
> Ok, I was confused earlier.  I thought you were saying that all packets 
> were headed into a single Rx queue.  This is different.
> 
> Do you know what version of irqbalance you're running, or if it's running 
> at all?  We've seen issues with irqbalance where it won't recognize the 
> ethernet device if the driver has been reloaded.  In that case, it won't 
> balance the interrupts at all.  If the default affinity was set to one 
> CPU, then well, you're screwed.
> 
> My suggestion in this case is after you reload ixgbe and start your tests, 
> see if it all goes to one CPU.  If it does, then restart irqbalance 
> (service irqbalance restart - or just kill it and restart by hand).  Then 
> start running your test, and in 10 seconds you should see the interrupts 
> move and spread out.
> 
> Let me know if this helps,

Sure it helps !

I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.

I do receive 16 different irqs, but all serviced on one cpu.

Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
(one bit set in the mask, not several ones), and that is not optimal for moderate loads.

echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`


One other problem is that after reload of ixgbe driver, link is 95% of the time
at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps

I run following script many times and stop it when 10 Gbps speed if reached.

ethtool -A fiber0 rx off tx off
ip link set fiber0 down
ip link set fiber1 down
sleep 2
ethtool fiber0
ethtool -s fiber0 speed 10000
ethtool -s fiber1 speed 10000
ethtool -r fiber0 &
ethtool -r fiber1 &
ethtool fiber0
ip link set fiber1 up &
ip link set fiber0 up &
ethtool fiber0

[   33.625689] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
[   33.625692] ixgbe: Copyright (c) 1999-2009 Intel Corporation.
[   33.625741] ixgbe 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
[   33.625760] ixgbe 0000:07:00.0: setting latency timer to 64
[   33.735579] ixgbe 0000:07:00.0: irq 100 for MSI/MSI-X
[   33.735583] ixgbe 0000:07:00.0: irq 101 for MSI/MSI-X
[   33.735585] ixgbe 0000:07:00.0: irq 102 for MSI/MSI-X
[   33.735587] ixgbe 0000:07:00.0: irq 103 for MSI/MSI-X
[   33.735589] ixgbe 0000:07:00.0: irq 104 for MSI/MSI-X
[   33.735591] ixgbe 0000:07:00.0: irq 105 for MSI/MSI-X
[   33.735593] ixgbe 0000:07:00.0: irq 106 for MSI/MSI-X
[   33.735595] ixgbe 0000:07:00.0: irq 107 for MSI/MSI-X
[   33.735597] ixgbe 0000:07:00.0: irq 108 for MSI/MSI-X
[   33.735599] ixgbe 0000:07:00.0: irq 109 for MSI/MSI-X
[   33.735602] ixgbe 0000:07:00.0: irq 110 for MSI/MSI-X
[   33.735604] ixgbe 0000:07:00.0: irq 111 for MSI/MSI-X
[   33.735606] ixgbe 0000:07:00.0: irq 112 for MSI/MSI-X
[   33.735608] ixgbe 0000:07:00.0: irq 113 for MSI/MSI-X
[   33.735610] ixgbe 0000:07:00.0: irq 114 for MSI/MSI-X
[   33.735612] ixgbe 0000:07:00.0: irq 115 for MSI/MSI-X
[   33.735614] ixgbe 0000:07:00.0: irq 116 for MSI/MSI-X
[   33.735633] ixgbe: 0000:07:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
[   33.735638] ixgbe 0000:07:00.0: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:54
[   33.735722] ixgbe 0000:07:00.0: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
[   33.738111] ixgbe 0000:07:00.0: Intel(R) 10 Gigabit Network Connection
[   33.738135] ixgbe 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> IRQ 42
[   33.738151] ixgbe 0000:07:00.1: setting latency timer to 64
[   33.853526] ixgbe 0000:07:00.1: irq 117 for MSI/MSI-X
[   33.853529] ixgbe 0000:07:00.1: irq 118 for MSI/MSI-X
[   33.853532] ixgbe 0000:07:00.1: irq 119 for MSI/MSI-X
[   33.853534] ixgbe 0000:07:00.1: irq 120 for MSI/MSI-X
[   33.853536] ixgbe 0000:07:00.1: irq 121 for MSI/MSI-X
[   33.853538] ixgbe 0000:07:00.1: irq 122 for MSI/MSI-X
[   33.853540] ixgbe 0000:07:00.1: irq 123 for MSI/MSI-X
[   33.853542] ixgbe 0000:07:00.1: irq 124 for MSI/MSI-X
[   33.853544] ixgbe 0000:07:00.1: irq 125 for MSI/MSI-X
[   33.853546] ixgbe 0000:07:00.1: irq 126 for MSI/MSI-X
[   33.853548] ixgbe 0000:07:00.1: irq 127 for MSI/MSI-X
[   33.853550] ixgbe 0000:07:00.1: irq 128 for MSI/MSI-X
[   33.853552] ixgbe 0000:07:00.1: irq 129 for MSI/MSI-X
[   33.853554] ixgbe 0000:07:00.1: irq 130 for MSI/MSI-X
[   33.853556] ixgbe 0000:07:00.1: irq 131 for MSI/MSI-X
[   33.853558] ixgbe 0000:07:00.1: irq 132 for MSI/MSI-X
[   33.853560] ixgbe 0000:07:00.1: irq 133 for MSI/MSI-X
[   33.853580] ixgbe: 0000:07:00.1: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
[   33.853585] ixgbe 0000:07:00.1: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:55
[   33.853669] ixgbe 0000:07:00.1: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
[   33.855956] ixgbe 0000:07:00.1: Intel(R) 10 Gigabit Network Connection

[   85.208233] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: RX/TX
[   85.237453] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: RX/TX
[   96.080713] ixgbe: fiber1 NIC Link is Down
[  102.094610] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  102.119572] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  142.524691] ixgbe: fiber1 NIC Link is Down
[  148.421332] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  148.449465] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  160.728643] ixgbe: fiber1 NIC Link is Down
[  172.832301] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  173.659038] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  184.554501] ixgbe: fiber0 NIC Link is Down
[  185.376273] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  186.493598] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  190.564383] ixgbe: fiber0 NIC Link is Down
[  191.391149] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  192.484492] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  192.545424] ixgbe: fiber1 NIC Link is Down
[  205.858197] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  206.684940] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  211.991875] ixgbe: fiber1 NIC Link is Down
[  220.833478] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  220.833630] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  229.804853] ixgbe: fiber1 NIC Link is Down
[  248.395672] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[  249.222408] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[  484.631598] ixgbe: fiber1 NIC Link is Down
[  490.138931] ixgbe: fiber1 NIC Link is Up 10 Gbps, Flow Control: None
[  490.167880] ixgbe: fiber0 NIC Link is Up 10 Gbps, Flow Control: None

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24  7:46                 ` Eric Dumazet
@ 2009-11-24  8:46                   ` Badalian Vyacheslav
  2009-11-24  9:07                   ` Peter P Waskiewicz Jr
  1 sibling, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-24  8:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Waskiewicz Jr, Peter P, Linux Netdev List

Eric Dumazet пишет:
> Waskiewicz Jr, Peter P a écrit :
>> Ok, I was confused earlier.  I thought you were saying that all packets 
>> were headed into a single Rx queue.  This is different.
>>
>> Do you know what version of irqbalance you're running, or if it's running 
>> at all?  We've seen issues with irqbalance where it won't recognize the 
>> ethernet device if the driver has been reloaded.  In that case, it won't 
>> balance the interrupts at all.  If the default affinity was set to one 
>> CPU, then well, you're screwed.
>>
>> My suggestion in this case is after you reload ixgbe and start your tests, 
>> see if it all goes to one CPU.  If it does, then restart irqbalance 
>> (service irqbalance restart - or just kill it and restart by hand).  Then 
>> start running your test, and in 10 seconds you should see the interrupts 
>> move and spread out.
>>
>> Let me know if this helps,
> 
> Sure it helps !
> 
> I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
> I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.
> 
> I do receive 16 different irqs, but all serviced on one cpu.
> 
> Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
> (one bit set in the mask, not several ones), and that is not optimal for moderate loads.
> 
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
> 
> 
> One other problem is that after reload of ixgbe driver, link is 95% of the time
> at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps
> 
> I run following script many times and stop it when 10 Gbps speed if reached.
> 
> ethtool -A fiber0 rx off tx off
> ip link set fiber0 down
> ip link set fiber1 down
> sleep 2
> ethtool fiber0
> ethtool -s fiber0 speed 10000
> ethtool -s fiber1 speed 10000
> ethtool -r fiber0 &
> ethtool -r fiber1 &
> ethtool fiber0
> ip link set fiber1 up &
> ip link set fiber0 up &
> ethtool fiber0
> 
> [   33.625689] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
> [   33.625692] ixgbe: Copyright (c) 1999-2009 Intel Corporation.
> [   33.625741] ixgbe 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
> [   33.625760] ixgbe 0000:07:00.0: setting latency timer to 64
> [   33.735579] ixgbe 0000:07:00.0: irq 100 for MSI/MSI-X
> [   33.735583] ixgbe 0000:07:00.0: irq 101 for MSI/MSI-X
> [   33.735585] ixgbe 0000:07:00.0: irq 102 for MSI/MSI-X
> [   33.735587] ixgbe 0000:07:00.0: irq 103 for MSI/MSI-X
> [   33.735589] ixgbe 0000:07:00.0: irq 104 for MSI/MSI-X
> [   33.735591] ixgbe 0000:07:00.0: irq 105 for MSI/MSI-X
> [   33.735593] ixgbe 0000:07:00.0: irq 106 for MSI/MSI-X
> [   33.735595] ixgbe 0000:07:00.0: irq 107 for MSI/MSI-X
> [   33.735597] ixgbe 0000:07:00.0: irq 108 for MSI/MSI-X
> [   33.735599] ixgbe 0000:07:00.0: irq 109 for MSI/MSI-X
> [   33.735602] ixgbe 0000:07:00.0: irq 110 for MSI/MSI-X
> [   33.735604] ixgbe 0000:07:00.0: irq 111 for MSI/MSI-X
> [   33.735606] ixgbe 0000:07:00.0: irq 112 for MSI/MSI-X
> [   33.735608] ixgbe 0000:07:00.0: irq 113 for MSI/MSI-X
> [   33.735610] ixgbe 0000:07:00.0: irq 114 for MSI/MSI-X
> [   33.735612] ixgbe 0000:07:00.0: irq 115 for MSI/MSI-X
> [   33.735614] ixgbe 0000:07:00.0: irq 116 for MSI/MSI-X
> [   33.735633] ixgbe: 0000:07:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
> [   33.735638] ixgbe 0000:07:00.0: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:54
> [   33.735722] ixgbe 0000:07:00.0: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
> [   33.738111] ixgbe 0000:07:00.0: Intel(R) 10 Gigabit Network Connection
> [   33.738135] ixgbe 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> IRQ 42
> [   33.738151] ixgbe 0000:07:00.1: setting latency timer to 64
> [   33.853526] ixgbe 0000:07:00.1: irq 117 for MSI/MSI-X
> [   33.853529] ixgbe 0000:07:00.1: irq 118 for MSI/MSI-X
> [   33.853532] ixgbe 0000:07:00.1: irq 119 for MSI/MSI-X
> [   33.853534] ixgbe 0000:07:00.1: irq 120 for MSI/MSI-X
> [   33.853536] ixgbe 0000:07:00.1: irq 121 for MSI/MSI-X
> [   33.853538] ixgbe 0000:07:00.1: irq 122 for MSI/MSI-X
> [   33.853540] ixgbe 0000:07:00.1: irq 123 for MSI/MSI-X
> [   33.853542] ixgbe 0000:07:00.1: irq 124 for MSI/MSI-X
> [   33.853544] ixgbe 0000:07:00.1: irq 125 for MSI/MSI-X
> [   33.853546] ixgbe 0000:07:00.1: irq 126 for MSI/MSI-X
> [   33.853548] ixgbe 0000:07:00.1: irq 127 for MSI/MSI-X
> [   33.853550] ixgbe 0000:07:00.1: irq 128 for MSI/MSI-X
> [   33.853552] ixgbe 0000:07:00.1: irq 129 for MSI/MSI-X
> [   33.853554] ixgbe 0000:07:00.1: irq 130 for MSI/MSI-X
> [   33.853556] ixgbe 0000:07:00.1: irq 131 for MSI/MSI-X
> [   33.853558] ixgbe 0000:07:00.1: irq 132 for MSI/MSI-X
> [   33.853560] ixgbe 0000:07:00.1: irq 133 for MSI/MSI-X
> [   33.853580] ixgbe: 0000:07:00.1: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
> [   33.853585] ixgbe 0000:07:00.1: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:55
> [   33.853669] ixgbe 0000:07:00.1: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
> [   33.855956] ixgbe 0000:07:00.1: Intel(R) 10 Gigabit Network Connection
> 
> [   85.208233] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: RX/TX
> [   85.237453] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: RX/TX
> [   96.080713] ixgbe: fiber1 NIC Link is Down
> [  102.094610] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  102.119572] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  142.524691] ixgbe: fiber1 NIC Link is Down
> [  148.421332] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  148.449465] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  160.728643] ixgbe: fiber1 NIC Link is Down
> [  172.832301] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  173.659038] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  184.554501] ixgbe: fiber0 NIC Link is Down
> [  185.376273] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  186.493598] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  190.564383] ixgbe: fiber0 NIC Link is Down
> [  191.391149] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  192.484492] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  192.545424] ixgbe: fiber1 NIC Link is Down
> [  205.858197] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  206.684940] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  211.991875] ixgbe: fiber1 NIC Link is Down
> [  220.833478] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  220.833630] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  229.804853] ixgbe: fiber1 NIC Link is Down
> [  248.395672] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [  249.222408] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [  484.631598] ixgbe: fiber1 NIC Link is Down
> [  490.138931] ixgbe: fiber1 NIC Link is Up 10 Gbps, Flow Control: None
> [  490.167880] ixgbe: fiber0 NIC Link is Up 10 Gbps, Flow Control: None
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

May be its Flow Director?
Multuqueue in this network card work only if you set 1 queue to 1 cpu core in smp_affinity :(
In README:


Intel(R) Ethernet Flow Director
-------------------------------
Supports advanced filters that direct receive packets by their flows to
different queues. Enables tight control on routing a flow in the platform.
Matches flows and CPU cores for flow affinity. Supports multiple parameters
for flexible flow classification and load balancing.

Flow director is enabled only if the kernel is multiple TX queue capable.

An included script (set_irq_affinity.sh) automates setting the IRQ to CPU
affinity.

You can verify that the driver is using Flow Director by looking at the counter
in ethtool: fdir_miss and fdir_match.

The following three parameters impact Flow Director.


FdirMode
--------
Valid Range: 0-2 (0=off, 1=ATR, 2=Perfect filter mode)
Default Value: 1

  Flow Director filtering modes.


FdirPballoc
-----------
Valid Range: 0-2 (0=64k, 1=128k, 2=256k)
Default Value: 0

  Flow Director allocated packet buffer size.


AtrSampleRate
--------------
Valid Range: 1-100
Default Value: 20

  Software ATR Tx packet sample rate. For example, when set to 20, every 20th
  packet, looks to see if the packet will create a new flow.






^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24  7:46                 ` Eric Dumazet
  2009-11-24  8:46                   ` Badalian Vyacheslav
@ 2009-11-24  9:07                   ` Peter P Waskiewicz Jr
  2009-11-24  9:55                     ` Eric Dumazet
  1 sibling, 1 reply; 27+ messages in thread
From: Peter P Waskiewicz Jr @ 2009-11-24  9:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List

On Tue, 2009-11-24 at 00:46 -0700, Eric Dumazet wrote:
> Waskiewicz Jr, Peter P a écrit :
> > Ok, I was confused earlier.  I thought you were saying that all packets 
> > were headed into a single Rx queue.  This is different.
> > 
> > Do you know what version of irqbalance you're running, or if it's running 
> > at all?  We've seen issues with irqbalance where it won't recognize the 
> > ethernet device if the driver has been reloaded.  In that case, it won't 
> > balance the interrupts at all.  If the default affinity was set to one 
> > CPU, then well, you're screwed.
> > 
> > My suggestion in this case is after you reload ixgbe and start your tests, 
> > see if it all goes to one CPU.  If it does, then restart irqbalance 
> > (service irqbalance restart - or just kill it and restart by hand).  Then 
> > start running your test, and in 10 seconds you should see the interrupts 
> > move and spread out.
> > 
> > Let me know if this helps,
> 
> Sure it helps !
> 
> I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
> I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.
> 
> I do receive 16 different irqs, but all serviced on one cpu.
> 
> Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
> (one bit set in the mask, not several ones), and that is not optimal for moderate loads.
> 
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
> 
> 
> One other problem is that after reload of ixgbe driver, link is 95% of the time
> at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps
> 

You might have this elsewhere, but it sounds like you're connecting back
to back with another 82599 NIC.  Our optics in that NIC are dual-rate,
and the software mechanism that tries to "autoneg" link speed gets out
of sync easily in back-to-back setups.

If it's really annoying, and you're willing to run with a local patch to
disable the autotry mechanism, try this:

diff --git a/drivers/net/ixgbe/ixgbe_main.c
b/drivers/net/ixgbe/ixgbe_main.c
index a5036f7..62c0915 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
work_struct *work)
        autoneg = hw->phy.autoneg_advertised;
        if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
                hw->mac.ops.get_link_capabilities(hw, &autoneg,
&negotiation);
+
+       /* force 10G only */
+       autoneg = IXGBE_LINK_SPEED_10GB_FULL;
+
        if (hw->mac.ops.setup_link)
                hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
        adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;




Cheers,
-PJ


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24  9:07                   ` Peter P Waskiewicz Jr
@ 2009-11-24  9:55                     ` Eric Dumazet
  2009-11-24 10:06                       ` Peter P Waskiewicz Jr
  2009-11-26 14:10                       ` Badalian Vyacheslav
  0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-24  9:55 UTC (permalink / raw)
  To: Peter P Waskiewicz Jr
  Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List

Peter P Waskiewicz Jr a écrit :

> You might have this elsewhere, but it sounds like you're connecting back
> to back with another 82599 NIC.  Our optics in that NIC are dual-rate,
> and the software mechanism that tries to "autoneg" link speed gets out
> of sync easily in back-to-back setups.
> 
> If it's really annoying, and you're willing to run with a local patch to
> disable the autotry mechanism, try this:
> 
> diff --git a/drivers/net/ixgbe/ixgbe_main.c
> b/drivers/net/ixgbe/ixgbe_main.c
> index a5036f7..62c0915 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
> work_struct *work)
>         autoneg = hw->phy.autoneg_advertised;
>         if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
>                 hw->mac.ops.get_link_capabilities(hw, &autoneg,
> &negotiation);
> +
> +       /* force 10G only */
> +       autoneg = IXGBE_LINK_SPEED_10GB_FULL;
> +
>         if (hw->mac.ops.setup_link)
>                 hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
>         adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;

Thanks ! This did the trick :)

If I am not mistaken, number of TX queues should be capped by number of possible cpus ?

Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
and polluting "tc -s -d class show dev fiber0" output.

[PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues

Instead of allocating 128 struct netdev_queue per device, use the minimum
value between 128 and number of possible cpus, to reduce ram usage and
"tc -s -d class show dev ..." output

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index ebcec30..ec2508d 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 	pci_set_master(pdev);
 	pci_save_state(pdev);
 
-	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
+	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
+				   min_t(unsigned int,
+					 MAX_TX_QUEUES,
+					 num_possible_cpus()));
 	if (!netdev) {
 		err = -ENOMEM;
 		goto err_alloc_etherdev;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24  9:55                     ` Eric Dumazet
@ 2009-11-24 10:06                       ` Peter P Waskiewicz Jr
  2009-11-24 13:14                         ` John Fastabend
  2009-11-26 14:10                       ` Badalian Vyacheslav
  1 sibling, 1 reply; 27+ messages in thread
From: Peter P Waskiewicz Jr @ 2009-11-24 10:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List

On Tue, 2009-11-24 at 02:55 -0700, Eric Dumazet wrote:
> Peter P Waskiewicz Jr a écrit :
> 
> > You might have this elsewhere, but it sounds like you're connecting back
> > to back with another 82599 NIC.  Our optics in that NIC are dual-rate,
> > and the software mechanism that tries to "autoneg" link speed gets out
> > of sync easily in back-to-back setups.
> > 
> > If it's really annoying, and you're willing to run with a local patch to
> > disable the autotry mechanism, try this:
> > 
> > diff --git a/drivers/net/ixgbe/ixgbe_main.c
> > b/drivers/net/ixgbe/ixgbe_main.c
> > index a5036f7..62c0915 100644
> > --- a/drivers/net/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ixgbe/ixgbe_main.c
> > @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
> > work_struct *work)
> >         autoneg = hw->phy.autoneg_advertised;
> >         if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
> >                 hw->mac.ops.get_link_capabilities(hw, &autoneg,
> > &negotiation);
> > +
> > +       /* force 10G only */
> > +       autoneg = IXGBE_LINK_SPEED_10GB_FULL;
> > +
> >         if (hw->mac.ops.setup_link)
> >                 hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
> >         adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
> 
> Thanks ! This did the trick :)
> 
> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
> 
> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
> and polluting "tc -s -d class show dev fiber0" output.
> 

Yes, this is a stupid issue we haven't gotten around to fixing yet.
This looks fine to me.  Thanks for putting it together.

> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
> 
> Instead of allocating 128 struct netdev_queue per device, use the minimum
> value between 128 and number of possible cpus, to reduce ram usage and
> "tc -s -d class show dev ..." output
> 
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index ebcec30..ec2508d 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>  	pci_set_master(pdev);
>  	pci_save_state(pdev);
>  
> -	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
> +	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
> +				   min_t(unsigned int,
> +					 MAX_TX_QUEUES,
> +					 num_possible_cpus()));
>  	if (!netdev) {
>  		err = -ENOMEM;
>  		goto err_alloc_etherdev;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24 10:06                       ` Peter P Waskiewicz Jr
@ 2009-11-24 13:14                         ` John Fastabend
  2009-11-29  8:18                           ` David Miller
  0 siblings, 1 reply; 27+ messages in thread
From: John Fastabend @ 2009-11-24 13:14 UTC (permalink / raw)
  To: Peter P Waskiewicz Jr
  Cc: Eric Dumazet, robert@herjulf.net, Jesper Dangaard Brouer,
	Linux Netdev List

Peter P Waskiewicz Jr wrote:
> On Tue, 2009-11-24 at 02:55 -0700, Eric Dumazet wrote:
>   
>> Peter P Waskiewicz Jr a écrit :
>>
>>     
>>> You might have this elsewhere, but it sounds like you're connecting back
>>> to back with another 82599 NIC.  Our optics in that NIC are dual-rate,
>>> and the software mechanism that tries to "autoneg" link speed gets out
>>> of sync easily in back-to-back setups.
>>>
>>> If it's really annoying, and you're willing to run with a local patch to
>>> disable the autotry mechanism, try this:
>>>
>>> diff --git a/drivers/net/ixgbe/ixgbe_main.c
>>> b/drivers/net/ixgbe/ixgbe_main.c
>>> index a5036f7..62c0915 100644
>>> --- a/drivers/net/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>>> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
>>> work_struct *work)
>>>         autoneg = hw->phy.autoneg_advertised;
>>>         if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
>>>                 hw->mac.ops.get_link_capabilities(hw, &autoneg,
>>> &negotiation);
>>> +
>>> +       /* force 10G only */
>>> +       autoneg = IXGBE_LINK_SPEED_10GB_FULL;
>>> +
>>>         if (hw->mac.ops.setup_link)
>>>                 hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
>>>         adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
>>>       
>> Thanks ! This did the trick :)
>>
>> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
>>
>> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
>> and polluting "tc -s -d class show dev fiber0" output.
>>
>>     
>
> Yes, this is a stupid issue we haven't gotten around to fixing yet.
> This looks fine to me.  Thanks for putting it together.
>
>   
Believe the below patch will break DCB and FCoE though, both features 
have the potential to set real_num_tx_queues to greater then the number 
of CPUs.  This could result in real_num_tx_queues > num_tx_queues. 

The current solution isn't that great though, maybe we should set to the 
minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.

That should cover the maximum possible queues for DCB, FCoE and their 
combinations. 

general multiq = num_possible_cpus()
DCB = 8 tx queue's
FCoE = 2*num_possible_cpus()
FCoE + DCB = 8 tx queues + num_possible_cpus

thanks,
john.



>> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
>>
>> Instead of allocating 128 struct netdev_queue per device, use the minimum
>> value between 128 and number of possible cpus, to reduce ram usage and
>> "tc -s -d class show dev ..." output
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
>> index ebcec30..ec2508d 100644
>> --- a/drivers/net/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>>  	pci_set_master(pdev);
>>  	pci_save_state(pdev);
>>  
>> -	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
>> +	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
>> +				   min_t(unsigned int,
>> +					 MAX_TX_QUEUES,
>> +					 num_possible_cpus()));
>>  	if (!netdev) {
>>  		err = -ENOMEM;
>>  		goto err_alloc_etherdev;
>>     
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24  9:55                     ` Eric Dumazet
  2009-11-24 10:06                       ` Peter P Waskiewicz Jr
@ 2009-11-26 14:10                       ` Badalian Vyacheslav
  1 sibling, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-26 14:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter P Waskiewicz Jr, robert@herjulf.net, Jesper Dangaard Brouer,
	Linux Netdev List

Eric Dumazet пишет:
> Peter P Waskiewicz Jr a écrit :
> 
>> You might have this elsewhere, but it sounds like you're connecting back
>> to back with another 82599 NIC.  Our optics in that NIC are dual-rate,
>> and the software mechanism that tries to "autoneg" link speed gets out
>> of sync easily in back-to-back setups.
>>
>> If it's really annoying, and you're willing to run with a local patch to
>> disable the autotry mechanism, try this:
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_main.c
>> b/drivers/net/ixgbe/ixgbe_main.c
>> index a5036f7..62c0915 100644
>> --- a/drivers/net/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
>> work_struct *work)
>>         autoneg = hw->phy.autoneg_advertised;
>>         if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
>>                 hw->mac.ops.get_link_capabilities(hw, &autoneg,
>> &negotiation);
>> +
>> +       /* force 10G only */
>> +       autoneg = IXGBE_LINK_SPEED_10GB_FULL;
>> +
>>         if (hw->mac.ops.setup_link)
>>                 hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
>>         adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
> 
> Thanks ! This did the trick :)
> 
> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
> 
> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
> and polluting "tc -s -d class show dev fiber0" output.
> 
> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
> 
> Instead of allocating 128 struct netdev_queue per device, use the minimum
> value between 128 and number of possible cpus, to reduce ram usage and
> "tc -s -d class show dev ..." output
> 
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index ebcec30..ec2508d 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>  	pci_set_master(pdev);
>  	pci_save_state(pdev);
>  
> -	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
> +	netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
> +				   min_t(unsigned int,
> +					 MAX_TX_QUEUES,
> +					 num_possible_cpus()));
>  	if (!netdev) {
>  		err = -ENOMEM;
>  		goto err_alloc_etherdev;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

This also fix log time loading TC rules for me

Tested-by: Badalian Vyacheslav <slavon.net@gmail.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-24 13:14                         ` John Fastabend
@ 2009-11-29  8:18                           ` David Miller
  2009-11-30 13:02                             ` Eric Dumazet
  0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2009-11-29  8:18 UTC (permalink / raw)
  To: john.r.fastabend
  Cc: peter.p.waskiewicz.jr, eric.dumazet, robert, hawk, netdev

From: John Fastabend <john.r.fastabend@intel.com>
Date: Tue, 24 Nov 2009 13:14:12 +0000

> Believe the below patch will break DCB and FCoE though, both features
> have the potential to set real_num_tx_queues to greater then the
> number of CPUs.  This could result in real_num_tx_queues >
> num_tx_queues.
> 
> The current solution isn't that great though, maybe we should set to
> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
> 
> That should cover the maximum possible queues for DCB, FCoE and their
> combinations.
> 
> general multiq = num_possible_cpus()
> DCB = 8 tx queue's
> FCoE = 2*num_possible_cpus()
> FCoE + DCB = 8 tx queues + num_possible_cpus

Eric, I'm tossing your patch because of this problem, just FYI.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-29  8:18                           ` David Miller
@ 2009-11-30 13:02                             ` Eric Dumazet
  2009-11-30 20:20                               ` John Fastabend
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2009-11-30 13:02 UTC (permalink / raw)
  To: David Miller
  Cc: john.r.fastabend, peter.p.waskiewicz.jr, robert, hawk, netdev

David Miller a écrit :
> From: John Fastabend <john.r.fastabend@intel.com>
> Date: Tue, 24 Nov 2009 13:14:12 +0000
> 
>> Believe the below patch will break DCB and FCoE though, both features
>> have the potential to set real_num_tx_queues to greater then the
>> number of CPUs.  This could result in real_num_tx_queues >
>> num_tx_queues.
>>
>> The current solution isn't that great though, maybe we should set to
>> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
>>
>> That should cover the maximum possible queues for DCB, FCoE and their
>> combinations.
>>
>> general multiq = num_possible_cpus()
>> DCB = 8 tx queue's
>> FCoE = 2*num_possible_cpus()
>> FCoE + DCB = 8 tx queues + num_possible_cpus
> 
> Eric, I'm tossing your patch because of this problem, just FYI.

Sure, I guess we need a more generic way to handle this.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: ixgbe question
  2009-11-30 13:02                             ` Eric Dumazet
@ 2009-11-30 20:20                               ` John Fastabend
  0 siblings, 0 replies; 27+ messages in thread
From: John Fastabend @ 2009-11-30 20:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Waskiewicz Jr, Peter P, robert@herjulf.net,
	hawk@diku.dk, netdev@vger.kernel.org

Eric Dumazet wrote:
> David Miller a écrit :
>   
>> From: John Fastabend <john.r.fastabend@intel.com>
>> Date: Tue, 24 Nov 2009 13:14:12 +0000
>>
>>     
>>> Believe the below patch will break DCB and FCoE though, both features
>>> have the potential to set real_num_tx_queues to greater then the
>>> number of CPUs.  This could result in real_num_tx_queues >
>>> num_tx_queues.
>>>
>>> The current solution isn't that great though, maybe we should set to
>>> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
>>>
>>> That should cover the maximum possible queues for DCB, FCoE and their
>>> combinations.
>>>
>>> general multiq = num_possible_cpus()
>>> DCB = 8 tx queue's
>>> FCoE = 2*num_possible_cpus()
>>> FCoE + DCB = 8 tx queues + num_possible_cpus
>>>       
>> Eric, I'm tossing your patch because of this problem, just FYI.
>>     
>
> Sure, I guess we need a more generic way to handle this.
>
>   
Eric,

I'll resubmit your patch with a small update to fix my concerns soon.

thanks,
john.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2009-11-30 20:20 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-10 21:27 Ixgbe question Ben Greear
2008-03-11  1:01 ` Brandeburg, Jesse
  -- strict thread matches above, loose matches on Subject: below --
2009-11-23  6:46 [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints Peter P Waskiewicz Jr
2009-11-23  7:32 ` Yong Zhang
2009-11-23  9:36   ` Peter P Waskiewicz Jr
2009-11-23 10:21     ` ixgbe question Eric Dumazet
2009-11-23 10:30       ` Badalian Vyacheslav
2009-11-23 10:34       ` Waskiewicz Jr, Peter P
2009-11-23 10:37         ` Eric Dumazet
2009-11-23 14:05           ` Eric Dumazet
2009-11-23 21:26           ` David Miller
2009-11-23 14:10       ` Jesper Dangaard Brouer
2009-11-23 14:38         ` Eric Dumazet
2009-11-23 18:30           ` robert
2009-11-23 16:59             ` Eric Dumazet
2009-11-23 20:54               ` robert
2009-11-23 21:28                 ` David Miller
2009-11-23 22:14                   ` Robert Olsson
2009-11-23 23:28               ` Waskiewicz Jr, Peter P
2009-11-23 23:44                 ` David Miller
2009-11-24  7:46                 ` Eric Dumazet
2009-11-24  8:46                   ` Badalian Vyacheslav
2009-11-24  9:07                   ` Peter P Waskiewicz Jr
2009-11-24  9:55                     ` Eric Dumazet
2009-11-24 10:06                       ` Peter P Waskiewicz Jr
2009-11-24 13:14                         ` John Fastabend
2009-11-29  8:18                           ` David Miller
2009-11-30 13:02                             ` Eric Dumazet
2009-11-30 20:20                               ` John Fastabend
2009-11-26 14:10                       ` Badalian Vyacheslav

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).