* Ixgbe question
@ 2008-03-10 21:27 Ben Greear
2008-03-11 1:01 ` Brandeburg, Jesse
0 siblings, 1 reply; 27+ messages in thread
From: Ben Greear @ 2008-03-10 21:27 UTC (permalink / raw)
To: NetDev
I have a pair of IXGBE NICs in a system, and notice a strange
case where the NIC is not always 'UP' after my programs finish
trying to configure it. I haven't noticed this with any other
NICs, but I also just moved to 2.6.23.17 from 23.9 and 23.14.
I see this in the logs:
ADDRCONF(NETDEV_UP): eth0: link is not ready
It would seem to me that we should be able to set the admin
state to UP, even if the link is not up??
Kernel is 2.6.23.17 plus my hacks. Ixgbe driver is version 1.3.7.8-lro.
Hardware is quad-core Intel, 2 build-in e1000, 2-port ixgbe (CX4) chipset NIC
on a pcie riser.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Ixgbe question
2008-03-10 21:27 Ixgbe question Ben Greear
@ 2008-03-11 1:01 ` Brandeburg, Jesse
0 siblings, 0 replies; 27+ messages in thread
From: Brandeburg, Jesse @ 2008-03-11 1:01 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
Ben Greear wrote:
> I have a pair of IXGBE NICs in a system, and notice a strange
> case where the NIC is not always 'UP' after my programs finish
> trying to configure it. I haven't noticed this with any other
> NICs, but I also just moved to 2.6.23.17 from 23.9 and 23.14.
>
> I see this in the logs:
>
> ADDRCONF(NETDEV_UP): eth0: link is not ready
>
> It would seem to me that we should be able to set the admin
> state to UP, even if the link is not up??
>
> Kernel is 2.6.23.17 plus my hacks. Ixgbe driver is version
> 1.3.7.8-lro.
>
> Hardware is quad-core Intel, 2 build-in e1000, 2-port ixgbe (CX4)
> chipset NIC
> on a pcie riser.
addrconf_notify() is printing that message, and I see it if you enable
IPv6 on an interface. I don't think it is ixgbe specific.
it doesn't depend or preclude administrative UP, it is from the notify
handler getting a NETDEV_UP message, regardless of link state.
you'll see some other message like:
ADDRCONF(NETDEV_CHANGE): eth6: link becomes ready
when you plug in the cable, all courtesy of Ipv6
Jesse
^ permalink raw reply [flat|nested] 27+ messages in thread
* ixgbe question
2009-11-23 9:36 ` Peter P Waskiewicz Jr
@ 2009-11-23 10:21 ` Eric Dumazet
2009-11-23 10:30 ` Badalian Vyacheslav
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 10:21 UTC (permalink / raw)
To: Peter P Waskiewicz Jr; +Cc: Linux Netdev List
Hi Peter
I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
Setup is :
One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
was using many RX queues :
grep fiber1 /proc/interrupts
117: 1301 13060 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-0
118: 601 1402 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-1
119: 634 832 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-2
120: 601 1303 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-3
121: 620 1246 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-4
122: 1287 13088 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-5
123: 606 1354 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-6
124: 653 827 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-7
125: 639 825 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-8
126: 596 1199 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-9
127: 2013 24800 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-10
128: 648 1353 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-11
129: 601 1123 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-12
130: 625 834 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-13
131: 665 1409 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-14
132: 2637 31699 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-15
133: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1:lsc
But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
root@demodl380g6:/usr/src# ifconfig fiber0
fiber0 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:54
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 lg file transmission:1000
Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
root@demodl380g6:/usr/src# ifconfig fiber1
fiber1 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:55
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 lg file transmission:1000
Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
How and when multi queue rx can really start to use several cpus ?
Thanks
Eric
pktgen script :
pgset()
{
local result
echo $1 > $PGDEV
result=`cat $PGDEV | fgrep "Result: OK:"`
if [ "$result" = "" ]; then
cat $PGDEV | fgrep Result:
fi
}
pg()
{
echo inject > $PGDEV
cat $PGDEV
}
PGDEV=/proc/net/pktgen/kpktgend_4
echo "Adding fiber0"
pgset "add_device fiber0@0"
CLONE_SKB="clone_skb 15"
PKT_SIZE="pkt_size 60"
COUNT="count 100000000"
DELAY="delay 0"
PGDEV=/proc/net/pktgen/fiber0@0
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "$PKT_SIZE"
pgset "$DELAY"
pgset "queue_map_min 0"
pgset "queue_map_max 7"
pgset "dst_min 192.168.0.2"
pgset "dst_max 192.168.0.250"
pgset "src_min 192.168.0.1"
pgset "src_max 192.168.0.1"
pgset "dst_mac 00:1b:21:4a:fe:55"
# Time to run
PGDEV=/proc/net/pktgen/pgctrl
echo "Running... ctrl^C to stop"
pgset "start"
echo "Done"
# Result can be vieved in /proc/net/pktgen/fiber0@0
for f in fiber0@0
do
cat /proc/net/pktgen/$f
done
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:21 ` ixgbe question Eric Dumazet
@ 2009-11-23 10:30 ` Badalian Vyacheslav
2009-11-23 10:34 ` Waskiewicz Jr, Peter P
2009-11-23 14:10 ` Jesper Dangaard Brouer
2 siblings, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-23 10:30 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Peter P Waskiewicz Jr, Linux Netdev List
Hello Eric. I paly with this card 3 weeks and maybe help for you :)
By default intel flower use only first cpu. Its strange.
If we add affinity to single cpu core for interrupt its will use this CPU core.
If we add affinity to two or more cpus its applying but don't work.
See ixgbe driver README from intel.com. Its have param for RSS flower. I think its do this :)
Also driver from intel.com have script for split RxTx->Cpu core but you must replace "tx rx" in code to "TxRx".
P.S. Please also see if you can and wont:
On e1000 and x86 kernel + 2xXeon 2core my TC rules load 3 min
On ixgbe and X86_64 kernel + 4xXeon 6core my TC rules load more 15 mins!
Its 64 bit regression?
Tc rules i can send to you if you ask me for it! Thanks!
Slavon
> Hi Peter
>
> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>
> Setup is :
>
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :
>
> grep fiber1 /proc/interrupts
> 117: 1301 13060 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-0
> 118: 601 1402 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-1
> 119: 634 832 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-2
> 120: 601 1303 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-3
> 121: 620 1246 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-4
> 122: 1287 13088 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-5
> 123: 606 1354 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-6
> 124: 653 827 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-7
> 125: 639 825 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-8
> 126: 596 1199 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-9
> 127: 2013 24800 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-10
> 128: 648 1353 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-11
> 129: 601 1123 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-12
> 130: 625 834 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-13
> 131: 665 1409 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-14
> 132: 2637 31699 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-15
> 133: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1:lsc
>
>
>
> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
>
> root@demodl380g6:/usr/src# ifconfig fiber0
> fiber0 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:54
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
> TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 lg file transmission:1000
> Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
>
> root@demodl380g6:/usr/src# ifconfig fiber1
> fiber1 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:55
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
> TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 lg file transmission:1000
> Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
>
>
> How and when multi queue rx can really start to use several cpus ?
>
> Thanks
> Eric
>
>
> pktgen script :
>
> pgset()
> {
> local result
>
> echo $1 > $PGDEV
>
> result=`cat $PGDEV | fgrep "Result: OK:"`
> if [ "$result" = "" ]; then
> cat $PGDEV | fgrep Result:
> fi
> }
>
> pg()
> {
> echo inject > $PGDEV
> cat $PGDEV
> }
>
>
> PGDEV=/proc/net/pktgen/kpktgend_4
>
> echo "Adding fiber0"
> pgset "add_device fiber0@0"
>
>
> CLONE_SKB="clone_skb 15"
>
> PKT_SIZE="pkt_size 60"
>
>
> COUNT="count 100000000"
> DELAY="delay 0"
>
> PGDEV=/proc/net/pktgen/fiber0@0
> echo "Configuring $PGDEV"
> pgset "$COUNT"
> pgset "$CLONE_SKB"
> pgset "$PKT_SIZE"
> pgset "$DELAY"
> pgset "queue_map_min 0"
> pgset "queue_map_max 7"
> pgset "dst_min 192.168.0.2"
> pgset "dst_max 192.168.0.250"
> pgset "src_min 192.168.0.1"
> pgset "src_max 192.168.0.1"
> pgset "dst_mac 00:1b:21:4a:fe:55"
>
>
> # Time to run
> PGDEV=/proc/net/pktgen/pgctrl
>
> echo "Running... ctrl^C to stop"
> pgset "start"
> echo "Done"
>
> # Result can be vieved in /proc/net/pktgen/fiber0@0
>
> for f in fiber0@0
> do
> cat /proc/net/pktgen/$f
> done
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:21 ` ixgbe question Eric Dumazet
2009-11-23 10:30 ` Badalian Vyacheslav
@ 2009-11-23 10:34 ` Waskiewicz Jr, Peter P
2009-11-23 10:37 ` Eric Dumazet
2009-11-23 14:10 ` Jesper Dangaard Brouer
2 siblings, 1 reply; 27+ messages in thread
From: Waskiewicz Jr, Peter P @ 2009-11-23 10:34 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Waskiewicz Jr, Peter P, Linux Netdev List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 5581 bytes --]
On Mon, 23 Nov 2009, Eric Dumazet wrote:
> Hi Peter
>
> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>
> Setup is :
>
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
Can you specify kernel version and driver version?
>
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :
>
> grep fiber1 /proc/interrupts
> 117: 1301 13060 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-0
> 118: 601 1402 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-1
> 119: 634 832 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-2
> 120: 601 1303 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-3
> 121: 620 1246 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-4
> 122: 1287 13088 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-5
> 123: 606 1354 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-6
> 124: 653 827 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-7
> 125: 639 825 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-8
> 126: 596 1199 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-9
> 127: 2013 24800 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-10
> 128: 648 1353 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-11
> 129: 601 1123 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-12
> 130: 625 834 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-13
> 131: 665 1409 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-14
> 132: 2637 31699 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-15
> 133: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1:lsc
>
>
>
> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
>
> root@demodl380g6:/usr/src# ifconfig fiber0
> fiber0 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:54
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
> TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 lg file transmission:1000
> Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
>
> root@demodl380g6:/usr/src# ifconfig fiber1
> fiber1 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:55
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
> TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 lg file transmission:1000
> Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
I stay in the states too much. I love seeing net stats in French. :-)
>
>
> How and when multi queue rx can really start to use several cpus ?
If you're sending one flow to many consumers, it's still one flow. Even
using RSS won't help, since it requires differing flows to spread load
(5-tuple matches for flow distribution).
Cheers,
-PJ
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:34 ` Waskiewicz Jr, Peter P
@ 2009-11-23 10:37 ` Eric Dumazet
2009-11-23 14:05 ` Eric Dumazet
2009-11-23 21:26 ` David Miller
0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 10:37 UTC (permalink / raw)
To: Waskiewicz Jr, Peter P; +Cc: Linux Netdev List
Waskiewicz Jr, Peter P a écrit :
> On Mon, 23 Nov 2009, Eric Dumazet wrote:
>
>> Hi Peter
>>
>> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>>
>> Setup is :
>>
>> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
>> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>
> Can you specify kernel version and driver version?
Well, I forgot to mention I am only working with net-next-2.6 tree.
Ubuntu 9.10 kernel (Fedora Core 12 installer was not able to recognize disks on this machine :( )
ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
>
>> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
>> was using many RX queues :
>>
>> grep fiber1 /proc/interrupts
>> 117: 1301 13060 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-0
>> 118: 601 1402 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-1
>> 119: 634 832 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-2
>> 120: 601 1303 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-3
>> 121: 620 1246 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-4
>> 122: 1287 13088 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-5
>> 123: 606 1354 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-6
>> 124: 653 827 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-7
>> 125: 639 825 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-8
>> 126: 596 1199 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-9
>> 127: 2013 24800 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-10
>> 128: 648 1353 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-11
>> 129: 601 1123 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-12
>> 130: 625 834 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-13
>> 131: 665 1409 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-14
>> 132: 2637 31699 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1-TxRx-15
>> 133: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge fiber1:lsc
>>
>>
>>
>> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
>>
>> root@demodl380g6:/usr/src# ifconfig fiber0
>> fiber0 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:54
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> Packets reçus:4 erreurs:0 :0 overruns:0 frame:0
>> TX packets:309291576 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 lg file transmission:1000
>> Octets reçus:1368 (1.3 KB) Octets transmis:18557495682 (18.5 GB)
>>
>> root@demodl380g6:/usr/src# ifconfig fiber1
>> fiber1 Link encap:Ethernet HWaddr 00:1b:21:4a:fe:55
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> Packets reçus:55122164 erreurs:0 :254169411 overruns:0 frame:0
>> TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 lg file transmission:1000
>> Octets reçus:3307330968 (3.3 GB) Octets transmis:1368 (1.3 KB)
>
> I stay in the states too much. I love seeing net stats in French. :-)
Ok :)
>
>>
>> How and when multi queue rx can really start to use several cpus ?
>
> If you're sending one flow to many consumers, it's still one flow. Even
> using RSS won't help, since it requires differing flows to spread load
> (5-tuple matches for flow distribution).
Hm... I can try varying both src and dst on my pktgen test.
Thanks
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:37 ` Eric Dumazet
@ 2009-11-23 14:05 ` Eric Dumazet
2009-11-23 21:26 ` David Miller
1 sibling, 0 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 14:05 UTC (permalink / raw)
To: Waskiewicz Jr, Peter P; +Cc: Linux Netdev List
Eric Dumazet a écrit :
> Waskiewicz Jr, Peter P a écrit :
>> On Mon, 23 Nov 2009, Eric Dumazet wrote:
>>
>>> Hi Peter
>>>
>>> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>>>
>>> Setup is :
>>>
>>> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
>>> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>> Can you specify kernel version and driver version?
>
>
> Well, I forgot to mention I am only working with net-next-2.6 tree.
>
> Ubuntu 9.10 kernel (Fedora Core 12 installer was not able to recognize disks on this machine :( )
>
> ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
>
>
I tried with several pktgen threads, no success so far.
Only one cpu handles all interrupts and ksoftirq enters
a mode with no escape to splitted mode.
To get real multi queue and uncontended handling, I had to force :
echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
echo 2 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
echo 8 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
echo 20 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
echo 80 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
echo 200 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
echo 800 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
echo 2000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
echo 8000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
Probably problem comes from fact that when ksoftirqd runs and
RX queues are not depleted, no hardware interrupts is sent,
and NAPI contexts stay sticked on one cpu forever ?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:21 ` ixgbe question Eric Dumazet
2009-11-23 10:30 ` Badalian Vyacheslav
2009-11-23 10:34 ` Waskiewicz Jr, Peter P
@ 2009-11-23 14:10 ` Jesper Dangaard Brouer
2009-11-23 14:38 ` Eric Dumazet
2 siblings, 1 reply; 27+ messages in thread
From: Jesper Dangaard Brouer @ 2009-11-23 14:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Peter P Waskiewicz Jr, Linux Netdev List
On Mon, 23 Nov 2009, Eric Dumazet wrote:
> I tried a pktgen stress on 82599EB card and could not split RX load on multiple cpus.
>
> Setup is :
>
> One 82599 card with fiber0 looped to fiber1, 10Gb link mode.
> machine is a HPDL380 G6 with dual quadcore E5530 @2.4GHz (16 logical cpus)
>
> I use one pktgen thread sending to fiber0 one many dst IP, and checked that fiber1
> was using many RX queues :
How is your smp_affinity mask's set?
grep . /proc/irq/*/fiber1-*/../smp_affinity
> But only one CPU (CPU1) had a softirq running, 100%, and many frames were dropped
Just a hint, I use 'ethtool -S fiber1' to see how the packets gets
distributed across the rx and tx queues.
> CLONE_SKB="clone_skb 15"
Be careful with to high clone, as my experience is it will send a burst of
clone_skb packets before the packet gets randomized again.
> pgset "dst_min 192.168.0.2"
> pgset "dst_max 192.168.0.250"
> pgset "src_min 192.168.0.1"
> pgset "src_max 192.168.0.1"
> pgset "dst_mac 00:1b:21:4a:fe:55"
To get a packets randomized across RX queues, I used:
echo "- Random UDP source port min:$min - max:$max"
pgset "flag UDPSRC_RND"
pgset "udp_src_min $min"
pgset "udp_src_max $max"
Ahh.. I think you are missing:
pgset "flag IPDST_RND"
Cheers,
Jesper Brouer
--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 14:10 ` Jesper Dangaard Brouer
@ 2009-11-23 14:38 ` Eric Dumazet
2009-11-23 18:30 ` robert
0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 14:38 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: Peter P Waskiewicz Jr, Linux Netdev List
Jesper Dangaard Brouer a écrit :
> How is your smp_affinity mask's set?
>
> grep . /proc/irq/*/fiber1-*/../smp_affinity
First, I tried default affinities (ffff)
Then I tried irqbalance... no more success.
Driver seems to try to handle all queues on one cpu on low trafic,
and possibly dynamically switches to a multi-cpu mode,
but as all interrupts are masked, we stay in
a NAPI context handling all queues.
And we let one cpu in flood/drops mode.
>
>
>> But only one CPU (CPU1) had a softirq running, 100%, and many frames
>> were dropped
>
> Just a hint, I use 'ethtool -S fiber1' to see how the packets gets
> distributed across the rx and tx queues.
They are correctly distributed
rx_queue_0_packets: 14119644
rx_queue_0_bytes: 847178640
rx_queue_1_packets: 14126315
rx_queue_1_bytes: 847578900
rx_queue_2_packets: 14115249
rx_queue_2_bytes: 846914940
rx_queue_3_packets: 14118146
rx_queue_3_bytes: 847088760
rx_queue_4_packets: 14130869
rx_queue_4_bytes: 847853268
rx_queue_5_packets: 14112239
rx_queue_5_bytes: 846734340
rx_queue_6_packets: 14128425
rx_queue_6_bytes: 847705500
rx_queue_7_packets: 14110587
rx_queue_7_bytes: 846635220
rx_queue_8_packets: 14117350
rx_queue_8_bytes: 847041000
rx_queue_9_packets: 14125992
rx_queue_9_bytes: 847559520
rx_queue_10_packets: 14121732
rx_queue_10_bytes: 847303920
rx_queue_11_packets: 14120997
rx_queue_11_bytes: 847259820
rx_queue_12_packets: 14125576
rx_queue_12_bytes: 847535854
rx_queue_13_packets: 14118512
rx_queue_13_bytes: 847110720
rx_queue_14_packets: 14118348
rx_queue_14_bytes: 847100880
rx_queue_15_packets: 14118647
rx_queue_15_bytes: 847118820
>
>
>
>> CLONE_SKB="clone_skb 15"
>
> Be careful with to high clone, as my experience is it will send a burst
> of clone_skb packets before the packet gets randomized again.
Yes, but 15 should be ok with 10Gb link :)
Thanks
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 18:30 ` robert
@ 2009-11-23 16:59 ` Eric Dumazet
2009-11-23 20:54 ` robert
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-23 16:59 UTC (permalink / raw)
To: robert; +Cc: Jesper Dangaard Brouer, Peter P Waskiewicz Jr, Linux Netdev List
robert@herjulf.net a écrit :
> Eric Dumazet writes:
>
> > Jesper Dangaard Brouer a écrit :
> >
> > > How is your smp_affinity mask's set?
> > >
> > > grep . /proc/irq/*/fiber1-*/../smp_affinity
> >
>
> Weird... set clone_skb to 1 to be sure and vary dst or something so
> the HW classifier selects different queues and with proper RX affinty.
>
> You should see in /proc/net/softnet_stat something like:
>
> 012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
> 01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
> 0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> 01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
> 0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> 0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
> 0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> 012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
>
slone_skb set to 1, this changes nothing but slows down pktgen (obviously)
Result: OK: 117614452(c117608705+d5746) nsec, 100000000 (60byte,0frags)
850235pps 408Mb/sec (408112800bps) errors: 0
All RX processing of 16 RX queues done by CPU 1 only.
# cat /proc/net/softnet_stat ; sleep 2 ; echo "--------------";cat /proc/net/softnet_stat
0039f331 00000000 00002e10 00000000 00000000 00000000 00000000 00000000 00000000
03f2ed19 00000000 00037ca2 00000000 00000000 00000000 00000000 00000000 00000000
00000024 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000041 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000028 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000000b 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000c5 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000010d 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000250 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000498 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000616 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000012c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000d2 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000025d 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000003c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000127 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
--------------
0039f331 00000000 00002e10 00000000 00000000 00000000 00000000 00000000 00000000
03f66737 00000000 00038015 00000000 00000000 00000000 00000000 00000000 00000000
00000024 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000041 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000028 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000000b 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000c5 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000110 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000250 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000499 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000616 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000012c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000d2 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000263 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0000003c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000129 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ethtool -S fiber1 (to show how my trafic is equally distributed to 16 RX queues)
rx_queue_0_packets: 4867706
rx_queue_0_bytes: 292062360
rx_queue_1_packets: 4862472
rx_queue_1_bytes: 291748320
rx_queue_2_packets: 4867111
rx_queue_2_bytes: 292026660
rx_queue_3_packets: 4859897
rx_queue_3_bytes: 291593820
rx_queue_4_packets: 4862267
rx_queue_4_bytes: 291740814
rx_queue_5_packets: 4861517
rx_queue_5_bytes: 291691020
rx_queue_6_packets: 4862699
rx_queue_6_bytes: 291761940
rx_queue_7_packets: 4860523
rx_queue_7_bytes: 291631380
rx_queue_8_packets: 4856891
rx_queue_8_bytes: 291413460
rx_queue_9_packets: 4868794
rx_queue_9_bytes: 292127640
rx_queue_10_packets: 4859099
rx_queue_10_bytes: 291545940
rx_queue_11_packets: 4867599
rx_queue_11_bytes: 292055940
rx_queue_12_packets: 4861868
rx_queue_12_bytes: 291713374
rx_queue_13_packets: 4862655
rx_queue_13_bytes: 291759300
rx_queue_14_packets: 4860798
rx_queue_14_bytes: 291647880
rx_queue_15_packets: 4860951
rx_queue_15_bytes: 291657060
perf top -C 1 -E 25
------------------------------------------------------------------------------
PerfTop: 24419 irqs/sec kernel:100.0% [100000 cycles], (all, cpu: 1)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
46234.00 - 24.3% : ixgbe_clean_tx_irq [ixgbe]
21134.00 - 11.1% : __slab_free
17838.00 - 9.4% : _raw_spin_lock
17086.00 - 9.0% : skb_release_head_state
9410.00 - 5.0% : ixgbe_clean_rx_irq [ixgbe]
8639.00 - 4.5% : kmem_cache_free
6910.00 - 3.6% : kfree
5743.00 - 3.0% : __ip_route_output_key
5321.00 - 2.8% : ip_route_input
3138.00 - 1.7% : ip_rcv
2179.00 - 1.1% : kmem_cache_alloc_node
2002.00 - 1.1% : __kmalloc_node_track_caller
1907.00 - 1.0% : skb_put
1807.00 - 1.0% : __xfrm_lookup
1742.00 - 0.9% : get_partial_node
1727.00 - 0.9% : csum_partial_copy_generic
1541.00 - 0.8% : add_partial
1516.00 - 0.8% : __kfree_skb
1465.00 - 0.8% : __netdev_alloc_skb
1420.00 - 0.7% : icmp_send
1222.00 - 0.6% : dev_gro_receive
1159.00 - 0.6% : fib_table_lookup
1155.00 - 0.6% : __phys_addr
1050.00 - 0.6% : skb_release_data
982.00 - 0.5% : _raw_spin_unlock
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 14:38 ` Eric Dumazet
@ 2009-11-23 18:30 ` robert
2009-11-23 16:59 ` Eric Dumazet
0 siblings, 1 reply; 27+ messages in thread
From: robert @ 2009-11-23 18:30 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jesper Dangaard Brouer, Peter P Waskiewicz Jr, Linux Netdev List
Eric Dumazet writes:
> Jesper Dangaard Brouer a écrit :
>
> > How is your smp_affinity mask's set?
> >
> > grep . /proc/irq/*/fiber1-*/../smp_affinity
>
Weird... set clone_skb to 1 to be sure and vary dst or something so
the HW classifier selects different queues and with proper RX affinty.
You should see in /proc/net/softnet_stat something like:
012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
Or something is...
Cheers
--ro
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 16:59 ` Eric Dumazet
@ 2009-11-23 20:54 ` robert
2009-11-23 21:28 ` David Miller
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
1 sibling, 1 reply; 27+ messages in thread
From: robert @ 2009-11-23 20:54 UTC (permalink / raw)
To: Eric Dumazet
Cc: robert, Jesper Dangaard Brouer, Peter P Waskiewicz Jr,
Linux Netdev List
Eric Dumazet writes:
> slone_skb set to 1, this changes nothing but slows down pktgen (obviously)
> All RX processing of 16 RX queues done by CPU 1 only.
Well just pulled net-next-2.6 and ran with both 82598 and 82599 boards and
pkt load gets distributed among the cpu-cores.
Something mysterious or very obvious...
You can even try the script it's a sort of Internet Link traffic emulation
well you have to set up your routing.
Cheers
--ro
#! /bin/sh
#modprobe pktgen
function pgset() {
local result
echo $1 > $PGDEV
result=`cat $PGDEV | fgrep "Result: OK:"`
if [ "$result" = "" ]; then
cat $PGDEV | fgrep Result:
fi
}
function pg() {
echo inject > $PGDEV
cat $PGDEV
}
# Config Start Here -----------------------------------------------------------
remove_all()
{
# thread config
PGDEV=/proc/net/pktgen/kpktgend_0
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_1
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_2
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_3
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_4
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_5
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_6
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_7
pgset "rem_device_all"
}
remove_all
PGDEV=/proc/net/pktgen/kpktgend_0
pgset "add_device eth2@0"
PGDEV=/proc/net/pktgen/kpktgend_1
pgset "add_device eth2@1"
PGDEV=/proc/net/pktgen/kpktgend_2
pgset "add_device eth2@2"
PGDEV=/proc/net/pktgen/kpktgend_3
pgset "add_device eth2@3"
# device config
#
# Sending a mix of pkt sizes of 64, 576 and 1500
#
CLONE_SKB="clone_skb 1"
PKT_SIZE="pkt_size 60"
COUNT="count 000000"
DELAY="delay 0000"
#MAC="00:21:28:08:40:EE"
#MAC="00:21:28:08:40:EF"
#MAC="00:1B:21:17:C1:CD"
MAC="00:14:4F:DA:8C:66"
#MAC="00:14:4F:6B:CD:E8"
PGDEV=/proc/net/pktgen/eth2@0
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 1496"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND"
pgset "flag FLOW_SEQ"
pgset "dst_min 11.0.0.0"
pgset "dst_max 11.255.255.255"
pgset "flows 2048"
pgset "flowlen 30"
pgset "dst_mac $MAC"
PGDEV=/proc/net/pktgen/eth2@1
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 576"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND"
pgset "flag FLOW_SEQ"
pgset "dst_min 11.0.0.0"
pgset "dst_max 11.255.255.255"
pgset "flows 2048"
pgset "flowlen 30"
pgset "dst_mac $MAC"
PGDEV=/proc/net/pktgen/eth2@2
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "$DELAY"
pgset "pkt_size 60"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND"
pgset "flag FLOW_SEQ"
pgset "dst_min 11.0.0.0"
pgset "dst_max 11.255.255.255"
pgset "flows 2048"
pgset "flowlen 30"
pgset "dst_mac $MAC"
PGDEV=/proc/net/pktgen/eth2@3
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "pkt_size 1496"
pgset "$DELAY"
pgset "flag QUEUE_MAP_CPU"
pgset "flag IPDST_RND"
pgset "flag FLOW_SEQ"
pgset "dst_min 11.0.0.0"
pgset "dst_max 11.255.255.255"
pgset "flows 2048"
pgset "flowlen 30"
pgset "dst_mac $MAC"
# Time to run
PGDEV=/proc/net/pktgen/pgctrl
echo "Running... ctrl^C to stop"
pgset "start"
echo "Done"
grep pps /proc/net/pktgen/*
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 10:37 ` Eric Dumazet
2009-11-23 14:05 ` Eric Dumazet
@ 2009-11-23 21:26 ` David Miller
1 sibling, 0 replies; 27+ messages in thread
From: David Miller @ 2009-11-23 21:26 UTC (permalink / raw)
To: eric.dumazet; +Cc: peter.p.waskiewicz.jr, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 23 Nov 2009 11:37:20 +0100
> (Fedora Core 12 installer was not able to recognize disks on this machine :( )
I ran into this problem too on my laptop, but only with the Live-CD images.
The DVD image recognized the disks and installed just fine.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 20:54 ` robert
@ 2009-11-23 21:28 ` David Miller
2009-11-23 22:14 ` Robert Olsson
0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2009-11-23 21:28 UTC (permalink / raw)
To: robert; +Cc: eric.dumazet, hawk, peter.p.waskiewicz.jr, netdev
From: robert@herjulf.net
Date: Mon, 23 Nov 2009 21:54:43 +0100
> Something mysterious or very obvious...
It seem very obvious to me that, for whatever reason, the MSI-X vectors
are only being sent to cpu 1 on Eric's system.
I also suspect, as a result, that it has nothing to do with the IXGBE
driver but rather is some IRQ controller programming or some bug or
limitation in the IRQ affinity mask handling in the kernel.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 21:28 ` David Miller
@ 2009-11-23 22:14 ` Robert Olsson
0 siblings, 0 replies; 27+ messages in thread
From: Robert Olsson @ 2009-11-23 22:14 UTC (permalink / raw)
To: David Miller; +Cc: robert, eric.dumazet, hawk, peter.p.waskiewicz.jr, netdev
David Miller writes:
> It seem very obvious to me that, for whatever reason, the MSI-X vectors
> are only being sent to cpu 1 on Eric's system.
>
> I also suspect, as a result, that it has nothing to do with the IXGBE
> driver but rather is some IRQ controller programming or some bug or
> limitation in the IRQ affinity mask handling in the kernel.
Probably so yes. I'll guess Eric will dig into this.
Cheers
--ro
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 16:59 ` Eric Dumazet
2009-11-23 20:54 ` robert
@ 2009-11-23 23:28 ` Waskiewicz Jr, Peter P
2009-11-23 23:44 ` David Miller
2009-11-24 7:46 ` Eric Dumazet
1 sibling, 2 replies; 27+ messages in thread
From: Waskiewicz Jr, Peter P @ 2009-11-23 23:28 UTC (permalink / raw)
To: Eric Dumazet
Cc: robert@herjulf.net, Jesper Dangaard Brouer,
Waskiewicz Jr, Peter P, Linux Netdev List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2301 bytes --]
On Mon, 23 Nov 2009, Eric Dumazet wrote:
> robert@herjulf.net a écrit :
> > Eric Dumazet writes:
> >
> > > Jesper Dangaard Brouer a écrit :
> > >
> > > > How is your smp_affinity mask's set?
> > > >
> > > > grep . /proc/irq/*/fiber1-*/../smp_affinity
> > >
> >
> > Weird... set clone_skb to 1 to be sure and vary dst or something so
> > the HW classifier selects different queues and with proper RX affinty.
> >
> > You should see in /proc/net/softnet_stat something like:
> >
> > 012a7bb9 00000000 000000ae 00000000 00000000 00000000 00000000 00000000 00000000
> > 01288d4c 00000000 00000049 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128fe28 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> > 01295387 00000000 00000047 00000000 00000000 00000000 00000000 00000000 00000000
> > 0129a722 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128c5e4 00000000 00000046 00000000 00000000 00000000 00000000 00000000 00000000
> > 0128f718 00000000 00000043 00000000 00000000 00000000 00000000 00000000 00000000
> > 012993e3 00000000 0000004a 00000000 00000000 00000000 00000000 00000000 00000000
> >
>
> slone_skb set to 1, this changes nothing but slows down pktgen (obviously)
>
> Result: OK: 117614452(c117608705+d5746) nsec, 100000000 (60byte,0frags)
> 850235pps 408Mb/sec (408112800bps) errors: 0
>
> All RX processing of 16 RX queues done by CPU 1 only.
Ok, I was confused earlier. I thought you were saying that all packets
were headed into a single Rx queue. This is different.
Do you know what version of irqbalance you're running, or if it's running
at all? We've seen issues with irqbalance where it won't recognize the
ethernet device if the driver has been reloaded. In that case, it won't
balance the interrupts at all. If the default affinity was set to one
CPU, then well, you're screwed.
My suggestion in this case is after you reload ixgbe and start your tests,
see if it all goes to one CPU. If it does, then restart irqbalance
(service irqbalance restart - or just kill it and restart by hand). Then
start running your test, and in 10 seconds you should see the interrupts
move and spread out.
Let me know if this helps,
-PJ
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
@ 2009-11-23 23:44 ` David Miller
2009-11-24 7:46 ` Eric Dumazet
1 sibling, 0 replies; 27+ messages in thread
From: David Miller @ 2009-11-23 23:44 UTC (permalink / raw)
To: peter.p.waskiewicz.jr; +Cc: eric.dumazet, robert, hawk, netdev
From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
Date: Mon, 23 Nov 2009 15:28:18 -0800 (Pacific Standard Time)
> Do you know what version of irqbalance you're running, or if it's running
> at all?
Eric said he tried both with and without irqbalanced.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
2009-11-23 23:44 ` David Miller
@ 2009-11-24 7:46 ` Eric Dumazet
2009-11-24 8:46 ` Badalian Vyacheslav
2009-11-24 9:07 ` Peter P Waskiewicz Jr
1 sibling, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-24 7:46 UTC (permalink / raw)
To: Waskiewicz Jr, Peter P
Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List
Waskiewicz Jr, Peter P a écrit :
> Ok, I was confused earlier. I thought you were saying that all packets
> were headed into a single Rx queue. This is different.
>
> Do you know what version of irqbalance you're running, or if it's running
> at all? We've seen issues with irqbalance where it won't recognize the
> ethernet device if the driver has been reloaded. In that case, it won't
> balance the interrupts at all. If the default affinity was set to one
> CPU, then well, you're screwed.
>
> My suggestion in this case is after you reload ixgbe and start your tests,
> see if it all goes to one CPU. If it does, then restart irqbalance
> (service irqbalance restart - or just kill it and restart by hand). Then
> start running your test, and in 10 seconds you should see the interrupts
> move and spread out.
>
> Let me know if this helps,
Sure it helps !
I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.
I do receive 16 different irqs, but all serviced on one cpu.
Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
(one bit set in the mask, not several ones), and that is not optimal for moderate loads.
echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
One other problem is that after reload of ixgbe driver, link is 95% of the time
at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps
I run following script many times and stop it when 10 Gbps speed if reached.
ethtool -A fiber0 rx off tx off
ip link set fiber0 down
ip link set fiber1 down
sleep 2
ethtool fiber0
ethtool -s fiber0 speed 10000
ethtool -s fiber1 speed 10000
ethtool -r fiber0 &
ethtool -r fiber1 &
ethtool fiber0
ip link set fiber1 up &
ip link set fiber0 up &
ethtool fiber0
[ 33.625689] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
[ 33.625692] ixgbe: Copyright (c) 1999-2009 Intel Corporation.
[ 33.625741] ixgbe 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
[ 33.625760] ixgbe 0000:07:00.0: setting latency timer to 64
[ 33.735579] ixgbe 0000:07:00.0: irq 100 for MSI/MSI-X
[ 33.735583] ixgbe 0000:07:00.0: irq 101 for MSI/MSI-X
[ 33.735585] ixgbe 0000:07:00.0: irq 102 for MSI/MSI-X
[ 33.735587] ixgbe 0000:07:00.0: irq 103 for MSI/MSI-X
[ 33.735589] ixgbe 0000:07:00.0: irq 104 for MSI/MSI-X
[ 33.735591] ixgbe 0000:07:00.0: irq 105 for MSI/MSI-X
[ 33.735593] ixgbe 0000:07:00.0: irq 106 for MSI/MSI-X
[ 33.735595] ixgbe 0000:07:00.0: irq 107 for MSI/MSI-X
[ 33.735597] ixgbe 0000:07:00.0: irq 108 for MSI/MSI-X
[ 33.735599] ixgbe 0000:07:00.0: irq 109 for MSI/MSI-X
[ 33.735602] ixgbe 0000:07:00.0: irq 110 for MSI/MSI-X
[ 33.735604] ixgbe 0000:07:00.0: irq 111 for MSI/MSI-X
[ 33.735606] ixgbe 0000:07:00.0: irq 112 for MSI/MSI-X
[ 33.735608] ixgbe 0000:07:00.0: irq 113 for MSI/MSI-X
[ 33.735610] ixgbe 0000:07:00.0: irq 114 for MSI/MSI-X
[ 33.735612] ixgbe 0000:07:00.0: irq 115 for MSI/MSI-X
[ 33.735614] ixgbe 0000:07:00.0: irq 116 for MSI/MSI-X
[ 33.735633] ixgbe: 0000:07:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
[ 33.735638] ixgbe 0000:07:00.0: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:54
[ 33.735722] ixgbe 0000:07:00.0: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
[ 33.738111] ixgbe 0000:07:00.0: Intel(R) 10 Gigabit Network Connection
[ 33.738135] ixgbe 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> IRQ 42
[ 33.738151] ixgbe 0000:07:00.1: setting latency timer to 64
[ 33.853526] ixgbe 0000:07:00.1: irq 117 for MSI/MSI-X
[ 33.853529] ixgbe 0000:07:00.1: irq 118 for MSI/MSI-X
[ 33.853532] ixgbe 0000:07:00.1: irq 119 for MSI/MSI-X
[ 33.853534] ixgbe 0000:07:00.1: irq 120 for MSI/MSI-X
[ 33.853536] ixgbe 0000:07:00.1: irq 121 for MSI/MSI-X
[ 33.853538] ixgbe 0000:07:00.1: irq 122 for MSI/MSI-X
[ 33.853540] ixgbe 0000:07:00.1: irq 123 for MSI/MSI-X
[ 33.853542] ixgbe 0000:07:00.1: irq 124 for MSI/MSI-X
[ 33.853544] ixgbe 0000:07:00.1: irq 125 for MSI/MSI-X
[ 33.853546] ixgbe 0000:07:00.1: irq 126 for MSI/MSI-X
[ 33.853548] ixgbe 0000:07:00.1: irq 127 for MSI/MSI-X
[ 33.853550] ixgbe 0000:07:00.1: irq 128 for MSI/MSI-X
[ 33.853552] ixgbe 0000:07:00.1: irq 129 for MSI/MSI-X
[ 33.853554] ixgbe 0000:07:00.1: irq 130 for MSI/MSI-X
[ 33.853556] ixgbe 0000:07:00.1: irq 131 for MSI/MSI-X
[ 33.853558] ixgbe 0000:07:00.1: irq 132 for MSI/MSI-X
[ 33.853560] ixgbe 0000:07:00.1: irq 133 for MSI/MSI-X
[ 33.853580] ixgbe: 0000:07:00.1: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
[ 33.853585] ixgbe 0000:07:00.1: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:55
[ 33.853669] ixgbe 0000:07:00.1: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
[ 33.855956] ixgbe 0000:07:00.1: Intel(R) 10 Gigabit Network Connection
[ 85.208233] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: RX/TX
[ 85.237453] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: RX/TX
[ 96.080713] ixgbe: fiber1 NIC Link is Down
[ 102.094610] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 102.119572] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 142.524691] ixgbe: fiber1 NIC Link is Down
[ 148.421332] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 148.449465] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 160.728643] ixgbe: fiber1 NIC Link is Down
[ 172.832301] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 173.659038] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 184.554501] ixgbe: fiber0 NIC Link is Down
[ 185.376273] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 186.493598] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 190.564383] ixgbe: fiber0 NIC Link is Down
[ 191.391149] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 192.484492] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 192.545424] ixgbe: fiber1 NIC Link is Down
[ 205.858197] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 206.684940] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 211.991875] ixgbe: fiber1 NIC Link is Down
[ 220.833478] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 220.833630] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 229.804853] ixgbe: fiber1 NIC Link is Down
[ 248.395672] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
[ 249.222408] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
[ 484.631598] ixgbe: fiber1 NIC Link is Down
[ 490.138931] ixgbe: fiber1 NIC Link is Up 10 Gbps, Flow Control: None
[ 490.167880] ixgbe: fiber0 NIC Link is Up 10 Gbps, Flow Control: None
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 7:46 ` Eric Dumazet
@ 2009-11-24 8:46 ` Badalian Vyacheslav
2009-11-24 9:07 ` Peter P Waskiewicz Jr
1 sibling, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-24 8:46 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Waskiewicz Jr, Peter P, Linux Netdev List
Eric Dumazet пишет:
> Waskiewicz Jr, Peter P a écrit :
>> Ok, I was confused earlier. I thought you were saying that all packets
>> were headed into a single Rx queue. This is different.
>>
>> Do you know what version of irqbalance you're running, or if it's running
>> at all? We've seen issues with irqbalance where it won't recognize the
>> ethernet device if the driver has been reloaded. In that case, it won't
>> balance the interrupts at all. If the default affinity was set to one
>> CPU, then well, you're screwed.
>>
>> My suggestion in this case is after you reload ixgbe and start your tests,
>> see if it all goes to one CPU. If it does, then restart irqbalance
>> (service irqbalance restart - or just kill it and restart by hand). Then
>> start running your test, and in 10 seconds you should see the interrupts
>> move and spread out.
>>
>> Let me know if this helps,
>
> Sure it helps !
>
> I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
> I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.
>
> I do receive 16 different irqs, but all serviced on one cpu.
>
> Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
> (one bit set in the mask, not several ones), and that is not optimal for moderate loads.
>
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
>
>
> One other problem is that after reload of ixgbe driver, link is 95% of the time
> at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps
>
> I run following script many times and stop it when 10 Gbps speed if reached.
>
> ethtool -A fiber0 rx off tx off
> ip link set fiber0 down
> ip link set fiber1 down
> sleep 2
> ethtool fiber0
> ethtool -s fiber0 speed 10000
> ethtool -s fiber1 speed 10000
> ethtool -r fiber0 &
> ethtool -r fiber1 &
> ethtool fiber0
> ip link set fiber1 up &
> ip link set fiber0 up &
> ethtool fiber0
>
> [ 33.625689] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
> [ 33.625692] ixgbe: Copyright (c) 1999-2009 Intel Corporation.
> [ 33.625741] ixgbe 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
> [ 33.625760] ixgbe 0000:07:00.0: setting latency timer to 64
> [ 33.735579] ixgbe 0000:07:00.0: irq 100 for MSI/MSI-X
> [ 33.735583] ixgbe 0000:07:00.0: irq 101 for MSI/MSI-X
> [ 33.735585] ixgbe 0000:07:00.0: irq 102 for MSI/MSI-X
> [ 33.735587] ixgbe 0000:07:00.0: irq 103 for MSI/MSI-X
> [ 33.735589] ixgbe 0000:07:00.0: irq 104 for MSI/MSI-X
> [ 33.735591] ixgbe 0000:07:00.0: irq 105 for MSI/MSI-X
> [ 33.735593] ixgbe 0000:07:00.0: irq 106 for MSI/MSI-X
> [ 33.735595] ixgbe 0000:07:00.0: irq 107 for MSI/MSI-X
> [ 33.735597] ixgbe 0000:07:00.0: irq 108 for MSI/MSI-X
> [ 33.735599] ixgbe 0000:07:00.0: irq 109 for MSI/MSI-X
> [ 33.735602] ixgbe 0000:07:00.0: irq 110 for MSI/MSI-X
> [ 33.735604] ixgbe 0000:07:00.0: irq 111 for MSI/MSI-X
> [ 33.735606] ixgbe 0000:07:00.0: irq 112 for MSI/MSI-X
> [ 33.735608] ixgbe 0000:07:00.0: irq 113 for MSI/MSI-X
> [ 33.735610] ixgbe 0000:07:00.0: irq 114 for MSI/MSI-X
> [ 33.735612] ixgbe 0000:07:00.0: irq 115 for MSI/MSI-X
> [ 33.735614] ixgbe 0000:07:00.0: irq 116 for MSI/MSI-X
> [ 33.735633] ixgbe: 0000:07:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
> [ 33.735638] ixgbe 0000:07:00.0: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:54
> [ 33.735722] ixgbe 0000:07:00.0: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
> [ 33.738111] ixgbe 0000:07:00.0: Intel(R) 10 Gigabit Network Connection
> [ 33.738135] ixgbe 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> IRQ 42
> [ 33.738151] ixgbe 0000:07:00.1: setting latency timer to 64
> [ 33.853526] ixgbe 0000:07:00.1: irq 117 for MSI/MSI-X
> [ 33.853529] ixgbe 0000:07:00.1: irq 118 for MSI/MSI-X
> [ 33.853532] ixgbe 0000:07:00.1: irq 119 for MSI/MSI-X
> [ 33.853534] ixgbe 0000:07:00.1: irq 120 for MSI/MSI-X
> [ 33.853536] ixgbe 0000:07:00.1: irq 121 for MSI/MSI-X
> [ 33.853538] ixgbe 0000:07:00.1: irq 122 for MSI/MSI-X
> [ 33.853540] ixgbe 0000:07:00.1: irq 123 for MSI/MSI-X
> [ 33.853542] ixgbe 0000:07:00.1: irq 124 for MSI/MSI-X
> [ 33.853544] ixgbe 0000:07:00.1: irq 125 for MSI/MSI-X
> [ 33.853546] ixgbe 0000:07:00.1: irq 126 for MSI/MSI-X
> [ 33.853548] ixgbe 0000:07:00.1: irq 127 for MSI/MSI-X
> [ 33.853550] ixgbe 0000:07:00.1: irq 128 for MSI/MSI-X
> [ 33.853552] ixgbe 0000:07:00.1: irq 129 for MSI/MSI-X
> [ 33.853554] ixgbe 0000:07:00.1: irq 130 for MSI/MSI-X
> [ 33.853556] ixgbe 0000:07:00.1: irq 131 for MSI/MSI-X
> [ 33.853558] ixgbe 0000:07:00.1: irq 132 for MSI/MSI-X
> [ 33.853560] ixgbe 0000:07:00.1: irq 133 for MSI/MSI-X
> [ 33.853580] ixgbe: 0000:07:00.1: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
> [ 33.853585] ixgbe 0000:07:00.1: (PCI Express:5.0Gb/s:Width x8) 00:1b:21:4a:fe:55
> [ 33.853669] ixgbe 0000:07:00.1: MAC: 2, PHY: 11, SFP+: 5, PBA No: e66562-003
> [ 33.855956] ixgbe 0000:07:00.1: Intel(R) 10 Gigabit Network Connection
>
> [ 85.208233] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: RX/TX
> [ 85.237453] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: RX/TX
> [ 96.080713] ixgbe: fiber1 NIC Link is Down
> [ 102.094610] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 102.119572] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 142.524691] ixgbe: fiber1 NIC Link is Down
> [ 148.421332] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 148.449465] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 160.728643] ixgbe: fiber1 NIC Link is Down
> [ 172.832301] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 173.659038] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 184.554501] ixgbe: fiber0 NIC Link is Down
> [ 185.376273] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 186.493598] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 190.564383] ixgbe: fiber0 NIC Link is Down
> [ 191.391149] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 192.484492] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 192.545424] ixgbe: fiber1 NIC Link is Down
> [ 205.858197] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 206.684940] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 211.991875] ixgbe: fiber1 NIC Link is Down
> [ 220.833478] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 220.833630] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 229.804853] ixgbe: fiber1 NIC Link is Down
> [ 248.395672] ixgbe: fiber0 NIC Link is Up 1 Gbps, Flow Control: None
> [ 249.222408] ixgbe: fiber1 NIC Link is Up 1 Gbps, Flow Control: None
> [ 484.631598] ixgbe: fiber1 NIC Link is Down
> [ 490.138931] ixgbe: fiber1 NIC Link is Up 10 Gbps, Flow Control: None
> [ 490.167880] ixgbe: fiber0 NIC Link is Up 10 Gbps, Flow Control: None
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
May be its Flow Director?
Multuqueue in this network card work only if you set 1 queue to 1 cpu core in smp_affinity :(
In README:
Intel(R) Ethernet Flow Director
-------------------------------
Supports advanced filters that direct receive packets by their flows to
different queues. Enables tight control on routing a flow in the platform.
Matches flows and CPU cores for flow affinity. Supports multiple parameters
for flexible flow classification and load balancing.
Flow director is enabled only if the kernel is multiple TX queue capable.
An included script (set_irq_affinity.sh) automates setting the IRQ to CPU
affinity.
You can verify that the driver is using Flow Director by looking at the counter
in ethtool: fdir_miss and fdir_match.
The following three parameters impact Flow Director.
FdirMode
--------
Valid Range: 0-2 (0=off, 1=ATR, 2=Perfect filter mode)
Default Value: 1
Flow Director filtering modes.
FdirPballoc
-----------
Valid Range: 0-2 (0=64k, 1=128k, 2=256k)
Default Value: 0
Flow Director allocated packet buffer size.
AtrSampleRate
--------------
Valid Range: 1-100
Default Value: 20
Software ATR Tx packet sample rate. For example, when set to 20, every 20th
packet, looks to see if the packet will create a new flow.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 7:46 ` Eric Dumazet
2009-11-24 8:46 ` Badalian Vyacheslav
@ 2009-11-24 9:07 ` Peter P Waskiewicz Jr
2009-11-24 9:55 ` Eric Dumazet
1 sibling, 1 reply; 27+ messages in thread
From: Peter P Waskiewicz Jr @ 2009-11-24 9:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List
On Tue, 2009-11-24 at 00:46 -0700, Eric Dumazet wrote:
> Waskiewicz Jr, Peter P a écrit :
> > Ok, I was confused earlier. I thought you were saying that all packets
> > were headed into a single Rx queue. This is different.
> >
> > Do you know what version of irqbalance you're running, or if it's running
> > at all? We've seen issues with irqbalance where it won't recognize the
> > ethernet device if the driver has been reloaded. In that case, it won't
> > balance the interrupts at all. If the default affinity was set to one
> > CPU, then well, you're screwed.
> >
> > My suggestion in this case is after you reload ixgbe and start your tests,
> > see if it all goes to one CPU. If it does, then restart irqbalance
> > (service irqbalance restart - or just kill it and restart by hand). Then
> > start running your test, and in 10 seconds you should see the interrupts
> > move and spread out.
> >
> > Let me know if this helps,
>
> Sure it helps !
>
> I tried without irqbalance and with irqbalance (Ubuntu 9.10 ships irqbalance 0.55-4)
> I can see irqbalance setting smp_affinities to 5555 or AAAA with no direct effect.
>
> I do receive 16 different irqs, but all serviced on one cpu.
>
> Only way to have irqs on different cpus is to manualy force irq affinities to be exclusive
> (one bit set in the mask, not several ones), and that is not optimal for moderate loads.
>
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-0/../smp_affinity`
> echo 1 >`echo /proc/irq/*/fiber1-TxRx-1/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-2/../smp_affinity`
> echo 4 >`echo /proc/irq/*/fiber1-TxRx-3/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-4/../smp_affinity`
> echo 10 >`echo /proc/irq/*/fiber1-TxRx-5/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-6/../smp_affinity`
> echo 40 >`echo /proc/irq/*/fiber1-TxRx-7/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-8/../smp_affinity`
> echo 100 >`echo /proc/irq/*/fiber1-TxRx-9/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-10/../smp_affinity`
> echo 400 >`echo /proc/irq/*/fiber1-TxRx-11/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-12/../smp_affinity`
> echo 1000 >`echo /proc/irq/*/fiber1-TxRx-13/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-14/../smp_affinity`
> echo 4000 >`echo /proc/irq/*/fiber1-TxRx-15/../smp_affinity`
>
>
> One other problem is that after reload of ixgbe driver, link is 95% of the time
> at 1 Gbps speed, and I could not find an easy way to force it being 10 Gbps
>
You might have this elsewhere, but it sounds like you're connecting back
to back with another 82599 NIC. Our optics in that NIC are dual-rate,
and the software mechanism that tries to "autoneg" link speed gets out
of sync easily in back-to-back setups.
If it's really annoying, and you're willing to run with a local patch to
disable the autotry mechanism, try this:
diff --git a/drivers/net/ixgbe/ixgbe_main.c
b/drivers/net/ixgbe/ixgbe_main.c
index a5036f7..62c0915 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
work_struct *work)
autoneg = hw->phy.autoneg_advertised;
if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
hw->mac.ops.get_link_capabilities(hw, &autoneg,
&negotiation);
+
+ /* force 10G only */
+ autoneg = IXGBE_LINK_SPEED_10GB_FULL;
+
if (hw->mac.ops.setup_link)
hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
Cheers,
-PJ
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 9:07 ` Peter P Waskiewicz Jr
@ 2009-11-24 9:55 ` Eric Dumazet
2009-11-24 10:06 ` Peter P Waskiewicz Jr
2009-11-26 14:10 ` Badalian Vyacheslav
0 siblings, 2 replies; 27+ messages in thread
From: Eric Dumazet @ 2009-11-24 9:55 UTC (permalink / raw)
To: Peter P Waskiewicz Jr
Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List
Peter P Waskiewicz Jr a écrit :
> You might have this elsewhere, but it sounds like you're connecting back
> to back with another 82599 NIC. Our optics in that NIC are dual-rate,
> and the software mechanism that tries to "autoneg" link speed gets out
> of sync easily in back-to-back setups.
>
> If it's really annoying, and you're willing to run with a local patch to
> disable the autotry mechanism, try this:
>
> diff --git a/drivers/net/ixgbe/ixgbe_main.c
> b/drivers/net/ixgbe/ixgbe_main.c
> index a5036f7..62c0915 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
> work_struct *work)
> autoneg = hw->phy.autoneg_advertised;
> if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
> hw->mac.ops.get_link_capabilities(hw, &autoneg,
> &negotiation);
> +
> + /* force 10G only */
> + autoneg = IXGBE_LINK_SPEED_10GB_FULL;
> +
> if (hw->mac.ops.setup_link)
> hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
> adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
Thanks ! This did the trick :)
If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
and polluting "tc -s -d class show dev fiber0" output.
[PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
Instead of allocating 128 struct netdev_queue per device, use the minimum
value between 128 and number of possible cpus, to reduce ram usage and
"tc -s -d class show dev ..." output
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index ebcec30..ec2508d 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
pci_set_master(pdev);
pci_save_state(pdev);
- netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
+ netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
+ min_t(unsigned int,
+ MAX_TX_QUEUES,
+ num_possible_cpus()));
if (!netdev) {
err = -ENOMEM;
goto err_alloc_etherdev;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 9:55 ` Eric Dumazet
@ 2009-11-24 10:06 ` Peter P Waskiewicz Jr
2009-11-24 13:14 ` John Fastabend
2009-11-26 14:10 ` Badalian Vyacheslav
1 sibling, 1 reply; 27+ messages in thread
From: Peter P Waskiewicz Jr @ 2009-11-24 10:06 UTC (permalink / raw)
To: Eric Dumazet
Cc: robert@herjulf.net, Jesper Dangaard Brouer, Linux Netdev List
On Tue, 2009-11-24 at 02:55 -0700, Eric Dumazet wrote:
> Peter P Waskiewicz Jr a écrit :
>
> > You might have this elsewhere, but it sounds like you're connecting back
> > to back with another 82599 NIC. Our optics in that NIC are dual-rate,
> > and the software mechanism that tries to "autoneg" link speed gets out
> > of sync easily in back-to-back setups.
> >
> > If it's really annoying, and you're willing to run with a local patch to
> > disable the autotry mechanism, try this:
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_main.c
> > b/drivers/net/ixgbe/ixgbe_main.c
> > index a5036f7..62c0915 100644
> > --- a/drivers/net/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ixgbe/ixgbe_main.c
> > @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
> > work_struct *work)
> > autoneg = hw->phy.autoneg_advertised;
> > if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
> > hw->mac.ops.get_link_capabilities(hw, &autoneg,
> > &negotiation);
> > +
> > + /* force 10G only */
> > + autoneg = IXGBE_LINK_SPEED_10GB_FULL;
> > +
> > if (hw->mac.ops.setup_link)
> > hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
> > adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
>
> Thanks ! This did the trick :)
>
> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
>
> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
> and polluting "tc -s -d class show dev fiber0" output.
>
Yes, this is a stupid issue we haven't gotten around to fixing yet.
This looks fine to me. Thanks for putting it together.
> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
>
> Instead of allocating 128 struct netdev_queue per device, use the minimum
> value between 128 and number of possible cpus, to reduce ram usage and
> "tc -s -d class show dev ..." output
>
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index ebcec30..ec2508d 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
> pci_set_master(pdev);
> pci_save_state(pdev);
>
> - netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
> + netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
> + min_t(unsigned int,
> + MAX_TX_QUEUES,
> + num_possible_cpus()));
> if (!netdev) {
> err = -ENOMEM;
> goto err_alloc_etherdev;
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 10:06 ` Peter P Waskiewicz Jr
@ 2009-11-24 13:14 ` John Fastabend
2009-11-29 8:18 ` David Miller
0 siblings, 1 reply; 27+ messages in thread
From: John Fastabend @ 2009-11-24 13:14 UTC (permalink / raw)
To: Peter P Waskiewicz Jr
Cc: Eric Dumazet, robert@herjulf.net, Jesper Dangaard Brouer,
Linux Netdev List
Peter P Waskiewicz Jr wrote:
> On Tue, 2009-11-24 at 02:55 -0700, Eric Dumazet wrote:
>
>> Peter P Waskiewicz Jr a écrit :
>>
>>
>>> You might have this elsewhere, but it sounds like you're connecting back
>>> to back with another 82599 NIC. Our optics in that NIC are dual-rate,
>>> and the software mechanism that tries to "autoneg" link speed gets out
>>> of sync easily in back-to-back setups.
>>>
>>> If it's really annoying, and you're willing to run with a local patch to
>>> disable the autotry mechanism, try this:
>>>
>>> diff --git a/drivers/net/ixgbe/ixgbe_main.c
>>> b/drivers/net/ixgbe/ixgbe_main.c
>>> index a5036f7..62c0915 100644
>>> --- a/drivers/net/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>>> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
>>> work_struct *work)
>>> autoneg = hw->phy.autoneg_advertised;
>>> if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
>>> hw->mac.ops.get_link_capabilities(hw, &autoneg,
>>> &negotiation);
>>> +
>>> + /* force 10G only */
>>> + autoneg = IXGBE_LINK_SPEED_10GB_FULL;
>>> +
>>> if (hw->mac.ops.setup_link)
>>> hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
>>> adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
>>>
>> Thanks ! This did the trick :)
>>
>> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
>>
>> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
>> and polluting "tc -s -d class show dev fiber0" output.
>>
>>
>
> Yes, this is a stupid issue we haven't gotten around to fixing yet.
> This looks fine to me. Thanks for putting it together.
>
>
Believe the below patch will break DCB and FCoE though, both features
have the potential to set real_num_tx_queues to greater then the number
of CPUs. This could result in real_num_tx_queues > num_tx_queues.
The current solution isn't that great though, maybe we should set to the
minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
That should cover the maximum possible queues for DCB, FCoE and their
combinations.
general multiq = num_possible_cpus()
DCB = 8 tx queue's
FCoE = 2*num_possible_cpus()
FCoE + DCB = 8 tx queues + num_possible_cpus
thanks,
john.
>> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
>>
>> Instead of allocating 128 struct netdev_queue per device, use the minimum
>> value between 128 and number of possible cpus, to reduce ram usage and
>> "tc -s -d class show dev ..." output
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
>> index ebcec30..ec2508d 100644
>> --- a/drivers/net/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>> pci_set_master(pdev);
>> pci_save_state(pdev);
>>
>> - netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
>> + netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
>> + min_t(unsigned int,
>> + MAX_TX_QUEUES,
>> + num_possible_cpus()));
>> if (!netdev) {
>> err = -ENOMEM;
>> goto err_alloc_etherdev;
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 9:55 ` Eric Dumazet
2009-11-24 10:06 ` Peter P Waskiewicz Jr
@ 2009-11-26 14:10 ` Badalian Vyacheslav
1 sibling, 0 replies; 27+ messages in thread
From: Badalian Vyacheslav @ 2009-11-26 14:10 UTC (permalink / raw)
To: Eric Dumazet
Cc: Peter P Waskiewicz Jr, robert@herjulf.net, Jesper Dangaard Brouer,
Linux Netdev List
Eric Dumazet пишет:
> Peter P Waskiewicz Jr a écrit :
>
>> You might have this elsewhere, but it sounds like you're connecting back
>> to back with another 82599 NIC. Our optics in that NIC are dual-rate,
>> and the software mechanism that tries to "autoneg" link speed gets out
>> of sync easily in back-to-back setups.
>>
>> If it's really annoying, and you're willing to run with a local patch to
>> disable the autotry mechanism, try this:
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_main.c
>> b/drivers/net/ixgbe/ixgbe_main.c
>> index a5036f7..62c0915 100644
>> --- a/drivers/net/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ixgbe/ixgbe_main.c
>> @@ -4670,6 +4670,10 @@ static void ixgbe_multispeed_fiber_task(struct
>> work_struct *work)
>> autoneg = hw->phy.autoneg_advertised;
>> if ((!autoneg) && (hw->mac.ops.get_link_capabilities))
>> hw->mac.ops.get_link_capabilities(hw, &autoneg,
>> &negotiation);
>> +
>> + /* force 10G only */
>> + autoneg = IXGBE_LINK_SPEED_10GB_FULL;
>> +
>> if (hw->mac.ops.setup_link)
>> hw->mac.ops.setup_link(hw, autoneg, negotiation, true);
>> adapter->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
>
> Thanks ! This did the trick :)
>
> If I am not mistaken, number of TX queues should be capped by number of possible cpus ?
>
> Its currently a fixed 128 value, allocating 128*128 = 16384 bytes,
> and polluting "tc -s -d class show dev fiber0" output.
>
> [PATCH net-next-2.6] ixgbe: Do not allocate too many netdev txqueues
>
> Instead of allocating 128 struct netdev_queue per device, use the minimum
> value between 128 and number of possible cpus, to reduce ram usage and
> "tc -s -d class show dev ..." output
>
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index ebcec30..ec2508d 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -5582,7 +5583,10 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
> pci_set_master(pdev);
> pci_save_state(pdev);
>
> - netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter), MAX_TX_QUEUES);
> + netdev = alloc_etherdev_mq(sizeof(struct ixgbe_adapter),
> + min_t(unsigned int,
> + MAX_TX_QUEUES,
> + num_possible_cpus()));
> if (!netdev) {
> err = -ENOMEM;
> goto err_alloc_etherdev;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
This also fix log time loading TC rules for me
Tested-by: Badalian Vyacheslav <slavon.net@gmail.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-24 13:14 ` John Fastabend
@ 2009-11-29 8:18 ` David Miller
2009-11-30 13:02 ` Eric Dumazet
0 siblings, 1 reply; 27+ messages in thread
From: David Miller @ 2009-11-29 8:18 UTC (permalink / raw)
To: john.r.fastabend
Cc: peter.p.waskiewicz.jr, eric.dumazet, robert, hawk, netdev
From: John Fastabend <john.r.fastabend@intel.com>
Date: Tue, 24 Nov 2009 13:14:12 +0000
> Believe the below patch will break DCB and FCoE though, both features
> have the potential to set real_num_tx_queues to greater then the
> number of CPUs. This could result in real_num_tx_queues >
> num_tx_queues.
>
> The current solution isn't that great though, maybe we should set to
> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
>
> That should cover the maximum possible queues for DCB, FCoE and their
> combinations.
>
> general multiq = num_possible_cpus()
> DCB = 8 tx queue's
> FCoE = 2*num_possible_cpus()
> FCoE + DCB = 8 tx queues + num_possible_cpus
Eric, I'm tossing your patch because of this problem, just FYI.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-29 8:18 ` David Miller
@ 2009-11-30 13:02 ` Eric Dumazet
2009-11-30 20:20 ` John Fastabend
0 siblings, 1 reply; 27+ messages in thread
From: Eric Dumazet @ 2009-11-30 13:02 UTC (permalink / raw)
To: David Miller
Cc: john.r.fastabend, peter.p.waskiewicz.jr, robert, hawk, netdev
David Miller a écrit :
> From: John Fastabend <john.r.fastabend@intel.com>
> Date: Tue, 24 Nov 2009 13:14:12 +0000
>
>> Believe the below patch will break DCB and FCoE though, both features
>> have the potential to set real_num_tx_queues to greater then the
>> number of CPUs. This could result in real_num_tx_queues >
>> num_tx_queues.
>>
>> The current solution isn't that great though, maybe we should set to
>> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
>>
>> That should cover the maximum possible queues for DCB, FCoE and their
>> combinations.
>>
>> general multiq = num_possible_cpus()
>> DCB = 8 tx queue's
>> FCoE = 2*num_possible_cpus()
>> FCoE + DCB = 8 tx queues + num_possible_cpus
>
> Eric, I'm tossing your patch because of this problem, just FYI.
Sure, I guess we need a more generic way to handle this.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: ixgbe question
2009-11-30 13:02 ` Eric Dumazet
@ 2009-11-30 20:20 ` John Fastabend
0 siblings, 0 replies; 27+ messages in thread
From: John Fastabend @ 2009-11-30 20:20 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Waskiewicz Jr, Peter P, robert@herjulf.net,
hawk@diku.dk, netdev@vger.kernel.org
Eric Dumazet wrote:
> David Miller a écrit :
>
>> From: John Fastabend <john.r.fastabend@intel.com>
>> Date: Tue, 24 Nov 2009 13:14:12 +0000
>>
>>
>>> Believe the below patch will break DCB and FCoE though, both features
>>> have the potential to set real_num_tx_queues to greater then the
>>> number of CPUs. This could result in real_num_tx_queues >
>>> num_tx_queues.
>>>
>>> The current solution isn't that great though, maybe we should set to
>>> the minimum of MAX_TX_QUEUES and num_possible_cpus() * 2 + 8.
>>>
>>> That should cover the maximum possible queues for DCB, FCoE and their
>>> combinations.
>>>
>>> general multiq = num_possible_cpus()
>>> DCB = 8 tx queue's
>>> FCoE = 2*num_possible_cpus()
>>> FCoE + DCB = 8 tx queues + num_possible_cpus
>>>
>> Eric, I'm tossing your patch because of this problem, just FYI.
>>
>
> Sure, I guess we need a more generic way to handle this.
>
>
Eric,
I'll resubmit your patch with a small update to fix my concerns soon.
thanks,
john.
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-11-30 20:20 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-10 21:27 Ixgbe question Ben Greear
2008-03-11 1:01 ` Brandeburg, Jesse
-- strict thread matches above, loose matches on Subject: below --
2009-11-23 6:46 [PATCH] irq: Add node_affinity CPU masks for smarter irqbalance hints Peter P Waskiewicz Jr
2009-11-23 7:32 ` Yong Zhang
2009-11-23 9:36 ` Peter P Waskiewicz Jr
2009-11-23 10:21 ` ixgbe question Eric Dumazet
2009-11-23 10:30 ` Badalian Vyacheslav
2009-11-23 10:34 ` Waskiewicz Jr, Peter P
2009-11-23 10:37 ` Eric Dumazet
2009-11-23 14:05 ` Eric Dumazet
2009-11-23 21:26 ` David Miller
2009-11-23 14:10 ` Jesper Dangaard Brouer
2009-11-23 14:38 ` Eric Dumazet
2009-11-23 18:30 ` robert
2009-11-23 16:59 ` Eric Dumazet
2009-11-23 20:54 ` robert
2009-11-23 21:28 ` David Miller
2009-11-23 22:14 ` Robert Olsson
2009-11-23 23:28 ` Waskiewicz Jr, Peter P
2009-11-23 23:44 ` David Miller
2009-11-24 7:46 ` Eric Dumazet
2009-11-24 8:46 ` Badalian Vyacheslav
2009-11-24 9:07 ` Peter P Waskiewicz Jr
2009-11-24 9:55 ` Eric Dumazet
2009-11-24 10:06 ` Peter P Waskiewicz Jr
2009-11-24 13:14 ` John Fastabend
2009-11-29 8:18 ` David Miller
2009-11-30 13:02 ` Eric Dumazet
2009-11-30 20:20 ` John Fastabend
2009-11-26 14:10 ` Badalian Vyacheslav
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).