From: Eric Dumazet <eric.dumazet@gmail.com>
To: Wei Gu <wei.gu@ericsson.com>
Cc: netdev <netdev@vger.kernel.org>,
Alexander Duyck <alexander.h.duyck@intel.com>,
Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Subject: Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
Date: Thu, 07 Apr 2011 10:07:30 +0200 [thread overview]
Message-ID: <1302163650.3357.8.camel@edumazet-laptop> (raw)
In-Reply-To: <D12839161ADD3A4B8DA63D1A134D084026E48B9E82@ESGSCCMS0001.eapac.ericsson.se>
Le jeudi 07 avril 2011 à 15:22 +0800, Wei Gu a écrit :
> Hi guys,
> As I talked with Eric, that I get a very low performance on Linux 2.6.38 kernel with intel ixgbe-3.2.10 driver.
> I test different rx buff size on the Intel 10G NIC, by setting ethtool -G rx 4096.
> I get the lowest performance(~50Kpps Rx&Tx) by setting the rx==4096.
> Once I decrease the Rx to 512 (default) then I can get Max 250Kpps Rx&Tx on 1 NIC.
>
> I was runing this test with HP DL580 4 Sock CPUs, and full memeory configuration.
> modprobe ixgbe RSS=8,8,8,8,8,8,8,8 FdirMode=0,0,0,0,0,0,0,0 Node=0,0,1,1,2,2,3,3
> Numactrl --hardware
> available: 4 nodes (0-3)
> node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
> node 0 size: 65525 MB
> node 0 free: 63053 MB
> node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
> node 1 size: 65536 MB
> node 1 free: 63388 MB
> node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
> node 2 size: 65536 MB
> node 2 free: 63344 MB
> node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
> node 3 size: 65535 MB
> node 3 free: 63376 MB
>
> Then I binding the eth10's rx and tx's IRQs to core "24 25 26 27 28 29 30 31", one by one, which means 1 rx and 1 tx was share 1 core.
>
>
> I did the same test on 2.6.32 kernel, I can get >2.5M tx&rx with the same setup on RHEL6(2.6.32) Linux. But never reach 10.000.000 rx&tx on a single NIC:)
>
> I also test the 2.6.38 shipped intel ixgbe driver It has the same problem.
>
> This is a perf record with linux shipped ixgbe driver, looks it has a very high irq/s rate. And the softirq was busy on alloc_iova
>
>
> PerfTop: 512417 irqs/sec kernel:91.3% exact: 0.0% [1000Hz cpu-clock-msecs], (all, 64 CPUs)
> ------------------------------------------------------------------------------------------------------------------------------------------------------
> - 0.82% ksoftirqd/24 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
> \u2592 - _raw_spin_unlock_irqrestore
> \u2592 - 44.27% alloc_iova
> \u2592 intel_alloc_iova
> \u2592 __intel_map_single
> \u2592 intel_map_page
> \u2592 - ixgbe_init_interrupt_scheme
> \u2592 - 59.97% ixgbe_alloc_rx_buffers
> \u2592 ixgbe_clean_rx_irq
> \u2592 0xffffffffa033a5
> \u2592 net_rx_action
> u2592 __do_softirq
> \u2592 + call_softirq
> \u2592 - 40.03% ixgbe_change_mtu
> \u2592 ixgbe_change_mtu
> \u2592 dev_hard_start_xmit
> \u2592 sch_direct_xmit
> \u2592 dev_queue_xmit
> \u2592 vlan_dev_hard_start_xmit
> \u2592 hook_func
> \u2592 nf_iterate
> \u2592 nf_hook_slow
> \u2592 NF_HOOK.clone.1
> \u2592 ip_rcv
> \u2592 __netif_receive_skb
> \u2592 __netif_receive_skb
> \u2592 netif_receive_skb
> \u2592 napi_skb_finish
> \u2592 napi_gro_receive
> \u2592 ixgbe_clean_rx_irq
> \u2592 0xffffffffa033a5
> \u2592 net_rx_action
> \u2592 __do_softirq
> \u2592 + call_softirq
> \u2592 + 35.85% find_iova
> \u2592 + 19.44% add_unmap
>
>
> Thanks
> WeiGu
What about using the driver as provided in 2.6.38 ?
No custom module parameter, only play with irq affinities
Say you have 64 queues but want only 8 cpus (24 -> 31) receiving trafic
for i in `seq 0 7`
do
echo 01000000 >/proc/irq/*/eth1-fp-$i/../smp_affinity
done
for i in `seq 8 15`
do
echo 02000000 >/proc/irq/*/eth1-fp-$i/../smp_affinity
done
...
for i in `seq 56 63`
do
echo 80000000 >/proc/irq/*/eth1-fp-$i/../smp_affinity
done
Why is ixgbe_change_mtu() seen on your profile ?
Its damn expensive, since it must call ixgbe_reinit_locked()
Are you using a custom code in kernel ?
next prev parent reply other threads:[~2011-04-07 8:07 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <D12839161ADD3A4B8DA63D1A134D084026E48B9BEB@ESGSCCMS0001.eapac.ericsson.se>
2011-04-07 4:58 ` Question on "net: allocate skbs on local node" Eric Dumazet
2011-04-07 5:16 ` Eric Dumazet
2011-04-07 6:16 ` Eric Dumazet
2011-04-07 7:22 ` Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel Wei Gu
2011-04-07 8:07 ` Eric Dumazet [this message]
2011-04-07 8:39 ` Wei Gu
2011-04-07 9:06 ` Eric Dumazet
2011-04-07 11:15 ` Wei Gu
2011-04-07 11:46 ` Eric Dumazet
2011-04-07 13:41 ` Eric Dumazet
2011-04-07 15:58 ` Alexander Duyck
2011-04-07 16:03 ` Eric Dumazet
2011-04-07 16:20 ` Alexander Duyck
2011-04-07 16:37 ` Eric Dumazet
2011-04-08 8:59 ` Wei Gu
2011-04-08 9:07 ` Eric Dumazet
2011-04-08 9:15 ` Wei Gu
2011-04-08 9:49 ` Eric Dumazet
2011-04-08 9:59 ` Wei Gu
2011-04-08 9:41 ` Wei Gu
2011-04-08 12:19 ` Wei Gu
2011-04-08 12:56 ` Eric Dumazet
2011-04-08 14:10 ` Wei Gu
2011-04-08 14:49 ` Stephen Hemminger
2011-04-09 3:51 ` Wei Gu
2011-04-08 15:07 ` Eric Dumazet
2011-04-09 3:27 ` Wei Gu
2011-04-09 6:36 ` Eric Dumazet
2011-04-10 7:02 ` Wei Gu
2011-04-11 14:50 ` Alexander Duyck
2011-04-11 15:00 ` Wei Gu
2011-04-11 15:14 ` Wei Gu
2011-04-11 15:42 ` Eric Dumazet
2011-04-12 1:22 ` Wei Gu
2011-04-12 4:40 ` Wei Gu
2011-04-12 4:56 ` Eric Dumazet
2011-04-12 5:18 ` Wei Gu
2011-04-14 5:42 ` Wei Gu
2011-04-14 6:07 ` Eric Dumazet
2011-04-14 6:33 ` Eric Dumazet
2011-04-14 6:58 ` Wei Gu
2011-04-14 16:42 ` Alexander Duyck
2011-04-14 16:45 ` Eric Dumazet
2011-04-14 16:56 ` Peter Zijlstra
2011-04-14 16:57 ` Eric Dumazet
2011-04-14 17:49 ` Eric Dumazet
2011-04-14 19:08 ` Alexander Duyck
2011-04-15 2:10 ` Wei Gu
2011-04-15 8:57 ` Peter Zijlstra
2011-04-15 9:14 ` Wei Gu
2011-04-18 21:12 ` Jesse Brandeburg
2011-04-19 4:09 ` Wei Gu
2011-04-21 2:57 ` Wei Gu
2011-04-21 3:25 ` Wei Gu
2011-04-08 16:22 ` Alexander Duyck
2011-04-09 3:36 ` Wei Gu
2011-04-09 4:40 ` Alexander H Duyck
2011-04-09 6:12 ` Wei Gu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1302163650.3357.8.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=netdev@vger.kernel.org \
--cc=wei.gu@ericsson.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox