* behaviour question for igb on nehalem box
@ 2009-10-09 18:43 Chris Friesen
2009-10-09 20:22 ` Brandeburg, Jesse
0 siblings, 1 reply; 8+ messages in thread
From: Chris Friesen @ 2009-10-09 18:43 UTC (permalink / raw)
To: e1000-list, Linux Network Development list, Kirsher, Jeffrey T,
"Brandeburg, Jesse" <jesse
Hi all,
I've got some general questions around the expected behaviour of the
82576 igb net device. (On a dual quad-core Nehalem box, if it matters.)
As a caveat, the box is running Centos 5.3 with their 2.6.18 kernel.
It's using the 1.3.16-k2 igb driver though, which looks to be the one
from mainline linux.
The igb driver is being loaded with no parameters specified. At driver
init time, it's selecting 1 tx queue and 4 rx queues per device.
My first question is whether the number of queues makes sense. I
couldn't figure out how this would happen since the rules for selecting
the number of queues seems to be the same for rx and tx. Also, it's not
clear to me why it's limiting itself to 4 rx queues when I have 8
physical cores (and 16 virtual ones with hyperthreading enabled).
My second question is around how the rx queues are mapped to interrupts.
According to /proc/interrupts there appears to be a 1:1 mapping between
queues and interrupts. However, I've set up at test with a given amount
of traffic coming in to the device (from 4 different IP addresses and 4
ports). Under this scenario, "ethtool -S" shows the number of packets
increasing for only rx queue 0, but I see the interrupt count going up
for two interrupts.
My final question is around smp affinity for the rx and tx queue
interrupts. Do I need to affine the interrupt for each rx queue to a
single core to guarantee proper packet ordering, or can they be handled
on arbitrary cores? Should the tx queue be affined to a particular core
or left to be handled by all cores?
Thanks,
Chris
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-09 18:43 behaviour question for igb on nehalem box Chris Friesen
@ 2009-10-09 20:22 ` Brandeburg, Jesse
2009-10-09 22:31 ` Chris Friesen
0 siblings, 1 reply; 8+ messages in thread
From: Brandeburg, Jesse @ 2009-10-09 20:22 UTC (permalink / raw)
To: Chris Friesen
Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
Ronciak, John, Kirsher, Jeffrey T
On Fri, 9 Oct 2009, Chris Friesen wrote:
> I've got some general questions around the expected behaviour of the
> 82576 igb net device. (On a dual quad-core Nehalem box, if it matters.)
>
> As a caveat, the box is running Centos 5.3 with their 2.6.18 kernel.
> It's using the 1.3.16-k2 igb driver though, which looks to be the one
> from mainline linux.
>
> The igb driver is being loaded with no parameters specified. At driver
> init time, it's selecting 1 tx queue and 4 rx queues per device.
>
> My first question is whether the number of queues makes sense. I
It does for this kernel, because 2.6.18 doesn't support multiple tx
queues. The hardware supports RSS over receive queues, and the driver
doesn't mention the multiple receive queues from the OS.
> couldn't figure out how this would happen since the rules for selecting
> the number of queues seems to be the same for rx and tx. Also, it's not
> clear to me why it's limiting itself to 4 rx queues when I have 8
> physical cores (and 16 virtual ones with hyperthreading enabled).
for gigabit more queues is not necessarily better, and MQ arguably isn't
necessary at all for gigabit. However, it can help for some workloads
when spreading out RX traffic. the hardware you have only supports 8
queues (rx and tx) and the driver is configured to only set up 4 max.
> My second question is around how the rx queues are mapped to interrupts.
> According to /proc/interrupts there appears to be a 1:1 mapping between
> queues and interrupts. However, I've set up at test with a given amount
> of traffic coming in to the device (from 4 different IP addresses and 4
> ports). Under this scenario, "ethtool -S" shows the number of packets
> increasing for only rx queue 0, but I see the interrupt count going up
> for two interrupts.
one transmit interrupt and one receive interrupt? RSS will spread the
receive work out in a flow based way, based on ip/xDP header. Your test
as described should be using more than one flow (and therefore more than
one rx queue) unless you got caught out by the default arp_filter
behavior (check arp -an).
> My final question is around smp affinity for the rx and tx queue
> interrupts. Do I need to affine the interrupt for each rx queue to a
> single core to guarantee proper packet ordering, or can they be handled
> on arbitrary cores? Should the tx queue be affined to a particular core
> or left to be handled by all cores?
on RHEL5.3 you can use irqbalance, you shouldn't need to hand affine
anything. Packets won't be received out of order unless you have the rx
interrupts going to more that one cpu per queue. (smp_affinity mask has
more than one bit set) RSS is doing flow steering.
going to a 2.6.27 or newer kernel will get you full tx multiqueue support.
Hope this helps,
Jesse
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-09 20:22 ` Brandeburg, Jesse
@ 2009-10-09 22:31 ` Chris Friesen
2009-10-09 23:20 ` Alexander Duyck
0 siblings, 1 reply; 8+ messages in thread
From: Chris Friesen @ 2009-10-09 22:31 UTC (permalink / raw)
To: Brandeburg, Jesse
Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
Ronciak, John, Kirsher, Jeffrey T
On 10/09/2009 02:22 PM, Brandeburg, Jesse wrote:
> On Fri, 9 Oct 2009, Chris Friesen wrote:
>> I've got some general questions around the expected behaviour of the
>> 82576 igb net device. (On a dual quad-core Nehalem box, if it matters.)
> the hardware you have only supports 8
> queues (rx and tx) and the driver is configured to only set up 4 max.
The datasheet for the 82576 says 16 tx queues and 16 rx queues. Is that
a typo or do we have the economy version?
>> My second question is around how the rx queues are mapped to interrupts.
>> According to /proc/interrupts there appears to be a 1:1 mapping between
>> queues and interrupts. However, I've set up at test with a given amount
>> of traffic coming in to the device (from 4 different IP addresses and 4
>> ports). Under this scenario, "ethtool -S" shows the number of packets
>> increasing for only rx queue 0, but I see the interrupt count going up
>> for two interrupts.
>
> one transmit interrupt and one receive interrupt?
No, two rx interrupts. (Can't remember if the tx interrupt was going up
as well or no...was only looking at rx.)
> RSS will spread the
> receive work out in a flow based way, based on ip/xDP header. Your test
> as described should be using more than one flow (and therefore more than
> one rx queue) unless you got caught out by the default arp_filter
> behavior (check arp -an).
I was surprised as well since it didn't match what I expected. What's
the story around the arp_filter? I just logged onto the test box and
"arp -an" gives:
? (47.135.251.129) at 00:00:5E:00:01:08 [ether] on eth0
but I'm not sure that's worth anything since someone is running a test
and it's currently using all four rx queues and all four rx interrupt
counts are increasing. I'll have to see if they changed anything.
> Hope this helps,
That's great, thanks.
Chris
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-09 22:31 ` Chris Friesen
@ 2009-10-09 23:20 ` Alexander Duyck
2009-10-09 23:48 ` Alexander Duyck
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Duyck @ 2009-10-09 23:20 UTC (permalink / raw)
To: Chris Friesen
Cc: e1000-list ; gospo@redhat.com, Linux Network Development list,
Allan, Bruce W, Brandeburg, Jesse, Ronciak, John,
Kirsher, Jeffrey T
Chris Friesen wrote:
> On 10/09/2009 02:22 PM, Brandeburg, Jesse wrote:
>> On Fri, 9 Oct 2009, Chris Friesen wrote:
>>> I've got some general questions around the expected behaviour of the
>>> 82576 igb net device. (On a dual quad-core Nehalem box, if it matters.)
>
>> the hardware you have only supports 8
>> queues (rx and tx) and the driver is configured to only set up 4 max.
>
> The datasheet for the 82576 says 16 tx queues and 16 rx queues. Is that
> a typo or do we have the economy version?
Actually the limitation is due to the fact that there are only 10
interrupts available. On kernels that support TX multi-queue the number
of queues would be 4 TX and 4 RX, which would consume 8 interrupts
leaving 1 for the link status change and one unused.
However on the kernel you are using I don't believe multi-queue NAPI is
enabled so you shouldn't have multiple RX queues either. On a 2.6.18
kernel you should have only 1 RX and 1 TX queue unless you are using the
driver provided on e1000.sourceforge.net which uses fake netdevs to
support multi-queue NAPI. I believe this may be a bug that was
introduced when SR-IOV support was back-ported from the 2.6.30 kernel.
>>> My second question is around how the rx queues are mapped to interrupts.
>>> According to /proc/interrupts there appears to be a 1:1 mapping between
>>> queues and interrupts. However, I've set up at test with a given amount
>>> of traffic coming in to the device (from 4 different IP addresses and 4
>>> ports). Under this scenario, "ethtool -S" shows the number of packets
>>> increasing for only rx queue 0, but I see the interrupt count going up
>>> for two interrupts.
>> one transmit interrupt and one receive interrupt?
>
> No, two rx interrupts. (Can't remember if the tx interrupt was going up
> as well or no...was only looking at rx.)
This may be due to the bug I mentioned above. Multiple RX queues
shouldn't be present on the 2.6.18 kernel as I do not believe
multi-queue NAPI has been back-ported and it could have negative effects.
>> RSS will spread the
>> receive work out in a flow based way, based on ip/xDP header. Your test
>> as described should be using more than one flow (and therefore more than
>> one rx queue) unless you got caught out by the default arp_filter
>> behavior (check arp -an).
>
> I was surprised as well since it didn't match what I expected. What's
> the story around the arp_filter? I just logged onto the test box and
> "arp -an" gives:
>
> ? (47.135.251.129) at 00:00:5E:00:01:08 [ether] on eth0
>
> but I'm not sure that's worth anything since someone is running a test
> and it's currently using all four rx queues and all four rx interrupt
> counts are increasing. I'll have to see if they changed anything.
>
>
>> Hope this helps,
>
> That's great, thanks.
>
> Chris
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-09 23:20 ` Alexander Duyck
@ 2009-10-09 23:48 ` Alexander Duyck
2009-10-13 17:32 ` Chris Friesen
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Duyck @ 2009-10-09 23:48 UTC (permalink / raw)
To: Chris Friesen
Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T,
gospo@redhat.com
Alexander Duyck wrote:
> Chris Friesen wrote:
>> On 10/09/2009 02:22 PM, Brandeburg, Jesse wrote:
>>> On Fri, 9 Oct 2009, Chris Friesen wrote:
>>>> I've got some general questions around the expected behaviour of the
>>>> 82576 igb net device. (On a dual quad-core Nehalem box, if it matters.)
>>> the hardware you have only supports 8
>>> queues (rx and tx) and the driver is configured to only set up 4 max.
>> The datasheet for the 82576 says 16 tx queues and 16 rx queues. Is that
>> a typo or do we have the economy version?
>
> Actually the limitation is due to the fact that there are only 10
> interrupts available. On kernels that support TX multi-queue the number
> of queues would be 4 TX and 4 RX, which would consume 8 interrupts
> leaving 1 for the link status change and one unused.
>
> However on the kernel you are using I don't believe multi-queue NAPI is
> enabled so you shouldn't have multiple RX queues either. On a 2.6.18
> kernel you should have only 1 RX and 1 TX queue unless you are using the
> driver provided on e1000.sourceforge.net which uses fake netdevs to
> support multi-queue NAPI. I believe this may be a bug that was
> introduced when SR-IOV support was back-ported from the 2.6.30 kernel.
Actually after looking closer at the Redhat source it looks like they
have done the fake netdev workaround in their own code so I guess igb
driver in the RHEL kernel does support multiple RX queues.
>>>> My second question is around how the rx queues are mapped to interrupts.
>>>> According to /proc/interrupts there appears to be a 1:1 mapping between
>>>> queues and interrupts. However, I've set up at test with a given amount
>>>> of traffic coming in to the device (from 4 different IP addresses and 4
>>>> ports). Under this scenario, "ethtool -S" shows the number of packets
>>>> increasing for only rx queue 0, but I see the interrupt count going up
>>>> for two interrupts.
>>> one transmit interrupt and one receive interrupt?
>> No, two rx interrupts. (Can't remember if the tx interrupt was going up
>> as well or no...was only looking at rx.)
>
> This may be due to the bug I mentioned above. Multiple RX queues
> shouldn't be present on the 2.6.18 kernel as I do not believe
> multi-queue NAPI has been back-ported and it could have negative effects.
The odds of any 2 flows overlapping when you are only using 4 flows is
pretty high, especially if the addresses/ports are close in range. You
typically need something on the order of about 16 flows over a wide
range of port numbers in order to get a good distribution.
Thanks,
Alex
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-09 23:48 ` Alexander Duyck
@ 2009-10-13 17:32 ` Chris Friesen
2009-10-16 22:15 ` Richard Scobie
0 siblings, 1 reply; 8+ messages in thread
From: Chris Friesen @ 2009-10-13 17:32 UTC (permalink / raw)
To: Alexander Duyck
Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T,
gospo@redhat.com
On 10/09/2009 05:48 PM, Alexander Duyck wrote:
> The odds of any 2 flows overlapping when you are only using 4 flows is
> pretty high, especially if the addresses/ports are close in range. You
> typically need something on the order of about 16 flows over a wide
> range of port numbers in order to get a good distribution.
Yes, I realize this. However, I was surprised that we were seeing the
packet count increasing for only one queue but the interrupt count
increasing for more than one.
Also, if we really crank up the traffic levels the box apparently
panics. They're working on getting a serial cable hooked up to it to
get the debug information, so I don't really have much information on
that part just yet.
We're going to try the out-of-tree drivers. Unfortunately it appears
that the out-of-tree igb driver doesn't compile for this kernel. I
suspect the compat code needs tweaking.
Chris
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: behaviour question for igb on nehalem box
2009-10-13 17:32 ` Chris Friesen
@ 2009-10-16 22:15 ` Richard Scobie
2009-10-16 22:48 ` [E1000-devel] " Brandeburg, Jesse
0 siblings, 1 reply; 8+ messages in thread
From: Richard Scobie @ 2009-10-16 22:15 UTC (permalink / raw)
To: Chris Friesen
Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T,
gospo@redhat.com
I'm have just put together a Nehalem system (1 x Xeon),
2.6.30.8-64.fc11.x86_64, which has quad onboard 82576 and noticed during
testing using just a single interface, that the RX queues on the other 3
were receiving interrupts - observed in /proc/interrupts.
Is this normal behaviour?
Regards,
Richard
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [E1000-devel] behaviour question for igb on nehalem box
2009-10-16 22:15 ` Richard Scobie
@ 2009-10-16 22:48 ` Brandeburg, Jesse
0 siblings, 0 replies; 8+ messages in thread
From: Brandeburg, Jesse @ 2009-10-16 22:48 UTC (permalink / raw)
To: Richard Scobie
Cc: Chris Friesen, e1000-list, Linux Network Development list,
Allan, Bruce W, Ronciak, John, Kirsher, Jeffrey T,
gospo@redhat.com
On Fri, 16 Oct 2009, Richard Scobie wrote:
> I'm have just put together a Nehalem system (1 x Xeon),
> 2.6.30.8-64.fc11.x86_64, which has quad onboard 82576 and noticed during
> testing using just a single interface, that the RX queues on the other 3
> were receiving interrupts - observed in /proc/interrupts.
>
> Is this normal behaviour?
Hi Richard,
This is normal, since we trigger an interrupt on every queue during our
watchdog. So if the interface is up it should be triggering an interrupt
on every queue every two seconds.
This has (and will probably continue to be) necessary in order to pick up
any straggler packets due to any (extremely rare, but expected to occur)
missed interrupts that could happen when traffic is running. It also is
extremely useful if you're doing irq affinitization and/or debugging
interrupts.
you can bring down the interfaces and they will stop interrupting.
Jesse
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-10-16 22:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-09 18:43 behaviour question for igb on nehalem box Chris Friesen
2009-10-09 20:22 ` Brandeburg, Jesse
2009-10-09 22:31 ` Chris Friesen
2009-10-09 23:20 ` Alexander Duyck
2009-10-09 23:48 ` Alexander Duyck
2009-10-13 17:32 ` Chris Friesen
2009-10-16 22:15 ` Richard Scobie
2009-10-16 22:48 ` [E1000-devel] " Brandeburg, Jesse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).