* [RFC] bnx2x: Insane RX rings
@ 2010-09-09 20:45 Eric Dumazet
2010-09-09 21:21 ` Krzysztof Olędzki
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2010-09-09 20:45 UTC (permalink / raw)
To: netdev; +Cc: Eilon Greenstein
So I have a small dev machine, 4GB of ram,
a dual E5540 cpu (quad core, 2 threads per core),
so a total of 16 threads.
Two ethernet ports, eth0 and eth1,
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
bnx2x 0000:02:00.0: eth0: using MSI-X IRQs: sp 68 fp[0] 69 ... fp[15] 84
bnx2x 0000:02:00.1: eth1: using MSI-X IRQs: sp 85 fp[0] 86 ... fp[15] 101
Default configuration :
ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4078
RX Mini: 0
RX Jumbo: 0
TX: 4078
Current hardware settings:
RX: 4078
RX Mini: 0
RX Jumbo: 0
TX: 4078
Problem is : With 16 RX queues per device , thats 4078*16*2Kbytes per
ethernet port.
Total :
skbuff_head_cache 130747 131025 256 15 1 : tunables 120 60 8 : slabdata 8735 8735 40
size-2048 130866 130888 2048 2 1 : tunables 24 12 8 : slabdata 65444 65444 28
Thats about 300 Mbytes of memory, just in case some network trafic will occur.
Lets do something about that ?
Thanks
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-09 20:45 [RFC] bnx2x: Insane RX rings Eric Dumazet
@ 2010-09-09 21:21 ` Krzysztof Olędzki
2010-09-09 21:30 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Krzysztof Olędzki @ 2010-09-09 21:21 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, Eilon Greenstein
On 2010-09-09 22:45, Eric Dumazet wrote:
> So I have a small dev machine, 4GB of ram,
> a dual E5540 cpu (quad core, 2 threads per core),
> so a total of 16 threads.
>
> Two ethernet ports, eth0 and eth1,
>
> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
> 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57711E 10Gigabit PCIe
>
> bnx2x 0000:02:00.0: eth0: using MSI-X IRQs: sp 68 fp[0] 69 ... fp[15] 84
> bnx2x 0000:02:00.1: eth1: using MSI-X IRQs: sp 85 fp[0] 86 ... fp[15] 101
>
>
> Default configuration :
>
> ethtool -g eth0
> Ring parameters for eth0:
> Pre-set maximums:
> RX: 4078
> RX Mini: 0
> RX Jumbo: 0
> TX: 4078
> Current hardware settings:
> RX: 4078
> RX Mini: 0
> RX Jumbo: 0
> TX: 4078
>
> Problem is : With 16 RX queues per device , thats 4078*16*2Kbytes per
> ethernet port.
>
> Total :
>
> skbuff_head_cache 130747 131025 256 15 1 : tunables 120 60 8 : slabdata 8735 8735 40
> size-2048 130866 130888 2048 2 1 : tunables 24 12 8 : slabdata 65444 65444 28
>
> Thats about 300 Mbytes of memory, just in case some network trafic will occur.
>
> Lets do something about that ?
Yep, it is ~8MB per queue, not so much alone, but a lot together. For
this reason I use something like bnx2.num_queues=2 on servers where I
don't need much CPU power for network workload.
Best regards,
Krzysztof Olędzki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-09 21:21 ` Krzysztof Olędzki
@ 2010-09-09 21:30 ` David Miller
2010-09-09 21:38 ` Rick Jones
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2010-09-09 21:30 UTC (permalink / raw)
To: ole; +Cc: eric.dumazet, netdev, eilong
From: Krzysztof Olędzki <ole@ans.pl>
Date: Thu, 09 Sep 2010 23:21:01 +0200
> On 2010-09-09 22:45, Eric Dumazet wrote:
>> Problem is : With 16 RX queues per device , thats 4078*16*2Kbytes per
>> ethernet port.
>>
>> Total :
>>
>> skbuff_head_cache 130747 131025 256 15 1 : tunables 120 60 8 :
>> slabdata 8735 8735 40
>> size-2048 130866 130888 2048 2 1 : tunables 24 12 8 : slabdata 65444
>> 65444 28
>>
>> Thats about 300 Mbytes of memory, just in case some network trafic
>> will occur.
>>
>> Lets do something about that ?
>
> Yep, it is ~8MB per queue, not so much alone, but a lot together. For
> this reason I use something like bnx2.num_queues=2 on servers where I
> don't need much CPU power for network workload.
I think simply that the RX queue size should be scaled by the number
of queues we have.
If people want enormous RX ring sizes even when there are many queues,
they can use ethtool to get that.
Taking up 130MB of memory per-card, just for RX packet buffers, is
certainly over the top.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-09 21:30 ` David Miller
@ 2010-09-09 21:38 ` Rick Jones
2010-09-10 11:16 ` Eilon Greenstein
0 siblings, 1 reply; 8+ messages in thread
From: Rick Jones @ 2010-09-09 21:38 UTC (permalink / raw)
To: David Miller; +Cc: ole, eric.dumazet, netdev, eilong
David Miller wrote:
> From: Krzysztof Olędzki <ole@ans.pl>
> Date: Thu, 09 Sep 2010 23:21:01 +0200
>
>
>>On 2010-09-09 22:45, Eric Dumazet wrote:
>>
>>>Problem is : With 16 RX queues per device , thats 4078*16*2Kbytes per
>>>ethernet port.
>>>
>>>Total :
>>>
>>>skbuff_head_cache 130747 131025 256 15 1 : tunables 120 60 8 :
>>>slabdata 8735 8735 40
>>>size-2048 130866 130888 2048 2 1 : tunables 24 12 8 : slabdata 65444
>>>65444 28
>>>
>>>Thats about 300 Mbytes of memory, just in case some network trafic
>>>will occur.
>>>
>>>Lets do something about that ?
>>
>>Yep, it is ~8MB per queue, not so much alone, but a lot together. For
>>this reason I use something like bnx2.num_queues=2 on servers where I
>>don't need much CPU power for network workload.
>
>
> I think simply that the RX queue size should be scaled by the number
> of queues we have.
>
> If people want enormous RX ring sizes even when there are many queues,
> they can use ethtool to get that.
>
> Taking up 130MB of memory per-card, just for RX packet buffers, is
> certainly over the top.
It gets even better if one consideres JumboFrames... that said, I've had
customer contacts (indirect) where they were quite keep to have a ring size of
at least 2048 packets - I never could get it confirmed, but I suspect they had
applications/systems that might "go out to lunch" for long-enough periods of
time they wanted that degree of FIFO.
Doesn't necessarily change "what should be the defaults" much but there it is.
rick jones
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-09 21:38 ` Rick Jones
@ 2010-09-10 11:16 ` Eilon Greenstein
2010-09-10 15:46 ` Rick Jones
2010-09-10 16:42 ` David Miller
0 siblings, 2 replies; 8+ messages in thread
From: Eilon Greenstein @ 2010-09-10 11:16 UTC (permalink / raw)
To: David Miller, Rick Jones
Cc: ole@ans.pl, eric.dumazet@gmail.com, netdev@vger.kernel.org
On Thu, 2010-09-09 at 14:38 -0700, Rick Jones wrote:
> David Miller wrote:
> > From: Krzysztof Olędzki <ole@ans.pl>
> > Date: Thu, 09 Sep 2010 23:21:01 +0200
> >
> >
> >>On 2010-09-09 22:45, Eric Dumazet wrote:
> >>
> >>>Problem is : With 16 RX queues per device , thats 4078*16*2Kbytes per
> >>>ethernet port.
> >>>
> >>>Total :
> >>>
> >>>skbuff_head_cache 130747 131025 256 15 1 : tunables 120 60 8 :
> >>>slabdata 8735 8735 40
> >>>size-2048 130866 130888 2048 2 1 : tunables 24 12 8 : slabdata 65444
> >>>65444 28
> >>>
> >>>Thats about 300 Mbytes of memory, just in case some network trafic
> >>>will occur.
> >>>
> >>>Lets do something about that ?
> >>
> >>Yep, it is ~8MB per queue, not so much alone, but a lot together. For
> >>this reason I use something like bnx2.num_queues=2 on servers where I
> >>don't need much CPU power for network workload.
> >
> >
> > I think simply that the RX queue size should be scaled by the number
> > of queues we have.
There are few factors that can be considered when scaling the ring
sizes:
- Number of queues per device
- Number of devices
- Available amount of memory
- Others...
I'm thinking about adding a factor only according to the number of
queues - this will still cause issues for systems with many ports. Does
that sound reasonable or not enough? Do you think the number of devices
or even the amount of free memory should be considered?
Thanks,
Eilon
> > If people want enormous RX ring sizes even when there are many queues,
> > they can use ethtool to get that.
> >
> > Taking up 130MB of memory per-card, just for RX packet buffers, is
> > certainly over the top.
>
> It gets even better if one consideres JumboFrames... that said, I've had
> customer contacts (indirect) where they were quite keep to have a ring size of
> at least 2048 packets - I never could get it confirmed, but I suspect they had
> applications/systems that might "go out to lunch" for long-enough periods of
> time they wanted that degree of FIFO.
>
> Doesn't necessarily change "what should be the defaults" much but there it is.
>
> rick jones
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-10 11:16 ` Eilon Greenstein
@ 2010-09-10 15:46 ` Rick Jones
2010-09-10 15:54 ` Rick Jones
2010-09-10 16:42 ` David Miller
1 sibling, 1 reply; 8+ messages in thread
From: Rick Jones @ 2010-09-10 15:46 UTC (permalink / raw)
To: eilong
Cc: David Miller, ole@ans.pl, eric.dumazet@gmail.com,
netdev@vger.kernel.org
>>>I think simply that the RX queue size should be scaled by the number
>>>of queues we have.
>
>
> There are few factors that can be considered when scaling the ring
> sizes:
> - Number of queues per device
> - Number of devices
> - Available amount of memory
> - Others...
>
> I'm thinking about adding a factor only according to the number of
> queues - this will still cause issues for systems with many ports. Does
> that sound reasonable or not enough? Do you think the number of devices
> or even the amount of free memory should be considered?
At one level we are talking about horses and barn doors - for example, the
minimum memory requirements for ProLiants have already been set (and
communicated for some time) taking memory usage of their LOMs (Lan On
Motherboard) into account.
rick jones
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] bnx2x: Insane RX rings
2010-09-10 11:16 ` Eilon Greenstein
2010-09-10 15:46 ` Rick Jones
@ 2010-09-10 16:42 ` David Miller
1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2010-09-10 16:42 UTC (permalink / raw)
To: eilong; +Cc: rick.jones2, ole, eric.dumazet, netdev
From: "Eilon Greenstein" <eilong@broadcom.com>
Date: Fri, 10 Sep 2010 14:16:14 +0300
> There are few factors that can be considered when scaling the ring
> sizes:
> - Number of queues per device
> - Number of devices
> - Available amount of memory
> - Others...
>
> I'm thinking about adding a factor only according to the number of
> queues - this will still cause issues for systems with many ports. Does
> that sound reasonable or not enough? Do you think the number of devices
> or even the amount of free memory should be considered?
I think scaling based upon the number of queues is a good place
to start.
Multi-port is less of an issue. The problem we really care about is
stemming from the fact that the same exact port will require more
memory than another one simply because it has more queues active.
I would even argue that this is a zero sum thing to do, because
since the traffic ought to be distributed, you have enough buffers
to handle the load.
Of course I understand that a certain level of buffering is necessary
even on a per-queue level with many queues active, so if you scale
based upon the number of queues but then enforce a minimum (something
like 128 entries) that would be a reasonable thing to do.
Thanks for looking into this Eilon.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-09-10 16:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-09 20:45 [RFC] bnx2x: Insane RX rings Eric Dumazet
2010-09-09 21:21 ` Krzysztof Olędzki
2010-09-09 21:30 ` David Miller
2010-09-09 21:38 ` Rick Jones
2010-09-10 11:16 ` Eilon Greenstein
2010-09-10 15:46 ` Rick Jones
2010-09-10 15:54 ` Rick Jones
2010-09-10 16:42 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).