netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* What's the benefit of large Rx rings?
@ 2015-11-22 20:19 Yuval Mintz
  2015-11-22 21:53 ` Alexander Duyck
  2015-11-23 17:27 ` David Laight
  0 siblings, 2 replies; 6+ messages in thread
From: Yuval Mintz @ 2015-11-22 20:19 UTC (permalink / raw)
  To: netdev

Hi,

This might be a dumb question, but I recently touched this
and felt like I'm missing something basic -

NAPI is being scheduled from soft-interrupt contex, and it
has a ~strict quota for handling Rx packets [even though we're
allowing practically unlimited handling of Tx completions].
Given these facts, what's the benefit of having arbitrary large
Rx buffer rings? Assuming quota is 64, I would have expected
that having more than twice or thrice as many buffers could not
help in real traffic scenarios - in any given time-unit
[the time between 2 NAPI runs which should be relatively
constant] CPU can't handle more than the quota; If HW is
generating more packets on a regular basis the buffers are bound
to get exhausted, no matter how many there are.

While there isn't any obvious downside to allowing drivers to
increase ring sizes to be larger [other than memory footprint],
I feel like I'm missing the scenarios where having Ks of
buffers can actually help.
And for the unlikely case that I'm not missing anything,
why aren't we supplying some `default' max and min amounts
in a common header?

Thanks,
Yuval

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What's the benefit of large Rx rings?
  2015-11-22 20:19 What's the benefit of large Rx rings? Yuval Mintz
@ 2015-11-22 21:53 ` Alexander Duyck
  2015-11-23  4:47   ` Yuval Mintz
  2015-11-23 17:27 ` David Laight
  1 sibling, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2015-11-22 21:53 UTC (permalink / raw)
  To: Yuval Mintz; +Cc: netdev

On Sun, Nov 22, 2015 at 12:19 PM, Yuval Mintz <Yuval.Mintz@qlogic.com> wrote:
> Hi,
>
> This might be a dumb question, but I recently touched this
> and felt like I'm missing something basic -
>
> NAPI is being scheduled from soft-interrupt contex, and it
> has a ~strict quota for handling Rx packets [even though we're
> allowing practically unlimited handling of Tx completions].
> Given these facts, what's the benefit of having arbitrary large
> Rx buffer rings? Assuming quota is 64, I would have expected
> that having more than twice or thrice as many buffers could not
> help in real traffic scenarios - in any given time-unit
> [the time between 2 NAPI runs which should be relatively
> constant] CPU can't handle more than the quota; If HW is
> generating more packets on a regular basis the buffers are bound
> to get exhausted, no matter how many there are.
>
> While there isn't any obvious downside to allowing drivers to
> increase ring sizes to be larger [other than memory footprint],
> I feel like I'm missing the scenarios where having Ks of
> buffers can actually help.
> And for the unlikely case that I'm not missing anything,
> why aren't we supplying some `default' max and min amounts
> in a common header?

The main benefit of large Rx rings is that you could theoretically
support longer delays between device interrupts.  So for example if
you have a protocol such as UDP that doesn't care about latency then
you could theoretically set a large ring size, a large interrupt delay
and process several hundred or possibly even several thousand packets
per device interrupt instead of just a few.

- Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What's the benefit of large Rx rings?
  2015-11-22 21:53 ` Alexander Duyck
@ 2015-11-23  4:47   ` Yuval Mintz
  2015-11-23 15:16     ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: Yuval Mintz @ 2015-11-23  4:47 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev

>> This might be a dumb question, but I recently touched this
>> and felt like I'm missing something basic -
>>
>> NAPI is being scheduled from soft-interrupt contex, and it
>> has a ~strict quota for handling Rx packets [even though we're
>> allowing practically unlimited handling of Tx completions].
>> Given these facts, what's the benefit of having arbitrary large
>> Rx buffer rings? Assuming quota is 64, I would have expected
>> that having more than twice or thrice as many buffers could not
>> help in real traffic scenarios - in any given time-unit
>> [the time between 2 NAPI runs which should be relatively
>> constant] CPU can't handle more than the quota; If HW is
>> generating more packets on a regular basis the buffers are bound
>> to get exhausted, no matter how many there are.
>>
>> While there isn't any obvious downside to allowing drivers to
>> increase ring sizes to be larger [other than memory footprint],
>> I feel like I'm missing the scenarios where having Ks of
>> buffers can actually help.
>> And for the unlikely case that I'm not missing anything,
>> why aren't we supplying some `default' max and min amounts
>> in a common header?

> The main benefit of large Rx rings is that you could theoretically
> support longer delays between device interrupts.  So for example if
> you have a protocol such as UDP that doesn't care about latency then
> you could theoretically set a large ring size, a large interrupt delay
> and process several hundred or possibly even several thousand packets
> per device interrupt instead of just a few.

So we're basically spending hundred of MBs [at least for high-speed
ethernet devices] on memory that helps us mostly on the first
coalesced interrupt [since later it all goes through napi re-scheduling]?
Sounds a bit... wasteful.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What's the benefit of large Rx rings?
  2015-11-23  4:47   ` Yuval Mintz
@ 2015-11-23 15:16     ` Alexander Duyck
  2015-11-23 16:48       ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2015-11-23 15:16 UTC (permalink / raw)
  To: Yuval Mintz; +Cc: netdev

On Sun, Nov 22, 2015 at 8:47 PM, Yuval Mintz <Yuval.Mintz@qlogic.com> wrote:
>>> This might be a dumb question, but I recently touched this
>>> and felt like I'm missing something basic -
>>>
>>> NAPI is being scheduled from soft-interrupt contex, and it
>>> has a ~strict quota for handling Rx packets [even though we're
>>> allowing practically unlimited handling of Tx completions].
>>> Given these facts, what's the benefit of having arbitrary large
>>> Rx buffer rings? Assuming quota is 64, I would have expected
>>> that having more than twice or thrice as many buffers could not
>>> help in real traffic scenarios - in any given time-unit
>>> [the time between 2 NAPI runs which should be relatively
>>> constant] CPU can't handle more than the quota; If HW is
>>> generating more packets on a regular basis the buffers are bound
>>> to get exhausted, no matter how many there are.
>>>
>>> While there isn't any obvious downside to allowing drivers to
>>> increase ring sizes to be larger [other than memory footprint],
>>> I feel like I'm missing the scenarios where having Ks of
>>> buffers can actually help.
>>> And for the unlikely case that I'm not missing anything,
>>> why aren't we supplying some `default' max and min amounts
>>> in a common header?
>
>> The main benefit of large Rx rings is that you could theoretically
>> support longer delays between device interrupts.  So for example if
>> you have a protocol such as UDP that doesn't care about latency then
>> you could theoretically set a large ring size, a large interrupt delay
>> and process several hundred or possibly even several thousand packets
>> per device interrupt instead of just a few.
>
> So we're basically spending hundred of MBs [at least for high-speed
> ethernet devices] on memory that helps us mostly on the first
> coalesced interrupt [since later it all goes through napi re-scheduling]?
> Sounds a bit... wasteful.

The hundreds of MBs might be stretching it a bit.  It is most likely
more like tens of MBs, not hundreds.  For example the ixgbe driver
uses 512 buffers for Rx by default.  Each Rx buffer is 4K so that
comes out to only 2MB per ring.  Other than that there are 8K worth of
descriptors and another 12K worth of buffer info data.

It all depends on priorities.  You could decrease the delay between
interrupts and reduce the Rx ring size but it means for a lightly
loaded system you may see significantly higher CPU utilization.

Another thing to keep in mind is for things like virtualization the
interrupt latency is increased and as a result you need more buffering
to allow for the greater delay between the IRQ and when the NAPI
instance in the guest actually begins polling.

- Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: What's the benefit of large Rx rings?
  2015-11-23 15:16     ` Alexander Duyck
@ 2015-11-23 16:48       ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 6+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-11-23 16:48 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Yuval Mintz, netdev

On Mon, Nov 23, 2015 at 07:16:25AM -0800, Alexander Duyck wrote:
> On Sun, Nov 22, 2015 at 8:47 PM, Yuval Mintz <Yuval.Mintz@qlogic.com> wrote:
> >>> This might be a dumb question, but I recently touched this
> >>> and felt like I'm missing something basic -
> >>>
> >>> NAPI is being scheduled from soft-interrupt contex, and it
> >>> has a ~strict quota for handling Rx packets [even though we're
> >>> allowing practically unlimited handling of Tx completions].
> >>> Given these facts, what's the benefit of having arbitrary large
> >>> Rx buffer rings? Assuming quota is 64, I would have expected
> >>> that having more than twice or thrice as many buffers could not
> >>> help in real traffic scenarios - in any given time-unit
> >>> [the time between 2 NAPI runs which should be relatively
> >>> constant] CPU can't handle more than the quota; If HW is
> >>> generating more packets on a regular basis the buffers are bound
> >>> to get exhausted, no matter how many there are.
> >>>
> >>> While there isn't any obvious downside to allowing drivers to
> >>> increase ring sizes to be larger [other than memory footprint],
> >>> I feel like I'm missing the scenarios where having Ks of
> >>> buffers can actually help.
> >>> And for the unlikely case that I'm not missing anything,
> >>> why aren't we supplying some `default' max and min amounts
> >>> in a common header?
> >
> >> The main benefit of large Rx rings is that you could theoretically
> >> support longer delays between device interrupts.  So for example if
> >> you have a protocol such as UDP that doesn't care about latency then
> >> you could theoretically set a large ring size, a large interrupt delay
> >> and process several hundred or possibly even several thousand packets
> >> per device interrupt instead of just a few.
> >
> > So we're basically spending hundred of MBs [at least for high-speed
> > ethernet devices] on memory that helps us mostly on the first
> > coalesced interrupt [since later it all goes through napi re-scheduling]?
> > Sounds a bit... wasteful.
> 
> The hundreds of MBs might be stretching it a bit.  It is most likely
> more like tens of MBs, not hundreds.  For example the ixgbe driver
> uses 512 buffers for Rx by default.  Each Rx buffer is 4K so that
> comes out to only 2MB per ring.  Other than that there are 8K worth of
> descriptors and another 12K worth of buffer info data.
> 
> It all depends on priorities.  You could decrease the delay between
> interrupts and reduce the Rx ring size but it means for a lightly
> loaded system you may see significantly higher CPU utilization.
> 
> Another thing to keep in mind is for things like virtualization the
> interrupt latency is increased and as a result you need more buffering
> to allow for the greater delay between the IRQ and when the NAPI
> instance in the guest actually begins polling.

There are other factors too that may cause extra processing during that
softirq. If you have netfilter rules, for example, they are processed
in the same SI that is receiving the packets, it's part of it. Then you
can have rules that are check for some packets and rules are skipped,
etc..

Even tcp processing is done at this time, specially if you don't use
RFS. If a given socket starts to get its rx buffer full, it may trigger
a buffer collapse (tcp_collapse()) and will consume an extra cpu time
for that packet only.

But for how big is worth having it, that's a good question, as these
extra processing depends on traffic pattern, CPU model, memory
speed/availability, etc, while NIC line rate may remain nearly constant.

  Marcelo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: What's the benefit of large Rx rings?
  2015-11-22 20:19 What's the benefit of large Rx rings? Yuval Mintz
  2015-11-22 21:53 ` Alexander Duyck
@ 2015-11-23 17:27 ` David Laight
  1 sibling, 0 replies; 6+ messages in thread
From: David Laight @ 2015-11-23 17:27 UTC (permalink / raw)
  To: 'Yuval Mintz', netdev

From: Yuval Mintz
> Sent: 22 November 2015 20:19
> This might be a dumb question, but I recently touched this
> and felt like I'm missing something basic -
> 
> NAPI is being scheduled from soft-interrupt contex, and it
> has a ~strict quota for handling Rx packets [even though we're
> allowing practically unlimited handling of Tx completions].
> Given these facts, what's the benefit of having arbitrary large
> Rx buffer rings? Assuming quota is 64, I would have expected
> that having more than twice or thrice as many buffers could not
> help in real traffic scenarios - in any given time-unit
> [the time between 2 NAPI runs which should be relatively
> constant] CPU can't handle more than the quota; If HW is
> generating more packets on a regular basis the buffers are bound
> to get exhausted, no matter how many there are.

What you don't want is guaranteed packet loss for common scenarios.

The worst one we I've seen was not having enough buffers for a
single NFS 8k UDP datagram (was a long time ago).

But a 64k send using hardware TSO will most likely give you
about 40 receive frames back to back, unless you can keep up
with line speed (unlikely at high speeds on a slow cpu) you
may need several times that many buffers to handle rx data
on multiple connections.
At some point data will get discarded, but usually the delays
in sending acks will slow down the receive data.

In reality it is all a trade off between a lot of rx buffers
and recovering from rx discards.

	David

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-23 17:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-22 20:19 What's the benefit of large Rx rings? Yuval Mintz
2015-11-22 21:53 ` Alexander Duyck
2015-11-23  4:47   ` Yuval Mintz
2015-11-23 15:16     ` Alexander Duyck
2015-11-23 16:48       ` Marcelo Ricardo Leitner
2015-11-23 17:27 ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).