Re: What's the benefit of large Rx rings?

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>, netdev <netdev@vger.kernel.org>
Subject: Re: What's the benefit of large Rx rings?
Date: Mon, 23 Nov 2015 14:48:30 -0200	[thread overview]
Message-ID: <20151123164829.GA6114@mrl.redhat.com> (raw)
In-Reply-To: <CAKgT0Uem9MdrtJTA1gsoz2D_Fy2N2CdZq5ufdXBs7zsYkLMo-Q@mail.gmail.com>

On Mon, Nov 23, 2015 at 07:16:25AM -0800, Alexander Duyck wrote:
> On Sun, Nov 22, 2015 at 8:47 PM, Yuval Mintz <Yuval.Mintz@qlogic.com> wrote:
> >>> This might be a dumb question, but I recently touched this
> >>> and felt like I'm missing something basic -
> >>>
> >>> NAPI is being scheduled from soft-interrupt contex, and it
> >>> has a ~strict quota for handling Rx packets [even though we're
> >>> allowing practically unlimited handling of Tx completions].
> >>> Given these facts, what's the benefit of having arbitrary large
> >>> Rx buffer rings? Assuming quota is 64, I would have expected
> >>> that having more than twice or thrice as many buffers could not
> >>> help in real traffic scenarios - in any given time-unit
> >>> [the time between 2 NAPI runs which should be relatively
> >>> constant] CPU can't handle more than the quota; If HW is
> >>> generating more packets on a regular basis the buffers are bound
> >>> to get exhausted, no matter how many there are.
> >>>
> >>> While there isn't any obvious downside to allowing drivers to
> >>> increase ring sizes to be larger [other than memory footprint],
> >>> I feel like I'm missing the scenarios where having Ks of
> >>> buffers can actually help.
> >>> And for the unlikely case that I'm not missing anything,
> >>> why aren't we supplying some `default' max and min amounts
> >>> in a common header?
> >
> >> The main benefit of large Rx rings is that you could theoretically
> >> support longer delays between device interrupts.  So for example if
> >> you have a protocol such as UDP that doesn't care about latency then
> >> you could theoretically set a large ring size, a large interrupt delay
> >> and process several hundred or possibly even several thousand packets
> >> per device interrupt instead of just a few.
> >
> > So we're basically spending hundred of MBs [at least for high-speed
> > ethernet devices] on memory that helps us mostly on the first
> > coalesced interrupt [since later it all goes through napi re-scheduling]?
> > Sounds a bit... wasteful.
> 
> The hundreds of MBs might be stretching it a bit.  It is most likely
> more like tens of MBs, not hundreds.  For example the ixgbe driver
> uses 512 buffers for Rx by default.  Each Rx buffer is 4K so that
> comes out to only 2MB per ring.  Other than that there are 8K worth of
> descriptors and another 12K worth of buffer info data.
> 
> It all depends on priorities.  You could decrease the delay between
> interrupts and reduce the Rx ring size but it means for a lightly
> loaded system you may see significantly higher CPU utilization.
> 
> Another thing to keep in mind is for things like virtualization the
> interrupt latency is increased and as a result you need more buffering
> to allow for the greater delay between the IRQ and when the NAPI
> instance in the guest actually begins polling.

There are other factors too that may cause extra processing during that
softirq. If you have netfilter rules, for example, they are processed
in the same SI that is receiving the packets, it's part of it. Then you
can have rules that are check for some packets and rules are skipped,
etc..

Even tcp processing is done at this time, specially if you don't use
RFS. If a given socket starts to get its rx buffer full, it may trigger
a buffer collapse (tcp_collapse()) and will consume an extra cpu time
for that packet only.

But for how big is worth having it, that's a good question, as these
extra processing depends on traffic pattern, CPU model, memory
speed/availability, etc, while NIC line rate may remain nearly constant.

  Marcelo

next prev parent reply	other threads:[~2015-11-23 16:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-22 20:19 What's the benefit of large Rx rings? Yuval Mintz
2015-11-22 21:53 ` Alexander Duyck
2015-11-23  4:47   ` Yuval Mintz
2015-11-23 15:16     ` Alexander Duyck
2015-11-23 16:48       ` Marcelo Ricardo Leitner [this message]
2015-11-23 17:27 ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151123164829.GA6114@mrl.redhat.com \
    --to=marcelo.leitner@gmail.com \
    --cc=Yuval.Mintz@qlogic.com \
    --cc=alexander.duyck@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).