All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Zvi Effron <zeffron@riotgames.com>
Cc: rob.sherwood@gmail.com, bjorn.topel@gmail.com,
	aforster@cloudflare.com, justin.azoff@gmail.com,
	Xdp <xdp-newbies@vger.kernel.org>,
	magnus.karlsson@intel.com, brouer@redhat.com
Subject: Re: AF_XDP umem and jumbo frames?
Date: Sun, 7 Oct 2018 17:14:02 +0200	[thread overview]
Message-ID: <20181007171402.3d25f9d9@redhat.com> (raw)
In-Reply-To: <CAC1LvL1wds7CL11xUhF_e0QU__C1mX4X69XoHOv95Y=1b4ZQug@mail.gmail.com>


On Fri, 5 Oct 2018 11:47:25 -0700 Zvi Effron <zeffron@riotgames.com> wrote:

> If the requirement is just for contiguous memory, could this be
> resolved with allowing the driver to request multiple contiguous 4KB
> pages instead of one higher order page? Does that reduce the cost? (Or
> does it actually increase it?)

Sorry, but your question does not make sense. A higher order page _is_
multiple contiguous 4KB pages. Thus, the answer is that it is the same.

--Jesper

> On Thu, Oct 4, 2018 at 12:45 PM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Thu, 4 Oct 2018 08:47:45 -0700
> > Rob Sherwood <rob.sherwood@gmail.com> wrote:
> >  
> > > [not speaking for my current employer, but just from past experience ]
> > >
> > > Certainly a lot of the 'hard' requirements (hard meaning - "without
> > > this it won't work")  I've seen could be served with a ~3k non-full
> > > jumbo frame.  
> >
> > Glad to hear that _most_ use-cases can be solved with a ~3k non-full
> > jumbo-frame.
> >  
> > > But at least what I've seen in the past was that because
> > > many of the host-side operations are per-packet limited (e.g., because
> > > of CPU or RAM, but ultimately turns into a max pps per host), a
> > > trivial way to increase application performance/reduce CPU for
> > > networking was to run at as large a frame size as possible.  For
> > > example, if your application/host is really pps limited, then getting
> > > the frame size to increase from 3k to 9k means either 3x more
> > > bandwidth for the same cpu usage (assuming the application is
> > > bandwidth limited) or 1/3x the CPU usage for the same bandwidth (if
> > > the application is not bandwidth limited).  Either way, IMHO it's a
> > > pretty big win.  
> >
> > With XDP we have basically solved the issue of being PPS (packets per
> > sec) limited.  And we can avoid these workarounds of using jumbo frames.
> > That is why it is a bit provoking to ask for jumbo-frames ;-)
> >
> >
> > People on this list might not realize that there is a significant
> > overhead in supporting larger that 4K frames for XDP, that is larger
> > than one memory-page. So let me explain...
> >
> > It is actually trivially easy for XDP to support jumbo frames, if the
> > NIC hardware supports storing RX frames into higher order pages (aka
> > compound pages, more 4K pages physically after each-other) which most
> > HW does. (Page order0 = 4KB, order1=8KB, order2=16KB, order3=32KB).
> > As then XDP will work out-of-the-box, as the requirement is really that
> > packet-payload is layout as phys continuous memory.
> >
> > Kernel page allocator can give us high-order pages, sure, but is cost
> > more, see slide 12 of [1].  The large jump to order-1, is because
> > order-0 have a Per-Cpu-Pages (PCP) cache.  From order-1 and above, the
> > page allocator goes through a central (per NUMA) lock, which makes
> > thing even worse, as this does not scale to multiple CPUs.  And there
> > is also the point of wasting memory when processing 64Byte packets.
> > So, it is not 100% of the picture, that we could support jumbo-frames
> > for XDP.  Mostly because we can workaround this cost/issue, by having
> > recycle caches for these pages, which we even do for order-0 pages.
> > Hint, I actually left this door open, as you can specify page-order
> > when setting up the page_pool API in the driver...
> >
> > [1] http://people.netfilter.org/hawk/presentations/MM-summit2017/MM-summit2017-JesperBrouer.pdf
> >
> > --
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer
> >
> >
> >
> >  
> > > On Thu, Oct 4, 2018 at 12:52 AM Jesper Dangaard Brouer
> > > <brouer@redhat.com> wrote:  
> > > >
> > > > On Thu, 4 Oct 2018 08:44:27 +0200
> > > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > >  
> > > > > Den tors 27 sep. 2018 kl 02:56 skrev Rob Sherwood <rob.sherwood@gmail.com>:  
> > > > > >
> > > > > > Thanks for the reference and the page-per-packet point makes sense.
> > > > > > At the same time, not supporting jumbo frames seems like a non-trivial
> > > > > > limitation.  Are there a subset of drivers that do support jumbo
> > > > > > frames (or LRO or the other features that require multiple pages per
> > > > > > packet)?
> > > > > >  
> > > > >
> > > > > No, not at the moment. XDP has a strict "one frame cannot exceed a
> > > > > page" constraint. Everything that applies to XDP in terms of
> > > > > constraints, applies to AF_XDP as well.
> > > > >
> > > > > Just to clarify, XDP supports jumbo frames -- i.e. larger than 1500B
> > > > > payload, just not the maximum 9000B size. My personal observation is
> > > > > that many deployments that "require jumbo frames", are usually OK with
> > > > > an of MTU ~3000B. Jumbo frames, yes. Full jumbo frames, no. :-)  
> > > >
> > > > Thank you for clarifying that Bjørn.
> > > >
> > > > Can Alex or Rob explain:
> > > >
> > > > (1) What is your use-case for wanting jumbo-frames?
> > > >
> > > > And (2) will an MTU of ~3000Bytes be sufficient? (which XDP does support)
> > > >
> > > >  
> > > > > > On Tue, Sep 25, 2018 at 9:44 AM Alex Forster <aforster@cloudflare.com> wrote:  
> > > > > > >  
> > > > > > > > On my test box running 4.18 if XDP is in use the MTU can not be
> > > > > > > > set higher than 3050.  
> > > > > > >
> > > > > > > Ah, that answers a few questions for me. Thanks!
> > > > > > >
> > > > > > > Alex Forster  
> > > >
> > > > --
> > > > Best regards,
> > > >   Jesper Dangaard Brouer
> > > >   MSc.CS, Principal Kernel Engineer at Red Hat
> > > >   LinkedIn: http://www.linkedin.com/in/brouer  
> >
> >  



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-10-07 22:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-24 21:52 AF_XDP umem and jumbo frames? Alex Forster
2018-09-24 23:15 ` Justin Azoff
2018-09-25 16:43   ` Alex Forster
2018-09-27  0:55     ` Rob Sherwood
2018-10-04  6:44       ` Björn Töpel
2018-10-04  7:52         ` Jesper Dangaard Brouer
2018-10-04 14:48           ` Justin Azoff
2018-10-04 15:39           ` Alex Forster
2018-10-04 15:47           ` Rob Sherwood
2018-10-04 19:44             ` Jesper Dangaard Brouer
2018-10-05 18:47               ` Zvi Effron
2018-10-07 15:14                 ` Jesper Dangaard Brouer [this message]
2018-10-05 19:56               ` Justin Azoff
2018-10-07 16:13                 ` Jesper Dangaard Brouer
2018-10-07 17:48                   ` Eric Leblond
2018-10-07 19:34                   ` Johannes Berg
2018-10-06 20:02               ` Rob Sherwood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181007171402.3d25f9d9@redhat.com \
    --to=brouer@redhat.com \
    --cc=aforster@cloudflare.com \
    --cc=bjorn.topel@gmail.com \
    --cc=justin.azoff@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=rob.sherwood@gmail.com \
    --cc=xdp-newbies@vger.kernel.org \
    --cc=zeffron@riotgames.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.