From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Justin Azoff <justin.azoff@gmail.com>
Cc: rob.sherwood@gmail.com, bjorn.topel@gmail.com,
aforster@cloudflare.com, xdp-newbies@vger.kernel.org,
magnus.karlsson@intel.com, brouer@redhat.com
Subject: Re: AF_XDP umem and jumbo frames?
Date: Sun, 7 Oct 2018 18:13:39 +0200 [thread overview]
Message-ID: <20181007181339.2d94c762@redhat.com> (raw)
In-Reply-To: <CAJL276AAN2CcyV_vFV1CX7_8BHHoiP9w0z=GQL0imsC+4ujORA@mail.gmail.com>
On Fri, 5 Oct 2018 15:56:31 -0400
Justin Azoff <justin.azoff@gmail.com> wrote:
> > People on this list might not realize that there is a significant
> > overhead in supporting larger that 4K frames for XDP, that is larger
> > than one memory-page. So let me explain...
> >
> > It is actually trivially easy for XDP to support jumbo frames, if the
> > NIC hardware supports storing RX frames into higher order pages (aka
> > compound pages, more 4K pages physically after each-other) which most
> > HW does. (Page order0 = 4KB, order1=8KB, order2=16KB, order3=32KB).
> > As then XDP will work out-of-the-box, as the requirement is really that
> > packet-payload is layout as phys continuous memory.
> >
>
> For the use cases of XDP_DROP or XDP_PASS, could XDP send as much of
> the packet that fits in a single page up to the ebpf program and allow
> decisions based on that?
>
> For the flow bypass, ddos drop stuff, you only need the l3 header to
> make the PASS/DROP decision, not the entire packet.
>
> I suppose this would be a bit more complicated for modifying headers
> and using XDP_TX.
The key in your question is just "bit more complicated", then we can
support feature "X". For XDP is designed for performance where every
nanosec counts. Feature creep will slowly but surely kill this
performance edge.
I'll try to explain the overhead of jumbo-frame again, with another
angle. XDP have gained performance up-front by saying we don't support
jumbo-frames. As instead of (per RX packet) allocating 3x 4KB pages, we
only need to alloc a single 4KB page. That in itself is a huge
performance win. Are you saying that you want a feature, that is used
in 1-5% use-cases, that in general is going to slowdown the baseline
performance of XDP?
One thing I realize is that people on this list, are perhaps not
familiar how NIC RX (via DMA) works. On RX, we cannot know the RX
packet size up-front. Thus, when filling the NIC RX-ring memory slots,
then we have to allocated room for the "worse-case", e.g. 9000Bytes is
minimum 3x4K=12K, and due to page-alloc limits min 4x4K=16K. Thus,
regardless of packet length the alloc size is the same. (I will not go
into detail on how different drivers tries to reduce this mem-overhead,
but only say that those tricks costs CPU cycles).
A last word of adding features to XDP: When adding features, I look
long and hard for ways that the features checks can be pushed to setup
time, rather than runtime.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-10-07 23:21 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-24 21:52 AF_XDP umem and jumbo frames? Alex Forster
2018-09-24 23:15 ` Justin Azoff
2018-09-25 16:43 ` Alex Forster
2018-09-27 0:55 ` Rob Sherwood
2018-10-04 6:44 ` Björn Töpel
2018-10-04 7:52 ` Jesper Dangaard Brouer
2018-10-04 14:48 ` Justin Azoff
2018-10-04 15:39 ` Alex Forster
2018-10-04 15:47 ` Rob Sherwood
2018-10-04 19:44 ` Jesper Dangaard Brouer
2018-10-05 18:47 ` Zvi Effron
2018-10-07 15:14 ` Jesper Dangaard Brouer
2018-10-05 19:56 ` Justin Azoff
2018-10-07 16:13 ` Jesper Dangaard Brouer [this message]
2018-10-07 17:48 ` Eric Leblond
2018-10-07 19:34 ` Johannes Berg
2018-10-06 20:02 ` Rob Sherwood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181007181339.2d94c762@redhat.com \
--to=brouer@redhat.com \
--cc=aforster@cloudflare.com \
--cc=bjorn.topel@gmail.com \
--cc=justin.azoff@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=rob.sherwood@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.