From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>,
Brenden Blanco <bblanco@plumgrid.com>,
John Fastabend <john.fastabend@gmail.com>,
Tom Herbert <tom@herbertland.com>,
Daniel Borkmann <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Or Gerlitz <ogerlitz@mellanox.com>,
brouer@redhat.com
Subject: Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
Date: Tue, 5 Apr 2016 10:11:23 +0200 [thread overview]
Message-ID: <20160405101123.4e128d13@redhat.com> (raw)
In-Reply-To: <20160405022507.GB80209@ast-mbp.thefacebook.com>
On Mon, 4 Apr 2016 19:25:08 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> On Tue, Apr 05, 2016 at 12:04:39AM +0200, Thomas Graf wrote:
> > On 04/04/16 at 01:00pm, Alexei Starovoitov wrote:
> > > Exactly. That the most important part of this rfc.
> > > Right now redirect to different queue, batching, prefetch and tons of
> > > other code are mising. We have to plan the whole project, so we can
> > > incrementally add features without breaking abi.
> > > So new IFLA, xdp_metadata struct and enum for bpf return codes are
> > > the main things to agree on.
> >
> > +1
> > This is the most important statement in this thread so far. A plan
> > that gets us from this RFC series to a functional forwarding engine
> > with redirect and load/write is essential. [...]
+1 agreed, I love to see the energy in this thread! :-)
> exactly. I think the next step 2 is to figure out the redirect return code
> and 'rewiring' of the rx dma buffer into tx ring and auto-batching.
>
> As this rfc showed even when using standard page alloc/free the peformance
> is hitting 10Gbps hw limit and not being cpu bounded, so recycling of
> the pages and avoiding map/unmap will come at step 3.
Yes, I've also noticed the standard page alloc/free performance is
slowing us down. I will be working on identifying (and measuring) page
allocator bottlenecks.
For the early drop case, we should be able to hack the driver to,
recycle the page directly (and even avoid the DMA unmap). But for the
TX (forward) case, we would need some kind of page-pool cache API to
recycle the page through (also useful for normal netstack usage).
I'm interested in implementing a generic page-pool cache mechanism,
and I plan to bring up this topic at the MM-summit in 2 weeks. Input
are welcome... but as Alexei says this is likely a step 3 project.
> Batching is necessary even for basic redirect, since ringing doorbell
> for every tx buffer is not an option.
Yes, we know TX batching is essential for performance. If we create a
RX bundle/batch step (in the driver) then eBFP forward step can work on
a RX-bundle and build up TX-bundle(s) (per TX device), that can TX bulk
send these. Notice that I propose building TX-bundles, not sending
each individual packet and flushing tail/doorbell, because I want to
maximize icache efficiency of the eBFP program.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2016-04-05 8:11 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-02 1:21 [RFC PATCH 0/5] Add driver bpf hook for early packet drop Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter Brenden Blanco
2016-04-02 16:39 ` Tom Herbert
2016-04-03 7:02 ` Brenden Blanco
2016-04-04 22:07 ` Thomas Graf
2016-04-05 8:19 ` Jesper Dangaard Brouer
2016-04-04 8:49 ` Daniel Borkmann
2016-04-04 13:07 ` Jesper Dangaard Brouer
2016-04-04 13:36 ` Daniel Borkmann
2016-04-04 14:09 ` Tom Herbert
2016-04-04 15:12 ` Jesper Dangaard Brouer
2016-04-04 15:29 ` Brenden Blanco
2016-04-04 16:07 ` John Fastabend
2016-04-04 16:17 ` Brenden Blanco
2016-04-04 20:00 ` Alexei Starovoitov
2016-04-04 22:04 ` Thomas Graf
2016-04-05 2:25 ` Alexei Starovoitov
2016-04-05 8:11 ` Jesper Dangaard Brouer [this message]
2016-04-05 9:29 ` Jesper Dangaard Brouer
2016-04-05 22:06 ` Alexei Starovoitov
2016-04-04 14:33 ` Eric Dumazet
2016-04-04 15:18 ` Edward Cree
2016-04-02 1:21 ` [RFC PATCH 2/5] net: add ndo to set bpf prog in adapter rx Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 3/5] rtnl: add option for setting link bpf prog Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program Brenden Blanco
2016-04-02 2:08 ` Eric Dumazet
2016-04-02 2:47 ` Alexei Starovoitov
2016-04-04 14:57 ` Jesper Dangaard Brouer
2016-04-04 15:22 ` Eric Dumazet
2016-04-04 18:50 ` Alexei Starovoitov
2016-04-05 14:15 ` Or Gerlitz
2016-04-06 4:05 ` Brenden Blanco
2016-04-03 6:15 ` Brenden Blanco
2016-04-05 2:20 ` Brenden Blanco
2016-04-05 2:44 ` Eric Dumazet
2016-04-05 18:59 ` Eran Ben Elisha
2016-04-02 8:23 ` Jesper Dangaard Brouer
2016-04-03 6:11 ` Brenden Blanco
2016-04-04 18:27 ` Alexei Starovoitov
2016-04-05 6:04 ` Jesper Dangaard Brouer
2016-04-02 18:40 ` Johannes Berg
2016-04-03 6:38 ` Brenden Blanco
2016-04-04 7:35 ` Johannes Berg
2016-04-04 9:57 ` Daniel Borkmann
2016-04-04 18:46 ` Alexei Starovoitov
2016-04-04 21:01 ` Daniel Borkmann
2016-04-05 1:17 ` Alexei Starovoitov
2016-04-04 8:33 ` Jesper Dangaard Brouer
2016-04-04 9:22 ` Daniel Borkmann
2016-04-02 1:21 ` [RFC PATCH 5/5] Add sample for adding simple drop program to link Brenden Blanco
2016-04-06 19:48 ` Jesper Dangaard Brouer
2016-04-06 20:01 ` Jesper Dangaard Brouer
2016-04-06 23:11 ` Alexei Starovoitov
2016-04-06 20:03 ` Daniel Borkmann
2016-04-02 16:47 ` [RFC PATCH 0/5] Add driver bpf hook for early packet drop Tom Herbert
2016-04-03 5:41 ` Brenden Blanco
2016-04-04 7:48 ` Jesper Dangaard Brouer
2016-04-04 18:10 ` Alexei Starovoitov
2016-04-02 18:41 ` Johannes Berg
2016-04-02 22:57 ` Tom Herbert
2016-04-03 2:28 ` Lorenzo Colitti
2016-04-04 7:37 ` Johannes Berg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160405101123.4e128d13@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bblanco@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).