From: Brenden Blanco <bblanco@plumgrid.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
davem@davemloft.net, netdev@vger.kernel.org,
Martin KaFai Lau <kafai@fb.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Ari Saha <as754m@att.com>, Or Gerlitz <gerlitz.or@gmail.com>,
john.fastabend@gmail.com, hannes@stressinduktion.org,
Thomas Graf <tgraf@suug.ch>, Tom Herbert <tom@herbertland.com>,
Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH v6 12/12] net/mlx4_en: add prefetch in xdp rx path
Date: Fri, 8 Jul 2016 09:49:40 -0700 [thread overview]
Message-ID: <20160708164939.GA30632@gmail.com> (raw)
In-Reply-To: <1467961005.17638.28.camel@edumazet-glaptop3.roam.corp.google.com>
On Fri, Jul 08, 2016 at 08:56:45AM +0200, Eric Dumazet wrote:
> On Thu, 2016-07-07 at 21:16 -0700, Alexei Starovoitov wrote:
>
> > I've tried this style of prefetching in the past for normal stack
> > and it didn't help at all.
>
> This is very nice, but my experience showed opposite numbers.
> So I guess you did not choose the proper prefetch strategy.
>
> prefetching in mlx4 gave me good results, once I made sure our compiler
> was not moving the actual prefetch operations on x86_64 (ie forcing use
> of asm volatile as in x86_32 instead of the builtin prefetch). You might
> check if your compiler does the proper thing because this really hurt me
> in the past.
>
> In my case, I was using 40Gbit NIC, and prefetching 128 bytes instead of
> 64 bytes allowed to remove one stall in GRO engine when using TCP with
> TS (total header size : 66 bytes), or tunnels.
>
> The problem with prefetch is that it works well assuming a given rate
> (in pps), and given cpus, as prefetch behavior is varying among flavors.
>
> Brenden chose to prefetch N+3, based on some experiments, on some
> hardware,
>
> prefetch N+3 can actually slow down if you receive a moderate load,
> which is the case 99% of the time in typical workloads on modern servers
> with multi queue NIC.
Thanks for the feedback Eric!
This particular patch in the series is meant to be standalone exactly
for this reason. I don't pretend to assert that this optimization will
work for everybody, or even for a future version of me with different
hardware. But, it passes my internal criteria for usefulness:
1. It provides a measurable gain in the experiments that I have at hand
2. The code is easy to review
3. The change does not negatively impact non-XDP users
I would love to have a solution for all mlx4 driver users, but this
patch set is focused on a different goal. So, without munging a
different set of changes for the universal use case, and probably
violating criteria #2 or #3, I went with what you see.
In hopes of not derailing the whole patch series, what is an actionable
next step for this patch #12?
Ideas:
Pick a safer N? (I saw improvements with N=1 as well)
Drop this patch?
One thing I definitely don't want to do is go into the weeds trying to
get a universal prefetch logic in order to merge the XDP framework, even
though I agree the net result would benefit everybody.
>
> This is why it was hard to upstream such changes, because they focus on
> max throughput instead of low latencies.
>
>
>
next prev parent reply other threads:[~2016-07-08 16:49 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-08 2:15 [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 01/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-09 8:14 ` Jesper Dangaard Brouer
2016-07-09 13:47 ` Tom Herbert
2016-07-10 13:37 ` Jesper Dangaard Brouer
2016-07-10 17:09 ` Brenden Blanco
2016-07-10 20:30 ` Tom Herbert
2016-07-11 10:15 ` Daniel Borkmann
2016-07-11 12:58 ` Jesper Dangaard Brouer
2016-07-10 20:27 ` Tom Herbert
2016-07-11 11:36 ` Jesper Dangaard Brouer
2016-07-10 20:56 ` Tom Herbert
2016-07-11 16:51 ` Brenden Blanco
2016-07-11 21:21 ` Daniel Borkmann
2016-07-10 21:04 ` Tom Herbert
2016-07-11 13:53 ` Jesper Dangaard Brouer
2016-07-08 2:15 ` [PATCH v6 02/12] net: add ndo to set xdp prog in adapter rx Brenden Blanco
2016-07-10 20:59 ` Tom Herbert
2016-07-11 10:35 ` Daniel Borkmann
2016-07-08 2:15 ` [PATCH v6 03/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-09 14:07 ` Or Gerlitz
2016-07-10 15:40 ` Brenden Blanco
2016-07-10 16:38 ` Tariq Toukan
2016-07-09 19:58 ` Saeed Mahameed
2016-07-09 21:37 ` Or Gerlitz
2016-07-10 15:25 ` Tariq Toukan
2016-07-10 16:05 ` Brenden Blanco
2016-07-11 11:48 ` Saeed Mahameed
2016-07-11 21:49 ` Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 05/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-09 20:21 ` Saeed Mahameed
2016-07-11 11:09 ` Jamal Hadi Salim
2016-07-11 13:37 ` Jesper Dangaard Brouer
2016-07-16 14:55 ` Jamal Hadi Salim
2016-07-08 2:15 ` [PATCH v6 06/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 07/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 08/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 09/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 10/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 11/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 12/12] net/mlx4_en: add prefetch in xdp rx path Brenden Blanco
2016-07-08 3:56 ` Eric Dumazet
2016-07-08 4:16 ` Alexei Starovoitov
2016-07-08 6:56 ` Eric Dumazet
2016-07-08 16:49 ` Brenden Blanco [this message]
2016-07-10 20:48 ` Tom Herbert
2016-07-10 20:50 ` Tom Herbert
2016-07-11 14:54 ` Jesper Dangaard Brouer
2016-07-08 15:20 ` Jesper Dangaard Brouer
2016-07-08 16:02 ` [net-next PATCH RFC] mlx4: RX prefetch loop Jesper Dangaard Brouer
2016-07-11 11:09 ` Jesper Dangaard Brouer
2016-07-11 16:00 ` Brenden Blanco
2016-07-11 23:05 ` Alexei Starovoitov
2016-07-12 12:45 ` Jesper Dangaard Brouer
2016-07-12 16:46 ` Alexander Duyck
2016-07-12 19:52 ` Jesper Dangaard Brouer
2016-07-13 1:37 ` Alexei Starovoitov
2016-07-10 16:14 ` [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160708164939.GA30632@gmail.com \
--to=bblanco@plumgrid.com \
--cc=alexei.starovoitov@gmail.com \
--cc=as754m@att.com \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=gerlitz.or@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.