From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Brenden Blanco <bblanco@plumgrid.com>,
davem@davemloft.net, netdev@vger.kernel.org,
Jamal Hadi Salim <jhs@mojatatu.com>,
Saeed Mahameed <saeedm@dev.mellanox.co.il>,
Martin KaFai Lau <kafai@fb.com>, Ari Saha <as754m@att.com>,
Or Gerlitz <gerlitz.or@gmail.com>,
john.fastabend@gmail.com, hannes@stressinduktion.org,
Thomas Graf <tgraf@suug.ch>, Tom Herbert <tom@herbertland.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Tariq Toukan <ttoukan.linux@gmail.com>,
Mel Gorman <mgorman@techsingularity.net>,
linux-mm <linux-mm@kvack.org>,
brouer@redhat.com
Subject: Re: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support
Date: Mon, 8 Aug 2016 10:01:15 +0200 [thread overview]
Message-ID: <20160808100115.143d6ed3@redhat.com> (raw)
In-Reply-To: <20160808021525.GA81429@ast-mbp>
On Sun, 7 Aug 2016 19:15:27 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> On Fri, Aug 05, 2016 at 09:15:33AM +0200, Eric Dumazet wrote:
> > On Thu, 2016-08-04 at 18:19 +0200, Jesper Dangaard Brouer wrote:
> >
> > > I actually agree, that we should switch to order-0 allocations.
> > >
> > > *BUT* this will cause performance regressions on platforms with
> > > expensive DMA operations (as they no longer amortize the cost of
> > > mapping a larger page).
> >
> >
> > We much prefer reliable behavior, even it it is ~1 % slower than the
> > super-optimized thing that opens highways for attackers.
>
> +1
> It's more important to have deterministic performance at fresh boot
> and after long uptime when high order-N are gone.
Yes, exactly. Doing high order-N pages allocations might look good on
benchmarks on a freshly booted system, but once the page allocator gets
fragmented (after long uptime) then performance characteristics change.
(Discussed this with Christoph Lameter during MM-summit, and he have
seen issues with this kind of fragmentation in production)
> > Anyway, in most cases pages are re-used, so we only call
> > dma_sync_single_range_for_cpu(), and there is no way to avoid this.
> >
> > Using order-0 pages [1] is actually faster, since when we use high-order
> > pages (multiple frames per 'page') we can not reuse the pages.
> >
> > [1] I had a local patch to allocate these pages using a very simple
> > allocator allocating max order (order-10) pages and splitting them into
> > order-0 ages, in order to lower TLB footprint. But I could not measure a
> > gain doing so on x86, at least on my lab machines.
>
> Which driver was that?
> I suspect that should indeed be the case for any driver that
> uses build_skb and <256 copybreak.
>
> Saeed,
> could you please share the performance numbers for mlx5 order-0 vs order-N ?
> You mentioned that there was some performance improvement. We need to know
> how much we'll lose when we turn off order-N.
I'm not sure the compare will be "fair" with the mlx5 driver, because
(1) the N-order page mode (MPWQE) is a hardware feature, plus (2) the
order-0 page mode is done "wrongly" (by preallocating SKBs together
with RX ring entries).
AFAIK it is a hardware feature the MPQWE (Multi-Packet Work Queue
Element) or Striding RQ, for ConnectX4-Lx. Thus, the need to support
two modes in the mlx5 driver.
Commit[1] 461017cb006a ("net/mlx5e: Support RX multi-packet WQE
(Striding RQ)") states this gives a 10-15% performance improvement for
netperf TCP stream (and ability to absorb bursty traffic).
[1] https://git.kernel.org/torvalds/c/461017cb006
The MPWQE mode, uses order-5 pages. The critical question is: what
happens to the performance when order-5 allocations gets slower (or
impossible) due to page fragmentation? (Notice the page allocator uses
a central lock for order-N pages)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-08-08 8:01 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20 8:38 ` Daniel Borkmann
2016-07-20 17:35 ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41 ` Alexei Starovoitov
2016-07-20 9:07 ` Daniel Borkmann
2016-07-20 17:33 ` Brenden Blanco
2016-07-24 11:56 ` Jesper Dangaard Brouer
2016-07-24 16:57 ` Tom Herbert
2016-07-24 20:34 ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49 ` Alexei Starovoitov
2016-07-25 7:35 ` Eric Dumazet
2016-08-03 17:45 ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19 ` Jesper Dangaard Brouer
2016-08-05 0:30 ` Alexander Duyck
2016-08-05 3:55 ` Alexei Starovoitov
2016-08-05 15:15 ` Alexander Duyck
2016-08-05 15:33 ` David Laight
2016-08-05 16:00 ` Alexander Duyck
2016-08-05 7:15 ` Eric Dumazet
2016-08-08 2:15 ` Alexei Starovoitov
2016-08-08 8:01 ` Jesper Dangaard Brouer [this message]
2016-08-08 18:34 ` Alexei Starovoitov
2016-08-09 12:14 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05 ` Alexei Starovoitov
2016-07-20 17:38 ` Brenden Blanco
2016-07-27 18:25 ` Jesper Dangaard Brouer
2016-08-03 17:01 ` Tom Herbert
2016-08-03 17:11 ` Alexei Starovoitov
2016-08-03 17:29 ` Tom Herbert
2016-08-03 18:29 ` David Miller
2016-08-03 18:29 ` Brenden Blanco
2016-08-03 18:31 ` David Miller
2016-08-03 19:06 ` Tom Herbert
2016-08-03 22:36 ` Alexei Starovoitov
2016-08-03 23:18 ` Daniel Borkmann
2016-07-20 5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
[not found] ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08 ` Brenden Blanco
2016-07-20 19:14 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160808100115.143d6ed3@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=as754m@att.com \
--cc=bblanco@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=gerlitz.or@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=jhs@mojatatu.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=netdev@vger.kernel.org \
--cc=saeedm@dev.mellanox.co.il \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
--cc=ttoukan.linux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).