netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Brenden Blanco <bblanco@plumgrid.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Saeed Mahameed <saeedm@dev.mellanox.co.il>,
	Martin KaFai Lau <kafai@fb.com>, Ari Saha <as754m@att.com>,
	Or Gerlitz <gerlitz.or@gmail.com>,
	john.fastabend@gmail.com, hannes@stressinduktion.org,
	Thomas Graf <tgraf@suug.ch>, Tom Herbert <tom@herbertland.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Tariq Toukan <ttoukan.linux@gmail.com>,
	brouer@redhat.com, Mel Gorman <mgorman@techsingularity.net>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support
Date: Thu, 4 Aug 2016 18:19:13 +0200	[thread overview]
Message-ID: <20160804181913.26ee17b9@redhat.com> (raw)
In-Reply-To: <20160803174107.GA38399@ast-mbp.thefacebook.com>


On Wed, 3 Aug 2016 10:45:13 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Mon, Jul 25, 2016 at 09:35:20AM +0200, Eric Dumazet wrote:
> > On Tue, 2016-07-19 at 12:16 -0700, Brenden Blanco wrote:  
> > > The mlx4 driver by default allocates order-3 pages for the ring to
> > > consume in multiple fragments. When the device has an xdp program, this
> > > behavior will prevent tx actions since the page must be re-mapped in
> > > TODEVICE mode, which cannot be done if the page is still shared.
> > > 
> > > Start by making the allocator configurable based on whether xdp is
> > > running, such that order-0 pages are always used and never shared.
> > > 
> > > Since this will stress the page allocator, add a simple page cache to
> > > each rx ring. Pages in the cache are left dma-mapped, and in drop-only
> > > stress tests the page allocator is eliminated from the perf report.
> > > 
> > > Note that setting an xdp program will now require the rings to be
> > > reconfigured.  
> > 
> > Again, this has nothing to do with XDP ?
> > 
> > Please submit a separate patch, switching this driver to order-0
> > allocations.
> > 
> > I mentioned this order-3 vs order-0 issue earlier [1], and proposed to
> > send a generic patch, but had been traveling lately, and currently in
> > vacation.
> > 
> > order-3 pages are problematic when dealing with hostile traffic anyway,
> > so we should exclusively use order-0 pages, and page recycling like
> > Intel drivers.
> > 
> > http://lists.openwall.net/netdev/2016/04/11/88  
> 
> Completely agree. These multi-page tricks work only for benchmarks and
> not for production.
> Eric, if you can submit that patch for mlx4 that would be awesome.
> 
> I think we should default to order-0 for both mlx4 and mlx5.
> Alternatively we're thinking to do a netlink or ethtool switch to
> preserve old behavior, but frankly I don't see who needs this order-N
> allocation schemes.

I actually agree, that we should switch to order-0 allocations.

*BUT* this will cause performance regressions on platforms with
expensive DMA operations (as they no longer amortize the cost of
mapping a larger page).

Plus, the base cost of order-0 page is 246 cycles (see [1] slide#9),
and the 10G wirespeed target is approx 201 cycles.  Thus, for these
speeds some page recycling tricks are needed.  I described how the Intel
drives does a cool trick in [1] slide#14, but it does not address the
DMA part and costs some extra atomic ops.

I've started coding on the page-pool last week, which address both the
DMA mapping and recycling (with less atomic ops). (p.s. still on
vacation this week).

http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-04 16:19 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20  8:38   ` Daniel Borkmann
2016-07-20 17:35     ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41   ` Alexei Starovoitov
2016-07-20  9:07   ` Daniel Borkmann
2016-07-20 17:33     ` Brenden Blanco
2016-07-24 11:56   ` Jesper Dangaard Brouer
2016-07-24 16:57   ` Tom Herbert
2016-07-24 20:34     ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49   ` Alexei Starovoitov
2016-07-25  7:35   ` Eric Dumazet
2016-08-03 17:45     ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19       ` Jesper Dangaard Brouer [this message]
2016-08-05  0:30         ` Alexander Duyck
2016-08-05  3:55           ` Alexei Starovoitov
2016-08-05 15:15             ` Alexander Duyck
2016-08-05 15:33               ` David Laight
2016-08-05 16:00                 ` Alexander Duyck
2016-08-05  7:15         ` Eric Dumazet
2016-08-08  2:15           ` Alexei Starovoitov
2016-08-08  8:01             ` Jesper Dangaard Brouer
2016-08-08 18:34               ` Alexei Starovoitov
2016-08-09 12:14                 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05   ` Alexei Starovoitov
2016-07-20 17:38     ` Brenden Blanco
2016-07-27 18:25     ` Jesper Dangaard Brouer
2016-08-03 17:01   ` Tom Herbert
2016-08-03 17:11     ` Alexei Starovoitov
2016-08-03 17:29       ` Tom Herbert
2016-08-03 18:29         ` David Miller
2016-08-03 18:29         ` Brenden Blanco
2016-08-03 18:31           ` David Miller
2016-08-03 19:06           ` Tom Herbert
2016-08-03 22:36             ` Alexei Starovoitov
2016-08-03 23:18               ` Daniel Borkmann
2016-07-20  5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
     [not found]   ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08     ` Brenden Blanco
2016-07-20 19:14     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160804181913.26ee17b9@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    --cc=ttoukan.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).