netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Alexander Lobakin <aleksander.lobakin@intel.com>,
	 Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: intel-wired-lan@lists.osuosl.org,
	 Tony Nguyen <anthony.l.nguyen@intel.com>,
	 "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	 Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>,
	 Mina Almasry <almasrymina@google.com>,
	 nex.sw.ncis.osdt.itp.upstreaming@intel.com,
	 netdev@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()
Date: Thu, 20 Jun 2024 12:29:23 -0400	[thread overview]
Message-ID: <667458e38c879_2b190d294f5@willemb.c.googlers.com.notmuch> (raw)
In-Reply-To: <c38e22b5-090c-4e9f-80aa-37806aed5eaa@intel.com>

Alexander Lobakin wrote:
> From: Willem De Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Mon, 17 Jun 2024 14:13:07 -0400
> 
> > Alexander Lobakin wrote:
> >> From: Willem De Bruijn <willemdebruijn.kernel@gmail.com>
> >> Date: Thu, 30 May 2024 09:46:46 -0400
> >>
> >>> Alexander Lobakin wrote:
> >>>> Currently, idpf uses the following model for the header buffers:
> >>>>
> >>>> * buffers are allocated via dma_alloc_coherent();
> >>>> * when receiving, napi_alloc_skb() is called and then the header is
> >>>>   copied to the newly allocated linear part.
> >>>>
> >>>> This is far from optimal as DMA coherent zone is slow on many systems
> >>>> and memcpy() neutralizes the idea and benefits of the header split. 
> >>>
> >>> In the previous revision this assertion was called out, as we have
> >>> lots of experience with the existing implementation and a previous one
> >>> based on dynamic allocation one that performed much worse. You would
> >>
> >> napi_build_skb() is not a dynamic allocation. In contrary,
> >> napi_alloc_skb() from the current implementation actually *is* a dynamic
> >> allocation. It allocates a page frag for every header buffer each time.
> >>
> >> Page Pool refills header buffers from its pool of recycled frags.
> >> Plus, on x86_64, truesize of a header buffer is 1024, meaning it picks
> >> a new page from the pool every 4th buffer. During the testing of common
> >> workloads, I had literally zero new page allocations, as the skb core
> >> recycles frags from skbs back to the pool.
> >>
> >> IOW, the current version you're defending actually performs more dynamic
> >> allocations on hotpath than this one ¯\_(ツ)_/¯
> >>
> >> (I explained all this several times already)
> >>
> >>> share performance numbers in the next revision
> >>
> >> I can't share numbers in the outside, only percents.
> >>
> >> I shared before/after % in the cover letter. Every test yielded more
> >> Mpps after this change, esp. non-XDP_PASS ones when you don't have
> >> networking stack overhead.
> > 
> > This is the main concern: AF_XDP has no existing users, but TCP/IP is
> > used in production environments. So we cannot risk TCP/IP regressions
> > in favor of somewhat faster AF_XDP. Secondary is that a functional
> > implementation of AF_XDP soon with optimizations later is preferable
> > over the fastest solution later.
> 
> I have perf numbers before-after for all the common workloads and I see
> only improvements there.

Good. That was the request. Not only from me, to remind.

> Do you have any to prove that this change
> introduces regressions?

I have no data yet. We can run some tests on your github series too.

  reply	other threads:[~2024-06-20 16:29 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28 13:48 [PATCH iwl-next 00/12] idpf: XDP chapter I: convert Rx to libeth Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers Alexander Lobakin
2024-05-30  1:34   ` Jakub Kicinski
2024-06-12 10:07     ` Przemek Kitszel
2024-06-12 20:55       ` Jakub Kicinski
2024-06-13 10:47     ` Alexander Lobakin
2024-06-13 13:47       ` Jakub Kicinski
2024-05-28 13:48 ` [PATCH iwl-next 02/12] idpf: stop using macros for accessing queue descriptors Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures Alexander Lobakin
2024-06-01  8:53   ` Simon Horman
2024-06-13 11:03     ` Alexander Lobakin
2024-06-15  7:32       ` Simon Horman
2024-05-28 13:48 ` [PATCH iwl-next 04/12] idpf: avoid bloating &idpf_q_vector with big %NR_CPUS Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 05/12] idpf: strictly assert cachelines of queue and queue vector structures Alexander Lobakin
     [not found]   ` <b25cab15-f73c-4df8-bfca-434a8f717a31@intel.com>
2024-06-12 13:03     ` [Intel-wired-lan] " Alexander Lobakin
2024-06-12 13:08       ` Alexander Lobakin
2024-06-12 22:42         ` Jacob Keller
2024-06-12 22:40       ` Jacob Keller
2024-05-28 13:48 ` [PATCH iwl-next 06/12] idpf: merge singleq and splitq &net_device_ops Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 07/12] idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 08/12] idpf: reuse libeth's definitions of parsed ptype structures Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 09/12] idpf: remove legacy Page Pool Ethtool stats Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 10/12] libeth: support different types of buffers for Rx Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb() Alexander Lobakin
2024-05-30  1:40   ` Jakub Kicinski
2024-06-13 10:58     ` Alexander Lobakin
2024-05-30 13:46   ` Willem de Bruijn
2024-06-17 11:06     ` Alexander Lobakin
2024-06-17 18:13       ` Willem de Bruijn
2024-06-20 12:46         ` Alexander Lobakin
2024-06-20 16:29           ` Willem de Bruijn [this message]
2024-05-28 13:48 ` [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer Alexander Lobakin
2024-06-01  9:08   ` Simon Horman
2024-06-13 11:05     ` Alexander Lobakin
2024-06-15  7:35       ` Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=667458e38c879_2b190d294f5@willemb.c.googlers.com.notmuch \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=almasrymina@google.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).