From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: <intel-wired-lan@lists.osuosl.org>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
Mina Almasry <almasrymina@google.com>,
<nex.sw.ncis.osdt.itp.upstreaming@intel.com>,
<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb()
Date: Thu, 13 Jun 2024 12:58:48 +0200 [thread overview]
Message-ID: <bb1156fa-41eb-473b-bbdb-975765008d13@intel.com> (raw)
In-Reply-To: <20240529184012.5e999a93@kernel.org>
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 29 May 2024 18:40:12 -0700
> On Tue, 28 May 2024 15:48:45 +0200 Alexander Lobakin wrote:
>> Currently, idpf uses the following model for the header buffers:
>>
>> * buffers are allocated via dma_alloc_coherent();
>> * when receiving, napi_alloc_skb() is called and then the header is
>> copied to the newly allocated linear part.
>>
>> This is far from optimal as DMA coherent zone is slow on many systems
>> and memcpy() neutralizes the idea and benefits of the header split. Not
>> speaking of that XDP can't be run on DMA coherent buffers, but at the
>> same time the idea of allocating an skb to run XDP program is ill.
>> Instead, use libeth to create page_pools for the header buffers, allocate
>> them dynamically and then build an skb via napi_build_skb() around them
>> with no memory copy. With one exception...
>> When you enable header split, you except you'll always have a separate
>
> accept
"expect" :D Thanks for spotting, nice catch.
>
>> header buffer, so that you could reserve headroom and tailroom only
>> there and then use full buffers for the data. For example, this is how
>> TCP zerocopy works -- you have to have the payload aligned to PAGE_SIZE.
>> The current hardware running idpf does *not* guarantee that you'll
>> always have headers placed separately. For example, on my setup, even
>> ICMP packets are written as one piece to the data buffers. You can't
>> build a valid skb around a data buffer in this case.
>> To not complicate things and not lose TCP zerocopy etc., when such thing
>> happens, use the empty header buffer and pull either full frame (if it's
>> short) or the Ethernet header there and build an skb around it. GRO
>> layer will pull more from the data buffer later. This W/A will hopefully
>> be removed one day.
>
> Hopefully soon, cause it will prevent you from mapping data buffers to
> user space or using DMABUF memory :(
Correct. The HW team is informed and some work on it is happening right
now. I told them that stuff like devmem etc., i.e. when the kernel
doesn't have access to the data buffers, will just choke on this.
I mean, I don't care about unknown packet types since it's very unlikely
and it's almost always some garbage, as well as when the header doesn't
fit into 256 bytes (I can't imagine such situation with known protocols,
but if it can happen, I can bump to 512 or so), but currently, and it's
a shame, idpf does header split only for TCP/UDP/SCTP, on my setup it
didn't want to split even regular ICMP, although it parsed the type
correctly =\ I dunno why they did it this way.
There are some configuration bits which in theory allow you to enable
hsplit for other types of frames, but enabling them didn't change
anything unfortunately.
Thanks,
Olek
next prev parent reply other threads:[~2024-06-13 10:59 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-28 13:48 [PATCH iwl-next 00/12] idpf: XDP chapter I: convert Rx to libeth Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers Alexander Lobakin
2024-05-30 1:34 ` Jakub Kicinski
2024-06-12 10:07 ` Przemek Kitszel
2024-06-12 20:55 ` Jakub Kicinski
2024-06-13 10:47 ` Alexander Lobakin
2024-06-13 13:47 ` Jakub Kicinski
2024-05-28 13:48 ` [PATCH iwl-next 02/12] idpf: stop using macros for accessing queue descriptors Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures Alexander Lobakin
2024-06-01 8:53 ` Simon Horman
2024-06-13 11:03 ` Alexander Lobakin
2024-06-15 7:32 ` Simon Horman
2024-05-28 13:48 ` [PATCH iwl-next 04/12] idpf: avoid bloating &idpf_q_vector with big %NR_CPUS Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 05/12] idpf: strictly assert cachelines of queue and queue vector structures Alexander Lobakin
[not found] ` <b25cab15-f73c-4df8-bfca-434a8f717a31@intel.com>
2024-06-12 13:03 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-12 13:08 ` Alexander Lobakin
2024-06-12 22:42 ` Jacob Keller
2024-06-12 22:40 ` Jacob Keller
2024-05-28 13:48 ` [PATCH iwl-next 06/12] idpf: merge singleq and splitq &net_device_ops Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 07/12] idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 08/12] idpf: reuse libeth's definitions of parsed ptype structures Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 09/12] idpf: remove legacy Page Pool Ethtool stats Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 10/12] libeth: support different types of buffers for Rx Alexander Lobakin
2024-05-28 13:48 ` [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb() Alexander Lobakin
2024-05-30 1:40 ` Jakub Kicinski
2024-06-13 10:58 ` Alexander Lobakin [this message]
2024-05-30 13:46 ` Willem de Bruijn
2024-06-17 11:06 ` Alexander Lobakin
2024-06-17 18:13 ` Willem de Bruijn
2024-06-20 12:46 ` Alexander Lobakin
2024-06-20 16:29 ` Willem de Bruijn
2024-05-28 13:48 ` [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer Alexander Lobakin
2024-06-01 9:08 ` Simon Horman
2024-06-13 11:05 ` Alexander Lobakin
2024-06-15 7:35 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bb1156fa-41eb-473b-bbdb-975765008d13@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=almasrymina@google.com \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).