From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: <netdev@vger.kernel.org>, <bpf@vger.kernel.org>,
<magnus.karlsson@intel.com>, <stfomichev@gmail.com>,
<kuba@kernel.org>, <pabeni@redhat.com>, <horms@kernel.org>,
<bjorn@kernel.org>, <lorenzo@kernel.org>, <toke@redhat.com>,
Liang Chen <liangchen.linux@gmail.com>,
Yunsheng Lin <linyunsheng@huawei.com>,
huangjie.albert <huangjie.albert@bytedance.com>
Subject: Re: [PATCH RFC net-next 0/4] xdp: reuse generic skb XDP handling for veth
Date: Fri, 15 May 2026 20:18:54 +0200 [thread overview]
Message-ID: <agdjjtlXxlzy6Mx5@boxer> (raw)
In-Reply-To: <d784e8d3-e6af-4437-93f2-3103720b8454@kernel.org>
On Thu, May 14, 2026 at 07:13:07AM +0200, Jesper Dangaard Brouer wrote:
>
>
> On 09/05/2026 10.48, Maciej Fijalkowski wrote:
> > Hi,
> >
> > this series is an attempt to make skb-backed XDP handling in veth use
> > the generic skb XDP machinery instead of converting skb-backed packets
> > into xdp_frames for XDP_TX and XDP_REDIRECT.
>
> I support this idea. Thanks for working on this Maciej.
Hi Jesper, good to read!
>
> I basically proposed the same back in Aug 2023 [1]. My patchset[2] was
> motivated by a performance improvement 23.5% see [3], that comes from
> avoiding to reallocate most veth packets when XDP is loaded.
>
> Please read veth_benchmark04.org[4] as it documents a number of
> pitfalls, and highlight that the main trick to avoid reallocation is
> changing net_device needed_headroom (when XDP is loaded).
Thanks, will do
>
> [1] https://lore.kernel.org/all/169272715407.1975370.3989385869434330916.stgit@firesoul/
> [2] https://lore.kernel.org/all/169272709850.1975370.16698220879817216294.stgit@firesoul/
> [3] https://github.com/xdp-project/xdp-project/blob/main/areas/core/veth_benchmark03.org
> [4] https://github.com/xdp-project/xdp-project/blob/main/areas/core/veth_benchmark04.org
>
>
> > This was brought up by
> > Jakub during review of previous work, which was focused on addressing
> > splats from AF_XDP selftests shrinking multi-buffer packet:
> >
> > https://lore.kernel.org/bpf/20251029221315.2694841-1-maciej.fijalkowski@intel.com/
> >
> > veth currently has two different XDP input paths:
> > - xdp_frame input, coming through ndo_xdp_xmit(), for example from
> > devmap redirects; and
> > - skb input, coming through ndo_start_xmit(), for example from the
> > regular networking stack, pktgen, or other skb producers.
> >
> > The xdp_frame path is naturally frame-based and this series keeps it on
> > the existing veth xdp_frame handling path. The skb path, however, is
> > still fundamentally skb-backed, but today veth converts it into an
> > xdp_frame for XDP_TX/XDP_REDIRECT. That conversion is awkward and has
> > been pointed out as undesirable before, because veth has to pin skb data,
> > construct an xdp_frame view of it, and then restore/massage XDP memory
> > metadata around that conversion.
> >
>
> Yes, I really dislike the veth approach of stealing the SKB "head"/data
> page by bumping the page refcnt directly. If is awkward as you say.
> I would be happy to see that code improved.
>
> > This series takes a different approach: skb-backed veth XDP now reuses
> > the generic skb XDP execution path. veth provides its own xdp_buff,
> > xdp_rxq_info and page_pool to the generic XDP helper, so the generic code
> > can still perform the required skb COW using veth's page_pool when the
> > skb head/frags cannot be used directly. XDP_TX then uses generic_xdp_tx()
> > and XDP_REDIRECT uses xdp_do_generic_redirect(), while the xdp_frame
> > input path keeps using the existing veth xdp_frame TX/redirect handling.
>
> I also used this approach in my patchset, so I like it.
>
> https://lore.kernel.org/all/169272715407.1975370.3989385869434330916.stgit@firesoul/
>
> > The problem this series tries to address more generally is that
> > skb-backed generic XDP can end up with memory whose provenance is not
> > described correctly by a single queue-level MEM_TYPE_PAGE_SHARED value.
> > When skb is COWed the underlying memory is page_pool backed but current
> > logic does not respect it.
> >
> > For that reason the series introduces MEM_TYPE_PAGE_POOL_OR_SHARED. This
> > type is not bound to a single registered page_pool allocator. Instead,
> > the return path inspects the individual netmem and dispatches either to
> > the page_pool return path or to page_frag_free(). This lets generic
> > skb-backed XDP handle mixed page_pool/page-shared memory without
> > mutating rxq->mem.type per packet.
>
> I'm unsure about the introduction of MEM_TYPE_PAGE_POOL_OR_SHARED an
> ambiguous memory type. In [5] I considered adding two new memory types
> MEM_TYPE_KMALLOC_SKB and MEM_TYPE_SKB_SMALL_HEAD_CACHE that __xdp_return
> would handle, but I labeled is as an "uncertain approach" myself.
I assume your hacks were done on top of veth without page_pool being used
for reallocations? Otherwise I have to understand your reasoning. I had a
bit shattered week so expect response on monday.
Just as a reminder, I need this to be fixed as currently AF_XDP test suite
splats heavily when we encounter skb_pp_cow_data() calls and
__xdp_return() still sees MEM_TYPE_PAGE_SHARED when packet is shrunk via
bpf_xdp_adjust_tail().
Also seems all parties agree (Jakub's response) we should come up with
common conditions for taking the conversion path (and bump veth hroom).
To be fully transparent I assume we have to include Toke to discussion as
you guys had discussion back on your RFC.
Thanks,
Maciej
>
> [5] https://github.com/xdp-project/xdp-project/blob/main/areas/core/veth_benchmark04.org#uncertain-approach
>
>
> > The veth part also removes the old rq->xdp_mem juggling. For incoming
> > xdp_frames, veth now uses a local rxq view whose mem.type is taken from
> > frame->mem_type. This preserves the frame's original memory type for
> > XDP_TX/XDP_REDIRECT without overwriting the persistent rq->xdp_rxq memory
> > model used by skb-backed generic XDP.
> >
> > One visible datapath change is that skb-backed veth XDP_TX no longer
> > uses veth's xdp_frame bulk queue. It now follows the generic skb XDP_TX
> > path. The xdp_frame-originated path is unchanged and still uses the
> > existing veth bulk path. The old skb-backed path had batching, but it
> > also paid the cost of converting skb-backed packets into xdp_frames. The
> > new path removes that conversion and keeps skb-backed packets on the
> > generic skb XDP path. From my local tests that consisted of
> > pktgen + xdp_bench I did not notice any major performance regressions,
> > however Lorenzo and Jesper might disagree here, hence the RFC status. I
> > am fed up of internal wars with Sashiko so I would be pleased to get
> > some human feedback.
> >
>
> My benchmarking in above documents, showed that needed_headroom change was a
> bigger performance boost than loosing the batching.
>
> For a long time, I have considered adding batching to the generic-XDP code
> path, simply via SKB-list trick. I did an experiment doing this in the past
> and my benchmarking showed 30% TX performance boost, because xmit_more takes
> effect. Given people are not suppose to use generic-XDP if they care about
> performance, I never followed up. If veth start to use this generic
> redirect code path, then it makes sense to do this. In production we have
> code that does XDP redirect of SKBs into veth (I have recommended to
> redirect native xdp_frame's instead, but because traffic first need to
> travel through some DDoS filters in iptables, it cannot be done without
> first moving those filters into XDP).
>
>
> > In particular, let's discuss:
> >
> > - whether MEM_TYPE_PAGE_POOL_OR_SHARED is an acceptable way to describe
> > skb-backed generic XDP memory that may be page_pool-backed after COW
> > or ordinary page-shared memory otherwise;
> >
> > - whether passing caller-provided xdp_buff/rxq/page_pool context into
> > the generic skb XDP helper is the right API shape;
> >
> > - whether letting veth provide its own page_pool to generic XDP is
> > acceptable for avoiding the old skb->xdp_frame conversion; could
> > veth just piggy-back on system's page_pool? even though it could,
> > we still need xdp_buff being passed (metadata) and other refactors
> > that allow veth to bump stats and do the redirect flush;
> >
> > - whether skb-backed veth XDP_TX using generic_xdp_tx(), while keeping
> > xdp_frame-originated traffic on the existing veth bulk path, is an
> > acceptable split;
> >
> > - the INDIRECT_CALL for taking the COW path; I wanted to preserve
> > existing behavior, but is it really needed or maybe it would be
> > possible to come up with conditions that would cover both generic
> > XDP path and veth?
> >
> > FWIW I do like the end result on veth side. If I missed CCing someone,
> > mea culpa.
>
> Added Cc
> - Liang Chen <liangchen.linux@gmail.com>
> - Yunsheng Lin <linyunsheng@huawei.com>
> - huangjie.albert@bytedance.com
>
> >
> > Thanks,
> > Maciej
> >
> > Maciej Fijalkowski (4):
> > xdp: add mixed page_pool/page_shared memory type
> > xdp: return status from generic_xdp_tx()
> > xdp: split generic XDP skb handling
> > veth: use generic skb XDP handling
> >
> > drivers/net/veth.c | 179 ++++++++------------------------------
> > include/linux/netdevice.h | 33 ++++++-
> > include/net/xdp.h | 1 +
> > kernel/bpf/devmap.c | 2 +-
> > net/core/dev.c | 124 ++++++++++++++++++++------
> > net/core/filter.c | 2 +-
> > net/core/xdp.c | 54 ++++++++++--
> > 7 files changed, 216 insertions(+), 179 deletions(-)
> >
>
next prev parent reply other threads:[~2026-05-15 18:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 8:48 [PATCH RFC net-next 0/4] xdp: reuse generic skb XDP handling for veth Maciej Fijalkowski
2026-05-09 8:48 ` [PATCH RFC net-next 1/4] xdp: add mixed page_pool/page_shared memory type Maciej Fijalkowski
2026-05-09 8:48 ` [PATCH RFC net-next 2/4] xdp: return status from generic_xdp_tx() Maciej Fijalkowski
2026-05-12 12:57 ` Björn Töpel
2026-05-12 17:13 ` Maciej Fijalkowski
2026-05-09 8:48 ` [PATCH RFC net-next 3/4] xdp: split generic XDP skb handling Maciej Fijalkowski
2026-05-09 8:48 ` [PATCH RFC net-next 4/4] veth: use generic skb XDP handling Maciej Fijalkowski
2026-05-12 14:32 ` Björn Töpel
2026-05-12 17:06 ` Maciej Fijalkowski
2026-05-13 11:31 ` Björn Töpel
2026-05-12 12:55 ` [PATCH RFC net-next 0/4] xdp: reuse generic skb XDP handling for veth Björn Töpel
2026-05-12 17:12 ` Maciej Fijalkowski
2026-05-14 5:13 ` Jesper Dangaard Brouer
2026-05-15 18:18 ` Maciej Fijalkowski [this message]
2026-05-15 0:54 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agdjjtlXxlzy6Mx5@boxer \
--to=maciej.fijalkowski@intel.com \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=huangjie.albert@bytedance.com \
--cc=kuba@kernel.org \
--cc=liangchen.linux@gmail.com \
--cc=linyunsheng@huawei.com \
--cc=lorenzo@kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=stfomichev@gmail.com \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox