All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<hawk@kernel.org>, <netdev@vger.kernel.org>,
	<magnus.karlsson@intel.com>, <aleksander.lobakin@intel.com>,
	<ilias.apalodimas@linaro.org>, <lorenzo@kernel.org>,
	<kuba@kernel.org>,
	<syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com>,
	Ihor Solodrai <ihor.solodrai@linux.dev>,
	Octavian Purdila <tavip@google.com>
Subject: Re: [PATCH v3 bpf 1/2] xdp: introduce xdp_convert_skb_to_buff()
Date: Fri, 24 Oct 2025 19:19:58 +0200	[thread overview]
Message-ID: <aPu1PjwidRCdVvyv@boxer> (raw)
In-Reply-To: <87cy6cfy7t.fsf@toke.dk>

On Fri, Oct 24, 2025 at 11:25:26AM +0200, Toke Høiland-Jørgensen wrote:
> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> 
> > Currently, generic XDP hook uses xdp_rxq_info from netstack Rx queues
> > which do not have its XDP memory model registered. There is a case when
> > XDP program calls bpf_xdp_adjust_tail() BPF helper, which in turn
> > releases underlying memory. This happens when it consumes enough amount
> > of bytes and when XDP buffer has fragments. For this action the memory
> > model knowledge passed to XDP program is crucial so that core can call
> > suitable function for freeing/recycling the page.
> >
> > For netstack queues it defaults to MEM_TYPE_PAGE_SHARED (0) due to lack
> > of mem model registration. The problem we're fixing here is when kernel
> > copied the skb to new buffer backed by system's page_pool and XDP buffer
> > is built around it. Then when bpf_xdp_adjust_tail() calls
> > __xdp_return(), it acts incorrectly due to mem type not being set to
> > MEM_TYPE_PAGE_POOL and causes a page leak.
> >
> > Pull out the existing code from bpf_prog_run_generic_xdp() that
> > init/prepares xdp_buff onto new helper xdp_convert_skb_to_buff() and
> > embed there rxq's mem_type initialization that is assigned to xdp_buff.
> > Make it agnostic to current skb->data position.
> >
> > This problem was triggered by syzbot as well as AF_XDP test suite which
> > is about to be integrated to BPF CI.
> >
> > Reported-by: syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/netdev/6756c37b.050a0220.a30f1.019a.GAE@google.com/
> > Fixes: e6d5dbdd20aa ("xdp: add multi-buff support for xdp running in generic mode")
> > Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
> > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > Co-developed-by: Octavian Purdila <tavip@google.com>
> > Signed-off-by: Octavian Purdila <tavip@google.com> # whole analysis, testing, initiating a fix
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # commit msg and proposed more robust fix
> > ---
> >  include/net/xdp.h | 27 +++++++++++++++++++++++++++
> >  net/core/dev.c    | 25 ++++---------------------
> >  2 files changed, 31 insertions(+), 21 deletions(-)
> >
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index aa742f413c35..cec43f56ae9a 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -384,6 +384,33 @@ struct sk_buff *xdp_build_skb_from_frame(struct xdp_frame *xdpf,
> >  					 struct net_device *dev);
> >  struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf);
> >  
> > +static inline
> > +void xdp_convert_skb_to_buff(struct sk_buff *skb, struct xdp_buff *xdp,
> > +			     struct xdp_rxq_info *xdp_rxq)
> > +{
> > +	u32 frame_sz, pkt_len;
> > +
> > +	/* SKB "head" area always have tailroom for skb_shared_info */
> > +	frame_sz = skb_end_pointer(skb) - skb->head;
> > +	frame_sz += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > +
> > +	DEBUG_NET_WARN_ON_ONCE(!skb_mac_header_was_set(skb));
> > +	pkt_len =  skb->tail - skb->mac_header;
> 
> Should probably just use the helpers here:
> 
> pkt_len = skb_tail_pointer(skb) - skb_mac_header(skb);
> 
> that way you don't have to open-code the WARN_ON_ONCE, and you get the
> right behaviour even when NET_SKBUFF_DATA_USES_OFFSET is set

Exactly, I thought I could just strip out skb->head out of calculation,
but apparently not ;)

> 
> -Toke
> 

  reply	other threads:[~2025-10-24 17:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22 12:52 [PATCH v3 bpf 0/2] xdp: fix page_pool leaks Maciej Fijalkowski
2025-10-22 12:52 ` [PATCH v3 bpf 1/2] xdp: introduce xdp_convert_skb_to_buff() Maciej Fijalkowski
2025-10-22 14:53   ` kernel test robot
2025-10-22 14:53   ` kernel test robot
2025-10-22 14:54   ` kernel test robot
2025-10-24  8:07   ` kernel test robot
2025-10-24  9:25   ` Toke Høiland-Jørgensen
2025-10-24 17:19     ` Maciej Fijalkowski [this message]
2025-10-22 12:52 ` [PATCH v3 bpf 2/2] veth: update mem type in xdp_buff Maciej Fijalkowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPu1PjwidRCdVvyv@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hawk@kernel.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com \
    --cc=tavip@google.com \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.