From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>, <bpf@vger.kernel.org>,
<ast@kernel.org>, <daniel@iogearbox.net>,
<ilias.apalodimas@linaro.org>, <toke@redhat.com>,
<lorenzo@kernel.org>, <kuba@kernel.org>, <netdev@vger.kernel.org>,
<magnus.karlsson@intel.com>, <andrii@kernel.org>,
<stfomichev@gmail.com>,
<syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com>,
Ihor Solodrai <ihor.solodrai@linux.dev>,
Octavian Purdila <tavip@google.com>
Subject: Re: [PATCH v2 bpf 1/2] xdp: update xdp_rxq_info's mem type in XDP generic hook
Date: Mon, 20 Oct 2025 19:53:26 +0200 [thread overview]
Message-ID: <aPZ3FvcIVOPVxQum@boxer> (raw)
In-Reply-To: <025d2281-caf0-4f88-8f31-b0bfa5596aec@intel.com>
On Mon, Oct 20, 2025 at 05:36:06PM +0200, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> Date: Mon, 20 Oct 2025 13:20:57 +0200
>
> >
> >
> > On 17/10/2025 16.31, Maciej Fijalkowski wrote:
> >> Currently, generic XDP hook uses xdp_rxq_info from netstack Rx queues
> >> which do not have its XDP memory model registered. There is a case when
> >> XDP program calls bpf_xdp_adjust_tail() BPF helper, which in turn
> >> releases underlying memory. This happens when it consumes enough amount
> >> of bytes and when XDP buffer has fragments. For this action the memory
> >> model knowledge passed to XDP program is crucial so that core can call
> >> suitable function for freeing/recycling the page.
> >>
> >> For netstack queues it defaults to MEM_TYPE_PAGE_SHARED (0) due to lack
> >> of mem model registration. The problem we're fixing here is when kernel
> >> copied the skb to new buffer backed by system's page_pool and XDP buffer
> >> is built around it. Then when bpf_xdp_adjust_tail() calls
> >> __xdp_return(), it acts incorrectly due to mem type not being set to
> >> MEM_TYPE_PAGE_POOL and causes a page leak.
> >>
> >
> > Does the code not set the skb->pp_recycle ?
Hi Jesper,
yes it does and I based on this my initial solution, however Jakub had
concerns:
https://lore.kernel.org/bpf/20251001082737.23f5037f@kernel.org/
>
> You mean this CoW code which replaces the buffers in the skb with system
> PP-backed ones?
skb_pp_cow_data() takes arbitrary page_pool as an input, it does not imply
we're gonna be dealing with system pp only. veth provides its own and
generic xdp uses system pp (the two existing skb_pp_cow_data() callsites).
> Maybe that's the problem (I don't remember the details of the function)?
>
> >
> >> Pull out the existing code from bpf_prog_run_generic_xdp() that
> >> init/prepares xdp_buff onto new helper xdp_convert_skb_to_buff() and
> >> embed there rxq's mem_type initialization that is assigned to xdp_buff.
>
> [...]
>
> >> + if (skb_is_nonlinear(skb)) {
> >> + skb_shinfo(skb)->xdp_frags_size = skb->data_len;
> >> + xdp_buff_set_frags_flag(xdp);
> >> + } else {
> >> + xdp_buff_clear_frags_flag(xdp);
> >> + }
> >> +
> >
> > The SKB should be marked via skb->pp_recycle, but I guess you are trying
> > to catch code that doesn't set this correctly?
> > (Slightly worried this will "paper-over" some other buggy code?)
> >
> >> + xdp->rxq->mem.type = page_pool_page_is_pp(virt_to_page(xdp->data)) ?
>
> BTW this may return incorrect results if the page is not order-0.
> IIRC system PPs always return order-0 pages, what about veth code etc?
veth's pp works on order-0 pages well, however I agree it would be better
to use virt_to_head_page() here.
>
> >> + MEM_TYPE_PAGE_POOL : MEM_TYPE_PAGE_SHARED;
> >> +}
> >
> > In the beginning PP_MAGIC / PP_SIGNATURE was primarily used as a
> > debugging feature to catch faulty code that released pp pages to the
> > real page allocator. It seems to have evolved into something more
> > critical. Someone also tried to elevate this into a page flag, which
> > would make this more reliable.
exactly here we have the very same issue and we need to correctly return
pages back to their originating page pool.
> Thanks,
> Olek
next prev parent reply other threads:[~2025-10-20 17:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 14:31 [PATCH v2 bpf 0/2] xdp: fix page_pool leaks Maciej Fijalkowski
2025-10-17 14:31 ` [PATCH v2 bpf 1/2] xdp: update xdp_rxq_info's mem type in XDP generic hook Maciej Fijalkowski
2025-10-17 16:45 ` Toke Høiland-Jørgensen
2025-10-20 11:20 ` Jesper Dangaard Brouer
2025-10-20 15:36 ` Alexander Lobakin
2025-10-20 17:53 ` Maciej Fijalkowski [this message]
2025-10-22 1:01 ` Jakub Kicinski
2025-10-22 11:22 ` Maciej Fijalkowski
2025-10-22 14:04 ` Jakub Kicinski
2025-10-17 14:31 ` [PATCH v2 bpf 2/2] veth: update mem type in xdp_buff Maciej Fijalkowski
2025-10-17 16:33 ` Toke Høiland-Jørgensen
2025-10-17 16:52 ` Maciej Fijalkowski
2025-10-17 17:51 ` Toke Høiland-Jørgensen
2025-10-17 18:12 ` Toke Høiland-Jørgensen
2025-10-17 20:08 ` Maciej Fijalkowski
2025-10-20 10:25 ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPZ3FvcIVOPVxQum@boxer \
--to=maciej.fijalkowski@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hawk@kernel.org \
--cc=ihor.solodrai@linux.dev \
--cc=ilias.apalodimas@linaro.org \
--cc=kuba@kernel.org \
--cc=lorenzo@kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=stfomichev@gmail.com \
--cc=syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com \
--cc=tavip@google.com \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.