From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Amery Hung <ameryhung@gmail.com>,
Martin KaFai Lau <martin.lau@kernel.org>,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
Sasha Levin <sashal@kernel.org>,
ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net,
kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com,
edumazet@google.com, pabeni@redhat.com, andrii@kernel.org,
martin.lau@linux.dev, netdev@vger.kernel.org,
bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.1] bpf: Clear pfmemalloc flag when freeing all fragments
Date: Thu, 9 Oct 2025 11:55:05 -0400 [thread overview]
Message-ID: <20251009155752.773732-39-sashal@kernel.org> (raw)
In-Reply-To: <20251009155752.773732-1-sashal@kernel.org>
From: Amery Hung <ameryhung@gmail.com>
[ Upstream commit 8f12d1137c2382c80aada8e05d7cc650cd4e403c ]
It is possible for bpf_xdp_adjust_tail() to free all fragments. The
kfunc currently clears the XDP_FLAGS_HAS_FRAGS bit, but not
XDP_FLAGS_FRAGS_PF_MEMALLOC. So far, this has not caused a issue when
building sk_buff from xdp_buff since all readers of xdp_buff->flags
use the flag only when there are fragments. Clear the
XDP_FLAGS_FRAGS_PF_MEMALLOC bit as well to make the flags correct.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250922233356.3356453-2-ameryhung@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes
- When shrinking non-linear XDP buffers, bpf_xdp_adjust_tail() can
free all frags but previously only cleared the HAS_FRAGS bit,
leaving XDP_FLAGS_FRAGS_PF_MEMALLOC set. This makes xdp_buff->flags
inconsistent: “no frags, but pfmemalloc set”.
- The fix adds an explicit clear for the pfmemalloc-frags bit when the
last frag is freed, ensuring flags correctly reflect state.
- Precise code changes
- Adds an inline helper to clear the pfmemalloc-frags bit:
- include/net/xdp.h:139: xdp_buff_clear_frag_pfmemalloc(struct
xdp_buff *xdp) clears XDP_FLAGS_FRAGS_PF_MEMALLOC by masking it
off.
- Invokes the helper when all fragments are freed in the shrink path:
- net/core/filter.c: in bpf_xdp_frags_shrink_tail(), after computing
that all frags are gone, it previously did:
- xdp_buff_clear_frags_flag(xdp);
- xdp->data_end -= offset;
Now it also does:
- xdp_buff_clear_frag_pfmemalloc(xdp);
- Concretely, in this tree: net/core/filter.c:4198 starts
bpf_xdp_frags_shrink_tail; when sinfo->nr_frags drops to zero, it
now calls both xdp_buff_clear_frags_flag(xdp) and
xdp_buff_clear_frag_pfmemalloc(xdp) before adjusting data_end.
- Why it matters
- pfmemalloc indicates frags came from memory under pressure. With no
frags, the flag must be false; leaving it set is incorrect state.
- Current skb-build paths only read the pfmemalloc flag when there are
frags (e.g., xdp_build_skb_from_buff uses pfmemalloc bit only if
xdp_buff_has_frags is true; see net/core/xdp.c:666-667, 720, 826 in
this tree). That’s why this hasn’t caused user-visible bugs yet.
However, correctness of flags avoids subtle future regressions and
makes the state coherent for any readers that don’t gate on
HAS_FRAGS.
- Scope and risk assessment
- Small, contained change: one new inline helper in a header and one
extra call in a single function.
- No API or ABI changes; no architectural refactoring.
- Touches BPF/XDP fast path but only modifies a bit when
sinfo->nr_frags becomes zero, which is the correct behavior by
definition.
- Extremely low regression risk; clearing a now-irrelevant bit cannot
break consumers and only improves state consistency.
- Backport considerations
- The bug and code paths exist in stable lines which support non-
linear XDP buffers:
- v6.1.y and v6.6.y have XDP_FLAGS_FRAGS_PF_MEMALLOC and the same
shrink path which only clears HAS_FRAGS, not PF_MEMALLOC (e.g.,
v6.6.99 net/core/filter.c shows only xdp_buff_clear_frags_flag();
include/net/xdp.h lacks the clear helper).
- The backport is trivial: add the inline clear helper to
include/net/xdp.h and invoke it in bpf_xdp_frags_shrink_tail()
alongside the existing HAS_FRAGS clear.
- No dependencies on recent infrastructure beyond the
FRAGS_PF_MEMALLOC flag (present since the XDP frags work was
introduced).
- Stable criteria fit
- Fixes a correctness bug that could lead to subtle misbehavior.
- Minimal and surgical; not a feature.
- No behavioral surprises or architectural changes.
- Applies cleanly to affected stable branches that have non-linear XDP
and the FRAGS_PF_MEMALLOC flag.
Conclusion: This is a low-risk correctness fix in BPF/XDP flag handling
and should be backported to stable.
include/net/xdp.h | 5 +++++
net/core/filter.c | 1 +
2 files changed, 6 insertions(+)
diff --git a/include/net/xdp.h b/include/net/xdp.h
index b40f1f96cb117..f288c348a6c13 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -115,6 +115,11 @@ static __always_inline void xdp_buff_set_frag_pfmemalloc(struct xdp_buff *xdp)
xdp->flags |= XDP_FLAGS_FRAGS_PF_MEMALLOC;
}
+static __always_inline void xdp_buff_clear_frag_pfmemalloc(struct xdp_buff *xdp)
+{
+ xdp->flags &= ~XDP_FLAGS_FRAGS_PF_MEMALLOC;
+}
+
static __always_inline void
xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
{
diff --git a/net/core/filter.c b/net/core/filter.c
index da391e2b0788d..43408bd3a87a4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4210,6 +4210,7 @@ static int bpf_xdp_frags_shrink_tail(struct xdp_buff *xdp, int offset)
if (unlikely(!sinfo->nr_frags)) {
xdp_buff_clear_frags_flag(xdp);
+ xdp_buff_clear_frag_pfmemalloc(xdp);
xdp->data_end -= offset;
}
--
2.51.0
next parent reply other threads:[~2025-10-09 15:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251009155752.773732-1-sashal@kernel.org>
2025-10-09 15:55 ` Sasha Levin [this message]
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17] selftests: drv-net: Pull data before parsing headers Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251009155752.773732-39-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=martin.lau@kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).