netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.17-6.1] bpf: Clear pfmemalloc flag when freeing all fragments
       [not found] <20251009155752.773732-1-sashal@kernel.org>
@ 2025-10-09 15:55 ` Sasha Levin
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17] selftests: drv-net: Pull data before parsing headers Sasha Levin
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2025-10-09 15:55 UTC (permalink / raw)
  To: patches, stable
  Cc: Amery Hung, Martin KaFai Lau, Maciej Fijalkowski, Sasha Levin,
	ast, daniel, davem, kuba, hawk, john.fastabend, edumazet, pabeni,
	andrii, martin.lau, netdev, bpf

From: Amery Hung <ameryhung@gmail.com>

[ Upstream commit 8f12d1137c2382c80aada8e05d7cc650cd4e403c ]

It is possible for bpf_xdp_adjust_tail() to free all fragments. The
kfunc currently clears the XDP_FLAGS_HAS_FRAGS bit, but not
XDP_FLAGS_FRAGS_PF_MEMALLOC. So far, this has not caused a issue when
building sk_buff from xdp_buff since all readers of xdp_buff->flags
use the flag only when there are fragments. Clear the
XDP_FLAGS_FRAGS_PF_MEMALLOC bit as well to make the flags correct.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250922233356.3356453-2-ameryhung@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What it fixes
  - When shrinking non-linear XDP buffers, bpf_xdp_adjust_tail() can
    free all frags but previously only cleared the HAS_FRAGS bit,
    leaving XDP_FLAGS_FRAGS_PF_MEMALLOC set. This makes xdp_buff->flags
    inconsistent: “no frags, but pfmemalloc set”.
  - The fix adds an explicit clear for the pfmemalloc-frags bit when the
    last frag is freed, ensuring flags correctly reflect state.

- Precise code changes
  - Adds an inline helper to clear the pfmemalloc-frags bit:
    - include/net/xdp.h:139: xdp_buff_clear_frag_pfmemalloc(struct
      xdp_buff *xdp) clears XDP_FLAGS_FRAGS_PF_MEMALLOC by masking it
      off.
  - Invokes the helper when all fragments are freed in the shrink path:
    - net/core/filter.c: in bpf_xdp_frags_shrink_tail(), after computing
      that all frags are gone, it previously did:
      - xdp_buff_clear_frags_flag(xdp);
      - xdp->data_end -= offset;
      Now it also does:
      - xdp_buff_clear_frag_pfmemalloc(xdp);
    - Concretely, in this tree: net/core/filter.c:4198 starts
      bpf_xdp_frags_shrink_tail; when sinfo->nr_frags drops to zero, it
      now calls both xdp_buff_clear_frags_flag(xdp) and
      xdp_buff_clear_frag_pfmemalloc(xdp) before adjusting data_end.

- Why it matters
  - pfmemalloc indicates frags came from memory under pressure. With no
    frags, the flag must be false; leaving it set is incorrect state.
  - Current skb-build paths only read the pfmemalloc flag when there are
    frags (e.g., xdp_build_skb_from_buff uses pfmemalloc bit only if
    xdp_buff_has_frags is true; see net/core/xdp.c:666-667, 720, 826 in
    this tree). That’s why this hasn’t caused user-visible bugs yet.
    However, correctness of flags avoids subtle future regressions and
    makes the state coherent for any readers that don’t gate on
    HAS_FRAGS.

- Scope and risk assessment
  - Small, contained change: one new inline helper in a header and one
    extra call in a single function.
  - No API or ABI changes; no architectural refactoring.
  - Touches BPF/XDP fast path but only modifies a bit when
    sinfo->nr_frags becomes zero, which is the correct behavior by
    definition.
  - Extremely low regression risk; clearing a now-irrelevant bit cannot
    break consumers and only improves state consistency.

- Backport considerations
  - The bug and code paths exist in stable lines which support non-
    linear XDP buffers:
    - v6.1.y and v6.6.y have XDP_FLAGS_FRAGS_PF_MEMALLOC and the same
      shrink path which only clears HAS_FRAGS, not PF_MEMALLOC (e.g.,
      v6.6.99 net/core/filter.c shows only xdp_buff_clear_frags_flag();
      include/net/xdp.h lacks the clear helper).
  - The backport is trivial: add the inline clear helper to
    include/net/xdp.h and invoke it in bpf_xdp_frags_shrink_tail()
    alongside the existing HAS_FRAGS clear.
  - No dependencies on recent infrastructure beyond the
    FRAGS_PF_MEMALLOC flag (present since the XDP frags work was
    introduced).

- Stable criteria fit
  - Fixes a correctness bug that could lead to subtle misbehavior.
  - Minimal and surgical; not a feature.
  - No behavioral surprises or architectural changes.
  - Applies cleanly to affected stable branches that have non-linear XDP
    and the FRAGS_PF_MEMALLOC flag.

Conclusion: This is a low-risk correctness fix in BPF/XDP flag handling
and should be backported to stable.

 include/net/xdp.h | 5 +++++
 net/core/filter.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index b40f1f96cb117..f288c348a6c13 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -115,6 +115,11 @@ static __always_inline void xdp_buff_set_frag_pfmemalloc(struct xdp_buff *xdp)
 	xdp->flags |= XDP_FLAGS_FRAGS_PF_MEMALLOC;
 }
 
+static __always_inline void xdp_buff_clear_frag_pfmemalloc(struct xdp_buff *xdp)
+{
+	xdp->flags &= ~XDP_FLAGS_FRAGS_PF_MEMALLOC;
+}
+
 static __always_inline void
 xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
diff --git a/net/core/filter.c b/net/core/filter.c
index da391e2b0788d..43408bd3a87a4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4210,6 +4210,7 @@ static int bpf_xdp_frags_shrink_tail(struct xdp_buff *xdp, int offset)
 
 	if (unlikely(!sinfo->nr_frags)) {
 		xdp_buff_clear_frags_flag(xdp);
+		xdp_buff_clear_frag_pfmemalloc(xdp);
 		xdp->data_end -= offset;
 	}
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 6.17] selftests: drv-net: Pull data before parsing headers
       [not found] <20251009155752.773732-1-sashal@kernel.org>
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] bpf: Clear pfmemalloc flag when freeing all fragments Sasha Levin
@ 2025-10-09 15:55 ` Sasha Levin
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2025-10-09 15:55 UTC (permalink / raw)
  To: patches, stable
  Cc: Amery Hung, Martin KaFai Lau, Sasha Levin, ast, daniel, davem,
	kuba, hawk, john.fastabend, mohsin.bashr, netdev, bpf

From: Amery Hung <ameryhung@gmail.com>

[ Upstream commit efec2e55bdefb889639a6e7fe1f1f2431cdddc6a ]

It is possible for drivers to generate xdp packets with data residing
entirely in fragments. To keep parsing headers using direct packet
access, call bpf_xdp_pull_data() to pull headers into the linear data
area.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250922233356.3356453-9-ameryhung@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What it fixes and why it matters
  - The BPF selftest program parsed Ethernet/IP/UDP headers using direct
    packet access without guaranteeing those headers are in the linear
    area. On drivers/NIC configs that place header data entirely in XDP
    fragments (multi-buffer/HDS), this can lead to invalid accesses or
    verifier failures. The patch ensures headers are pulled into the
    linear area before parsing, so the tests behave correctly on such
    drivers.

- Scope and contained changes
  - Single selftests file only:
    tools/testing/selftests/net/lib/xdp_native.bpf.c.
  - Adds kfunc declaration for `bpf_xdp_pull_data()` at
    xdp_native.bpf.c:17 to request pulling bytes from non-linear XDP
    data into the linear region.
  - Updates the UDP header parsing helper to pull and then re-read
    pointers:
    - Pull L2 first, then re-load pointers (xdp_native.bpf.c:78–86 and
      96–106).
    - For IPv4, pull up to L3+L4 and re-load pointers
      (xdp_native.bpf.c:91–106).
    - For IPv6, same pattern (xdp_native.bpf.c:109–124).
    - This ensures `data`/`data_end` are refreshed after each pull to
      satisfy the verifier and correctness of direct accesses.
  - Updates TX path similarly:
    - Pull L2 then re-load pointers (xdp_native.bpf.c:182–190).
    - For IPv4, pull up to L3+L4, re-load pointers, then validate, swap
      L2 and swap IPv4 src/dst (xdp_native.bpf.c:196–221).
    - For IPv6, same flow including `eth = data` reload before swapping
      MACs (xdp_native.bpf.c:233–261).
  - No kernel subsystem logic is changed; only test-side BPF program
    logic.

- Backport suitability vs. stable rules
  - Fixes a real-world issue affecting test correctness on drivers that
    produce non-linear XDP frames (user-visible in CI/selftests).
  - Minimal, self-contained change confined to selftests; no API or ABI
    changes; no architecture changes; low regression risk to the kernel
    proper.
  - Aligns with stable policy to keep selftests working on stable trees
    that already have the underlying feature.

- Important dependency to include
  - This change depends on kernel support for the kfunc
    `bpf_xdp_pull_data()` which is introduced by “bpf: Support pulling
    non-linear xdp data” (net/core/filter.c:12253). Ensure that commit
    is present in the target stable branch; otherwise the selftest
    program load will fail on kernels without this kfunc.
  - There is a follow-up fix that must be included to avoid verifier
    failures: “selftests: drv-net: Reload pkt pointer after calling
    filter_udphdr” (commit 11ae737efea10). It re-computes header length
    using a freshly reloaded `ctx->data` after `filter_udphdr()` because
    `bpf_xdp_pull_data()` invalidates earlier packet pointers. In this
    tree, that fix manifests as changing `hdr_len` calculations to `...
    - (void *)(long)ctx->data` (e.g., xdp_native.bpf.c:430–436 and
    582–590). Backport this fix alongside the main patch to prevent non-
    deterministic verifier errors depending on compiler codegen.

- Risk and side effects
  - Selftests-only; no effect on runtime kernel paths.
  - The only meaningful risk is missing dependencies: if
    `bpf_xdp_pull_data()` support isn’t in the target stable branch, or
    if the follow-up “Reload pkt pointer” fix is omitted, test load or
    verification can fail. With both present, changes are
    straightforward and low risk.

Given the above, this is a good candidate for stable backport on
branches that already include `bpf_xdp_pull_data()` support, and it
should be backported together with the follow-up “Reload pkt pointer”
fix to avoid verifier regressions.

 .../selftests/net/lib/xdp_native.bpf.c        | 89 +++++++++++++++----
 1 file changed, 74 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/net/lib/xdp_native.bpf.c b/tools/testing/selftests/net/lib/xdp_native.bpf.c
index 521ba38f2ddda..df4eea5c192b3 100644
--- a/tools/testing/selftests/net/lib/xdp_native.bpf.c
+++ b/tools/testing/selftests/net/lib/xdp_native.bpf.c
@@ -14,6 +14,8 @@
 #define MAX_PAYLOAD_LEN 5000
 #define MAX_HDR_LEN 64
 
+extern int bpf_xdp_pull_data(struct xdp_md *xdp, __u32 len) __ksym __weak;
+
 enum {
 	XDP_MODE = 0,
 	XDP_PORT = 1,
@@ -68,30 +70,57 @@ static void record_stats(struct xdp_md *ctx, __u32 stat_type)
 
 static struct udphdr *filter_udphdr(struct xdp_md *ctx, __u16 port)
 {
-	void *data_end = (void *)(long)ctx->data_end;
-	void *data = (void *)(long)ctx->data;
 	struct udphdr *udph = NULL;
-	struct ethhdr *eth = data;
+	void *data, *data_end;
+	struct ethhdr *eth;
+	int err;
+
+	err = bpf_xdp_pull_data(ctx, sizeof(*eth));
+	if (err)
+		return NULL;
+
+	data_end = (void *)(long)ctx->data_end;
+	data = eth = (void *)(long)ctx->data;
 
 	if (data + sizeof(*eth) > data_end)
 		return NULL;
 
 	if (eth->h_proto == bpf_htons(ETH_P_IP)) {
-		struct iphdr *iph = data + sizeof(*eth);
+		struct iphdr *iph;
+
+		err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*iph) +
+					     sizeof(*udph));
+		if (err)
+			return NULL;
+
+		data_end = (void *)(long)ctx->data_end;
+		data = (void *)(long)ctx->data;
+
+		iph = data + sizeof(*eth);
 
 		if (iph + 1 > (struct iphdr *)data_end ||
 		    iph->protocol != IPPROTO_UDP)
 			return NULL;
 
-		udph = (void *)eth + sizeof(*iph) + sizeof(*eth);
-	} else if (eth->h_proto  == bpf_htons(ETH_P_IPV6)) {
-		struct ipv6hdr *ipv6h = data + sizeof(*eth);
+		udph = data + sizeof(*iph) + sizeof(*eth);
+	} else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ipv6h;
+
+		err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*ipv6h) +
+					     sizeof(*udph));
+		if (err)
+			return NULL;
+
+		data_end = (void *)(long)ctx->data_end;
+		data = (void *)(long)ctx->data;
+
+		ipv6h = data + sizeof(*eth);
 
 		if (ipv6h + 1 > (struct ipv6hdr *)data_end ||
 		    ipv6h->nexthdr != IPPROTO_UDP)
 			return NULL;
 
-		udph = (void *)eth + sizeof(*ipv6h) + sizeof(*eth);
+		udph = data + sizeof(*ipv6h) + sizeof(*eth);
 	} else {
 		return NULL;
 	}
@@ -145,17 +174,34 @@ static void swap_machdr(void *data)
 
 static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
 {
-	void *data_end = (void *)(long)ctx->data_end;
-	void *data = (void *)(long)ctx->data;
 	struct udphdr *udph = NULL;
-	struct ethhdr *eth = data;
+	void *data, *data_end;
+	struct ethhdr *eth;
+	int err;
+
+	err = bpf_xdp_pull_data(ctx, sizeof(*eth));
+	if (err)
+		return XDP_PASS;
+
+	data_end = (void *)(long)ctx->data_end;
+	data = eth = (void *)(long)ctx->data;
 
 	if (data + sizeof(*eth) > data_end)
 		return XDP_PASS;
 
 	if (eth->h_proto == bpf_htons(ETH_P_IP)) {
-		struct iphdr *iph = data + sizeof(*eth);
-		__be32 tmp_ip = iph->saddr;
+		struct iphdr *iph;
+		__be32 tmp_ip;
+
+		err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*iph) +
+					     sizeof(*udph));
+		if (err)
+			return XDP_PASS;
+
+		data_end = (void *)(long)ctx->data_end;
+		data = (void *)(long)ctx->data;
+
+		iph = data + sizeof(*eth);
 
 		if (iph + 1 > (struct iphdr *)data_end ||
 		    iph->protocol != IPPROTO_UDP)
@@ -169,8 +215,10 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
 			return XDP_PASS;
 
 		record_stats(ctx, STATS_RX);
+		eth = data;
 		swap_machdr((void *)eth);
 
+		tmp_ip = iph->saddr;
 		iph->saddr = iph->daddr;
 		iph->daddr = tmp_ip;
 
@@ -178,9 +226,19 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
 
 		return XDP_TX;
 
-	} else if (eth->h_proto  == bpf_htons(ETH_P_IPV6)) {
-		struct ipv6hdr *ipv6h = data + sizeof(*eth);
+	} else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
 		struct in6_addr tmp_ipv6;
+		struct ipv6hdr *ipv6h;
+
+		err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*ipv6h) +
+					     sizeof(*udph));
+		if (err)
+			return XDP_PASS;
+
+		data_end = (void *)(long)ctx->data_end;
+		data = (void *)(long)ctx->data;
+
+		ipv6h = data + sizeof(*eth);
 
 		if (ipv6h + 1 > (struct ipv6hdr *)data_end ||
 		    ipv6h->nexthdr != IPPROTO_UDP)
@@ -194,6 +252,7 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
 			return XDP_PASS;
 
 		record_stats(ctx, STATS_RX);
+		eth = data;
 		swap_machdr((void *)eth);
 
 		__builtin_memcpy(&tmp_ipv6, &ipv6h->saddr, sizeof(tmp_ipv6));
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 6.17-6.1] selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh
       [not found] <20251009155752.773732-1-sashal@kernel.org>
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] bpf: Clear pfmemalloc flag when freeing all fragments Sasha Levin
  2025-10-09 15:55 ` [PATCH AUTOSEL 6.17] selftests: drv-net: Pull data before parsing headers Sasha Levin
@ 2025-10-09 15:55 ` Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2025-10-09 15:55 UTC (permalink / raw)
  To: patches, stable
  Cc: Ricardo B. Marlière, Andrii Nakryiko, Magnus Karlsson,
	Sasha Levin, maciej.fijalkowski, ast, daniel, eddyz87, netdev,
	bpf

From: Ricardo B. Marlière <rbm@suse.com>

[ Upstream commit 2a912258c90e895363c0ffc0be8a47f112ab67b7 ]

Currently, even if some subtests fails, the end result will still yield
"ok 1 selftests: bpf: test_xsk.sh". Fix it by exiting with 1 if there are
any failures.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20250828-selftests-bpf-test_xsk_ret-v1-1-e6656c01f397@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my comprehensive analysis, here is my determination:

**Backport Status: YES**

## Analysis

### Code Change
The commit adds a simple 2-line fix to ensure the test script exits with
code 1 when failures occur:
```bash
+else
+       exit 1
```

### Rationale for Backporting

**1. Pattern of Similar Backports**
My research shows that similar test exit code fixes have been
consistently backported to stable kernels:
- `selftests/net: have gro.sh -t return a correct exit code` (commit
  784e6abd99f24) was backported by AUTOSEL
- `selftests: ksft: Fix finished() helper exit code on skipped tests`
  (commit 170c966cbe274) was backported by AUTOSEL
- `selftests: xsk: fix reporting of failed tests` (commit 895b62eed2ab4)
  was backported to stable 6.1 branches

**2. Critical for Testing Infrastructure**
This fix addresses a real bug in test reporting that affects:
- **CI/Automated Testing**: Systems running selftests on stable kernels
  rely on correct exit codes to detect regressions
- **False Positives**: The current behavior reports "ok" even when tests
  fail, masking real problems
- **Quality Assurance**: Proper exit codes are essential for stable
  kernel validation

**3. Meets Stable Kernel Rules**
According to Documentation/process/stable-kernel-rules.rst:
- ✅ **Obviously correct**: Trivial 2-line addition with clear intent
- ✅ **Small and contained**: Only 2 lines in a single shell script
- ✅ **Fixes a real bug**: Test infrastructure incorrectly reporting
  success on failures
- ✅ **Already in mainline**: Commit 2a912258c90e exists in upstream
- ✅ **Minimal risk**: Changes only test infrastructure, cannot affect
  kernel runtime
- ✅ **Benefits users**: Helps developers and organizations running tests
  on stable kernels

**4. Historical Evidence**
The commit 8f610b24a1a44 shows this has already been selected by AUTOSEL
for backporting, with the marker `[ Upstream commit
2a912258c90e895363c0ffc0be8a47f112ab67b7 ]` and signed by Sasha Levin.

### Conclusion
This is a clear candidate for stable backporting. It fixes test
infrastructure that provides critical validation for stable kernels,
follows established backporting patterns for similar fixes, and meets
all stable kernel rules criteria.

 tools/testing/selftests/bpf/test_xsk.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_xsk.sh b/tools/testing/selftests/bpf/test_xsk.sh
index 65aafe0003db0..62db060298a4a 100755
--- a/tools/testing/selftests/bpf/test_xsk.sh
+++ b/tools/testing/selftests/bpf/test_xsk.sh
@@ -241,4 +241,6 @@ done
 
 if [ $failures -eq 0 ]; then
         echo "All tests successful!"
+else
+	exit 1
 fi
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-10-09 16:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251009155752.773732-1-sashal@kernel.org>
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] bpf: Clear pfmemalloc flag when freeing all fragments Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17] selftests: drv-net: Pull data before parsing headers Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.1] selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).