* [PATCH net v2] bnxt: fix head underflow on XDP head-grow
@ 2026-06-09 20:44 Joe Damato
2026-06-09 21:57 ` Michael Chan
2026-06-10 20:45 ` sashiko-bot
0 siblings, 2 replies; 3+ messages in thread
From: Joe Damato @ 2026-06-09 20:44 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
Stanislav Fomichev, Andy Gospodarek, Somnath Kotur
Cc: horms, Joe Damato, linux-kernel, bpf
The xdp.py test test_xdp_native_adjst_head_grow_data crashes when run on
a bnxt machine (and also crashes in NIPA).
It seems that the bug is an underflow in bnxt_rx_multi_page_skb, which
builds the skb head:
napi_build_skb(data_ptr - bp->rx_offset, rxr->rx_page_size);
The problem with this expression is that in page mode, rx_offset is:
bp->rx_offset = NET_IP_ALIGN + XDP_PACKET_HEADROOM;
Which evaluates (at least on x86_64) to 258.
The test test_xdp_native_adjst_head_grow_data tests a case where the
head is adjusted by -256.
When this test runs, data_ptr is shifted to frag_start + 2 (where
frag_start = page_address(page) + offset).
Then, bnxt_rx_multi_page_skb is invoked and the napi_build_skb
expression subtracts 258, landing at an address before frag_start. This
could be either the previous fragment or the previous physical page when
the offset is < 256 (e.g. if the fragment started at offset 0).
When the skb is freed, the page pool fragment reference is dropped on
either the wrong page or the wrong frag of the right page. In either
case, the corrupted reference count can lead to the page being
prematurely recycled while still in use. Once (incorrectly) recycled, it
can be handed out again and on driver teardown this would result in a
double free.
The commit under fixes updated this code to handle the case where the
native page size is >= 64k, but it unintentionally broke the head grow
case.
To fix this, add an offset field to struct bnxt_sw_rx_bd, mirroring the
existing offset field in struct bnxt_sw_rx_agg_bd. Populate it on
allocation and preserve it on reuse.
In bnxt_rx_multi_page_skb, use the newly added offset field to compute
the fragment start and pass that to napi_build_skb. Adjust the layout
with skb_reserve.
There are two cases, the non-adjustment case and the adjustment case.
In both cases, the skb is built at page_address(page) + offset to
account for the case where the native page size >= 64K and skb_reserve
is called with data_ptr - (page_address(page) + offset). That
difference equals bp->rx_offset when data_ptr was not moved, or
bp->rx_offset + xdp_adjust when XDP adjusted the head.
Re-running the failing test with this commit applied causes the test to
run successfully to completion.
The other rx_skb_func implementations don't have this issue.
Fixes: f6974b4c2d8e ("bnxt_en: Fix page pool logic for page size >= 64K")
Signed-off-by: Joe Damato <joe@dama.to>
---
v2:
- Add offset to bnxt_sw_rx_bd, mirroring bnxt_sw_rx_agg_bd,
as suggested by Michael Chan, eliminating math and simplifying code
in bnxt_rx_multi_page_skb
- Re-ran the xdp.py test to confirm the crash is resolved.
v1: https://lore.kernel.org/netdev/20260608203129.920104-1-joe@dama.to/
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 ++++++++--
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 35e1f8f663c7..12e73ffff6c3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1011,6 +1011,7 @@ int bnxt_alloc_rx_data(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
mapping += bp->rx_dma_offset;
rx_buf->data = page;
rx_buf->data_ptr = page_address(page) + offset + bp->rx_offset;
+ rx_buf->offset = offset;
} else {
u8 *data = __bnxt_alloc_rx_frag(bp, &mapping, rxr, gfp);
@@ -1019,6 +1020,7 @@ int bnxt_alloc_rx_data(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
rx_buf->data = data;
rx_buf->data_ptr = data + bp->rx_offset;
+ rx_buf->offset = 0;
}
rx_buf->mapping = mapping;
@@ -1040,6 +1042,7 @@ void bnxt_reuse_rx_data(struct bnxt_rx_ring_info *rxr, u16 cons, void *data)
prod_rx_buf->data_ptr = cons_rx_buf->data_ptr;
prod_rx_buf->mapping = cons_rx_buf->mapping;
+ prod_rx_buf->offset = cons_rx_buf->offset;
prod_bd = &rxr->rx_desc_ring[RX_RING(bp, prod)][RX_IDX(prod)];
cons_bd = &rxr->rx_desc_ring[RX_RING(bp, cons)][RX_IDX(cons)];
@@ -1175,8 +1178,11 @@ static struct sk_buff *bnxt_rx_multi_page_skb(struct bnxt *bp,
struct page *page = data;
u16 prod = rxr->rx_prod;
struct sk_buff *skb;
+ void *frag_start;
int err;
+ frag_start = page_address(page) + rxr->rx_buf_ring[cons].offset;
+
err = bnxt_alloc_rx_data(bp, rxr, prod, GFP_ATOMIC);
if (unlikely(err)) {
bnxt_reuse_rx_data(rxr, cons, data);
@@ -1185,13 +1191,13 @@ static struct sk_buff *bnxt_rx_multi_page_skb(struct bnxt *bp,
dma_addr -= bp->rx_dma_offset;
dma_sync_single_for_cpu(&bp->pdev->dev, dma_addr, rxr->rx_page_size,
bp->rx_dir);
- skb = napi_build_skb(data_ptr - bp->rx_offset, rxr->rx_page_size);
+ skb = napi_build_skb(frag_start, rxr->rx_page_size);
if (!skb) {
page_pool_recycle_direct(rxr->page_pool, page);
return NULL;
}
skb_mark_for_recycle(skb);
- skb_reserve(skb, bp->rx_offset);
+ skb_reserve(skb, data_ptr - (u8 *)frag_start);
__skb_put(skb, len);
return skb;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 61c847b36b9f..a2e75cfb65ab 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -911,6 +911,7 @@ struct bnxt_sw_rx_bd {
void *data;
u8 *data_ptr;
dma_addr_t mapping;
+ unsigned int offset;
};
struct bnxt_sw_rx_agg_bd {
base-commit: 0aa05daef7848a5ac11158949dc73cd741995dc1
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH net v2] bnxt: fix head underflow on XDP head-grow
2026-06-09 20:44 [PATCH net v2] bnxt: fix head underflow on XDP head-grow Joe Damato
@ 2026-06-09 21:57 ` Michael Chan
2026-06-10 20:45 ` sashiko-bot
1 sibling, 0 replies; 3+ messages in thread
From: Michael Chan @ 2026-06-09 21:57 UTC (permalink / raw)
To: Joe Damato
Cc: netdev, Pavan Chebbi, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
Andy Gospodarek, Somnath Kotur, horms, linux-kernel, bpf
[-- Attachment #1: Type: text/plain, Size: 2893 bytes --]
On Tue, Jun 9, 2026 at 1:45 PM Joe Damato <joe@dama.to> wrote:
>
> The xdp.py test test_xdp_native_adjst_head_grow_data crashes when run on
> a bnxt machine (and also crashes in NIPA).
>
> It seems that the bug is an underflow in bnxt_rx_multi_page_skb, which
> builds the skb head:
>
> napi_build_skb(data_ptr - bp->rx_offset, rxr->rx_page_size);
>
> The problem with this expression is that in page mode, rx_offset is:
>
> bp->rx_offset = NET_IP_ALIGN + XDP_PACKET_HEADROOM;
>
> Which evaluates (at least on x86_64) to 258.
>
> The test test_xdp_native_adjst_head_grow_data tests a case where the
> head is adjusted by -256.
>
> When this test runs, data_ptr is shifted to frag_start + 2 (where
> frag_start = page_address(page) + offset).
>
> Then, bnxt_rx_multi_page_skb is invoked and the napi_build_skb
> expression subtracts 258, landing at an address before frag_start. This
> could be either the previous fragment or the previous physical page when
> the offset is < 256 (e.g. if the fragment started at offset 0).
>
> When the skb is freed, the page pool fragment reference is dropped on
> either the wrong page or the wrong frag of the right page. In either
> case, the corrupted reference count can lead to the page being
> prematurely recycled while still in use. Once (incorrectly) recycled, it
> can be handed out again and on driver teardown this would result in a
> double free.
>
> The commit under fixes updated this code to handle the case where the
> native page size is >= 64k, but it unintentionally broke the head grow
> case.
>
> To fix this, add an offset field to struct bnxt_sw_rx_bd, mirroring the
> existing offset field in struct bnxt_sw_rx_agg_bd. Populate it on
> allocation and preserve it on reuse.
>
> In bnxt_rx_multi_page_skb, use the newly added offset field to compute
> the fragment start and pass that to napi_build_skb. Adjust the layout
> with skb_reserve.
>
> There are two cases, the non-adjustment case and the adjustment case.
>
> In both cases, the skb is built at page_address(page) + offset to
> account for the case where the native page size >= 64K and skb_reserve
> is called with data_ptr - (page_address(page) + offset). That
> difference equals bp->rx_offset when data_ptr was not moved, or
> bp->rx_offset + xdp_adjust when XDP adjusted the head.
>
> Re-running the failing test with this commit applied causes the test to
> run successfully to completion.
>
> The other rx_skb_func implementations don't have this issue.
>
> Fixes: f6974b4c2d8e ("bnxt_en: Fix page pool logic for page size >= 64K")
> Signed-off-by: Joe Damato <joe@dama.to>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
As a side note, struct bnxt_tpa_info is similar to struct
bnxt_sw_rx_bd, but it doesn't require the offset field because we
always allocate frags for TPA. Thanks.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5469 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH net v2] bnxt: fix head underflow on XDP head-grow
2026-06-09 20:44 [PATCH net v2] bnxt: fix head underflow on XDP head-grow Joe Damato
2026-06-09 21:57 ` Michael Chan
@ 2026-06-10 20:45 ` sashiko-bot
1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-06-10 20:45 UTC (permalink / raw)
To: Joe Damato; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] If `bp->rx_skb_func()` fails, `bnxt_rx_pkt()` skips aggregate buffer cleanup, permanently leaking `rx_agg_ring` hardware slots and `netmem` references.
--
commit 374adfb9335a0a96705993cee5701e4d6ce2423c
Author: Joe Damato <joe@dama.to>
bnxt: fix head underflow on XDP head-grow
This commit fixes a head underflow bug in bnxt_rx_multi_page_skb() when
handling XDP adjust head grow cases. It accomplishes this by adding an offset
field to struct bnxt_sw_rx_bd and using it to compute the correct fragment
start address.
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 35e1f8f663c78..12e73ffff6c3d 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
[ ... ]
> @@ -1185,13 +1191,13 @@ static struct sk_buff *bnxt_rx_multi_page_skb(struct bnxt *bp,
> dma_addr -= bp->rx_dma_offset;
> dma_sync_single_for_cpu(&bp->pdev->dev, dma_addr, rxr->rx_page_size,
> bp->rx_dir);
> - skb = napi_build_skb(data_ptr - bp->rx_offset, rxr->rx_page_size);
> + skb = napi_build_skb(frag_start, rxr->rx_page_size);
> if (!skb) {
> page_pool_recycle_direct(rxr->page_pool, page);
> return NULL;
[Severity: High]
This isn't a bug introduced by this patch, but does returning NULL here
leak aggregate buffers in bnxt_rx_pkt()?
Looking at bnxt_rx_pkt(), when bp->rx_skb_func() (which maps to
bnxt_rx_multi_page_skb() in page mode) returns NULL, the code jumps
to oom_next_rx:
drivers/net/ethernet/broadcom/bnxt/bnxt.c:bnxt_rx_pkt()
skb = bp->rx_skb_func(bp, rxr, cons, data, data_ptr, dma_addr,
payload | len);
if (!skb)
goto oom_next_rx;
If the packet had aggregate buffers (agg_bufs > 0), this bypasses the
cleanup block later in the function that would recycle these buffers via
bnxt_reuse_rx_agg_bufs().
Can this omission cause a permanent leak of hardware aggregate slots and
netmem references, eventually wedging the RX ring under memory pressure?
> }
> skb_mark_for_recycle(skb);
> - skb_reserve(skb, bp->rx_offset);
> + skb_reserve(skb, data_ptr - (u8 *)frag_start);
> __skb_put(skb, len);
>
> return skb;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609204458.2237787-2-joe@dama.to?part=1
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-10 20:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 20:44 [PATCH net v2] bnxt: fix head underflow on XDP head-grow Joe Damato
2026-06-09 21:57 ` Michael Chan
2026-06-10 20:45 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.