* [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5
@ 2025-09-15 22:58 Amery Hung
2025-09-15 22:58 ` [PATCH net v2 1/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ Amery Hung
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Amery Hung @ 2025-09-15 22:58 UTC (permalink / raw)
To: netdev
Cc: bpf, andrew+netdev, davem, edumazet, pabeni, kuba, martin.lau,
noren, dtatulea, saeedm, tariqt, mbloch, cpaasch, ameryhung,
kernel-team
v1 -> v2
- Simplify truesize calculation (Tariq)
- Narrow the scope of local variables (Tariq)
- Make truesize adjustment conditional (Tariq)
v1
- Separate the set from [0] (Dragos)
- Split legacy RQ and striding RQ fixes (Dragos)
- Drop conditional truesize and end frag ptr update (Dragos)
- Fix truesize calculation in striding RQ (Dragos)
- Fix the always zero headlen passed to __pskb_pull_tail() that
causes kernel panic (Nimrod)
Link: https://lore.kernel.org/bpf/20250910034103.650342-1-ameryhung@gmail.com/
---
Hi all,
This patchset, separated from [0], contains fixes to mlx5 when handling
non-linear xdp_buff. The driver currently generates skb based on
information obtained before the XDP program runs, such as the number of
fragments and the size of the linear data. However, the XDP program can
actually change them through bpf_adjust_{head,tail}(). Fix the bugs
bygenerating skb according to xdp_buff after the XDP program runs.
[0] https://lore.kernel.org/bpf/20250905173352.3759457-1-ameryhung@gmail.com/
---
Amery Hung (2):
net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy
RQ
net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for
striding RQ
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 47 +++++++++++++++----
1 file changed, 38 insertions(+), 9 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net v2 1/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ
2025-09-15 22:58 [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Amery Hung
@ 2025-09-15 22:58 ` Amery Hung
2025-09-15 22:58 ` [PATCH net v2 2/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ Amery Hung
2025-09-16 13:52 ` [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Tariq Toukan
2 siblings, 0 replies; 7+ messages in thread
From: Amery Hung @ 2025-09-15 22:58 UTC (permalink / raw)
To: netdev
Cc: bpf, andrew+netdev, davem, edumazet, pabeni, kuba, martin.lau,
noren, dtatulea, saeedm, tariqt, mbloch, cpaasch, ameryhung,
kernel-team
XDP programs can release xdp_buff fragments when calling
bpf_xdp_adjust_tail(). The driver currently assumes the number of
fragments to be unchanged and may generate skb with wrong truesize or
containing invalid frags. Fix the bug by generating skb according to
xdp_buff after the XDP program runs.
Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 24 ++++++++++++++-----
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index b8c609d91d11..fadf04564981 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1773,14 +1773,26 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
}
prog = rcu_dereference(rq->xdp_prog);
- if (prog && mlx5e_xdp_handle(rq, prog, mxbuf)) {
- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
- struct mlx5e_wqe_frag_info *pwi;
+ if (prog) {
+ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags;
+
+ if (mlx5e_xdp_handle(rq, prog, mxbuf)) {
+ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
+ struct mlx5e_wqe_frag_info *pwi;
+
+ wi -= old_nr_frags - sinfo->nr_frags;
+
+ for (pwi = head_wi; pwi < wi; pwi++)
+ pwi->frag_page->frags++;
+ }
+ return NULL; /* page/packet was consumed by XDP */
+ }
- for (pwi = head_wi; pwi < wi; pwi++)
- pwi->frag_page->frags++;
+ nr_frags_free = old_nr_frags - sinfo->nr_frags;
+ if (unlikely(nr_frags_free)) {
+ wi -= nr_frags_free;
+ truesize -= nr_frags_free * frag_info->frag_stride;
}
- return NULL; /* page/packet was consumed by XDP */
}
skb = mlx5e_build_linear_skb(
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net v2 2/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ
2025-09-15 22:58 [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Amery Hung
2025-09-15 22:58 ` [PATCH net v2 1/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ Amery Hung
@ 2025-09-15 22:58 ` Amery Hung
2025-09-16 13:52 ` [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Tariq Toukan
2 siblings, 0 replies; 7+ messages in thread
From: Amery Hung @ 2025-09-15 22:58 UTC (permalink / raw)
To: netdev
Cc: bpf, andrew+netdev, davem, edumazet, pabeni, kuba, martin.lau,
noren, dtatulea, saeedm, tariqt, mbloch, cpaasch, ameryhung,
kernel-team
XDP programs can change the layout of an xdp_buff through
bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver
cannot assume the size of the linear data area nor fragments. Fix the
bug in mlx5 by generating skb according to xdp_buff after XDP programs
run.
Currently, when handling multi-buf XDP, the mlx5 driver assumes the
layout of an xdp_buff to be unchanged. That is, the linear data area
continues to be empty and fragments remain the same. This may cause
the driver to generate erroneous skb or triggering a kernel
warning. When an XDP program added linear data through
bpf_xdp_adjust_head(), the linear data will be ignored as
mlx5e_build_linear_skb() builds an skb without linear data and then
pull data from fragments to fill the linear data area. When an XDP
program has shrunk the non-linear data through bpf_xdp_adjust_tail(),
the delta passed to __pskb_pull_tail() may exceed the actual nonlinear
data size and trigger the BUG_ON in it.
To fix the issue, first record the original number of fragments. If the
number of fragments changes after the XDP program runs, rewind the end
fragment pointer by the difference and recalculate the truesize. Then,
build the skb with the linear data area matching the xdp_buff. Finally,
only pull data in if there is non-linear data and fill the linear part
up to 256 bytes.
Fixes: f52ac7028bec ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 23 ++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index fadf04564981..aa1368698a40 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -2016,6 +2016,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
u32 byte_cnt = cqe_bcnt;
struct skb_shared_info *sinfo;
unsigned int truesize = 0;
+ u32 pg_consumed_bytes;
struct bpf_prog *prog;
struct sk_buff *skb;
u32 linear_frame_sz;
@@ -2069,7 +2070,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
while (byte_cnt) {
/* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */
- u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt);
+ pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt);
if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state))
truesize += pg_consumed_bytes;
@@ -2085,10 +2086,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
}
if (prog) {
+ u8 nr_frags_free, old_nr_frags = sinfo->nr_frags;
+ u32 len;
+
if (mlx5e_xdp_handle(rq, prog, mxbuf)) {
if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
struct mlx5e_frag_page *pfp;
+ frag_page -= old_nr_frags - sinfo->nr_frags;
+
for (pfp = head_page; pfp < frag_page; pfp++)
pfp->frags++;
@@ -2099,9 +2105,18 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
return NULL; /* page/packet was consumed by XDP */
}
+ nr_frags_free = old_nr_frags - sinfo->nr_frags;
+ if (unlikely(nr_frags_free)) {
+ frag_page -= nr_frags_free;
+ truesize -= ALIGN(pg_consumed_bytes, BIT(rq->mpwqe.log_stride_sz)) +
+ (nr_frags_free - 1) * PAGE_SIZE;
+ }
+
+ len = mxbuf->xdp.data_end - mxbuf->xdp.data;
+
skb = mlx5e_build_linear_skb(
rq, mxbuf->xdp.data_hard_start, linear_frame_sz,
- mxbuf->xdp.data - mxbuf->xdp.data_hard_start, 0,
+ mxbuf->xdp.data - mxbuf->xdp.data_hard_start, len,
mxbuf->xdp.data - mxbuf->xdp.data_meta);
if (unlikely(!skb)) {
mlx5e_page_release_fragmented(rq->page_pool,
@@ -2126,8 +2141,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
do
pagep->frags++;
while (++pagep < frag_page);
+
+ headlen = min_t(u16, MLX5E_RX_MAX_HEAD - len, skb->data_len);
+ __pskb_pull_tail(skb, headlen);
}
- __pskb_pull_tail(skb, headlen);
} else {
dma_addr_t addr;
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5
2025-09-15 22:58 [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Amery Hung
2025-09-15 22:58 ` [PATCH net v2 1/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ Amery Hung
2025-09-15 22:58 ` [PATCH net v2 2/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ Amery Hung
@ 2025-09-16 13:52 ` Tariq Toukan
2025-09-21 11:24 ` Tariq Toukan
2 siblings, 1 reply; 7+ messages in thread
From: Tariq Toukan @ 2025-09-16 13:52 UTC (permalink / raw)
To: Amery Hung, netdev
Cc: bpf, andrew+netdev, davem, edumazet, pabeni, kuba, martin.lau,
noren, dtatulea, saeedm, tariqt, mbloch, cpaasch, kernel-team
On 16/09/2025 1:58, Amery Hung wrote:
> v1 -> v2
> - Simplify truesize calculation (Tariq)
> - Narrow the scope of local variables (Tariq)
> - Make truesize adjustment conditional (Tariq)
>
> v1
> - Separate the set from [0] (Dragos)
> - Split legacy RQ and striding RQ fixes (Dragos)
> - Drop conditional truesize and end frag ptr update (Dragos)
> - Fix truesize calculation in striding RQ (Dragos)
> - Fix the always zero headlen passed to __pskb_pull_tail() that
> causes kernel panic (Nimrod)
>
> Link: https://lore.kernel.org/bpf/20250910034103.650342-1-ameryhung@gmail.com/
>
> ---
>
> Hi all,
>
> This patchset, separated from [0], contains fixes to mlx5 when handling
> non-linear xdp_buff. The driver currently generates skb based on
> information obtained before the XDP program runs, such as the number of
> fragments and the size of the linear data. However, the XDP program can
> actually change them through bpf_adjust_{head,tail}(). Fix the bugs
> bygenerating skb according to xdp_buff after the XDP program runs.
>
> [0] https://lore.kernel.org/bpf/20250905173352.3759457-1-ameryhung@gmail.com/
>
> ---
>
> Amery Hung (2):
> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy
> RQ
> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for
> striding RQ
>
> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 47 +++++++++++++++----
> 1 file changed, 38 insertions(+), 9 deletions(-)
>
Thanks for your patches.
They LGTM.
As these are touching a sensitive area, I am taking them into internal
functional and perf testing.
I'll update with results once completed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5
2025-09-16 13:52 ` [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Tariq Toukan
@ 2025-09-21 11:24 ` Tariq Toukan
2025-09-22 17:51 ` Amery Hung
2025-09-22 18:03 ` Jakub Kicinski
0 siblings, 2 replies; 7+ messages in thread
From: Tariq Toukan @ 2025-09-21 11:24 UTC (permalink / raw)
To: Amery Hung, netdev
Cc: bpf, andrew+netdev, davem, edumazet, pabeni, kuba, martin.lau,
noren, dtatulea, saeedm, tariqt, mbloch, cpaasch, kernel-team
On 16/09/2025 16:52, Tariq Toukan wrote:
>
>
> On 16/09/2025 1:58, Amery Hung wrote:
>> v1 -> v2
>> - Simplify truesize calculation (Tariq)
>> - Narrow the scope of local variables (Tariq)
>> - Make truesize adjustment conditional (Tariq)
>>
>> v1
>> - Separate the set from [0] (Dragos)
>> - Split legacy RQ and striding RQ fixes (Dragos)
>> - Drop conditional truesize and end frag ptr update (Dragos)
>> - Fix truesize calculation in striding RQ (Dragos)
>> - Fix the always zero headlen passed to __pskb_pull_tail() that
>> causes kernel panic (Nimrod)
>>
>> Link: https://lore.kernel.org/bpf/20250910034103.650342-1-
>> ameryhung@gmail.com/
>>
>> ---
>>
>> Hi all,
>>
>> This patchset, separated from [0], contains fixes to mlx5 when handling
>> non-linear xdp_buff. The driver currently generates skb based on
>> information obtained before the XDP program runs, such as the number of
>> fragments and the size of the linear data. However, the XDP program can
>> actually change them through bpf_adjust_{head,tail}(). Fix the bugs
>> bygenerating skb according to xdp_buff after the XDP program runs.
>>
>> [0] https://lore.kernel.org/bpf/20250905173352.3759457-1-
>> ameryhung@gmail.com/
>>
>> ---
>>
>> Amery Hung (2):
>> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy
>> RQ
>> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for
>> striding RQ
>>
>> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 47 +++++++++++++++----
>> 1 file changed, 38 insertions(+), 9 deletions(-)
>>
>
> Thanks for your patches.
> They LGTM.
>
> As these are touching a sensitive area, I am taking them into internal
> functional and perf testing.
> I'll update with results once completed.
>
Initial testing passed.
Thanks for your patches.
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5
2025-09-21 11:24 ` Tariq Toukan
@ 2025-09-22 17:51 ` Amery Hung
2025-09-22 18:03 ` Jakub Kicinski
1 sibling, 0 replies; 7+ messages in thread
From: Amery Hung @ 2025-09-22 17:51 UTC (permalink / raw)
To: Tariq Toukan
Cc: netdev, bpf, andrew+netdev, davem, edumazet, pabeni, kuba,
martin.lau, noren, dtatulea, saeedm, tariqt, mbloch, cpaasch,
kernel-team
On Sun, Sep 21, 2025 at 4:24 AM Tariq Toukan <ttoukan.linux@gmail.com> wrote:
>
>
>
> On 16/09/2025 16:52, Tariq Toukan wrote:
> >
> >
> > On 16/09/2025 1:58, Amery Hung wrote:
> >> v1 -> v2
> >> - Simplify truesize calculation (Tariq)
> >> - Narrow the scope of local variables (Tariq)
> >> - Make truesize adjustment conditional (Tariq)
> >>
> >> v1
> >> - Separate the set from [0] (Dragos)
> >> - Split legacy RQ and striding RQ fixes (Dragos)
> >> - Drop conditional truesize and end frag ptr update (Dragos)
> >> - Fix truesize calculation in striding RQ (Dragos)
> >> - Fix the always zero headlen passed to __pskb_pull_tail() that
> >> causes kernel panic (Nimrod)
> >>
> >> Link: https://lore.kernel.org/bpf/20250910034103.650342-1-
> >> ameryhung@gmail.com/
> >>
> >> ---
> >>
> >> Hi all,
> >>
> >> This patchset, separated from [0], contains fixes to mlx5 when handling
> >> non-linear xdp_buff. The driver currently generates skb based on
> >> information obtained before the XDP program runs, such as the number of
> >> fragments and the size of the linear data. However, the XDP program can
> >> actually change them through bpf_adjust_{head,tail}(). Fix the bugs
> >> bygenerating skb according to xdp_buff after the XDP program runs.
> >>
> >> [0] https://lore.kernel.org/bpf/20250905173352.3759457-1-
> >> ameryhung@gmail.com/
> >>
> >> ---
> >>
> >> Amery Hung (2):
> >> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy
> >> RQ
> >> net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for
> >> striding RQ
> >>
> >> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 47 +++++++++++++++----
> >> 1 file changed, 38 insertions(+), 9 deletions(-)
> >>
> >
> > Thanks for your patches.
> > They LGTM.
> >
> > As these are touching a sensitive area, I am taking them into internal
> > functional and perf testing.
> > I'll update with results once completed.
> >
>
> Initial testing passed.
> Thanks for your patches.
Thanks for testing and the review!
>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5
2025-09-21 11:24 ` Tariq Toukan
2025-09-22 17:51 ` Amery Hung
@ 2025-09-22 18:03 ` Jakub Kicinski
1 sibling, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2025-09-22 18:03 UTC (permalink / raw)
To: Tariq Toukan
Cc: Amery Hung, netdev, bpf, andrew+netdev, davem, edumazet, pabeni,
martin.lau, noren, dtatulea, saeedm, tariqt, mbloch, cpaasch,
kernel-team
On Sun, 21 Sep 2025 14:24:53 +0300 Tariq Toukan wrote:
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
To be clear - you have to take these via your tree now.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-22 18:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-15 22:58 [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Amery Hung
2025-09-15 22:58 ` [PATCH net v2 1/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ Amery Hung
2025-09-15 22:58 ` [PATCH net v2 2/2] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ Amery Hung
2025-09-16 13:52 ` [PATCH net v2 0/2] Fix generating skb from non-linear xdp_buff for mlx5 Tariq Toukan
2025-09-21 11:24 ` Tariq Toukan
2025-09-22 17:51 ` Amery Hung
2025-09-22 18:03 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).