From: "Rao, Nikhil" <nikhirao@amd.com>
To: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: netdev@vger.kernel.org, magnus.karlsson@intel.com,
sdf@fomichev.me, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, horms@kernel.org,
kerneljasonxing@gmail.com, ",nikhil.rao"@amd.com
Subject: Re: [PATCH net v2 2/2] xsk: Fix zero-copy AF_XDP fragment drop
Date: Tue, 10 Feb 2026 13:10:37 -0800 [thread overview]
Message-ID: <64112a86-156e-43cf-8fa5-8038ea0037e9@amd.com> (raw)
In-Reply-To: <aYtaioooXhsXVL0G@boxer>
Sorry for the duplicate - sending without html this time.
On 2/10/2026 8:19 AM, Maciej Fijalkowski wrote:
>
> On Mon, Feb 09, 2026 at 10:55:16PM +0100, Maciej Fijalkowski wrote:
>> On Mon, Feb 09, 2026 at 06:24:51PM +0000, Nikhil P. Rao wrote:
>>> AF_XDP should ensure that only a complete packet is sent to application.
>>> In the zero-copy case, if the Rx queue gets full as fragments are being
>>> enqueued, the remaining fragments are dropped.
>>
>> All of the descs that current xdp_buff was carrying will be dropped which
>> is incorrect as some of them have been exposed to Rx queue already and I
>> don't see the error path that would rewind them. So that's my
>> understanding of this issue.
>>
>> However, we were trying to keep the single-buf case as fast as we can, see
>> below.
>>
>>>
>>> Add a check to ensure that the Rx queue has enough space for all
>>> fragments of a packet before starting to enqueue them.
>>>
>>> Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
>>> Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
>>> ---
>>> net/xdp/xsk.c | 22 +++++++++++-----------
>>> 1 file changed, 11 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>>> index f2ec4f78bbb6..b65be95abcdc 100644
>>> --- a/net/xdp/xsk.c
>>> +++ b/net/xdp/xsk.c
>>> @@ -166,15 +166,20 @@ static int xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
>>> u32 frags = xdp_buff_has_frags(xdp);
>>> struct xdp_buff_xsk *pos, *tmp;
>>> struct list_head *xskb_list;
>>> + u32 num_desc = 1;
>>> u32 contd = 0;
>>> - int err;
>>>
>>> - if (frags)
>>> + if (frags) {
>>> + num_desc = xdp_get_shared_info_from_buff(xdp)->nr_frags + 1;
>>> contd = XDP_PKT_CONTD;
>>> + }
>>>
>>> - err = __xsk_rcv_zc(xs, xskb, len, contd);
>>> - if (err)
>>> - goto err;
>>> + if (xskq_prod_nb_free(xs->rx, num_desc) < num_desc) {
>>
>> this will hurt single buf performance unfortunately, I'd rather have frag
>> part still executed separately. Did you measure what impact on throughput
>> this patch has?
>>
>> Further thought here is once we are sure about sufficient space in xsk
>> queue then we could skip sanity check that xskq_prod_reserve_desc()
>> contains. Look at batching that is done on Tx side.
>>
>> Please see what works best here. Whether keeping linear part execution
>> separate from frags + producing frags in a 'batched' way or including
>> linear part with this 'batched' production of descriptors.
>
> What I meant was patch below. However this is not a fix so I wouldn't
> incorporate it to your set. Maybe let's go with just processing linear
> part separately just like it used to be and then I can follow-up with this
> diff.
>
Agreed, this change can be implemented without any change to single buf
performance.
Let me know if this looks good to you:
if (frags) {
} else {
/* handle single buf */
}
/* handle multi-buf */
>>From 153e1bc5d2baf6328667956ae16d47103085eac8 Mon Sep 17 00:00:00 2001
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Tue, 10 Feb 2026 13:39:17 +0000
> Subject: [PATCH bpf-next] xsk: avoid double checking against rx queue being
> full
>
prev parent reply other threads:[~2026-02-10 21:10 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-09 18:24 [PATCH net v2 0/2] xsk: Fixes for AF_XDP fragment handling Nikhil P. Rao
2026-02-09 18:24 ` [PATCH net v2 1/2] xsk: Fix fragment node deletion to prevent buffer leak Nikhil P. Rao
2026-02-09 21:29 ` Maciej Fijalkowski
2026-02-09 18:24 ` [PATCH net v2 2/2] xsk: Fix zero-copy AF_XDP fragment drop Nikhil P. Rao
2026-02-09 21:55 ` Maciej Fijalkowski
2026-02-10 16:19 ` Maciej Fijalkowski
2026-02-10 21:10 ` Rao, Nikhil [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64112a86-156e-43cf-8fa5-8038ea0037e9@amd.com \
--to=nikhirao@amd.com \
--cc=",nikhil.rao"@amd.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox