BPF List
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	bjorn@kernel.org, magnus.karlsson@intel.com,
	maciej.fijalkowski@intel.com, jonathan.lemon@gmail.com,
	sdf@fomichev.me, ast@kernel.org, daniel@iogearbox.net,
	hawk@kernel.org, john.fastabend@gmail.com, bpf@vger.kernel.org,
	netdev@vger.kernel.org, Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v3] xsk: skip validating skb list in xmit path
Date: Thu, 27 Nov 2025 18:58:18 +0100	[thread overview]
Message-ID: <f8d6dbe0-b213-4990-a8af-2f95d25d21be@redhat.com> (raw)
In-Reply-To: <CAL+tcoDdntkJ8SFaqjPvkJoCDwiitqsCNeFUq7CYa_fajPQL4A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3317 bytes --]

On 11/27/25 1:49 PM, Jason Xing wrote:
> On Thu, Nov 27, 2025 at 8:02 PM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 11/25/25 12:57 PM, Jason Xing wrote:
>>> This patch also removes total ~4% consumption which can be observed
>>> by perf:
>>> |--2.97%--validate_xmit_skb
>>> |          |
>>> |           --1.76%--netif_skb_features
>>> |                     |
>>> |                      --0.65%--skb_network_protocol
>>> |
>>> |--1.06%--validate_xmit_xfrm
>>>
>>> The above result has been verfied on different NICs, like I40E. I
>>> managed to see the number is going up by 4%.
>>
>> I must admit this delta is surprising, and does not fit my experience in
>> slightly different scenarios with the plain UDP TX path.
> 
> My take is that when the path is extremely hot, even the mathematics
> calculation could cause unexpected overhead. You can see the pps is
> now over 2,000,000. The reason why I say this is because I've done a
> few similar tests to verify this thought.

Uhm... 2M is not that huge. Prior to the H/W vulnerability fallout
(spectre and friends) reasonable good H/W (2016 old) could do ~2Mpps
with a single plain UDP socket.

Also validate_xmit_xfrm() should be basically a no-op, possibly some bad
luck with icache?

Could you please try the attached patch instead?

Should not be as good as skipping the whole validation but should give
some measurable gain.
>>> [1] - analysis of the validate_xmit_skb()
>>> 1. validate_xmit_unreadable_skb()
>>>    xsk doesn't initialize skb->unreadable, so the function will not free
>>>    the skb.
>>> 2. validate_xmit_vlan()
>>>    xsk also doesn't initialize skb->vlan_all.
>>> 3. sk_validate_xmit_skb()
>>>    skb from xsk_build_skb() doesn't have either sk_validate_xmit_skb or
>>>    sk_state, so the skb will not be validated.
>>> 4. netif_needs_gso()
>>>    af_xdp doesn't support gso/tso.
>>> 5. skb_needs_linearize() && __skb_linearize()
>>>    skb doesn't have frag_list as always, so skb_has_frag_list() returns
>>>    false. In copy mode, skb can put more data in the frags[] that can be
>>>    found in xsk_build_skb_zerocopy().
>>
>> I'm not sure  parse this last sentence correctly, could you please
>> re-phrase?
>>
>> I read it as as the xsk xmit path could build skb with nr_frags > 0.
>> That in turn will need validation from
>> validate_xmit_skb()/skb_needs_linearize() depending on the egress device
>> (lack of NETIF_F_SG), regardless of any other offload required.
> 
> There are two paths where the allocation of frags happen:
> 1) xsk_build_skb() -> xsk_build_skb_zerocopy() -> skb_fill_page_desc()
> -> shinfo->frags[i]
> 2) xsk_build_skb() -> skb_add_rx_frag() -> ... -> shinfo->frags[i]
> 
> Neither of them touch skb->frag_list, which means frag_list is NULL.
> IIUC, there is no place where frag_list is used (which actually I
> tested). we can see skb_needs_linearize() needs to check
> skb_has_frag_list() first, so it will not proceed after seeing it
> return false.
https://elixir.bootlin.com/linux/v6.18-rc7/source/include/linux/skbuff.h#L4322

return skb_is_nonlinear(skb) &&
	       ((skb_has_frag_list(skb) && !(features & NETIF_F_FRAGLIST)) ||
		(skb_shinfo(skb)->nr_frags && !(features & NETIF_F_SG)));

can return true even if `!skb_has_frag_list(skb)`.

I think you still need to call validate_xmit_skb()

/P


[-- Attachment #2: sec_path.patch --]
[-- Type: text/x-patch, Size: 383 bytes --]

diff --git a/net/core/dev.c b/net/core/dev.c
index 9094c0fb8c68..39516a5766e5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4030,7 +4030,8 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
 		}
 	}
 
-	skb = validate_xmit_xfrm(skb, features, again);
+	if (skb_sec_path(skb)
+		skb = validate_xmit_xfrm(skb, features, again);
 
 	return skb;
 

  reply	other threads:[~2025-11-27 17:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 11:57 [PATCH net-next v3] xsk: skip validating skb list in xmit path Jason Xing
2025-11-27 12:02 ` Paolo Abeni
2025-11-27 12:49   ` Jason Xing
2025-11-27 17:58     ` Paolo Abeni [this message]
2025-11-28  1:44       ` Jason Xing
2025-11-28  8:40         ` Paolo Abeni
2025-11-28 12:59           ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8d6dbe0-b213-4990-a8af-2f95d25d21be@redhat.com \
    --to=pabeni@redhat.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox