netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: arthur@arthurfabre.com
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
	jakub@cloudflare.com, hawk@kernel.org, yan@cloudflare.com,
	jbrandeburg@cloudflare.com, thoiland@redhat.com,
	lbiancon@redhat.com, Arthur Fabre <afabre@cloudflare.com>
Subject: Re: [PATCH RFC bpf-next 16/20] trait: Support sk_buffs
Date: Mon, 10 Mar 2025 12:45:03 +0100	[thread overview]
Message-ID: <Z87Qv2Jf3p3MeXRC@lore-desk> (raw)
In-Reply-To: <20250305-afabre-traits-010-rfc2-v1-16-d0ecfb869797@cloudflare.com>

[-- Attachment #1: Type: text/plain, Size: 5394 bytes --]

> From: Arthur Fabre <afabre@cloudflare.com>
> 
> Hide the space used by traits from skb_headroom(): that space isn't
> actually usable.
> 
> Preserve the trait store in pskb_expand_head() by copying it ahead of
> the new headroom. The struct xdp_frame at the start of the headroom
> isn't needed anymore, so we can overwrite it with traits, and introduce
> a new flag to indicate traits are stored at the start of the headroom.
> 
> Cloned skbs share the same packet data and headroom as the original skb,
> so changes to traits in one would be reflected in the other.
> Is that ok?
> Are there cases where we would want a clone to have different traits?
> For now, prevent clones from using traits.
> 
> Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
> ---
>  include/linux/skbuff.h | 25 +++++++++++++++++++++++--
>  net/core/skbuff.c      | 25 +++++++++++++++++++++++--
>  2 files changed, 46 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index d7dfee152ebd26ce87a230222e94076aca793adc..886537508789202339c925b5613574de67b7e43c 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -39,6 +39,7 @@
>  #include <net/net_debug.h>
>  #include <net/dropreason-core.h>
>  #include <net/netmem.h>
> +#include <net/trait.h>
>  
>  /**
>   * DOC: skb checksums
> @@ -729,6 +730,8 @@ enum skb_traits_type {
>  	SKB_TRAITS_NONE,
>  	/* Trait store in headroom, offset by sizeof(struct xdp_frame) */
>  	SKB_TRAITS_AFTER_XDP,
> +	/* Trait store at start of headroom */
> +	SKB_TRAITS_AT_HEAD,
>  };
>  
>  /**
> @@ -1029,7 +1032,7 @@ struct sk_buff {
>  	__u8			csum_not_inet:1;
>  #endif
>  	__u8			unreadable:1;
> -	__u8			traits_type:1;	/* See enum skb_traits_type */
> +	__u8			traits_type:2;	/* See enum skb_traits_type */
>  #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
>  	__u16			tc_index;	/* traffic control index */
>  #endif
> @@ -2836,6 +2839,18 @@ static inline void *pskb_pull(struct sk_buff *skb, unsigned int len)
>  
>  void skb_condense(struct sk_buff *skb);
>  
> +static inline void *skb_traits(const struct sk_buff *skb)
> +{
> +	switch (skb->traits_type) {
> +	case SKB_TRAITS_AFTER_XDP:
> +		return skb->head + _XDP_FRAME_SIZE;
> +	case SKB_TRAITS_AT_HEAD:
> +		return skb->head;
> +	default:
> +		return NULL;
> +	}
> +}
> +
>  /**
>   *	skb_headroom - bytes at buffer head
>   *	@skb: buffer to check
> @@ -2844,7 +2859,13 @@ void skb_condense(struct sk_buff *skb);
>   */
>  static inline unsigned int skb_headroom(const struct sk_buff *skb)
>  {
> -	return skb->data - skb->head;
> +	int trait_size = 0;
> +	void *traits = skb_traits(skb);
> +
> +	if (traits)
> +		trait_size = traits_size(traits);
> +
> +	return skb->data - skb->head - trait_size;

I am not fully aware of all possible use-cases, but do we really need to
store hw medata traits (e.g. hw rx checksum or hw rx hash) in the skb
headroom when we convert the xdp_frame/xdp_buff in the skb? All of these
fields already have dedicated fields in the skb struct. Moreover, we need
to set them in order to have a real performance improvements when we execute
XDP_PASS. Something like:

https://lore.kernel.org/bpf/01ce17910fdd7b693c23132663fa884d5ec7f440.1726935917.git.lorenzo@kernel.org/

Regards,
Lorenzo

>  }
>  
>  /**
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7b03b64fdcb276f68ce881d1d8da8e4c6b897efc..83f58517738e8ff12990c28b09336ed44f4be32a 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -1515,6 +1515,19 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
>  	atomic_inc(&(skb_shinfo(skb)->dataref));
>  	skb->cloned = 1;
>  
> +	/* traits would end up shared with the clone,
> +	 * and edits would be reflected there.
> +	 *
> +	 * Is that ok? What if the original skb and the clone take different paths?
> +	 * Does that even happen?
> +	 *
> +	 * If that's not ok, we could copy the traits and store them in an extension header
> +	 * for clones.
> +	 *
> +	 * For now, pretend the clone doesn't have any traits.
> +	 */
> +	skb->traits_type = SKB_TRAITS_NONE;
> +
>  	return n;
>  #undef C
>  }
> @@ -2170,7 +2183,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
>  	unsigned int osize = skb_end_offset(skb);
>  	unsigned int size = osize + nhead + ntail;
>  	long off;
> -	u8 *data;
> +	u8 *data, *head;
>  	int i;
>  
>  	BUG_ON(nhead < 0);
> @@ -2187,10 +2200,18 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
>  		goto nodata;
>  	size = SKB_WITH_OVERHEAD(size);
>  
> +	head = skb->head;
> +	if (skb->traits_type != SKB_TRAITS_NONE) {
> +		head = skb_traits(skb) + traits_size(skb_traits(skb));
> +		/* struct xdp_frame isn't needed in the headroom, drop it */
> +		memcpy(data, skb_traits(skb), traits_size(skb_traits(skb)));
> +		skb->traits_type = SKB_TRAITS_AT_HEAD;
> +	}
> +
>  	/* Copy only real data... and, alas, header. This should be
>  	 * optimized for the cases when header is void.
>  	 */
> -	memcpy(data + nhead, skb->head, skb_tail_pointer(skb) - skb->head);
> +	memcpy(data + nhead, head, skb_tail_pointer(skb) - head);
>  
>  	memcpy((struct skb_shared_info *)(data + size),
>  	       skb_shinfo(skb),
> 
> -- 
> 2.43.0
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2025-03-10 11:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 14:31 [PATCH RFC bpf-next 00/20] traits: Per packet metadata KV store arthur
2025-03-05 14:31 ` [PATCH RFC bpf-next 01/20] trait: limited KV store for packet metadata arthur
2025-03-07  6:36   ` Alexei Starovoitov
2025-03-07 11:14     ` Arthur Fabre
2025-03-07 17:29       ` Alexei Starovoitov
2025-03-10 14:45         ` Arthur Fabre
2025-03-07 19:24   ` Jakub Sitnicki
2025-03-05 14:31 ` [PATCH RFC bpf-next 02/20] trait: XDP support arthur
2025-03-07 19:13   ` Lorenzo Bianconi
2025-03-10 15:50     ` Arthur Fabre
2025-03-05 14:32 ` [PATCH RFC bpf-next 03/20] trait: basic XDP selftest arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 04/20] trait: basic XDP benchmark arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 05/20] trait: Replace memcpy calls with inline copies arthur
2025-03-10 10:50   ` Lorenzo Bianconi
2025-03-10 15:52     ` Arthur Fabre
2025-03-10 22:15   ` David Laight
2025-03-05 14:32 ` [PATCH RFC bpf-next 06/20] trait: Replace memmove calls with inline move arthur
2025-03-06 10:14   ` Jesper Dangaard Brouer
2025-03-05 14:32 ` [PATCH RFC bpf-next 07/20] xdp: Track if metadata is supported in xdp_frame <> xdp_buff conversions arthur
2025-03-05 15:24   ` Alexander Lobakin
2025-03-05 17:02     ` Arthur Fabre
2025-03-06 11:12       ` Jesper Dangaard Brouer
2025-03-10 11:10         ` Lorenzo Bianconi
2025-03-05 14:32 ` [PATCH RFC bpf-next 08/20] trait: Propagate presence of traits to sk_buff arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 09/20] bnxt: Propagate trait presence to skb arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 10/20] ice: " arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 11/20] veth: " arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 12/20] virtio_net: " arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 13/20] mlx5: move xdp_buff scope one level up arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 14/20] mlx5: Propagate trait presence to skb arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 15/20] xdp generic: " arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 16/20] trait: Support sk_buffs arthur
2025-03-10 11:45   ` Lorenzo Bianconi [this message]
2025-03-05 14:32 ` [PATCH RFC bpf-next 17/20] trait: Allow socket filters to access traits arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 18/20] trait: registration API arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 19/20] trait: Sync linux/bpf.h to tools/ for trait registration arthur
2025-03-05 14:32 ` [PATCH RFC bpf-next 20/20] trait: register traits in benchmarks and tests arthur

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z87Qv2Jf3p3MeXRC@lore-desk \
    --to=lorenzo.bianconi@redhat.com \
    --cc=afabre@cloudflare.com \
    --cc=arthur@arthurfabre.com \
    --cc=bpf@vger.kernel.org \
    --cc=hawk@kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=jbrandeburg@cloudflare.com \
    --cc=lbiancon@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=thoiland@redhat.com \
    --cc=yan@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).