BPF List
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Alexander Lobakin <aleksander.lobakin@intel.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>
Cc: "Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Daniel Xu" <dxu@dxuuu.xyz>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Toke Høiland-Jørgensen" <toke@kernel.org>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 5/8] net: skbuff: introduce napi_skb_cache_get_bulk()
Date: Thu, 9 Jan 2025 14:16:22 +0100	[thread overview]
Message-ID: <d97b6ec6-59fb-4123-9d96-27b9b32dc5cc@redhat.com> (raw)
In-Reply-To: <20250107152940.26530-6-aleksander.lobakin@intel.com>

On 1/7/25 4:29 PM, Alexander Lobakin wrote:
> Add a function to get an array of skbs from the NAPI percpu cache.
> It's supposed to be a drop-in replacement for
> kmem_cache_alloc_bulk(skbuff_head_cache, GFP_ATOMIC) and
> xdp_alloc_skb_bulk(GFP_ATOMIC). The difference (apart from the
> requirement to call it only from the BH) is that it tries to use
> as many NAPI cache entries for skbs as possible, and allocate new
> ones only if needed.
> 
> The logic is as follows:
> 
> * there is enough skbs in the cache: decache them and return to the
>   caller;
> * not enough: try refilling the cache first. If there is now enough
>   skbs, return;
> * still not enough: try allocating skbs directly to the output array
>   with %GFP_ZERO, maybe we'll be able to get some. If there's now
>   enough, return;
> * still not enough: return as many as we were able to obtain.
> 
> Most of times, if called from the NAPI polling loop, the first one will
> be true, sometimes (rarely) the second one. The third and the fourth --
> only under heavy memory pressure.
> It can save significant amounts of CPU cycles if there are GRO cycles
> and/or Tx completion cycles (anything that descends to
> napi_skb_cache_put()) happening on this CPU.
> 
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> Tested-by: Daniel Xu <dxu@dxuuu.xyz>
> ---
>  include/linux/skbuff.h |  1 +
>  net/core/skbuff.c      | 62 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 63 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index bb2b751d274a..1c089c7c14e1 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1315,6 +1315,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb,
>  				 void *data, unsigned int frag_size);
>  void skb_attempt_defer_free(struct sk_buff *skb);
>  
> +u32 napi_skb_cache_get_bulk(void **skbs, u32 n);
>  struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
>  struct sk_buff *slab_build_skb(void *data);
>  
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index a441613a1e6c..42eb31dcc9ce 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -367,6 +367,68 @@ static struct sk_buff *napi_skb_cache_get(void)
>  	return skb;
>  }
>  
> +/**
> + * napi_skb_cache_get_bulk - obtain a number of zeroed skb heads from the cache
> + * @skbs: pointer to an at least @n-sized array to fill with skb pointers
> + * @n: number of entries to provide
> + *
> + * Tries to obtain @n &sk_buff entries from the NAPI percpu cache and writes
> + * the pointers into the provided array @skbs. If there are less entries
> + * available, tries to replenish the cache and bulk-allocates the diff from
> + * the MM layer if needed.
> + * The heads are being zeroed with either memset() or %__GFP_ZERO, so they are
> + * ready for {,__}build_skb_around() and don't have any data buffers attached.
> + * Must be called *only* from the BH context.
> + *
> + * Return: number of successfully allocated skbs (@n if no actual allocation
> + *	   needed or kmem_cache_alloc_bulk() didn't fail).
> + */
> +u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
> +{
> +	struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
> +	u32 bulk, total = n;
> +
> +	local_lock_nested_bh(&napi_alloc_cache.bh_lock);
> +
> +	if (nc->skb_count >= n)
> +		goto get;

I (mis?)understood from the commit message this condition should be
likely, too?!?

> +	/* No enough cached skbs. Try refilling the cache first */
> +	bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK);
> +	nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> +					       GFP_ATOMIC | __GFP_NOWARN, bulk,
> +					       &nc->skb_cache[nc->skb_count]);
> +	if (likely(nc->skb_count >= n))
> +		goto get;
> +
> +	/* Still not enough. Bulk-allocate the missing part directly, zeroed */
> +	n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> +				   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> +				   n - nc->skb_count, &skbs[nc->skb_count]);

You should probably cap 'n' to NAPI_SKB_CACHE_SIZE. Also what about
latency spikes when n == 48 (should be the maximum possible with such
limit) here?

/P


  reply	other threads:[~2025-01-09 13:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 15:29 [PATCH net-next v2 0/8] bpf: cpumap: enable GRO for XDP_PASS frames Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 1/8] net: gro: decouple GRO from the NAPI layer Alexander Lobakin
2025-01-09 14:24   ` Paolo Abeni
2025-01-13 13:50     ` Alexander Lobakin
2025-01-13 21:01       ` Jakub Kicinski
2025-01-14 17:19         ` Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 2/8] net: gro: expose GRO init/cleanup to use outside of NAPI Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 3/8] bpf: cpumap: switch to GRO from netif_receive_skb_list() Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 4/8] bpf: cpumap: reuse skb array instead of a linked list to chain skbs Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 5/8] net: skbuff: introduce napi_skb_cache_get_bulk() Alexander Lobakin
2025-01-09 13:16   ` Paolo Abeni [this message]
2025-01-13 13:47     ` Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 6/8] bpf: cpumap: switch to napi_skb_cache_get_bulk() Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 7/8] veth: use napi_skb_cache_get_bulk() instead of xdp_alloc_skb_bulk() Alexander Lobakin
2025-01-07 15:29 ` [PATCH net-next v2 8/8] xdp: remove xdp_alloc_skb_bulk() Alexander Lobakin
2025-01-07 17:17 ` [PATCH net-next v2 0/8] bpf: cpumap: enable GRO for XDP_PASS frames Jesper Dangaard Brouer
2025-01-08 13:39   ` Alexander Lobakin
2025-01-09 17:02     ` Toke Høiland-Jørgensen
2025-01-13 13:50       ` Alexander Lobakin
2025-01-09  1:26   ` Daniel Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d97b6ec6-59fb-4123-9d96-27b9b32dc5cc@redhat.com \
    --to=pabeni@redhat.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dxu@dxuuu.xyz \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=toke@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox