BPF List
 help / color / mirror / Atom feed
From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>, Daniel Xu <dxu@dxuuu.xyz>,
	John Fastabend <john.fastabend@gmail.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH bpf-next 6/9] net: skbuff: introduce napi_skb_cache_get_bulk()
Date: Fri, 30 Aug 2024 18:25:05 +0200	[thread overview]
Message-ID: <20240830162508.1009458-7-aleksander.lobakin@intel.com> (raw)
In-Reply-To: <20240830162508.1009458-1-aleksander.lobakin@intel.com>

Add a function to get an array of skbs from the NAPI percpu cache.
It's supposed to be a drop-in replacement for
kmem_cache_alloc_bulk(skbuff_head_cache, GFP_ATOMIC) and
xdp_alloc_skb_bulk(GFP_ATOMIC). The difference (apart from the
requirement to call it only from the BH) is that it tries to use
as many NAPI cache entries for skbs as possible, and allocate new
ones only if needed.

The logic is as follows:

* there is enough skbs in the cache: decache them and return to the
  caller;
* not enough: try refilling the cache first. If there is now enough
  skbs, return;
* still not enough: try allocating skbs directly to the output array
  with %GFP_ZERO, maybe we'll be able to get some. If there's now
  enough, return;
* still not enough: return as many as we were able to obtain.

Most of times, if called from the NAPI polling loop, the first one will
be true, sometimes (rarely) the second one. The third and the fourth --
only under heavy memory pressure.
It can save significant amounts of CPU cycles if there are GRO cycles
and/or Tx completion cycles (anything that descends to
napi_skb_cache_put()) happening on this CPU.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 include/linux/skbuff.h |  1 +
 net/core/skbuff.c      | 62 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cf8f6ce06742..2bc3ca79bc6e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1304,6 +1304,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb,
 				 void *data, unsigned int frag_size);
 void skb_attempt_defer_free(struct sk_buff *skb);
 
+u32 napi_skb_cache_get_bulk(void **skbs, u32 n);
 struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
 struct sk_buff *slab_build_skb(void *data);
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a52638363ea5..0a34f3aa00d1 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -366,6 +366,68 @@ static struct sk_buff *napi_skb_cache_get(void)
 	return skb;
 }
 
+/**
+ * napi_skb_cache_get_bulk - obtain a number of zeroed skb heads from the cache
+ * @skbs: pointer to an at least @n-sized array to fill with skb pointers
+ * @n: number of entries to provide
+ *
+ * Tries to obtain @n &sk_buff entries from the NAPI percpu cache and writes
+ * the pointers into the provided array @skbs. If there are less entries
+ * available, tries to replenish the cache and bulk-allocates the diff from
+ * the MM layer if needed.
+ * The heads are being zeroed with either memset() or %__GFP_ZERO, so they are
+ * ready for {,__}build_skb_around() and don't have any data buffers attached.
+ * Must be called *only* from the BH context.
+ *
+ * Return: number of successfully allocated skbs (@n if no actual allocation
+ *	   needed or kmem_cache_alloc_bulk() didn't fail).
+ */
+u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
+{
+	struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
+	u32 bulk, total = n;
+
+	local_lock_nested_bh(&napi_alloc_cache.bh_lock);
+
+	if (nc->skb_count >= n)
+		goto get;
+
+	/* No enough cached skbs. Try refilling the cache first */
+	bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK);
+	nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
+					       GFP_ATOMIC | __GFP_NOWARN, bulk,
+					       &nc->skb_cache[nc->skb_count]);
+	if (likely(nc->skb_count >= n))
+		goto get;
+
+	/* Still not enough. Bulk-allocate the missing part directly, zeroed */
+	n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
+				   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
+				   n - nc->skb_count, &skbs[nc->skb_count]);
+	if (likely(nc->skb_count >= n))
+		goto get;
+
+	/* kmem_cache didn't allocate the number we need, limit the output */
+	total -= n - nc->skb_count;
+	n = nc->skb_count;
+
+get:
+	for (u32 base = nc->skb_count - n, i = 0; i < n; i++) {
+		u32 cache_size = kmem_cache_size(net_hotdata.skbuff_cache);
+
+		skbs[i] = nc->skb_cache[base + i];
+
+		kasan_mempool_unpoison_object(skbs[i], cache_size);
+		memset(skbs[i], 0, offsetof(struct sk_buff, tail));
+	}
+
+	nc->skb_count -= n;
+	local_unlock_nested_bh(&napi_alloc_cache.bh_lock);
+
+	return total;
+}
+EXPORT_SYMBOL_GPL(napi_skb_cache_get_bulk);
+
 static inline void __finalize_skb_around(struct sk_buff *skb, void *data,
 					 unsigned int size)
 {
-- 
2.46.0


  parent reply	other threads:[~2024-08-30 16:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30 16:24 [PATCH bpf-next 0/9] bpf: cpumap: enable GRO for XDP_PASS frames Alexander Lobakin
2024-08-30 16:25 ` [PATCH bpf-next 1/9] firmware/psci: fix missing '%u' format literal in kthread_create_on_cpu() Alexander Lobakin
2024-08-30 23:31   ` Daniel Xu
2024-08-30 16:25 ` [PATCH bpf-next 2/9] kthread: allow vararg kthread_{create,run}_on_cpu() Alexander Lobakin
2024-08-30 22:56   ` Stanislav Fomichev
2024-09-03 12:25     ` Alexander Lobakin
2024-09-03 17:04       ` Stanislav Fomichev
2024-08-30 16:25 ` [PATCH bpf-next 3/9] net: napi: add ability to create CPU-pinned threaded NAPI Alexander Lobakin
2024-08-31  0:19   ` Daniel Xu
2024-08-30 16:25 ` [PATCH bpf-next 4/9] bpf: cpumap: use CPU-pinned threaded NAPI w/GRO instead of kthread Alexander Lobakin
2024-08-30 16:25 ` [PATCH bpf-next 5/9] bpf: cpumap: reuse skb array instead of a linked list to chain skbs Alexander Lobakin
2024-08-30 16:25 ` Alexander Lobakin [this message]
2024-08-30 16:25 ` [PATCH bpf-next 7/9] bpf: cpumap: switch to napi_skb_cache_get_bulk() Alexander Lobakin
2024-08-30 16:25 ` [PATCH bpf-next 8/9] veth: use napi_skb_cache_get_bulk() instead of xdp_alloc_skb_bulk() Alexander Lobakin
2024-08-30 16:25 ` [PATCH bpf-next 9/9] xdp: remove xdp_alloc_skb_bulk() Alexander Lobakin
2024-09-03 20:51 ` [PATCH bpf-next 0/9] bpf: cpumap: enable GRO for XDP_PASS frames Jakub Kicinski
2024-09-03 21:33   ` Lorenzo Bianconi
2024-09-05 11:53     ` Jesper Dangaard Brouer
2024-09-05 17:01     ` Lorenzo Bianconi
2024-09-06  0:20       ` Jakub Kicinski
2024-09-06  8:15         ` Lorenzo Bianconi
2024-09-07 13:22           ` Lorenzo Bianconi
2024-09-04 13:13   ` Alexander Lobakin
2024-09-04 14:50     ` Jakub Kicinski
2024-09-04 15:13       ` Alexander Lobakin
2024-09-04 18:29         ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240830162508.1009458-7-aleksander.lobakin@intel.com \
    --to=aleksander.lobakin@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dxu@dxuuu.xyz \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox