improve the kmem_cache_alloc

Linux io-uring development
 help / color / mirror / Atom feed

* improve the kmem_cache_alloc_bulk API
@ 2026-05-27  7:02 Christoph Hellwig
  2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
  2026-05-27  9:11 ` improve the kmem_cache_alloc_bulk API Vlastimil Babka (SUSE)
  0 siblings, 2 replies; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-27  7:02 UTC (permalink / raw)
  To: Vlastimil Babka, Harry Yoo, Andrew Morton
  Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

Hi all,

kmem_cache_alloc_bulk has a very unintuitive and undocumented return
value convention.  Fix that and add documentation.

Note that the few comments explaining it mention that the gfp flags
must allow "spinning".  That's not really a term used in the memory
allocator, is this supposed to mean "block" or "sleep"?

Diffstat:
 drivers/gpu/drm/msm/msm_iommu.c       |    6 +--
 drivers/gpu/drm/panthor/panthor_mmu.c |   12 ++-----
 include/linux/slab.h                  |    6 ++-
 io_uring/io_uring.c                   |   23 +++++--------
 lib/test_meminit.c                    |   19 +++++------
 mm/kasan/kasan_test_c.c               |    5 +-
 mm/kfence/kfence_test.c               |    9 ++---
 mm/slub.c                             |   58 ++++++++++++++++++----------------
 net/bpf/test_run.c                    |    7 +---
 net/core/skbuff.c                     |   23 +++++++------
 tools/include/linux/slab.h            |    2 -
 tools/testing/shared/linux.c          |   19 ++++-------
 12 files changed, 92 insertions(+), 97 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  7:02 improve the kmem_cache_alloc_bulk API Christoph Hellwig
@ 2026-05-27  7:02 ` Christoph Hellwig
  2026-05-27  7:53   ` bot+bpf-ci
                     ` (2 more replies)
  2026-05-27  9:11 ` improve the kmem_cache_alloc_bulk API Vlastimil Babka (SUSE)
  1 sibling, 3 replies; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-27  7:02 UTC (permalink / raw)
  To: Vlastimil Babka, Harry Yoo, Andrew Morton
  Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

The kmem_cache_alloc_bulk return value is weird.  It returns the number
of allocated objects, but that must always be 0 or the requested number
based on the implementations and the handling in the callers, but that
assumption is not actually documented anywhere, which confuses automated
review tools.

Fix this by returning a bool if the allocation succeeded and adding a
kerneldoc comment explaining the API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/msm/msm_iommu.c       |  6 +--
 drivers/gpu/drm/panthor/panthor_mmu.c | 12 +++---
 include/linux/slab.h                  |  6 ++-
 io_uring/io_uring.c                   | 23 +++++------
 lib/test_meminit.c                    | 19 +++++----
 mm/kasan/kasan_test_c.c               |  5 +--
 mm/kfence/kfence_test.c               |  9 +++--
 mm/slub.c                             | 58 +++++++++++++++------------
 net/bpf/test_run.c                    |  7 ++--
 net/core/skbuff.c                     | 24 ++++++-----
 tools/include/linux/slab.h            |  2 +-
 tools/testing/shared/linux.c          | 19 ++++-----
 12 files changed, 93 insertions(+), 97 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 058c71c82cf5..533104d71f6c 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -330,17 +330,15 @@ static int
 msm_iommu_pagetable_prealloc_allocate(struct msm_mmu *mmu, struct msm_mmu_prealloc *p)
 {
 	struct kmem_cache *pt_cache = get_pt_cache(mmu);
-	int ret;
 
 	p->pages = kvmalloc_objs(*p->pages, p->count);
 	if (!p->pages)
 		return -ENOMEM;
 
-	ret = kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, p->count, p->pages);
-	if (ret != p->count) {
+	if (!kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, p->count, p->pages)) {
 		kfree(p->pages);
 		p->pages = NULL;
-		p->count = ret;
+		p->count = 0;
 		return -ENOMEM;
 	}
 
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 75d98dad7b1d..b80d7e1d5123 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -1274,10 +1274,9 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
 		goto err_cleanup;
 	}
 
-	ret = kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
-				    op_ctx->rsvd_page_tables.pages);
-	op_ctx->rsvd_page_tables.count = ret;
-	if (ret != pt_count) {
+	if (!kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
+			op_ctx->rsvd_page_tables.pages)) {
+		op_ctx->rsvd_page_tables.count = 0;
 		ret = -ENOMEM;
 		goto err_cleanup;
 	}
@@ -1328,9 +1327,8 @@ static int panthor_vm_prepare_unmap_op_ctx(struct panthor_vm_op_ctx *op_ctx,
 			goto err_cleanup;
 		}
 
-		ret = kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
-					    op_ctx->rsvd_page_tables.pages);
-		if (ret != pt_count) {
+		if (!kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
+				op_ctx->rsvd_page_tables.pages)) {
 			ret = -ENOMEM;
 			goto err_cleanup;
 		}
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 2b5ab488e96b..6a7b452d43a0 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -815,8 +815,10 @@ kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags,
  */
 void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
 
-int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size, void **p);
-#define kmem_cache_alloc_bulk(...)	alloc_hooks(kmem_cache_alloc_bulk_noprof(__VA_ARGS__))
+bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
+		size_t size, void **p);
+#define kmem_cache_alloc_bulk(...) \
+	alloc_hooks(kmem_cache_alloc_bulk_noprof(__VA_ARGS__))
 
 static __always_inline void kfree_bulk(size_t size, void **p)
 {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 103b6c88f252..bf847ca823f7 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -978,29 +978,24 @@ __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
 {
 	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO;
 	void *reqs[IO_REQ_ALLOC_BATCH];
-	int ret;
-
-	ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
+	int nr_reqs = ARRAY_SIZE(reqs);
 
 	/*
-	 * Bulk alloc is all-or-nothing. If we fail to get a batch,
-	 * retry single alloc to be on the safe side.
+	 * Bulk alloc is all-or-nothing. If we fail to get a batch, retry a
+	 * single allocation to be on the safe side.
 	 */
-	if (unlikely(ret <= 0)) {
+	if (!kmem_cache_alloc_bulk(req_cachep, gfp, nr_reqs, reqs)) {
 		reqs[0] = kmem_cache_alloc(req_cachep, gfp);
 		if (!reqs[0])
 			return false;
-		ret = 1;
+		nr_reqs = 1;
 	}
 
-	percpu_ref_get_many(&ctx->refs, ret);
-	ctx->nr_req_allocated += ret;
-
-	while (ret--) {
-		struct io_kiocb *req = reqs[ret];
+	percpu_ref_get_many(&ctx->refs, nr_reqs);
+	ctx->nr_req_allocated += nr_reqs;
 
-		io_req_add_to_cache(req, ctx);
-	}
+	while (nr_reqs--)
+		io_req_add_to_cache(reqs[nr_reqs], ctx);
 	return true;
 }
 
diff --git a/lib/test_meminit.c b/lib/test_meminit.c
index d028a6552cd6..8f178fdf80ff 100644
--- a/lib/test_meminit.c
+++ b/lib/test_meminit.c
@@ -229,14 +229,12 @@ static int __init do_kmem_cache_size(size_t size, bool want_ctor,
 	for (iter = 0; iter < 10; iter++) {
 		/* Do a test of bulk allocations */
 		if (!want_rcu && !want_ctor) {
-			int ret;
-
-			ret = kmem_cache_alloc_bulk(c, alloc_mask, BULK_SIZE, bulk_array);
-			if (!ret) {
+			if (!kmem_cache_alloc_bulk(c, alloc_mask, BULK_SIZE,
+					bulk_array)) {
 				fail = true;
 			} else {
 				int i;
-				for (i = 0; i < ret; i++)
+				for (i = 0; i < BULK_SIZE; i++)
 					fail |= check_buf(bulk_array[i], size, want_ctor, want_rcu, want_zero);
 				kmem_cache_free_bulk(c, ret, bulk_array);
 			}
@@ -354,17 +352,18 @@ static int __init do_kmem_cache_size_bulk(int size, int *total_failures)
 
 	c = kmem_cache_create("test_cache", size, size, 0, NULL);
 	for (iter = 0; (iter < maxiter) && !fail; iter++) {
-		num = kmem_cache_alloc_bulk(c, GFP_KERNEL, ARRAY_SIZE(objects),
-					    objects);
-		for (i = 0; i < num; i++) {
+		if (!kmem_cache_alloc_bulk(c, GFP_KERNEL, ARRAY_SIZE(objects),
+				objects))
+			continue;
+
+		for (i = 0; i < ARRAY_SIZE(objects); i++) {
 			bytes = count_nonzero_bytes(objects[i], size);
 			if (bytes)
 				fail = true;
 			fill_with_garbage(objects[i], size);
 		}
 
-		if (num)
-			kmem_cache_free_bulk(c, num, objects);
+		kmem_cache_free_bulk(c, num, objects);
 	}
 	kmem_cache_destroy(c);
 	*total_failures += fail;
diff --git a/mm/kasan/kasan_test_c.c b/mm/kasan/kasan_test_c.c
index 3f4ed29178b3..b9e167ed5be3 100644
--- a/mm/kasan/kasan_test_c.c
+++ b/mm/kasan/kasan_test_c.c
@@ -1225,14 +1225,13 @@ static void kmem_cache_bulk(struct kunit *test)
 	struct kmem_cache *cache;
 	size_t size = 200;
 	char *p[10];
-	bool ret;
 	int i;
 
 	cache = kmem_cache_create("test_cache", size, 0, 0, NULL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, cache);
 
-	ret = kmem_cache_alloc_bulk(cache, GFP_KERNEL, ARRAY_SIZE(p), (void **)&p);
-	if (!ret) {
+	if (!kmem_cache_alloc_bulk(cache, GFP_KERNEL, ARRAY_SIZE(p),
+			(void **)&p)) {
 		kunit_err(test, "Allocation failed: %s\n", __func__);
 		kmem_cache_destroy(cache);
 		return;
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 10424cd25e5a..c472e66e7242 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -761,9 +761,10 @@ static void test_memcache_alloc_bulk(struct kunit *test)
 	timeout = jiffies + msecs_to_jiffies(100 * kfence_sample_interval);
 	do {
 		void *objects[100];
-		int i, num = kmem_cache_alloc_bulk(test_cache, GFP_ATOMIC, ARRAY_SIZE(objects),
-						   objects);
-		if (!num)
+		int i;
+
+		if (!kmem_cache_alloc_bulk(test_cache, GFP_ATOMIC,
+				ARRAY_SIZE(objects), objects))
 			continue;
 		for (i = 0; i < ARRAY_SIZE(objects); i++) {
 			if (is_kfence_address(objects[i])) {
@@ -771,7 +772,7 @@ static void test_memcache_alloc_bulk(struct kunit *test)
 				break;
 			}
 		}
-		kmem_cache_free_bulk(test_cache, num, objects);
+		kmem_cache_free_bulk(test_cache, ARRAY_SIZE(objects), objects);
 		/*
 		 * kmem_cache_alloc_bulk() disables interrupts, and calling it
 		 * in a tight loop may not give KFENCE a chance to switch the
diff --git a/mm/slub.c b/mm/slub.c
index a2bf3756ca7d..d9790e7c17f6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4981,8 +4981,8 @@ static int __prefill_sheaf_pfmemalloc(struct kmem_cache *s,
 	return ret;
 }
 
-static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
-				   size_t size, void **p);
+static bool __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
+		size_t size, void **p);
 
 /*
  * returns a sheaf that has at least the requested size
@@ -5154,9 +5154,8 @@ int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp,
 			return __prefill_sheaf_pfmemalloc(s, sheaf, gfp);
 
 		if (!__kmem_cache_alloc_bulk(s, gfp, sheaf->capacity - sheaf->size,
-					     &sheaf->objects[sheaf->size])) {
+					     &sheaf->objects[sheaf->size]))
 			return -ENOMEM;
-		}
 		sheaf->size = sheaf->capacity;
 
 		return 0;
@@ -7289,9 +7288,8 @@ refill_objects(struct kmem_cache *s, void **p, gfp_t gfp, unsigned int min,
 	return refilled;
 }
 
-static inline
-int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
-			    void **p)
+static bool __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
+		size_t size, void **p)
 {
 	int i;
 
@@ -7312,30 +7310,42 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 		stat_add(s, ALLOC_SLOWPATH, i);
 	}
 
-	return i;
+	return true;
 
 error:
 	__kmem_cache_free_bulk(s, i, p);
-	return 0;
-
+	return false;
 }
 
-/*
- * Note that interrupts must be enabled when calling this function and gfp
- * flags must allow spinning.
+/**
+ * kmem_cache_alloc_bulk - Allocate multiple objects
+ * @s:		The cache to allocate from
+ * @flags:	GFP_* flags. See kmalloc().
+ * @size:	Number of objects to allocate
+ * @p:		Array of allocated objects
+ *
+ * Allocate @size objects from @s and places them into @p.
+ *
+ * Interrupts must be enabled when calling this function and @flags must allow
+ * spinning.
+ *
+ * Unlike alloc_pages_bulk(), this function does not check for already allocated
+ * objects in @p, and thus the caller does not need to zero it.
+ *
+ * Return: %true if the allocation succeeded, or %false if it failed.
  */
-int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,
-				 void **p)
+bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
+		size_t size, void **p)
 {
 	unsigned int i = 0;
 	void *kfence_obj;
 
 	if (!size)
-		return 0;
+		return false;
 
 	s = slab_pre_alloc_hook(s, flags);
 	if (unlikely(!s))
-		return 0;
+		return false;
 
 	/*
 	 * to make things simpler, only assume at most once kfence allocated
@@ -7352,18 +7362,18 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,
 	}
 
 	i = alloc_from_pcs_bulk(s, flags, size, p);
-
 	if (i < size) {
 		/*
 		 * If we ran out of memory, don't bother with freeing back to
 		 * the percpu sheaves, we have bigger problems.
 		 */
-		if (unlikely(__kmem_cache_alloc_bulk(s, flags, size - i, p + i) == 0)) {
+		if (unlikely(!__kmem_cache_alloc_bulk(s, flags, size - i,
+				p + i))) {
 			if (i > 0)
 				__kmem_cache_free_bulk(s, i, p);
 			if (kfence_obj)
 				__kfence_free(kfence_obj);
-			return 0;
+			return false;
 		}
 	}
 
@@ -7382,12 +7392,8 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,
 	 * memcg and kmem_cache debug support and memory initialization.
 	 * Done outside of the IRQ disabled fastpath loop.
 	 */
-	if (unlikely(!slab_post_alloc_hook(s, NULL, flags, size, p,
-		    slab_want_init_on_alloc(flags, s), s->object_size))) {
-		return 0;
-	}
-
-	return size;
+	return likely(slab_post_alloc_hook(s, NULL, flags, size, p,
+			slab_want_init_on_alloc(flags, s), s->object_size));
 }
 EXPORT_SYMBOL(kmem_cache_alloc_bulk_noprof);
 
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2bc04feadfab..99ab9ddb05e3 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -243,12 +243,11 @@ static int xdp_recv_frames(struct xdp_frame **frames, int nframes,
 			   struct net_device *dev)
 {
 	gfp_t gfp = __GFP_ZERO | GFP_ATOMIC;
-	int i, n;
+	int i;
 	LIST_HEAD(list);
 
-	n = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, gfp, nframes,
-				  (void **)skbs);
-	if (unlikely(n == 0)) {
+	if (!kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, gfp, nframes,
+				   (void **)skbs)) {
 		for (i = 0; i < nframes; i++)
 			xdp_return_frame(frames[i]);
 		return -ENOMEM;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 44ac121cfccb..73045b688385 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -288,11 +288,11 @@ static inline struct sk_buff *napi_skb_cache_get(bool alloc)
 
 	local_lock_nested_bh(&napi_alloc_cache.bh_lock);
 	if (unlikely(!nc->skb_count)) {
-		if (alloc)
-			nc->skb_count = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
-						GFP_ATOMIC | __GFP_NOWARN,
-						NAPI_SKB_CACHE_BULK,
-						nc->skb_cache);
+		if (alloc && kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
+						   GFP_ATOMIC | __GFP_NOWARN,
+						   NAPI_SKB_CACHE_BULK,
+						   nc->skb_cache))
+			nc->skb_count = NAPI_SKB_CACHE_BULK;
 		if (unlikely(!nc->skb_count)) {
 			local_unlock_nested_bh(&napi_alloc_cache.bh_lock);
 			return NULL;
@@ -353,16 +353,18 @@ u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
 
 	/* No enough cached skbs. Try refilling the cache first */
 	bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK);
-	nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
-					       GFP_ATOMIC | __GFP_NOWARN, bulk,
-					       &nc->skb_cache[nc->skb_count]);
+	if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
+				  GFP_ATOMIC | __GFP_NOWARN, bulk,
+				  &nc->skb_cache[nc->skb_count]))
+		nc->skb_count += bulk;
 	if (likely(nc->skb_count >= n))
 		goto get;
 
 	/* Still not enough. Bulk-allocate the missing part directly, zeroed */
-	n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
-				   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
-				   n - nc->skb_count, &skbs[nc->skb_count]);
+	if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
+				  GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
+				  n - nc->skb_count, &skbs[nc->skb_count]))
+		n = nc->skb_count;
 	if (likely(nc->skb_count >= n))
 		goto get;
 
diff --git a/tools/include/linux/slab.h b/tools/include/linux/slab.h
index 6d8e9413d5a4..2e63c2e726aa 100644
--- a/tools/include/linux/slab.h
+++ b/tools/include/linux/slab.h
@@ -183,7 +183,7 @@ __kmem_cache_create(const char *name, unsigned int size, unsigned int align,
 		default: __kmem_cache_create)(__name, __object_size, __args, __VA_ARGS__)
 
 void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **list);
-int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
+bool kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
 			  void **list);
 struct slab_sheaf *
 kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size);
diff --git a/tools/testing/shared/linux.c b/tools/testing/shared/linux.c
index 8c7257155958..e9c3bc9b3272 100644
--- a/tools/testing/shared/linux.c
+++ b/tools/testing/shared/linux.c
@@ -154,7 +154,7 @@ void kmem_cache_shrink(struct kmem_cache *cachep)
 {
 }
 
-int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
+bool kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
 			  void **p)
 {
 	size_t i;
@@ -213,7 +213,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
 		pthread_mutex_unlock(&cachep->lock);
 		if (cachep->callback)
 			cachep->exec_callback = true;
-		return 0;
+		return false;
 	}
 
 	for (i = 0; i < size; i++) {
@@ -224,7 +224,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
 			printf("Allocating %p from slab\n", p[i]);
 	}
 
-	return size;
+	return true;
 }
 
 struct kmem_cache *
@@ -271,8 +271,8 @@ kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size)
 
 	sheaf->cache = s;
 	sheaf->capacity = capacity;
-	sheaf->size = kmem_cache_alloc_bulk(s, gfp, size, sheaf->objects);
-	if (!sheaf->size) {
+	sheaf->size = size;
+	if (!kmem_cache_alloc_bulk(s, gfp, size, sheaf->objects)) {
 		free(sheaf);
 		return NULL;
 	}
@@ -284,7 +284,6 @@ int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp,
 		 struct slab_sheaf **sheafp, unsigned int size)
 {
 	struct slab_sheaf *sheaf = *sheafp;
-	int refill;
 
 	if (sheaf->size >= size)
 		return 0;
@@ -299,12 +298,10 @@ int kmem_cache_refill_sheaf(struct kmem_cache *s, gfp_t gfp,
 		return 0;
 	}
 
-	refill = kmem_cache_alloc_bulk(s, gfp, size - sheaf->size,
-				       &sheaf->objects[sheaf->size]);
-	if (!refill)
+	if (!kmem_cache_alloc_bulk(s, gfp, size - sheaf->size,
+			&sheaf->objects[sheaf->size]))
 		return -ENOMEM;
-
-	sheaf->size += refill;
+	sheaf->size += (size - sheaf->size);
 	return 0;
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
@ 2026-05-27  7:53   ` bot+bpf-ci
  2026-05-27  8:51   ` Jesper Dangaard Brouer
  2026-05-27  9:38   ` Vlastimil Babka (SUSE)
  2 siblings, 0 replies; 11+ messages in thread
From: bot+bpf-ci @ 2026-05-27  7:53 UTC (permalink / raw)
  To: hch, vbabka, harry, akpm
  Cc: hao.li, cl, rientjes, roman.gushchin, hawk, linux-arm-msm,
	dri-devel, freedreno, linux-kernel, linux-mm, io-uring, kasan-dev,
	bpf, netdev, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 4512 bytes --]

> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> index 75d98dad7b1d..b80d7e1d5123 100644
> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> @@ -1274,10 +1274,9 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx,
>  		goto err_cleanup;
>  	}
>
> -	ret = kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
> -				    op_ctx->rsvd_page_tables.pages);
> -	op_ctx->rsvd_page_tables.count = ret;
> -	if (ret != pt_count) {
> +	if (!kmem_cache_alloc_bulk(pt_cache, GFP_KERNEL, pt_count,
> +			op_ctx->rsvd_page_tables.pages)) {
> +		op_ctx->rsvd_page_tables.count = 0;
>  		ret = -ENOMEM;
>  		goto err_cleanup;
>  	}

Does this omit setting op_ctx->rsvd_page_tables.count to pt_count on success?

Since op_ctx is initialized via memset previously, the count will remain 0 when
the allocation succeeds. This could lead to memory leaks when these page tables
are consumed or cleaned up later, since the cleanup code won't know about the
allocated page tables.

The original code set the count to ret on both success and failure. The
updated code only sets count = 0 in the failure path, but doesn't set
count = pt_count in the success path.

[ ... ]

> diff --git a/lib/test_meminit.c b/lib/test_meminit.c
> index 6298f66c964b..3727dee7c262 100644
> --- a/lib/test_meminit.c
> +++ b/lib/test_meminit.c
> @@ -229,14 +229,12 @@ static int __init do_kmem_cache_size(size_t size, bool want_ctor,
>  	for (iter = 0; iter < 10; iter++) {
>  		/* Do a test of bulk allocations */
>  		if (!want_rcu && !want_ctor) {
> -			int ret;
> -
> -			ret = kmem_cache_alloc_bulk(c, alloc_mask, BULK_SIZE, bulk_array);
> -			if (!ret) {
> +			if (!kmem_cache_alloc_bulk(c, alloc_mask, BULK_SIZE,
> +					bulk_array)) {
>  				fail = true;
>  			} else {
>  				int i;
> -				for (i = 0; i < ret; i++)
> +				for (i = 0; i < BULK_SIZE; i++)
>  					fail |= check_buf(bulk_array[i], size, want_ctor, want_rcu, want_zero);
>  				kmem_cache_free_bulk(c, ret, bulk_array);
>  			}

Will this cause a compilation error?

The local variable ret is removed from this scope, but it's still used in the
kmem_cache_free_bulk() call. This should fail to compile because ret is
undeclared.

The fix should probably replace ret with BULK_SIZE in the free call:

	kmem_cache_free_bulk(c, BULK_SIZE, bulk_array);


> @@ -354,17 +352,18 @@ static int __init do_kmem_cache_size_bulk(int size, int *total_failures)
>
>  	c = kmem_cache_create("test_cache", size, size, 0, NULL);
>  	for (iter = 0; (iter < maxiter) && !fail; iter++) {
> -		num = kmem_cache_alloc_bulk(c, GFP_KERNEL, ARRAY_SIZE(objects),
> -					    objects);
> -		for (i = 0; i < num; i++) {
> +		if (!kmem_cache_alloc_bulk(c, GFP_KERNEL, ARRAY_SIZE(objects),
> +				objects))
> +			continue;
> +
> +		for (i = 0; i < ARRAY_SIZE(objects); i++) {
>  			bytes = count_nonzero_bytes(objects[i], size);
>  			if (bytes)
>  				fail = true;
>  			fill_with_garbage(objects[i], size);
>  		}
>
> -		if (num)
> -			kmem_cache_free_bulk(c, num, objects);
> +		kmem_cache_free_bulk(c, num, objects);
>  	}
>  	kmem_cache_destroy(c);
>  	*total_failures += fail;

Is num uninitialized here?

It's no longer assigned the result of the allocation call. Passing it to
kmem_cache_free_bulk() will use indeterminate stack data, which could lead to
memory corruption. The fix should probably be:

	kmem_cache_free_bulk(c, ARRAY_SIZE(objects), objects);

[ ... ]

> diff --git a/mm/slub.c b/mm/slub.c
> index a2bf3756ca7d..d9790e7c17f6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c

[ ... ]

> @@ -7352,18 +7362,18 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,
>  	 */
>  	unsigned int i = 0;
>  	void *kfence_obj;
>
>  	if (!size)
> -		return 0;
> +		return false;

Does this incorrectly report an allocation failure when size is 0?

Previously, callers interpreting the return value against size (ret != size)
would treat 0 objects allocated for a request of 0 as a success. Returning
false here contradicts the typical kernel behavior where allocating 0 items is
a successful no-op.

This might cause callers to incorrectly abort with an error when they request
0 objects, which should be a valid no-op case.


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26496962101

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
  2026-05-27  7:53   ` bot+bpf-ci
@ 2026-05-27  8:51   ` Jesper Dangaard Brouer
  2026-05-27 13:56     ` Alexander Lobakin
  2026-05-27  9:38   ` Vlastimil Babka (SUSE)
  2 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2026-05-27  8:51 UTC (permalink / raw)
  To: Christoph Hellwig, Vlastimil Babka, Harry Yoo, Andrew Morton
  Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
	linux-arm-msm, dri-devel, freedreno, linux-kernel, linux-mm,
	io-uring, kasan-dev, bpf, netdev, Alexander Lobakin, Matt Fleming,
	kernel-team



On 27/05/2026 09.02, Christoph Hellwig wrote:
> The kmem_cache_alloc_bulk return value is weird.  It returns the number
> of allocated objects, but that must always be 0 or the requested number
> based on the implementations and the handling in the callers, but that
> assumption is not actually documented anywhere, which confuses automated
> review tools.
> 

I remember, this API behavior was requested by AKPM when I developed
kmem_cache_alloc_bulk.  I trusted AKPM's decision, but I cannot explain
why this choice was made.

I kept the netdev code usage below. The current napi_skb_cache_get_bulk
have a retry logic that assumes that a partial bulk number can be
returned (which it cannot as Hellwig explains).  Cc Alex/Olek please
review the changes below as you added this retry logic.


> Fix this by returning a bool if the allocation succeeded and adding a
> kerneldoc comment explaining the API.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/gpu/drm/msm/msm_iommu.c       |  6 +--
>   drivers/gpu/drm/panthor/panthor_mmu.c | 12 +++---
>   include/linux/slab.h                  |  6 ++-
>   io_uring/io_uring.c                   | 23 +++++------
>   lib/test_meminit.c                    | 19 +++++----
>   mm/kasan/kasan_test_c.c               |  5 +--
>   mm/kfence/kfence_test.c               |  9 +++--
>   mm/slub.c                             | 58 +++++++++++++++------------
>   net/bpf/test_run.c                    |  7 ++--
>   net/core/skbuff.c                     | 24 ++++++-----
>   tools/include/linux/slab.h            |  2 +-
>   tools/testing/shared/linux.c          | 19 ++++-----
>   12 files changed, 93 insertions(+), 97 deletions(-)
> 

[...]

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 44ac121cfccb..73045b688385 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -288,11 +288,11 @@ static inline struct sk_buff *napi_skb_cache_get(bool alloc)
>   
>   	local_lock_nested_bh(&napi_alloc_cache.bh_lock);
>   	if (unlikely(!nc->skb_count)) {
> -		if (alloc)
> -			nc->skb_count = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> -						GFP_ATOMIC | __GFP_NOWARN,
> -						NAPI_SKB_CACHE_BULK,
> -						nc->skb_cache);
> +		if (alloc && kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> +						   GFP_ATOMIC | __GFP_NOWARN,
> +						   NAPI_SKB_CACHE_BULK,
> +						   nc->skb_cache))
> +			nc->skb_count = NAPI_SKB_CACHE_BULK;
>   		if (unlikely(!nc->skb_count)) {
>   			local_unlock_nested_bh(&napi_alloc_cache.bh_lock);
>   			return NULL;
> @@ -353,16 +353,18 @@ u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
>   
>   	/* No enough cached skbs. Try refilling the cache first */
>   	bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK);
> -	nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> -					       GFP_ATOMIC | __GFP_NOWARN, bulk,
> -					       &nc->skb_cache[nc->skb_count]);
> +	if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> +				  GFP_ATOMIC | __GFP_NOWARN, bulk,
> +				  &nc->skb_cache[nc->skb_count]))
> +		nc->skb_count += bulk;
>   	if (likely(nc->skb_count >= n))
>   		goto get;
>   
>   	/* Still not enough. Bulk-allocate the missing part directly, zeroed */
> -	n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> -				   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> -				   n - nc->skb_count, &skbs[nc->skb_count]);
> +	if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> +				  GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> +				  n - nc->skb_count, &skbs[nc->skb_count]))
> +		n = nc->skb_count;
>   	if (likely(nc->skb_count >= n))
>   		goto get;
>   

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: improve the kmem_cache_alloc_bulk API
  2026-05-27  7:02 improve the kmem_cache_alloc_bulk API Christoph Hellwig
  2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
@ 2026-05-27  9:11 ` Vlastimil Babka (SUSE)
  2026-05-27 12:21   ` Christoph Hellwig
  1 sibling, 1 reply; 11+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-05-27  9:11 UTC (permalink / raw)
  To: Christoph Hellwig, Harry Yoo, Andrew Morton
  Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

On 5/27/26 09:02, Christoph Hellwig wrote:
> Hi all,
> 
> kmem_cache_alloc_bulk has a very unintuitive and undocumented return
> value convention.  Fix that and add documentation.
> 
> Note that the few comments explaining it mention that the gfp flags
> must allow "spinning".  That's not really a term used in the memory
> allocator, is this supposed to mean "block" or "sleep"?

Page allocator now has alloc_pages_nolock() for when no spinning is
possible, and it uses ALLOC_TRYLOCK internally.

Slab has kmalloc_nolock() relying on that when it needs new pages.

In terms of gfp flags, such context is currently indicated by lack of
__GFP_KSWAPD_RECLAIM, where lack of __GFP_DIRECT_RECLAIM only means "no
sleeping" - see gfpflags_allow_spinning(). Slab uses it internally as
there's no ALLOC_TRYLOCK, but also there are callers from memcg and stackdepot.

Like the rest of gfp flags it's far from ideal, maybe we'll figure out a
better design eventually.

> Diffstat:
>  drivers/gpu/drm/msm/msm_iommu.c       |    6 +--
>  drivers/gpu/drm/panthor/panthor_mmu.c |   12 ++-----
>  include/linux/slab.h                  |    6 ++-
>  io_uring/io_uring.c                   |   23 +++++--------
>  lib/test_meminit.c                    |   19 +++++------
>  mm/kasan/kasan_test_c.c               |    5 +-
>  mm/kfence/kfence_test.c               |    9 ++---
>  mm/slub.c                             |   58 ++++++++++++++++++----------------
>  net/bpf/test_run.c                    |    7 +---
>  net/core/skbuff.c                     |   23 +++++++------
>  tools/include/linux/slab.h            |    2 -
>  tools/testing/shared/linux.c          |   19 ++++-------
>  12 files changed, 92 insertions(+), 97 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
  2026-05-27  7:53   ` bot+bpf-ci
  2026-05-27  8:51   ` Jesper Dangaard Brouer
@ 2026-05-27  9:38   ` Vlastimil Babka (SUSE)
  2026-05-27 12:20     ` Christoph Hellwig
  2 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-05-27  9:38 UTC (permalink / raw)
  To: Christoph Hellwig, Harry Yoo, Andrew Morton
  Cc: Hao Li, Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

On 5/27/26 09:02, Christoph Hellwig wrote:
> The kmem_cache_alloc_bulk return value is weird.  It returns the number
> of allocated objects, but that must always be 0 or the requested number
> based on the implementations and the handling in the callers, but that
> assumption is not actually documented anywhere, which confuses automated
> review tools.
> 
> Fix this by returning a bool if the allocation succeeded and adding a
> kerneldoc comment explaining the API.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Would 0 / -ENOMEM be more like what people would expect? I guess both that
and bool are better than the current API.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  9:38   ` Vlastimil Babka (SUSE)
@ 2026-05-27 12:20     ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-27 12:20 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Christoph Hellwig, Harry Yoo, Andrew Morton, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

On Wed, May 27, 2026 at 11:38:21AM +0200, Vlastimil Babka (SUSE) wrote:
> On 5/27/26 09:02, Christoph Hellwig wrote:
> > The kmem_cache_alloc_bulk return value is weird.  It returns the number
> > of allocated objects, but that must always be 0 or the requested number
> > based on the implementations and the handling in the callers, but that
> > assumption is not actually documented anywhere, which confuses automated
> > review tools.
> > 
> > Fix this by returning a bool if the allocation succeeded and adding a
> > kerneldoc comment explaining the API.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Would 0 / -ENOMEM be more like what people would expect? I guess both that
> and bool are better than the current API.

I find an errno return where the API could not return anything but the
specific error code a bit odd.  But even that would be a lot better
than the current version.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: improve the kmem_cache_alloc_bulk API
  2026-05-27  9:11 ` improve the kmem_cache_alloc_bulk API Vlastimil Babka (SUSE)
@ 2026-05-27 12:21   ` Christoph Hellwig
  2026-05-27 14:07     ` Vlastimil Babka (SUSE)
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-27 12:21 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Christoph Hellwig, Harry Yoo, Andrew Morton, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin,
	Jesper Dangaard Brouer, linux-arm-msm, dri-devel, freedreno,
	linux-kernel, linux-mm, io-uring, kasan-dev, bpf, netdev

On Wed, May 27, 2026 at 11:11:31AM +0200, Vlastimil Babka (SUSE) wrote:
> > value convention.  Fix that and add documentation.
> > 
> > Note that the few comments explaining it mention that the gfp flags
> > must allow "spinning".  That's not really a term used in the memory
> > allocator, is this supposed to mean "block" or "sleep"?
> 
> Page allocator now has alloc_pages_nolock() for when no spinning is
> possible, and it uses ALLOC_TRYLOCK internally.
> 
> Slab has kmalloc_nolock() relying on that when it needs new pages.

The comment long predates that, and it isn't expressed using gfp flags,
but by requiring separate functions so I somehow doubt that was meant.
But I could also not see why it would not support GFP_ATOMIC /
GFP_NOWAIT allocation, so I might just be confused.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27  8:51   ` Jesper Dangaard Brouer
@ 2026-05-27 13:56     ` Alexander Lobakin
  2026-05-27 14:07       ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Lobakin @ 2026-05-27 13:56 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Christoph Hellwig
  Cc: Vlastimil Babka, Harry Yoo, Andrew Morton, Hao Li,
	Christoph Lameter, David Rientjes, Roman Gushchin, linux-arm-msm,
	dri-devel, freedreno, linux-kernel, linux-mm, io-uring, kasan-dev,
	bpf, netdev, Matt Fleming, kernel-team

From: Jesper Dangaard Brouer <hawk@kernel.org>
Date: Wed, 27 May 2026 10:51:42 +0200

> 
> 
> On 27/05/2026 09.02, Christoph Hellwig wrote:
>> The kmem_cache_alloc_bulk return value is weird.  It returns the number
>> of allocated objects, but that must always be 0 or the requested number
>> based on the implementations and the handling in the callers, but that
>> assumption is not actually documented anywhere, which confuses automated
>> review tools.
>>
> 
> I remember, this API behavior was requested by AKPM when I developed
> kmem_cache_alloc_bulk.  I trusted AKPM's decision, but I cannot explain
> why this choice was made.

I sorta remember that when I was reading this function, I also noticed
that it always returns only 2 possible values (0 or the requested
number), but didn't pay enough attention or it was already after I
introduced napi_skb_cache_get_bulk().

> 
> I kept the netdev code usage below. The current napi_skb_cache_get_bulk
> have a retry logic that assumes that a partial bulk number can be
> returned (which it cannot as Hellwig explains).  Cc Alex/Olek please
> review the changes below as you added this retry logic.

As far as I can see, the diff below doesn't introduce any functional
changes (but allows for a bit better compiler optimization). The logic
is still the same:

1) try to allocate non-zeroed skbs into the cache
2) if not enough, try to allocate zeroed skbs directly
3) if still not enough, return less than requested

The logic is still valid even if kmem_cache_alloc_bulk() return bool --
we might have some skbs in the cache (but less than requested) and then
the first allocation try may fail, but the second one succeed (as it
allocates from a different (the zeroed) zone).

> 
> 
>> Fix this by returning a bool if the allocation succeeded and adding a
>> kerneldoc comment explaining the API.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   drivers/gpu/drm/msm/msm_iommu.c       |  6 +--
>>   drivers/gpu/drm/panthor/panthor_mmu.c | 12 +++---
>>   include/linux/slab.h                  |  6 ++-
>>   io_uring/io_uring.c                   | 23 +++++------
>>   lib/test_meminit.c                    | 19 +++++----
>>   mm/kasan/kasan_test_c.c               |  5 +--
>>   mm/kfence/kfence_test.c               |  9 +++--
>>   mm/slub.c                             | 58 +++++++++++++++------------
>>   net/bpf/test_run.c                    |  7 ++--
>>   net/core/skbuff.c                     | 24 ++++++-----
>>   tools/include/linux/slab.h            |  2 +-
>>   tools/testing/shared/linux.c          | 19 ++++-----
>>   12 files changed, 93 insertions(+), 97 deletions(-)
>>
> 
> [...]
> 
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 44ac121cfccb..73045b688385 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -288,11 +288,11 @@ static inline struct sk_buff
>> *napi_skb_cache_get(bool alloc)
>>         local_lock_nested_bh(&napi_alloc_cache.bh_lock);
>>       if (unlikely(!nc->skb_count)) {
>> -        if (alloc)
>> -            nc->skb_count =
>> kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> -                        GFP_ATOMIC | __GFP_NOWARN,
>> -                        NAPI_SKB_CACHE_BULK,
>> -                        nc->skb_cache);
>> +        if (alloc && kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> +                           GFP_ATOMIC | __GFP_NOWARN,
>> +                           NAPI_SKB_CACHE_BULK,
>> +                           nc->skb_cache))
>> +            nc->skb_count = NAPI_SKB_CACHE_BULK;
>>           if (unlikely(!nc->skb_count)) {
>>               local_unlock_nested_bh(&napi_alloc_cache.bh_lock);
>>               return NULL;
>> @@ -353,16 +353,18 @@ u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
>>         /* No enough cached skbs. Try refilling the cache first */
>>       bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count,
>> NAPI_SKB_CACHE_BULK);
>> -    nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> -                           GFP_ATOMIC | __GFP_NOWARN, bulk,
>> -                           &nc->skb_cache[nc->skb_count]);
>> +    if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> +                  GFP_ATOMIC | __GFP_NOWARN, bulk,
>> +                  &nc->skb_cache[nc->skb_count]))
>> +        nc->skb_count += bulk;
>>       if (likely(nc->skb_count >= n))
>>           goto get;
>>         /* Still not enough. Bulk-allocate the missing part directly,
>> zeroed */
>> -    n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> -                   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
>> -                   n - nc->skb_count, &skbs[nc->skb_count]);
>> +    if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
>> +                  GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
>> +                  n - nc->skb_count, &skbs[nc->skb_count]))
>> +        n = nc->skb_count;

kmem_cache_alloc_bulk() allocates `n - nc->skb_count`, but here you
assign `nc->skb_count` to n.
Ah wait,

n -= n - nc->skb_count
n = n - (n - nc->skb_count)
n = n - n + nc->skb_count
n = nc->skb_count

Correct :D

>>       if (likely(nc->skb_count >= n))
>>           goto get;

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> # skbuff

Thanks,
Olek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: improve the kmem_cache_alloc_bulk API
  2026-05-27 12:21   ` Christoph Hellwig
@ 2026-05-27 14:07     ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-05-27 14:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Harry Yoo, Andrew Morton, Hao Li, Christoph Lameter,
	David Rientjes, Roman Gushchin, Jesper Dangaard Brouer,
	linux-arm-msm, dri-devel, freedreno, linux-kernel, linux-mm,
	io-uring, kasan-dev, bpf, netdev

On 5/27/26 14:21, Christoph Hellwig wrote:
> On Wed, May 27, 2026 at 11:11:31AM +0200, Vlastimil Babka (SUSE) wrote:
>> > value convention.  Fix that and add documentation.
>> > 
>> > Note that the few comments explaining it mention that the gfp flags
>> > must allow "spinning".  That's not really a term used in the memory
>> > allocator, is this supposed to mean "block" or "sleep"?
>> 
>> Page allocator now has alloc_pages_nolock() for when no spinning is
>> possible, and it uses ALLOC_TRYLOCK internally.
>> 
>> Slab has kmalloc_nolock() relying on that when it needs new pages.
> 
> The comment long predates that, and it isn't expressed using gfp flags,

Do we both mean this comment?

-/* Note that interrupts must be enabled when calling this function. */
+/*
+ * Note that interrupts must be enabled when calling this function and gfp     
+ * flags must allow spinning.               
+ */
 int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,                                                                                                                                                    
                                 void **p)   

commit 46dea1744498 ("slab: refill sheaves from all nodes") from this January.
Previously it was just interrupts enabled.

> but by requiring separate functions so I somehow doubt that was meant.

Yeah, it's expressed by the _nolock variants. But slab propagates it internally
by the gfp flags, and since 46dea1744498 it affects kmem_cache_alloc_bulk().

> But I could also not see why it would not support GFP_ATOMIC /
> GFP_NOWAIT allocation, so I might just be confused.

Yeah those are supported because they can spin, just not sleep.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm/slab: improve kmem_cache_alloc_bulk
  2026-05-27 13:56     ` Alexander Lobakin
@ 2026-05-27 14:07       ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-27 14:07 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Jesper Dangaard Brouer, Christoph Hellwig, Vlastimil Babka,
	Harry Yoo, Andrew Morton, Hao Li, Christoph Lameter,
	David Rientjes, Roman Gushchin, linux-arm-msm, dri-devel,
	freedreno, linux-kernel, linux-mm, io-uring, kasan-dev, bpf,
	netdev, Matt Fleming, kernel-team

On Wed, May 27, 2026 at 03:56:38PM +0200, Alexander Lobakin wrote:
> >> -    n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> >> -                   GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> >> -                   n - nc->skb_count, &skbs[nc->skb_count]);
> >> +    if (kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> >> +                  GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> >> +                  n - nc->skb_count, &skbs[nc->skb_count]))
> >> +        n = nc->skb_count;
> 
> kmem_cache_alloc_bulk() allocates `n - nc->skb_count`, but here you
> assign `nc->skb_count` to n.
> Ah wait,
> 
> n -= n - nc->skb_count
> n = n - (n - nc->skb_count)
> n = n - n + nc->skb_count
> n = nc->skb_count
> 
> Correct :D

Exactly the steps I went through when writing this patch :)


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-05-27 14:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27  7:02 improve the kmem_cache_alloc_bulk API Christoph Hellwig
2026-05-27  7:02 ` [PATCH] mm/slab: improve kmem_cache_alloc_bulk Christoph Hellwig
2026-05-27  7:53   ` bot+bpf-ci
2026-05-27  8:51   ` Jesper Dangaard Brouer
2026-05-27 13:56     ` Alexander Lobakin
2026-05-27 14:07       ` Christoph Hellwig
2026-05-27  9:38   ` Vlastimil Babka (SUSE)
2026-05-27 12:20     ` Christoph Hellwig
2026-05-27  9:11 ` improve the kmem_cache_alloc_bulk API Vlastimil Babka (SUSE)
2026-05-27 12:21   ` Christoph Hellwig
2026-05-27 14:07     ` Vlastimil Babka (SUSE)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox