Linux cgroups development
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Harry Yoo <harry@kernel.org>
Cc: Hao Li <hao.li@linux.dev>, Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Suren Baghdasaryan <surenb@google.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Marco Elver <elver@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	kasan-dev@googlegroups.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v2 02/16] mm/slab: do not init any kfence objects on allocation
Date: Thu, 11 Jun 2026 16:47:19 +0200	[thread overview]
Message-ID: <e71bfc13-c233-4f85-a6ec-76327d3c6510@kernel.org> (raw)
In-Reply-To: <159d1e20-5b21-4329-ac9a-f7a5cb0fd56a@kernel.org>

On 6/11/26 10:34, Vlastimil Babka (SUSE) wrote:
> On 6/11/26 05:19, Harry Yoo wrote:
>> 
>>> This potentially adds overhead of the is_kfence_address() check to
>>> allocation hotpath, but that one is designed to be as small as possible,
>>> and it's only evaluated if zeroing is about to happen. This means (aside
>>> from init_on_alloc hardening) only for __GFP_ZERO allocations, and the
>>> zeroing itself comes with an overhead likely larger than the added
>>> check.
>>> 
>>> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
>>> ---
>>>  mm/kfence/core.c |  2 +-
>>>  mm/slub.c        | 23 ++++++++---------------
>>>  2 files changed, 9 insertions(+), 16 deletions(-)
>>> 
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index e2ee8f1aaccf..8e5264d3ddbf 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -4565,9 +4565,10 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
>>>  
>>>  static __fastpath_inline
>>>  bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
>>> -			  gfp_t flags, size_t size, void **p, bool init,
>>> +			  gfp_t flags, size_t size, void **p,
>>>  			  unsigned int orig_size)
>>>  {
>>> +	bool init = slab_want_init_on_alloc(flags, s);
>>>  	unsigned int zero_size = s->object_size;
>>>  	bool kasan_init = init;
>>>  	size_t i;
>>> @@ -4608,7 +4609,8 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
>>>  	for (i = 0; i < size; i++) {
>>>  		p[i] = kasan_slab_alloc(s, p[i], init_flags, kasan_init);
>>>  		if (p[i] && init && (!kasan_init ||
>>> -				     !kasan_has_integrated_init()))
>>> +				     !kasan_has_integrated_init())
>>> +				 && !is_kfence_address(p[i]))
>> 
>> I hope we could make it bit more verbose and straightforward,
>> something like:
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 5d7ea72ebebd..29cf4590f9d9 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4573,7 +4573,6 @@ bool slab_post_alloc_hook(struct kmem_cache *s,
>> gfp_t flags, size_t size,
>>  {
>>  	bool init = slab_want_init_on_alloc(flags, s);
>>  	unsigned int zero_size = s->object_size;
>> -	bool kasan_init = init;
>>  	size_t i;
>>  	gfp_t init_flags = flags & gfp_allowed_mask;
>> 
>> @@ -4591,29 +4590,37 @@ bool slab_post_alloc_hook(struct kmem_cache *s,
>> gfp_t flags, size_t size,
>>  	if (slub_debug_orig_size(s))
>>  		zero_size = ac->orig_size;
>> 
>> -	/*
>> -	 * When slab_debug is enabled, avoid memory initialization integrated
>> -	 * into KASAN and instead zero out the memory via the memset below with
>> -	 * the proper size. Otherwise, KASAN might overwrite SLUB redzones and
>> -	 * cause false-positive reports. This does not lead to a performance
>> -	 * penalty on production builds, as slab_debug is not intended to be
>> -	 * enabled there.
>> -	 */
>> -	if (__slub_debug_enabled())
>> -		kasan_init = false;
>> -
>> -	/*
>> -	 * As memory initialization might be integrated into KASAN,
>> -	 * kasan_slab_alloc and initialization memset must be
>> -	 * kept together to avoid discrepancies in behavior.
>> -	 *
>> -	 * As p[i] might get tagged, memset and kmemleak hook come after KASAN.
>> -	 */
>>  	for (i = 0; i < size; i++) {
>> -		p[i] = kasan_slab_alloc(s, p[i], init_flags, kasan_init);
>> -		if (p[i] && init && (!kasan_init ||
>> -				     !kasan_has_integrated_init())
>> -				 && !is_kfence_address(p[i]))
>> +		bool skip_init = false;
>> +
>> +		if (is_kfence_address(p[i])) {
>> +			/*
>> +			 * kfence zeroes the object instead of SLUB to avoid
>> +			 * overwriting its own redzone, and zeroing of
>> +			 * s->object_size will corrupt it.
>> +			 */
>> +			skip_init = true;
> 
> But now we perform this check even if init is false, making it more hot.
> 
>> +		} else if (__slub_debug_enabled()) {
>> +			/*
>> +			 * KASAN never zeroes memory when slab_debug is enabled
>> +			 * to avoid overwriting SLUB redzones. This does not
>> +			 * lead to a performance penalty on production builds,
>> +			 * as slab_debug is not intended to be enabled there.
>> +			 */
>> +			skip_init = false;
>> +		} else if (kasan_has_integrated_init()) {
>> +			/*
>> +			 * ARM64 can set memory tags and zero the memory using
>> +			 * a single instruction. Since HW_TAGS KASAN uses that
>> +			 * while tagging the object, a separate zeroing is
>> +			 * unnecessary unless slab_debug is enabled.
>> +			 */
> 
> (I like the new/updated comments)
> 
>> +			skip_init = true;
>> +		}>
> 
> And these two are now done in every loop iteration even though they don't
> depend on the object. Yeah it's a static key and build-time constant but still.
> 
> But maybe there's some middle ground?
> 
> Above the loop do (with your comments).

OK, not so simple, we still need the kasan_init variable too.
I've ended up with this, thoughts?

From 3a1c4398ce9f361a4e6f4d9946eab6237eea89c2 Mon Sep 17 00:00:00 2001
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Date: Wed, 10 Jun 2026 17:40:04 +0200
Subject: [PATCH] mm/slab: do not init any kfence objects on allocation

When init (zeroing) on allocation is requested, for kmalloc() we
generally have to zero the full object size even if a smaller size is
requested, in order to provide krealloc()'s __GFP_ZERO guarantees.

When we end up allocating a kfence object, kfence perfoms the zeroing on
its own because has its own redzone beyond the requested size. Thus
slab_post_alloc_hook() has an 'init' parameter which has to be evaluated
in all callers (via slab_want_init_on_alloc()) and should be false for
kfence allocations.

For kfence allocations in slab_alloc_node() this is achieved by subtly
skipping over the slab_want_init_on_alloc() call. Other callers (i.e.
kmem_cache_alloc_bulk_noprof()) however evaluate it unconditionally even
if they do end up with a kfence allocation. This is only subtly not a
problem, as those are not kmalloc allocations and thus the "requested
size" equals s->object_size and thus it cannot interfere with kfence's
redzone. There's just a unnecessary double zeroing (in both kfence and
slab_post_alloc_hook()), but it's all very fragile and contradicts the
comment in kfence_guarded_alloc().

Remove this subtlety and simplify the code by eliminating the init
parameter from slab_post_alloc_hook() and make it call
slab_want_init_on_alloc() itself. Instead add a is_kfence_address()
check before performing the memset, which will start doing the right
thing for all callers of slab_post_alloc_hook().

This potentially adds overhead of the is_kfence_address() check to
allocation hotpath, but that one is designed to be as small as possible,
and it's only evaluated if zeroing is about to happen. This means (aside
from init_on_alloc hardening) only for __GFP_ZERO allocations, and the
zeroing itself comes with an overhead likely larger than the added
check.

While at it, refactor the handling of evaluating when KASAN does the
init instead of SLUB, with no intended functional changes. A
non-functional change is that we don't pass kasan_init as true to
kasan_slab_alloc() if kasan has no integrated init, but then the value
is ignored anyway, so it's theoretically more correct.

Thanks to Harry Yoo for the initial refactoring attempt, and for updated
comments that are used here.

Link: https://patch.msgid.link/20260610-slab_alloc_flags-v2-2-7190909db118@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
 mm/kfence/core.c |  2 +-
 mm/slub.c        | 60 ++++++++++++++++++++++--------------------------
 2 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 655dc5ce3240..5e0b406924e9 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -500,7 +500,7 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
 
 	/*
 	 * We check slab_want_init_on_alloc() ourselves, rather than letting
-	 * SL*B do the initialization, as otherwise we might overwrite KFENCE's
+	 * slab do the initialization, as otherwise it might overwrite KFENCE's
 	 * redzone.
 	 */
 	if (unlikely(slab_want_init_on_alloc(gfp, cache)))
diff --git a/mm/slub.c b/mm/slub.c
index e2ee8f1aaccf..d762cbe5d040 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4565,13 +4565,13 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
 
 static __fastpath_inline
 bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
-			  gfp_t flags, size_t size, void **p, bool init,
+			  gfp_t flags, size_t size, void **p,
 			  unsigned int orig_size)
 {
+	bool init = slab_want_init_on_alloc(flags, s);
 	unsigned int zero_size = s->object_size;
-	bool kasan_init = init;
-	size_t i;
 	gfp_t init_flags = flags & gfp_allowed_mask;
+	bool kasan_init = false;
 
 	/*
 	 * For kmalloc object, the allocated size (object_size) can be larger
@@ -4588,28 +4588,33 @@ bool slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru,
 		zero_size = orig_size;
 
 	/*
-	 * When slab_debug is enabled, avoid memory initialization integrated
-	 * into KASAN and instead zero out the memory via the memset below with
-	 * the proper size. Otherwise, KASAN might overwrite SLUB redzones and
-	 * cause false-positive reports. This does not lead to a performance
+	 * ARM64 can set memory tags and zero the memory using a single
+	 * instruction. Since HW_TAGS KASAN uses that while tagging the object,
+	 * separate zeroing is unnecessary.
+	 *
+	 * However, KASAN never zeroes memory when slab_debug is enabled to
+	 * avoid overwriting SLUB redzones. This does not lead to a performance
 	 * penalty on production builds, as slab_debug is not intended to be
 	 * enabled there.
 	 */
-	if (__slub_debug_enabled())
-		kasan_init = false;
+	if (kasan_has_integrated_init() && !__slub_debug_enabled()) {
+		kasan_init = init;
+		init = false;
+	}
 
-	/*
-	 * As memory initialization might be integrated into KASAN,
-	 * kasan_slab_alloc and initialization memset must be
-	 * kept together to avoid discrepancies in behavior.
-	 *
-	 * As p[i] might get tagged, memset and kmemleak hook come after KASAN.
-	 */
-	for (i = 0; i < size; i++) {
+	for (size_t i = 0; i < size; i++) {
 		p[i] = kasan_slab_alloc(s, p[i], init_flags, kasan_init);
-		if (p[i] && init && (!kasan_init ||
-				     !kasan_has_integrated_init()))
+
+		/*
+		 * memset and hooks come after KASAN as p[i] might get tagged
+		 *
+		 * kfence zeroes the object instead of SLUB to avoid overwriting
+		 * its own redzone starting at orig_size, which could happen
+		 * with SLUB zeroing full s->object_size
+		 */
+		if (init && p[i] && !is_kfence_address(p[i]))
 			memset(p[i], 0, zero_size);
+
 		if (gfpflags_allow_spinning(flags))
 			kmemleak_alloc_recursive(p[i], s->object_size, 1,
 						 s->flags, init_flags);
@@ -4910,7 +4915,6 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
 		gfp_t gfpflags, int node, unsigned long addr, size_t orig_size)
 {
 	void *object;
-	bool init = false;
 
 	s = slab_pre_alloc_hook(s, gfpflags);
 	if (unlikely(!s))
@@ -4926,16 +4930,13 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list
 		object = __slab_alloc_node(s, gfpflags, node, addr, orig_size);
 
 	maybe_wipe_obj_freeptr(s, object);
-	init = slab_want_init_on_alloc(gfpflags, s);
 
 out:
 	/*
-	 * When init equals 'true', like for kzalloc() family, only
-	 * @orig_size bytes might be zeroed instead of s->object_size
 	 * In case this fails due to memcg_slab_post_alloc_hook(),
 	 * object is set to NULL
 	 */
-	slab_post_alloc_hook(s, lru, gfpflags, 1, &object, init, orig_size);
+	slab_post_alloc_hook(s, lru, gfpflags, 1, &object, orig_size);
 
 	return object;
 }
@@ -5230,7 +5231,6 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
 				   struct slab_sheaf *sheaf)
 {
 	void *ret = NULL;
-	bool init;
 
 	if (sheaf->size == 0)
 		goto out;
@@ -5240,10 +5240,8 @@ kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
 	if (likely(!ret))
 		ret = sheaf->objects[--sheaf->size];
 
-	init = slab_want_init_on_alloc(gfp, s);
-
 	/* add __GFP_NOFAIL to force successful memcg charging */
-	slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, init, s->object_size);
+	slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, s->object_size);
 out:
 	trace_kmem_cache_alloc(_RET_IP_, ret, s, gfp, NUMA_NO_NODE);
 
@@ -5423,8 +5421,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
 
 success:
 	maybe_wipe_obj_freeptr(s, ret);
-	slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret,
-			     slab_want_init_on_alloc(alloc_gfp, s), orig_size);
+	slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret, orig_size);
 
 	ret = kasan_kmalloc(s, ret, orig_size, alloc_gfp);
 	return ret;
@@ -7339,8 +7336,7 @@ bool kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags,
 
 out:
 	/* memcg and kmem_cache debug support and memory initialization */
-	return likely(slab_post_alloc_hook(s, NULL, flags, size, p,
-			slab_want_init_on_alloc(flags, s), s->object_size));
+	return likely(slab_post_alloc_hook(s, NULL, flags, size, p, s->object_size));
 }
 EXPORT_SYMBOL(kmem_cache_alloc_bulk_noprof);
 
-- 
2.54.0



  reply	other threads:[~2026-06-11 14:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 15:40 [PATCH v2 00/16] mm/slab: introduce alloc_flags and slab_alloc_context Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 01/16] mm/slab: do not limit zeroing to orig_size when only red zoning is enabled Vlastimil Babka (SUSE)
2026-06-11  4:28   ` Harry Yoo
2026-06-10 15:40 ` [PATCH v2 02/16] mm/slab: do not init any kfence objects on allocation Vlastimil Babka (SUSE)
2026-06-11  3:19   ` Harry Yoo
2026-06-11  8:34     ` Vlastimil Babka (SUSE)
2026-06-11 14:47       ` Vlastimil Babka (SUSE) [this message]
2026-06-11 15:11         ` Harry Yoo
2026-06-11 16:37           ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 03/16] mm/slab: stop inlining __slab_alloc_node() Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 04/16] mm/slab: introduce slab_alloc_context Vlastimil Babka (SUSE)
2026-06-11  4:49   ` Harry Yoo
2026-06-10 15:40 ` [PATCH v2 05/16] mm/slab: introduce alloc_flags and SLAB_ALLOC_TRYLOCK Vlastimil Babka (SUSE)
2026-06-11  4:57   ` Harry Yoo
2026-06-11  6:40   ` Harry Yoo
2026-06-11  8:51     ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 06/16] mm/slab: add alloc_flags to slab_alloc_context Vlastimil Babka (SUSE)
2026-06-11  5:06   ` Harry Yoo
2026-06-10 15:40 ` [PATCH v2 07/16] mm/slab: replace struct partial_context with slab_alloc_context Vlastimil Babka (SUSE)
2026-06-11  6:05   ` Harry Yoo
2026-06-10 15:40 ` [PATCH v2 08/16] mm/slab: pass alloc_flags to new slab allocation Vlastimil Babka (SUSE)
2026-06-11  7:52   ` Harry Yoo
2026-06-10 15:40 ` [PATCH v2 09/16] mm/slab: pass alloc_flags through slab_post_alloc_hook() chain Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 10/16] mm/slab: replace slab_alloc_node() parameters with slab_alloc_context Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 11/16] mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 12/16] mm/slab: pass slab_alloc_context to __do_kmalloc_node() Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 13/16] mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock() Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 14/16] mm/slab: introduce kmalloc_flags() Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 15/16] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts() Vlastimil Babka (SUSE)
2026-06-11 16:28   ` Vlastimil Babka (SUSE)
2026-06-10 15:40 ` [PATCH v2 16/16] mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves Vlastimil Babka (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e71bfc13-c233-4f85-a6ec-76327d3c6510@kernel.org \
    --to=vbabka@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=ast@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@gentwo.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hao.li@linux.dev \
    --cc=harry@kernel.org \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox