Re: [PATCH v1] slab: support for compiler-assisted type-based slab cache partitioning

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: Marco Elver <elver@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nathan Chancellor <nathan@kernel.org>,
	Nicolas Schier <nsc@kernel.org>, Dennis Zhou <dennis@kernel.org>,
	Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@gentwo.org>,
	Hao Li <hao.li@linux.dev>, David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Kees Cook <kees@kernel.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Bill Wendling <morbo@google.com>,
	Justin Stitt <justinstitt@google.com>,
	linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-hardening@vger.kernel.org,
	kasan-dev@googlegroups.com, llvm@lists.linux.dev,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Florent Revest <revest@google.com>,
	GONG Ruiqi <gongruiqi@huaweicloud.com>,
	Jann Horn <jannh@google.com>, KP Singh <kpsingh@kernel.org>,
	Matteo Rizzo <matteorizzo@google.com>
Subject: Re: [PATCH v1] slab: support for compiler-assisted type-based slab cache partitioning
Date: Fri, 3 Apr 2026 15:27:48 +0900	[thread overview]
Message-ID: <ac9d5O5XehnXRc5A@hyeyoo> (raw)
In-Reply-To: <20260331111240.153913-1-elver@google.com>

On Tue, Mar 31, 2026 at 01:12:27PM +0200, Marco Elver wrote:
> Rework the general infrastructure around RANDOM_KMALLOC_CACHES into more
> flexible PARTITION_KMALLOC_CACHES, with the former being a partitioning
> mode of the latter.
> 
> Introduce a new mode, TYPED_KMALLOC_CACHES, which leverages a feature
> available in Clang 22 and later, called "allocation tokens" via
> __builtin_infer_alloc_token [1]. Unlike RANDOM_KMALLOC_CACHES, this mode
> deterministically assigns a slab cache to an allocation of type T,
> regardless of allocation site.
> 
> The builtin __builtin_infer_alloc_token(<malloc-args>, ...) instructs
> the compiler to infer an allocation type from arguments commonly passed
> to memory-allocating functions and returns a type-derived token ID. The
> implementation passes kmalloc-args to the builtin: the compiler performs
> best-effort type inference, and then recognizes common patterns such as
> `kmalloc(sizeof(T), ...)`, `kmalloc(sizeof(T) * n, ...)`, but also
> `(T *)kmalloc(...)`. Where the compiler fails to infer a type the
> fallback token (default: 0) is chosen.
> 
> Note: kmalloc_obj(..) APIs fix the pattern how size and result type are
> expressed, and therefore ensures there's not much drift in which
> patterns the compiler needs to recognize. Specifically, kmalloc_obj()
> and friends expand to `(TYPE *)KMALLOC(__obj_size, GFP)`, which the
> compiler recognizes via the cast to TYPE*.
> 
> Clang's default token ID calculation is described as [1]:
> 
>    typehashpointersplit: This mode assigns a token ID based on the hash
>    of the allocated type's name, where the top half ID-space is reserved
>    for types that contain pointers and the bottom half for types that do
>    not contain pointers.
> 
> Separating pointer-containing objects from pointerless objects and data
> allocations can help mitigate certain classes of memory corruption
> exploits [2]: attackers who gains a buffer overflow on a primitive
> buffer cannot use it to directly corrupt pointers or other critical
> metadata in an object residing in a different, isolated heap region.
> 
> It is important to note that heap isolation strategies offer a
> best-effort approach, and do not provide a 100% security guarantee,
> albeit achievable at relatively low performance cost. Note that this
> also does not prevent cross-cache attacks, and SLAB_VIRTUAL [3] should
> be used as a complementary mitigation (once available).
> 
> With all that, my kernel (x86 defconfig) shows me a histogram of slab
> cache object distribution per /proc/slabinfo (after boot):
> 
>   <slab cache>      <objs> <hist>
>   kmalloc-part-15    1537  +++++++++++++++
>   kmalloc-part-14    2996  +++++++++++++++++++++++++++++
>   kmalloc-part-13    1555  +++++++++++++++
>   kmalloc-part-12    1045  ++++++++++
>   kmalloc-part-11    1717  +++++++++++++++++
>   kmalloc-part-10    1489  ++++++++++++++
>   kmalloc-part-09     851  ++++++++
>   kmalloc-part-08     710  +++++++
>   kmalloc-part-07     100  +
>   kmalloc-part-06     217  ++
>   kmalloc-part-05     105  +
>   kmalloc-part-04    4047  ++++++++++++++++++++++++++++++++++++++++
>   kmalloc-part-03     276  ++
>   kmalloc-part-02     283  ++
>   kmalloc-part-01     316  +++
>   kmalloc            1599  +++++++++++++++
> 
> The above /proc/slabinfo snapshot shows me there are 6943 allocated
> objects (slabs 00 - 07) that the compiler claims contain no pointers or
> it was unable to infer the type of, and 11900 objects that contain
> pointers (slabs 08 - 15). On a whole, this looks relatively sane.
> 
> Additionally, when I compile my kernel with -Rpass=alloc-token, which
> provides diagnostics where (after dead-code elimination) type inference
> failed, I see 179 allocation sites where the compiler failed to identify
> a type (down from 966 when I sent the RFC [4]). Some initial review
> confirms these are mostly variable sized buffers, but also include
> structs with trailing flexible length arrays.
> 
> Link: https://clang.llvm.org/docs/AllocToken.html [1]
> Link: https://blog.dfsec.com/ios/2025/05/30/blasting-past-ios-18/ [2]
> Link: https://lwn.net/Articles/944647/ [3]
> Link: https://lore.kernel.org/all/20250825154505.1558444-1-elver@google.com/ [4]
> Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434
> Signed-off-by: Marco Elver <elver@google.com>
> ---
> Changelog:
> v1:
> * Rebase and switch to builtin name that was released in Clang 22.

> * Keep RANDOM_KMALLOC_CACHES the default.

Presumably because only the latest Clang supports it?

> RFC: https://lore.kernel.org/all/20250825154505.1558444-1-elver@google.com/
> ---
>  Makefile                        |  5 ++
>  include/linux/percpu.h          |  2 +-
>  include/linux/slab.h            | 94 ++++++++++++++++++++-------------
>  kernel/configs/hardening.config |  2 +-
>  mm/Kconfig                      | 45 ++++++++++++----
>  mm/kfence/kfence_test.c         |  4 +-
>  mm/slab.h                       |  4 +-
>  mm/slab_common.c                | 48 ++++++++---------
>  mm/slub.c                       | 31 +++++------
>  9 files changed, 144 insertions(+), 91 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 15a60b501b95..c0bf00ee6025 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -864,10 +877,10 @@ unsigned int kmem_cache_sheaf_size(struct slab_sheaf *sheaf);
>   * with the exception of kunit tests
>   */
>  
> -void *__kmalloc_noprof(size_t size, gfp_t flags)
> +void *__kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token)
>  				__assume_kmalloc_alignment __alloc_size(1);
>  
> -void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
> +void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node, kmalloc_token_t token)
>  				__assume_kmalloc_alignment __alloc_size(1);

So the @token parameter is unused when CONFIG_PARTITION_KMALLOC_CACHES is
disabled but still increases the kernel size by a few kilobytes...
but yeah I'm not sure if we can get avoid it without hurting readability.

Just saying. (does anybody care?)

>  void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t flags, size_t size)

> diff --git a/mm/Kconfig b/mm/Kconfig
> index ebd8ea353687..fa4ffc1fcb80 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -247,22 +247,47 @@ config SLUB_STATS
>  	  out which slabs are relevant to a particular load.
>  	  Try running: slabinfo -DA
>  
> -config RANDOM_KMALLOC_CACHES
> -	default n
> +config PARTITION_KMALLOC_CACHES
>  	depends on !SLUB_TINY
> -	bool "Randomize slab caches for normal kmalloc"
> +	bool "Partitioned slab caches for normal kmalloc"
>  	help
> -	  A hardening feature that creates multiple copies of slab caches for
> -	  normal kmalloc allocation and makes kmalloc randomly pick one based
> -	  on code address, which makes the attackers more difficult to spray
> -	  vulnerable memory objects on the heap for the purpose of exploiting
> -	  memory vulnerabilities.
> +	  A hardening feature that creates multiple isolated copies of slab
> +	  caches for normal kmalloc allocations. This makes it more difficult
> +	  to exploit memory-safety vulnerabilities by attacking vulnerable
> +	  co-located memory objects. Several modes are provided.
>  
>  	  Currently the number of copies is set to 16, a reasonably large value
>  	  that effectively diverges the memory objects allocated for different
>  	  subsystems or modules into different caches, at the expense of a
> -	  limited degree of memory and CPU overhead that relates to hardware and
> -	  system workload.
> +	  limited degree of memory and CPU overhead that relates to hardware
> +	  and system workload.
> +
> +choice
> +	prompt "Partitioned slab cache mode"
> +	depends on PARTITION_KMALLOC_CACHES
> +	default RANDOM_KMALLOC_CACHES
> +	help
> +	  Selects the slab cache partitioning mode.
> +
> +config RANDOM_KMALLOC_CACHES
> +	bool "Randomize slab caches for normal kmalloc"
> +	help
> +	  Randomly pick a slab cache based on code address.
> +
> +config TYPED_KMALLOC_CACHES
> +	bool "Type based slab cache selection for normal kmalloc"
> +	depends on $(cc-option,-falloc-token-max=123)
> +	help
> +	  Rely on Clang's allocation tokens to choose a slab cache, where token
> +	  IDs are derived from the allocated type.
> +
> +	  The current effectiveness of Clang's type inference can be judged by
> +	  -Rpass=alloc-token, which provides diagnostics where (after dead-code
> +	  elimination) type inference failed.
> +
> +	  Requires Clang 22 or later.

Assuming not all people building the kernel are security experts...
(including myself) could you please add some insights/guidance on how to
decide between RANDOM_KMALLOC_CACHES and TYPED_KMALLOC_CACHES?

Something like what Florent wrote [1]:                                          
| One more perspective on this: in a data center environment, attackers
| typically get a first foothold by compromising a userspace network
| service. If they can do that once, they can do that a bunch of times,
| and gain code execution on different machines every time.
| 
| Before trying to exploit a kernel memory corruption to elevate
| privileges on a machine, they can test the SLAB properties of the
| running kernel to make sure it's as they wish (eg: with timing side
| channels like in the SLUBStick paper). So with RANDOM_KMALLOC_CACHES,
| attackers can just keep retrying their attacks until they land on a
| machine where the types T and S are collocated and only then proceed
| with their exploit.
| 
| With TYPED_KMALLOC_CACHES (and with SLAB_VIRTUAL hopefully someday),
| they are simply never able to cross the "objects without pointers" to
| "objects with pointers" boundary which really gets in the way of many
| exploitation techniques and feels at least to me like a much stronger
| security boundary.
| 
| This limit of RANDOM_KMALLOC_CACHES may not be as relevant in other
| deployments (eg: on a smartphone) but it makes me strongly prefer
| TYPED_KMALLOC_CACHES for server use cases at least.

[1] https://lore.kernel.org/all/CALGbS4U6fox7SwmdHfDuawmOWfQeQsxtA1X_VqRxTHpSs-sBYw@mail.gmail.com

Otherwise the patch is really straightforward and looks good to me.

Thanks!

-- 
Cheers,
Harry / Hyeonggon

next prev parent reply	other threads:[~2026-04-03  6:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 11:12 [PATCH v1] slab: support for compiler-assisted type-based slab cache partitioning Marco Elver
2026-04-02  6:24 ` kernel test robot
2026-04-02 13:33   ` Dan Carpenter
2026-04-02 13:48   ` Marco Elver
2026-04-02 17:05     ` Dan Carpenter
2026-04-02 19:08       ` Marco Elver
2026-04-03  6:27 ` Harry Yoo (Oracle) [this message]
2026-04-03 18:29   ` Vlastimil Babka (SUSE)
2026-04-06  4:28     ` Harry Yoo (Oracle)
2026-04-07 11:17       ` Marco Elver
2026-04-07 12:54         ` Harry Yoo (Oracle)
2026-04-09 20:19           ` Marco Elver
2026-04-10  2:10             ` Harry Yoo (Oracle)
2026-04-10  9:10               ` Marco Elver
2026-04-03  6:28 ` Harry Yoo (Oracle)
2026-04-10  9:50 ` GONG Ruiqi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac9d5O5XehnXRc5A@hyeyoo \
    --to=harry@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=cl@gentwo.org \
    --cc=david@kernel.org \
    --cc=dennis@kernel.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=gongruiqi@huaweicloud.com \
    --cc=gustavoars@kernel.org \
    --cc=hao.li@linux.dev \
    --cc=jannh@google.com \
    --cc=justinstitt@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kees@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=matteorizzo@google.com \
    --cc=mhocko@suse.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=nsc@kernel.org \
    --cc=revest@google.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.