linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/5] slab: Set freed variables to NULL by default
@ 2025-03-21 20:40 Kees Cook
  2025-03-21 20:40 ` [PATCH 1/5] treewide: Replace kfree() casts with union members Kees Cook
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:40 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Christoph Lameter, Miguel Ojeda, Nathan Chancellor,
	Marco Elver, Nick Desaulniers, Przemek Kitszel, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-kernel, linux-mm, linux-hardening

Hi!

This is very much an RFC series, but I wanted to make sure it actually
worked before I proposed it. This series seeks to give kfree() the
side-effect of assigning NULL to the kfree() argument when possible.
This would make a subset of "dangling pointer" flaws turn into NULL
derefs instead of Use-After-Free[1]. It effectively turns:

	kfree(var);

into:

	kfree(var);
	var = NULL;

when "var" is actually addressable. (i.e. not "kfree(get_ptrs())" etc.)

It depends on a builtin, __builtin_is_lvalue(), which is not landed in any
compiler yet, but I do have it working in a Clang patch[2]. This should
be essentially free (pardon the pun), so I think if folks can tolerate
a little bit of renaming needed for when kfree is needed as a function
and not a macro, it should be good. Please let me know what you think.

Thanks!

-Kees

(Yes, I'm still working on the kmalloc_objs() series, but I needed to
take a break from fixing all the allocation corner cases I've found there.)

[1] https://github.com/KSPP/linux/issues/87
[2] https://github.com/kees/llvm-project/commits/builtin_is_lvalue/

Kees Cook (5):
  treewide: Replace kfree() casts with union members
  treewide: Prepare for kfree() to __kfree() rename
  compiler_types: Introduce __is_lvalue()
  slab: Set freed variables to NULL by default
  [DEBUG] slab: Report number of NULLings

 arch/mips/alchemy/common/dbdma.c |  2 +-
 include/linux/compiler_types.h   | 10 ++++++++++
 include/linux/netlink.h          |  1 +
 include/linux/slab.h             | 33 ++++++++++++++++++++++++++++++--
 include/net/pkt_cls.h            |  5 ++++-
 io_uring/futex.c                 |  2 +-
 io_uring/io_uring.c              | 12 ++++++------
 kernel/bpf/core.c                |  3 ++-
 mm/slab_common.c                 | 12 ++++++++----
 mm/slub.c                        |  6 +++---
 net/sched/ematch.c               |  2 +-
 net/wireless/nl80211.c           |  2 +-
 12 files changed, 69 insertions(+), 21 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] treewide: Replace kfree() casts with union members
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
@ 2025-03-21 20:40 ` Kees Cook
  2025-03-23 10:26   ` David Laight
  2025-03-21 20:40 ` [PATCH 2/5] treewide: Prepare for kfree() to __kfree() rename Kees Cook
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:40 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Christoph Lameter, Miguel Ojeda, Nathan Chancellor,
	Marco Elver, Nick Desaulniers, Przemek Kitszel, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-kernel, linux-mm, linux-hardening

As a prerequisite to being able to optionally take the address of any
lvalues used in a call to kfree(), replace casts to the kfree() argument
with unions to include an actual pointer.

This is an example subset. There are another handful remaining:

$ git grep '\bkfree((void \*)'
arch/mips/alchemy/common/dbdma.c:       kfree((void *)ctp->cdb_membase);
arch/s390/kernel/crash_dump.c:  kfree((void *)(unsigned long)addr);
drivers/crypto/atmel-sha204a.c: kfree((void *)i2c_priv->hwrng.priv);
drivers/infiniband/hw/cxgb4/mem.c:              kfree((void *) (unsigned long) mhp->kva);
drivers/isdn/mISDN/fsm.c:       kfree((void *) fsm->jumpmatrix);
drivers/misc/altera-stapl/altera.c:           kfree((void *)vars[variable_id]);
drivers/misc/altera-stapl/altera.c:                             kfree((void *)vars[i]);
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:                        kfree((void *)x); \
drivers/net/ethernet/qlogic/qed/qed_main.c:     kfree((void *)cdev);
drivers/net/usb/cx82310_eth.c:  kfree((void *)dev->partial_data);
drivers/net/usb/cx82310_eth.c:  kfree((void *)dev->partial_data);
drivers/scsi/snic/snic_io.c:            kfree((void *)rqi->sge_va);
drivers/scsi/snic/snic_io.c:                    kfree((void *)rqi->sge_va);
drivers/staging/rtl8723bs/os_dep/os_intfs.c:    /* kfree((void *)padapter); */
drivers/video/fbdev/grvga.c:            kfree((void *)virtual_start);
drivers/video/fbdev/grvga.c:                    kfree((void *)info->screen_base);
drivers/xen/grant-table.c:                      kfree((void *)page_private(pages[i]));
net/ieee802154/nl802154.c:      kfree((void *)cb->args[0]);
net/sched/em_ipset.c:           kfree((void *) em->data);
net/sched/em_meta.c:    kfree((void *) v->val);

Signed-off-by: Kees Cook <kees@kernel.org>
---
 include/linux/netlink.h | 1 +
 include/net/pkt_cls.h   | 5 ++++-
 net/sched/ematch.c      | 2 +-
 net/wireless/nl80211.c  | 2 +-
 4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index c3ae84a77e16..26eb9eea8a74 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -295,6 +295,7 @@ struct netlink_callback {
 	bool			strict_check;
 	union {
 		u8		ctx[NETLINK_CTX_SIZE];
+		void *		ptr;
 
 		/* args is deprecated. Cast a struct over ctx instead
 		 * for proper type safety.
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index c64fd896b1f9..4faf8d6eed1d 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -403,7 +403,10 @@ struct tcf_ematch_ops;
  */
 struct tcf_ematch {
 	struct tcf_ematch_ops * ops;
-	unsigned long		data;
+	union {
+		unsigned long	data;
+		void *		ptr;
+	};
 	unsigned int		datalen;
 	u16			matchid;
 	u16			flags;
diff --git a/net/sched/ematch.c b/net/sched/ematch.c
index 5c1235e6076a..f4b00e7aca6a 100644
--- a/net/sched/ematch.c
+++ b/net/sched/ematch.c
@@ -411,7 +411,7 @@ void tcf_em_tree_destroy(struct tcf_ematch_tree *tree)
 			if (em->ops->destroy)
 				em->ops->destroy(em);
 			else if (!tcf_em_is_simple(em))
-				kfree((void *) em->data);
+				kfree(em->ptr);
 			module_put(em->ops->owner);
 		}
 	}
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index d7d3da0f6833..b5a3ae07d84c 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -3270,7 +3270,7 @@ static int nl80211_dump_wiphy(struct sk_buff *skb, struct netlink_callback *cb)
 
 static int nl80211_dump_wiphy_done(struct netlink_callback *cb)
 {
-	kfree((void *)cb->args[0]);
+	kfree(cb->ptr);
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] treewide: Prepare for kfree() to __kfree() rename
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
  2025-03-21 20:40 ` [PATCH 1/5] treewide: Replace kfree() casts with union members Kees Cook
@ 2025-03-21 20:40 ` Kees Cook
  2025-03-21 20:40 ` [PATCH 3/5] compiler_types: Introduce __is_lvalue() Kees Cook
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:40 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Christoph Lameter, Miguel Ojeda, Nathan Chancellor,
	Marco Elver, Nick Desaulniers, Przemek Kitszel, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-kernel, linux-mm, linux-hardening

In preparation for making kfree() a wrapper macro, replace address-taken
instances of kfree with __kfree so the future renaming of kfree to
__kfree will work correctly. (Or to avoid needing to create a union for
a cast.)

This is an example subset needed to build my bootable image. I'm sure
there are more, but they immediately throw build errors when encountered
so they cannot be silently missed.

Signed-off-by: Kees Cook <kees@kernel.org>
---
 arch/mips/alchemy/common/dbdma.c |  2 +-
 include/linux/slab.h             |  2 ++
 io_uring/futex.c                 |  2 +-
 io_uring/io_uring.c              | 12 ++++++------
 kernel/bpf/core.c                |  3 ++-
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/mips/alchemy/common/dbdma.c b/arch/mips/alchemy/common/dbdma.c
index 6a3c890f7bbf..08548e5daead 100644
--- a/arch/mips/alchemy/common/dbdma.c
+++ b/arch/mips/alchemy/common/dbdma.c
@@ -422,7 +422,7 @@ u32 au1xxx_dbdma_ring_alloc(u32 chanid, int entries)
 		 * Lost....do it again, allocate extra, and round
 		 * the address base.
 		 */
-		kfree((const void *)desc_base);
+		__kfree((const void *)desc_base);
 		i = entries * sizeof(au1x_ddma_desc_t);
 		i += (sizeof(au1x_ddma_desc_t) - 1);
 		desc_base = (u32)kmalloc(i, GFP_KERNEL|GFP_DMA);
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 09eedaecf120..3e807ccc8583 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -469,6 +469,8 @@ void kfree(const void *objp);
 void kfree_sensitive(const void *objp);
 size_t __ksize(const void *objp);
 
+#define __kfree(x)	kfree(x)
+
 DEFINE_FREE(kfree, void *, if (!IS_ERR_OR_NULL(_T)) kfree(_T))
 DEFINE_FREE(kfree_sensitive, void *, if (_T) kfree_sensitive(_T))
 
diff --git a/io_uring/futex.c b/io_uring/futex.c
index 43e2143255f5..e46a019fbd08 100644
--- a/io_uring/futex.c
+++ b/io_uring/futex.c
@@ -41,7 +41,7 @@ bool io_futex_cache_init(struct io_ring_ctx *ctx)
 
 void io_futex_cache_free(struct io_ring_ctx *ctx)
 {
-	io_alloc_cache_free(&ctx->futex_cache, kfree);
+	io_alloc_cache_free(&ctx->futex_cache, __kfree);
 }
 
 static void __io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ceacf6230e34..0a41a3a981b1 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -360,11 +360,11 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
 free_ref:
 	percpu_ref_exit(&ctx->refs);
 err:
-	io_alloc_cache_free(&ctx->apoll_cache, kfree);
+	io_alloc_cache_free(&ctx->apoll_cache, __kfree);
 	io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free);
 	io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free);
-	io_alloc_cache_free(&ctx->uring_cache, kfree);
-	io_alloc_cache_free(&ctx->msg_cache, kfree);
+	io_alloc_cache_free(&ctx->uring_cache, __kfree);
+	io_alloc_cache_free(&ctx->msg_cache, __kfree);
 	io_futex_cache_free(ctx);
 	kvfree(ctx->cancel_table.hbs);
 	xa_destroy(&ctx->io_bl_xa);
@@ -2702,11 +2702,11 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
 	io_sqe_files_unregister(ctx);
 	io_cqring_overflow_kill(ctx);
 	io_eventfd_unregister(ctx);
-	io_alloc_cache_free(&ctx->apoll_cache, kfree);
+	io_alloc_cache_free(&ctx->apoll_cache, __kfree);
 	io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free);
 	io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free);
-	io_alloc_cache_free(&ctx->uring_cache, kfree);
-	io_alloc_cache_free(&ctx->msg_cache, kfree);
+	io_alloc_cache_free(&ctx->uring_cache, __kfree);
+	io_alloc_cache_free(&ctx->msg_cache, __kfree);
 	io_futex_cache_free(ctx);
 	io_destroy_buffers(ctx);
 	io_free_region(ctx, &ctx->param_region);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index da729cbbaeb9..9d2721d24c40 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -280,7 +280,8 @@ void __bpf_prog_free(struct bpf_prog *fp)
 		mutex_destroy(&fp->aux->used_maps_mutex);
 		mutex_destroy(&fp->aux->dst_mutex);
 		kfree(fp->aux->poke_tab);
-		kfree(fp->aux);
+		/* "fp" may be in read-only memory */
+		__kfree(fp->aux);
 	}
 	free_percpu(fp->stats);
 	free_percpu(fp->active);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] compiler_types: Introduce __is_lvalue()
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
  2025-03-21 20:40 ` [PATCH 1/5] treewide: Replace kfree() casts with union members Kees Cook
  2025-03-21 20:40 ` [PATCH 2/5] treewide: Prepare for kfree() to __kfree() rename Kees Cook
@ 2025-03-21 20:40 ` Kees Cook
  2025-03-22  3:38   ` Jann Horn
  2025-03-21 20:41 ` [PATCH 4/5] slab: Set freed variables to NULL by default Kees Cook
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:40 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Roman Gushchin, Hyeonggon Yoo, linux-kernel, linux-mm,
	linux-hardening

If __builtin_is_lvalue() is available, use it with __is_lvalue(). There
is patch to Clang to provide this builtin now[1].

Link: https://github.com/kees/llvm-project/commits/builtin_is_lvalue/ [1]
Signed-off-by: Kees Cook <kees@kernel.org>
---
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
---
 include/linux/compiler_types.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e09d323be845..eb016808dfa8 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -468,6 +468,16 @@ struct ftrace_likely_data {
 #define __annotated(var, attr)	__builtin_has_attribute(var, attr)
 #endif
 
+/*
+ * Determine if a given expression is an lvalue for potential
+ * assignment. Without the builtin, report nothing is an lvalue.
+ */
+#if __has_builtin(__builtin_is_lvalue)
+#define __is_lvalue(expr)	__builtin_is_lvalue(expr)
+#else
+#define __is_lvalue(expr)	false
+#endif
+
 /*
  * Some versions of gcc do not mark 'asm goto' volatile:
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] slab: Set freed variables to NULL by default
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
                   ` (2 preceding siblings ...)
  2025-03-21 20:40 ` [PATCH 3/5] compiler_types: Introduce __is_lvalue() Kees Cook
@ 2025-03-21 20:41 ` Kees Cook
  2025-03-22  1:50   ` Jann Horn
  2025-03-27 19:42   ` Matthew Wilcox
  2025-03-21 20:41 ` [PATCH 5/5] [DEBUG] slab: Report number of NULLings Kees Cook
  2025-03-27 13:00 ` [RFC 0/5] slab: Set freed variables to NULL by default Harry Yoo
  5 siblings, 2 replies; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

To defang a subset of "dangling pointer" use-after-free flaws[1], take the
address of any lvalues passed to kfree() and set them to NULL after
freeing.

To do this manually, kfree_and_null() (and the "sensitive" variant)
are introduced.

Link: https://github.com/KSPP/linux/issues/87 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
---
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org
---
 include/linux/slab.h | 30 +++++++++++++++++++++++++++---
 mm/slab_common.c     |  8 ++++----
 mm/slub.c            |  6 +++---
 3 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3e807ccc8583..2717ad238fa2 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -465,11 +465,35 @@ void * __must_check krealloc_noprof(const void *objp, size_t new_size,
 				    gfp_t flags) __realloc_size(2);
 #define krealloc(...)				alloc_hooks(krealloc_noprof(__VA_ARGS__))
 
-void kfree(const void *objp);
-void kfree_sensitive(const void *objp);
+void __kfree(const void *objp);
+void __kfree_sensitive(const void *objp);
 size_t __ksize(const void *objp);
 
-#define __kfree(x)	kfree(x)
+static inline void kfree_and_null(void **ptr)
+{
+	__kfree(*ptr);
+	*ptr = NULL;
+}
+static inline void kfree_sensitive_and_null(void **ptr)
+{
+	__kfree_sensitive(*ptr);
+	*ptr = NULL;
+}
+
+#define __force_lvalue_expr(x)	\
+	__builtin_choose_expr(__is_lvalue(x), x, (void *){ NULL })
+
+#define __free_and_null(__how, x)	\
+({					\
+	typeof(x) *__ptr = &(x);	\
+	__how ## _and_null((void **)__ptr);	\
+})
+#define __free_and_maybe_null(__how, x)	\
+	__builtin_choose_expr(__is_lvalue(x), \
+		__free_and_null(__how, __force_lvalue_expr(x)), \
+		__kfree(x))
+#define kfree(x)	   __free_and_maybe_null(kfree, x)
+#define kfree_sensitive(x) __free_and_maybe_null(kfree_sensitive, x)
 
 DEFINE_FREE(kfree, void *, if (!IS_ERR_OR_NULL(_T)) kfree(_T))
 DEFINE_FREE(kfree_sensitive, void *, if (_T) kfree_sensitive(_T))
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 4030907b6b7d..9a82952ec266 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1211,7 +1211,7 @@ module_init(slab_proc_init);
 #endif /* CONFIG_SLUB_DEBUG */
 
 /**
- * kfree_sensitive - Clear sensitive information in memory before freeing
+ * __kfree_sensitive - Clear sensitive information in memory before freeing
  * @p: object to free memory of
  *
  * The memory of the object @p points to is zeroed before freed.
@@ -1221,7 +1221,7 @@ module_init(slab_proc_init);
  * deal bigger than the requested buffer size passed to kmalloc(). So be
  * careful when using this function in performance sensitive code.
  */
-void kfree_sensitive(const void *p)
+void __kfree_sensitive(const void *p)
 {
 	size_t ks;
 	void *mem = (void *)p;
@@ -1231,9 +1231,9 @@ void kfree_sensitive(const void *p)
 		kasan_unpoison_range(mem, ks);
 		memzero_explicit(mem, ks);
 	}
-	kfree(mem);
+	__kfree(mem);
 }
-EXPORT_SYMBOL(kfree_sensitive);
+EXPORT_SYMBOL(__kfree_sensitive);
 
 size_t ksize(const void *objp)
 {
diff --git a/mm/slub.c b/mm/slub.c
index 1f50129dcfb3..38dd898667bf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4729,12 +4729,12 @@ static void free_large_kmalloc(struct folio *folio, void *object)
 }
 
 /**
- * kfree - free previously allocated memory
+ * __kfree - free previously allocated memory
  * @object: pointer returned by kmalloc() or kmem_cache_alloc()
  *
  * If @object is NULL, no operation is performed.
  */
-void kfree(const void *object)
+void __kfree(const void *object)
 {
 	struct folio *folio;
 	struct slab *slab;
@@ -4756,7 +4756,7 @@ void kfree(const void *object)
 	s = slab->slab_cache;
 	slab_free(s, slab, x, _RET_IP_);
 }
-EXPORT_SYMBOL(kfree);
+EXPORT_SYMBOL(__kfree);
 
 static __always_inline __realloc_size(2) void *
 __do_krealloc(const void *p, size_t new_size, gfp_t flags)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] [DEBUG] slab: Report number of NULLings
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
                   ` (3 preceding siblings ...)
  2025-03-21 20:41 ` [PATCH 4/5] slab: Set freed variables to NULL by default Kees Cook
@ 2025-03-21 20:41 ` Kees Cook
  2025-03-24 16:16   ` Christoph Lameter (Ampere)
  2025-03-27 13:00 ` [RFC 0/5] slab: Set freed variables to NULL by default Harry Yoo
  5 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2025-03-21 20:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

Just to get a sense of what's happening, report the number of NULL
assignments that have been done. After booting an otherwise standard
Ubuntu image, this shows about 240,000 NULLifications have been performed.

Signed-off-by: Kees Cook <kees@kernel.org>
---
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org
---
 include/linux/slab.h | 3 +++
 mm/slab_common.c     | 4 ++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 2717ad238fa2..a4740c8b6ccb 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -469,6 +469,8 @@ void __kfree(const void *objp);
 void __kfree_sensitive(const void *objp);
 size_t __ksize(const void *objp);
 
+extern atomic_t count_nulled;
+
 static inline void kfree_and_null(void **ptr)
 {
 	__kfree(*ptr);
@@ -487,6 +489,7 @@ static inline void kfree_sensitive_and_null(void **ptr)
 ({					\
 	typeof(x) *__ptr = &(x);	\
 	__how ## _and_null((void **)__ptr);	\
+	atomic_inc(&count_nulled);	\
 })
 #define __free_and_maybe_null(__how, x)	\
 	__builtin_choose_expr(__is_lvalue(x), \
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 9a82952ec266..0412cbab81f9 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -42,6 +42,9 @@ LIST_HEAD(slab_caches);
 DEFINE_MUTEX(slab_mutex);
 struct kmem_cache *kmem_cache;
 
+atomic_t count_nulled = ATOMIC_INIT(0);
+EXPORT_SYMBOL(count_nulled);
+
 /*
  * Set of flags that will prevent slab merging
  */
@@ -1084,6 +1087,7 @@ static void print_slabinfo_header(struct seq_file *m)
 	 * without _too_ many complaints.
 	 */
 	seq_puts(m, "slabinfo - version: 2.1\n");
+	seq_printf(m, "# nulled: %d\n", atomic_read(&count_nulled));
 	seq_puts(m, "# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>");
 	seq_puts(m, " : tunables <limit> <batchcount> <sharedfactor>");
 	seq_puts(m, " : slabdata <active_slabs> <num_slabs> <sharedavail>");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] slab: Set freed variables to NULL by default
  2025-03-21 20:41 ` [PATCH 4/5] slab: Set freed variables to NULL by default Kees Cook
@ 2025-03-22  1:50   ` Jann Horn
  2025-03-22  7:18     ` Kees Cook
  2025-03-27 19:42   ` Matthew Wilcox
  1 sibling, 1 reply; 16+ messages in thread
From: Jann Horn @ 2025-03-22  1:50 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

On Fri, Mar 21, 2025 at 9:41 PM Kees Cook <kees@kernel.org> wrote:
> To defang a subset of "dangling pointer" use-after-free flaws[1], take the
> address of any lvalues passed to kfree() and set them to NULL after
> freeing.
>
> To do this manually, kfree_and_null() (and the "sensitive" variant)
> are introduced.

Unless callers of kfree() are allowed to rely on this behavior, we
might want to have an option to use a poison value instead of NULL for
this in debug builds.


Hmm... in theory, we could have an object that points to its own type, like so:

    struct foo {
     struct foo *ptr;
    };

And if that was initialized like so:

    struct foo *obj = kmalloc(sizeof(struct foo));
    obj->ptr = obj;

Then the following is currently fine, but would turn into UAF with this patch:

    kfree(obj->ptr);

That is a fairly contrived example; but maybe it would make sense to
reorder the NULL assignment and the actual freeing to avoid this
particular case?

Another pattern that is theoretically currently fine but would be
problematic after this would be if code continues to access a pointer
after the pointed-to object has been freed, but just to check for
NULL-ness to infer something about the state of the object containing
the pointer, or to check pointer identity, or something like that. But
I guess that's probably not very common? Something like:

    if (ptr) {
        mutex_lock(&some_lock);
        ...
        kfree(ptr);
    }
    ...
    if (ptr) {
        ...
        mutex_unlock(&some_lock);
    }

But from scrolling through the kernel sources and glancing at a few
hundred kfree() calls (not a lot compared to the ~40000 callsites we
have), I haven't actually found any obvious instances like that. There
is KMSAN test code that intentionally does a UAF, which might need to
be adjusted to not hit a NULL deref instead.
(And just an irrelevant sidenote: snd_gf1_dma_interrupt() has
commented-out debugging code that would UAF the "block" variable if it
was not commented out.)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] compiler_types: Introduce __is_lvalue()
  2025-03-21 20:40 ` [PATCH 3/5] compiler_types: Introduce __is_lvalue() Kees Cook
@ 2025-03-22  3:38   ` Jann Horn
  2025-03-22  7:03     ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Jann Horn @ 2025-03-22  3:38 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Roman Gushchin, Hyeonggon Yoo, linux-kernel, linux-mm,
	linux-hardening

On Fri, Mar 21, 2025 at 9:41 PM Kees Cook <kees@kernel.org> wrote:
> If __builtin_is_lvalue() is available, use it with __is_lvalue(). There
> is patch to Clang to provide this builtin now[1].
>
> Link: https://github.com/kees/llvm-project/commits/builtin_is_lvalue/ [1]
> Signed-off-by: Kees Cook <kees@kernel.org>

Before you land that LLVM patch, it might be a good idea to figure out
how this interacts with the fun C quirk where you can have temporary
rvalues which can contain array members to which you can technically
create lvalues but must not write. As far as I understand, calling
"kfree(getS().ptrs[0])" as in the following example would cause UB
with your patch:

```
$ cat kees-kfree-test.c
#include <stdlib.h>

#define __is_lvalue(expr)      __builtin_is_lvalue(expr)

void __kfree(void *ptr);
void BEFORE_SET_TO_NULL();
void AFTER_SET_TO_NULL();
static inline void kfree_and_null(void **ptr)
{
       __kfree(*ptr);
       BEFORE_SET_TO_NULL();
       *ptr = NULL;
       AFTER_SET_TO_NULL();
}
#define __force_lvalue_expr(x) \
       __builtin_choose_expr(__is_lvalue(x), x, (void *){ NULL })
#define __free_and_null(__how, x)      \
({                                     \
       typeof(x) *__ptr = &(x);        \
       __how ## _and_null((void **)__ptr);     \
})
#define __free_and_maybe_null(__how, x)        \
       __builtin_choose_expr(__is_lvalue(x), \
               __free_and_null(__how, __force_lvalue_expr(x)), \
               __kfree(x))
#define kfree(x)          __free_and_maybe_null(kfree, x)

struct S {
    void *ptrs[1];
};
struct S getS(void);

int is_lvalue_test(void) {
  return __is_lvalue(getS().ptrs[0]);
}
void testfn2(void) {
  kfree(getS().ptrs[0]);
}
$ [...]/bin/clang-14 -c -o kees-kfree-test.o kees-kfree-test.c -O3 -Wall
$ objdump -d -Mintel -r kees-kfree-test.o

kees-kfree-test.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <is_lvalue_test>:
   0:   b8 01 00 00 00          mov    eax,0x1
   5:   c3                      ret
   6:   66 2e 0f 1f 84 00 00    cs nop WORD PTR [rax+rax*1+0x0]
   d:   00 00 00

0000000000000010 <testfn2>:
  10:   50                      push   rax
  11:   e8 00 00 00 00          call   16 <testfn2+0x6>
                        12: R_X86_64_PLT32      getS-0x4
  16:   48 89 c7                mov    rdi,rax
  19:   e8 00 00 00 00          call   1e <testfn2+0xe>
                        1a: R_X86_64_PLT32      __kfree-0x4
  1e:   31 c0                   xor    eax,eax
  20:   e8 00 00 00 00          call   25 <testfn2+0x15>
                        21: R_X86_64_PLT32      BEFORE_SET_TO_NULL-0x4
  25:   31 c0                   xor    eax,eax
  27:   59                      pop    rcx
  28:   e9 00 00 00 00          jmp    2d <testfn2+0x1d>
                        29: R_X86_64_PLT32      AFTER_SET_TO_NULL-0x4
jannh@horn:~/test/kees-kfree$
```

As far as I understand, this causes UB in C99 ("If an attempt is made
to modify the result of a function call or to access it after the next
sequence point, the behavior is undefined.") and in C11 ("A non-lvalue
expression with structure or union type, where the structure or union
contains a member with array type (including, recursively, members of
all contained structures and unions) refers to an object with
automatic storage duration and temporary lifetime. 36) Its lifetime
begins when the expression is evaluated and its initial value is the
value of the expression. Its lifetime ends when the evaluation of the
containing full expression or full declarator ends. Any attempt to
modify an object with temporary lifetime results in undefined
behavior.").

Basically, something like getS().ptrs[0] gives you something that is
technically an lvalue but must not actually be written to, and
->isModifiableLvalue() does not catch that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] compiler_types: Introduce __is_lvalue()
  2025-03-22  3:38   ` Jann Horn
@ 2025-03-22  7:03     ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2025-03-22  7:03 UTC (permalink / raw)
  To: Jann Horn
  Cc: Vlastimil Babka, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Roman Gushchin, Hyeonggon Yoo, linux-kernel, linux-mm,
	linux-hardening, Andrew Pinski

On Sat, Mar 22, 2025 at 04:38:21AM +0100, Jann Horn wrote:
> On Fri, Mar 21, 2025 at 9:41 PM Kees Cook <kees@kernel.org> wrote:
> > If __builtin_is_lvalue() is available, use it with __is_lvalue(). There
> > is patch to Clang to provide this builtin now[1].
> >
> > Link: https://github.com/kees/llvm-project/commits/builtin_is_lvalue/ [1]
> > Signed-off-by: Kees Cook <kees@kernel.org>
> 
> Before you land that LLVM patch, it might be a good idea to figure out
> how this interacts with the fun C quirk where you can have temporary
> rvalues which can contain array members to which you can technically
> create lvalues but must not write. As far as I understand, calling
> "kfree(getS().ptrs[0])" as in the following example would cause UB
> with your patch:

Yay UB! I can confirm that currently isModifiableLvalue() does not catch
this.

> 
> ```
> $ cat kees-kfree-test.c
> #include <stdlib.h>
> 
> #define __is_lvalue(expr)      __builtin_is_lvalue(expr)
> 
> void __kfree(void *ptr);
> void BEFORE_SET_TO_NULL();
> void AFTER_SET_TO_NULL();
> static inline void kfree_and_null(void **ptr)
> {
>        __kfree(*ptr);
>        BEFORE_SET_TO_NULL();
>        *ptr = NULL;
>        AFTER_SET_TO_NULL();
> }
> #define __force_lvalue_expr(x) \
>        __builtin_choose_expr(__is_lvalue(x), x, (void *){ NULL })
> #define __free_and_null(__how, x)      \
> ({                                     \
>        typeof(x) *__ptr = &(x);        \
>        __how ## _and_null((void **)__ptr);     \
> })
> #define __free_and_maybe_null(__how, x)        \
>        __builtin_choose_expr(__is_lvalue(x), \
>                __free_and_null(__how, __force_lvalue_expr(x)), \
>                __kfree(x))
> #define kfree(x)          __free_and_maybe_null(kfree, x)
> 
> struct S {
>     void *ptrs[1];
> };
> struct S getS(void);
> 
> int is_lvalue_test(void) {
>   return __is_lvalue(getS().ptrs[0]);
> }
> void testfn2(void) {
>   kfree(getS().ptrs[0]);
> }
> $ [...]/bin/clang-14 -c -o kees-kfree-test.o kees-kfree-test.c -O3 -Wall
> $ objdump -d -Mintel -r kees-kfree-test.o
> 
> kees-kfree-test.o:     file format elf64-x86-64
> 
> 
> Disassembly of section .text:
> 
> 0000000000000000 <is_lvalue_test>:
>    0:   b8 01 00 00 00          mov    eax,0x1
>    5:   c3                      ret
>    6:   66 2e 0f 1f 84 00 00    cs nop WORD PTR [rax+rax*1+0x0]
>    d:   00 00 00
> 
> 0000000000000010 <testfn2>:
>   10:   50                      push   rax
>   11:   e8 00 00 00 00          call   16 <testfn2+0x6>
>                         12: R_X86_64_PLT32      getS-0x4
>   16:   48 89 c7                mov    rdi,rax
>   19:   e8 00 00 00 00          call   1e <testfn2+0xe>
>                         1a: R_X86_64_PLT32      __kfree-0x4
>   1e:   31 c0                   xor    eax,eax
>   20:   e8 00 00 00 00          call   25 <testfn2+0x15>
>                         21: R_X86_64_PLT32      BEFORE_SET_TO_NULL-0x4
>   25:   31 c0                   xor    eax,eax
>   27:   59                      pop    rcx
>   28:   e9 00 00 00 00          jmp    2d <testfn2+0x1d>
>                         29: R_X86_64_PLT32      AFTER_SET_TO_NULL-0x4

I don't see UB manifested here, though? It looks more like a dead store
was eliminated (i.e. it was an automatic variable that wasn't going to
be referenced outside of the expression statement).

If I add a global and assign the global from *ptr, it all seems to work
fine:

--- kees-kfree-test.c.orig    2025-03-22 00:00:53.550633347 -0700
+++ kees-kfree-test.c       2025-03-21 23:58:57.124268268 -0700
@@ -2,6 +2,8 @@
 
 #define __is_lvalue(expr)      __builtin_is_modifiable_lvalue(expr)
 
+void *g;
+
 void __kfree(void *ptr);
 void BEFORE_SET_TO_NULL();
 void AFTER_SET_TO_NULL();
@@ -11,6 +13,7 @@
        BEFORE_SET_TO_NULL();
        *ptr = NULL;
        AFTER_SET_TO_NULL();
+       g = *ptr;
 }
 #define __force_lvalue_expr(x) \
        __builtin_choose_expr(__is_lvalue(x), x, (void *){ NULL })

...

  27:   e8 00 00 00 00          call   2c <testfn2+0x1c>
                        28: R_X86_64_PLT32      AFTER_SET_TO_NULL-0x4
  2c:   48 c7 05 00 00 00 00    mov    QWORD PTR [rip+0x0],0x0        # 37 <testfn2+0x27>
  33:   00 00 00 00 

> jannh@horn:~/test/kees-kfree$
> ```
> 
> As far as I understand, this causes UB in C99 ("If an attempt is made
> to modify the result of a function call or to access it after the next
> sequence point, the behavior is undefined.") and in C11 ("A non-lvalue
> expression with structure or union type, where the structure or union
> contains a member with array type (including, recursively, members of
> all contained structures and unions) refers to an object with
> automatic storage duration and temporary lifetime. 36) Its lifetime
> begins when the expression is evaluated and its initial value is the
> value of the expression. Its lifetime ends when the evaluation of the
> containing full expression or full declarator ends. Any attempt to
> modify an object with temporary lifetime results in undefined
> behavior.").
> 
> Basically, something like getS().ptrs[0] gives you something that is
> technically an lvalue but must not actually be written to, and
> ->isModifiableLvalue() does not catch that.

But I agree, any mention of UB does give me pause! :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] slab: Set freed variables to NULL by default
  2025-03-22  1:50   ` Jann Horn
@ 2025-03-22  7:18     ` Kees Cook
  2025-03-27 19:23       ` Jann Horn
  0 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2025-03-22  7:18 UTC (permalink / raw)
  To: Jann Horn
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

On Sat, Mar 22, 2025 at 02:50:15AM +0100, Jann Horn wrote:
> On Fri, Mar 21, 2025 at 9:41 PM Kees Cook <kees@kernel.org> wrote:
> > To defang a subset of "dangling pointer" use-after-free flaws[1], take the
> > address of any lvalues passed to kfree() and set them to NULL after
> > freeing.
> >
> > To do this manually, kfree_and_null() (and the "sensitive" variant)
> > are introduced.
> 
> Unless callers of kfree() are allowed to rely on this behavior, we
> might want to have an option to use a poison value instead of NULL for
> this in debug builds.

Sure -- we have many to choose from. Is there a specific one you think
would be good?

> 
> Hmm... in theory, we could have an object that points to its own type, like so:
> 
>     struct foo {
>      struct foo *ptr;
>     };
> 
> And if that was initialized like so:
> 
>     struct foo *obj = kmalloc(sizeof(struct foo));
>     obj->ptr = obj;
> 
> Then the following is currently fine, but would turn into UAF with this patch:
> 
>     kfree(obj->ptr);
> 
> That is a fairly contrived example; but maybe it would make sense to
> reorder the NULL assignment and the actual freeing to avoid this
> particular case?

Ew. Yeah. Reordering this looks okay, yes:

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3f8fb60963e3..8913b05eca33 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -473,13 +473,15 @@ extern atomic_t count_nulled;
 
 static inline void kfree_and_null(void **ptr)
 {
-	__kfree(*ptr);
+	void *addr = *ptr;
 	*ptr = NULL;
+	__kfree(addr);
 }
 static inline void kfree_sensitive_and_null(void **ptr)
 {
-	__kfree_sensitive(*ptr);
+	void *addr = *ptr;
 	*ptr = NULL;
+	__kfree_sensitive(addr);
 }
 
 #define __force_assignable(x)	\

> Another pattern that is theoretically currently fine but would be
> problematic after this would be if code continues to access a pointer
> after the pointed-to object has been freed, but just to check for
> NULL-ness to infer something about the state of the object containing
> the pointer, or to check pointer identity, or something like that. But
> I guess that's probably not very common? Something like:
> 
>     if (ptr) {
>         mutex_lock(&some_lock);
>         ...
>         kfree(ptr);
>     }
>     ...
>     if (ptr) {
>         ...
>         mutex_unlock(&some_lock);
>     }

Yeah, this would be bad form IMO. But it's not impossible. This idea
was mentioned on the Fediverse thread for this RFC too.
https://fosstodon.org/@kees/114202495615591299

> But from scrolling through the kernel sources and glancing at a few
> hundred kfree() calls (not a lot compared to the ~40000 callsites we
> have), I haven't actually found any obvious instances like that. There
> is KMSAN test code that intentionally does a UAF, which might need to
> be adjusted to not hit a NULL deref instead.
> (And just an irrelevant sidenote: snd_gf1_dma_interrupt() has
> commented-out debugging code that would UAF the "block" variable if it
> was not commented out.)

Yeah, the heap LKDTM tests need to switch to __kfree() too. :)

I'm going to see if I can build a sensible Coccinelle script to look for
this. It's currently getting very confused by the may "for_each" helpers,
but here's what I've got currently:

@use_freed@
expression E, RHS;
@@

(
        kfree(E);
        ... when != E
?       E = RHS
|
        kfree(E);
        ...
*       E
)

It is finding a few, though. For example:

--- ./drivers/md/dm-ima.c
+++ /tmp/nothing/drivers/md/dm-ima.c
@@ -584,13 +584,11 @@ error:
 exit:
        kfree(md->ima.active_table.device_metadata);
 
-       if (md->ima.active_table.device_metadata !=
            md->ima.inactive_table.device_metadata)
                kfree(md->ima.inactive_table.device_metadata);


-- 
Kees Cook

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] treewide: Replace kfree() casts with union members
  2025-03-21 20:40 ` [PATCH 1/5] treewide: Replace kfree() casts with union members Kees Cook
@ 2025-03-23 10:26   ` David Laight
  0 siblings, 0 replies; 16+ messages in thread
From: David Laight @ 2025-03-23 10:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Christoph Lameter, Miguel Ojeda,
	Nathan Chancellor, Marco Elver, Nick Desaulniers, Przemek Kitszel,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Roman Gushchin, Hyeonggon Yoo, linux-kernel, linux-mm,
	linux-hardening

On Fri, 21 Mar 2025 13:40:57 -0700
Kees Cook <kees@kernel.org> wrote:

> As a prerequisite to being able to optionally take the address of any
> lvalues used in a call to kfree(), replace casts to the kfree() argument
> with unions to include an actual pointer.

Having different names for the argument to kfree() and other uses
of the data is just brain-dead and will make reading code harder
and writing code that access data after it is freed so much easier.

	David

> 
> This is an example subset. There are another handful remaining:
> 
> $ git grep '\bkfree((void \*)'
> arch/mips/alchemy/common/dbdma.c:       kfree((void *)ctp->cdb_membase);
> arch/s390/kernel/crash_dump.c:  kfree((void *)(unsigned long)addr);
> drivers/crypto/atmel-sha204a.c: kfree((void *)i2c_priv->hwrng.priv);
> drivers/infiniband/hw/cxgb4/mem.c:              kfree((void *) (unsigned long) mhp->kva);
> drivers/isdn/mISDN/fsm.c:       kfree((void *) fsm->jumpmatrix);
> drivers/misc/altera-stapl/altera.c:           kfree((void *)vars[variable_id]);
> drivers/misc/altera-stapl/altera.c:                             kfree((void *)vars[i]);
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:                        kfree((void *)x); \
> drivers/net/ethernet/qlogic/qed/qed_main.c:     kfree((void *)cdev);
> drivers/net/usb/cx82310_eth.c:  kfree((void *)dev->partial_data);
> drivers/net/usb/cx82310_eth.c:  kfree((void *)dev->partial_data);
> drivers/scsi/snic/snic_io.c:            kfree((void *)rqi->sge_va);
> drivers/scsi/snic/snic_io.c:                    kfree((void *)rqi->sge_va);
> drivers/staging/rtl8723bs/os_dep/os_intfs.c:    /* kfree((void *)padapter); */
> drivers/video/fbdev/grvga.c:            kfree((void *)virtual_start);
> drivers/video/fbdev/grvga.c:                    kfree((void *)info->screen_base);
> drivers/xen/grant-table.c:                      kfree((void *)page_private(pages[i]));
> net/ieee802154/nl802154.c:      kfree((void *)cb->args[0]);
> net/sched/em_ipset.c:           kfree((void *) em->data);
> net/sched/em_meta.c:    kfree((void *) v->val);
> 
> Signed-off-by: Kees Cook <kees@kernel.org>
> ---
>  include/linux/netlink.h | 1 +
>  include/net/pkt_cls.h   | 5 ++++-
>  net/sched/ematch.c      | 2 +-
>  net/wireless/nl80211.c  | 2 +-
>  4 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/netlink.h b/include/linux/netlink.h
> index c3ae84a77e16..26eb9eea8a74 100644
> --- a/include/linux/netlink.h
> +++ b/include/linux/netlink.h
> @@ -295,6 +295,7 @@ struct netlink_callback {
>  	bool			strict_check;
>  	union {
>  		u8		ctx[NETLINK_CTX_SIZE];
> +		void *		ptr;
>  
>  		/* args is deprecated. Cast a struct over ctx instead
>  		 * for proper type safety.
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index c64fd896b1f9..4faf8d6eed1d 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -403,7 +403,10 @@ struct tcf_ematch_ops;
>   */
>  struct tcf_ematch {
>  	struct tcf_ematch_ops * ops;
> -	unsigned long		data;
> +	union {
> +		unsigned long	data;
> +		void *		ptr;
> +	};
>  	unsigned int		datalen;
>  	u16			matchid;
>  	u16			flags;
> diff --git a/net/sched/ematch.c b/net/sched/ematch.c
> index 5c1235e6076a..f4b00e7aca6a 100644
> --- a/net/sched/ematch.c
> +++ b/net/sched/ematch.c
> @@ -411,7 +411,7 @@ void tcf_em_tree_destroy(struct tcf_ematch_tree *tree)
>  			if (em->ops->destroy)
>  				em->ops->destroy(em);
>  			else if (!tcf_em_is_simple(em))
> -				kfree((void *) em->data);
> +				kfree(em->ptr);
>  			module_put(em->ops->owner);
>  		}
>  	}
> diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
> index d7d3da0f6833..b5a3ae07d84c 100644
> --- a/net/wireless/nl80211.c
> +++ b/net/wireless/nl80211.c
> @@ -3270,7 +3270,7 @@ static int nl80211_dump_wiphy(struct sk_buff *skb, struct netlink_callback *cb)
>  
>  static int nl80211_dump_wiphy_done(struct netlink_callback *cb)
>  {
> -	kfree((void *)cb->args[0]);
> +	kfree(cb->ptr);
>  	return 0;
>  }
>  


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] [DEBUG] slab: Report number of NULLings
  2025-03-21 20:41 ` [PATCH 5/5] [DEBUG] slab: Report number of NULLings Kees Cook
@ 2025-03-24 16:16   ` Christoph Lameter (Ampere)
  2025-03-25 19:45     ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-03-24 16:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Miguel Ojeda, Nathan Chancellor, Marco Elver, Nick Desaulniers,
	Przemek Kitszel, linux-kernel, linux-hardening

On Fri, 21 Mar 2025, Kees Cook wrote:

> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 2717ad238fa2..a4740c8b6ccb 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -469,6 +469,8 @@ void __kfree(const void *objp);
>  void __kfree_sensitive(const void *objp);
>  size_t __ksize(const void *objp);
>
> +extern atomic_t count_nulled;

That is a scalability issue. Use a vmstat counter instead?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] [DEBUG] slab: Report number of NULLings
  2025-03-24 16:16   ` Christoph Lameter (Ampere)
@ 2025-03-25 19:45     ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2025-03-25 19:45 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Vlastimil Babka, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Miguel Ojeda, Nathan Chancellor, Marco Elver, Nick Desaulniers,
	Przemek Kitszel, linux-kernel, linux-hardening

On Mon, Mar 24, 2025 at 09:16:47AM -0700, Christoph Lameter (Ampere) wrote:
> On Fri, 21 Mar 2025, Kees Cook wrote:
> 
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 2717ad238fa2..a4740c8b6ccb 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -469,6 +469,8 @@ void __kfree(const void *objp);
> >  void __kfree_sensitive(const void *objp);
> >  size_t __ksize(const void *objp);
> >
> > +extern atomic_t count_nulled;
> 
> That is a scalability issue. Use a vmstat counter instead?

Yeah, this patch (marked "DEBUG") isn't intended for upstreaming. It was
just a quick hack to get a ballpark statistic. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 0/5] slab: Set freed variables to NULL by default
  2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
                   ` (4 preceding siblings ...)
  2025-03-21 20:41 ` [PATCH 5/5] [DEBUG] slab: Report number of NULLings Kees Cook
@ 2025-03-27 13:00 ` Harry Yoo
  5 siblings, 0 replies; 16+ messages in thread
From: Harry Yoo @ 2025-03-27 13:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Christoph Lameter, Miguel Ojeda,
	Nathan Chancellor, Marco Elver, Nick Desaulniers, Przemek Kitszel,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Roman Gushchin, Hyeonggon Yoo, linux-kernel, linux-mm,
	linux-hardening

On Fri, Mar 21, 2025 at 01:40:56PM -0700, Kees Cook wrote:
> Hi!
> 
> This is very much an RFC series, but I wanted to make sure it actually
> worked before I proposed it. This series seeks to give kfree() the
>` side-effect of assigning NULL to the kfree() argument when possible.
> This would make a subset of "dangling pointer" flaws turn into NULL
> derefs instead of Use-After-Free[1]. It effectively turns:
> 
> 	kfree(var);
> 
> into:
> 
> 	kfree(var);
> 	var = NULL;
> 
> when "var" is actually addressable. (i.e. not "kfree(get_ptrs())" etc.)

Maybe a dumb question, but if 'var' is a local variable, is it common that
a dangling pointer in a local variable to lead to a use-after-free bug?

Also, I don't expect this to have a significant performance impact, but have
you measured it and is the overhead low enough to enable it unconditionally?

> It depends on a builtin, __builtin_is_lvalue(), which is not landed in any
> compiler yet, but I do have it working in a Clang patch[2]. This should
> be essentially free (pardon the pun), so I think if folks can tolerate
> a little bit of renaming needed for when kfree is needed as a function
> and not a macro, it should be good. Please let me know what you think.

I don't have a strong opinion on the slab side, but now users of kfree
should be aware of this subtle change (e.g., when debugging code).

-- 
Cheers,
Harry (formerly known as Hyeonggon)


> Thanks!
> 
> -Kees
> 
> (Yes, I'm still working on the kmalloc_objs() series, but I needed to
> take a break from fixing all the allocation corner cases I've found there.)
> 
> [1] https://github.com/KSPP/linux/issues/87
> [2] https://github.com/kees/llvm-project/commits/builtin_is_lvalue/
> 
> Kees Cook (5):
>   treewide: Replace kfree() casts with union members
>   treewide: Prepare for kfree() to __kfree() rename
>   compiler_types: Introduce __is_lvalue()
>   slab: Set freed variables to NULL by default
>   [DEBUG] slab: Report number of NULLings
> 
>  arch/mips/alchemy/common/dbdma.c |  2 +-
>  include/linux/compiler_types.h   | 10 ++++++++++
>  include/linux/netlink.h          |  1 +
>  include/linux/slab.h             | 33 ++++++++++++++++++++++++++++++--
>  include/net/pkt_cls.h            |  5 ++++-
>  io_uring/futex.c                 |  2 +-
>  io_uring/io_uring.c              | 12 ++++++------
>  kernel/bpf/core.c                |  3 ++-
>  mm/slab_common.c                 | 12 ++++++++----
>  mm/slub.c                        |  6 +++---
>  net/sched/ematch.c               |  2 +-
>  net/wireless/nl80211.c           |  2 +-
>  12 files changed, 69 insertions(+), 21 deletions(-)
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] slab: Set freed variables to NULL by default
  2025-03-22  7:18     ` Kees Cook
@ 2025-03-27 19:23       ` Jann Horn
  0 siblings, 0 replies; 16+ messages in thread
From: Jann Horn @ 2025-03-27 19:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

On Sat, Mar 22, 2025 at 8:18 AM Kees Cook <kees@kernel.org> wrote:
> On Sat, Mar 22, 2025 at 02:50:15AM +0100, Jann Horn wrote:
> > On Fri, Mar 21, 2025 at 9:41 PM Kees Cook <kees@kernel.org> wrote:
> > > To defang a subset of "dangling pointer" use-after-free flaws[1], take the
> > > address of any lvalues passed to kfree() and set them to NULL after
> > > freeing.
> > >
> > > To do this manually, kfree_and_null() (and the "sensitive" variant)
> > > are introduced.
> >
> > Unless callers of kfree() are allowed to rely on this behavior, we
> > might want to have an option to use a poison value instead of NULL for
> > this in debug builds.
>
> Sure -- we have many to choose from. Is there a specific one you think
> would be good?

Forgot to reply to this, sorry. No, I don't have a particular one in mind.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] slab: Set freed variables to NULL by default
  2025-03-21 20:41 ` [PATCH 4/5] slab: Set freed variables to NULL by default Kees Cook
  2025-03-22  1:50   ` Jann Horn
@ 2025-03-27 19:42   ` Matthew Wilcox
  1 sibling, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2025-03-27 19:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Miguel Ojeda, Nathan Chancellor, Marco Elver,
	Nick Desaulniers, Przemek Kitszel, linux-kernel, linux-hardening

On Fri, Mar 21, 2025 at 01:41:00PM -0700, Kees Cook wrote:
>  /**
> - * kfree - free previously allocated memory
> + * __kfree - free previously allocated memory

No.  After applying this patch, we now have documentation for __kfree --
a function we want precisely 0 people to call directly, and no
documentation for kfree(), which is the function we want everybody to
use.

I don't know what the right solution to kernel documentation is here,
but this is definitely not it.  Something I've done in the past is to
note that kernel-doc ignores preprocessor, so we could do something
like:

#if 0
/**
 * kfree - blah blah
 * @blah: blah
 *
 * blah blah blah
 */
void kfree(const void *object) { }
#endif
void __kfree(const void *object)
{
...
}

We'd get warnings if people asked the kbuild system to show them all
EXPORT_SYMBOL functions which don't have kdoc, but nobody's that
invested in having complete documentation.  I don't know if this is
a good approach to solving the problem.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-03-27 19:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-21 20:40 [RFC 0/5] slab: Set freed variables to NULL by default Kees Cook
2025-03-21 20:40 ` [PATCH 1/5] treewide: Replace kfree() casts with union members Kees Cook
2025-03-23 10:26   ` David Laight
2025-03-21 20:40 ` [PATCH 2/5] treewide: Prepare for kfree() to __kfree() rename Kees Cook
2025-03-21 20:40 ` [PATCH 3/5] compiler_types: Introduce __is_lvalue() Kees Cook
2025-03-22  3:38   ` Jann Horn
2025-03-22  7:03     ` Kees Cook
2025-03-21 20:41 ` [PATCH 4/5] slab: Set freed variables to NULL by default Kees Cook
2025-03-22  1:50   ` Jann Horn
2025-03-22  7:18     ` Kees Cook
2025-03-27 19:23       ` Jann Horn
2025-03-27 19:42   ` Matthew Wilcox
2025-03-21 20:41 ` [PATCH 5/5] [DEBUG] slab: Report number of NULLings Kees Cook
2025-03-24 16:16   ` Christoph Lameter (Ampere)
2025-03-25 19:45     ` Kees Cook
2025-03-27 13:00 ` [RFC 0/5] slab: Set freed variables to NULL by default Harry Yoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).