* [PATCH v2 1/5] mm/kasan: Don't store metadata inside kmalloc object when slub_debug_orig_size is on
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
@ 2024-09-11 6:45 ` Feng Tang
2024-09-11 6:45 ` [PATCH v2 2/5] mm/slub: Consider kfence case for get_orig_size() Feng Tang
` (4 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-09-11 6:45 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel, Feng Tang
For a kmalloc object, when both kasan and slub redzone sanity check
are enabled, they could both manipulate its data space like storing
kasan free meta data and setting up kmalloc redzone, and may affect
accuracy of that object's 'orig_size'.
As an accurate 'orig_size' will be needed by some function like
krealloc() soon, save kasan's free meta data in slub's metadata area
instead of inside object when 'orig_size' is enabled.
This will make it easier to maintain/understand the code. Size wise,
when these two options are both enabled, the slub meta data space is
already huge, and this just slightly increase the overall size.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
---
mm/kasan/generic.c | 7 +++++--
mm/slab.h | 6 ++++++
mm/slub.c | 17 -----------------
3 files changed, 11 insertions(+), 19 deletions(-)
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 6310a180278b..8b9e348113b1 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -392,9 +392,12 @@ void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
* 1. Object is SLAB_TYPESAFE_BY_RCU, which means that it can
* be touched after it was freed, or
* 2. Object has a constructor, which means it's expected to
- * retain its content until the next allocation.
+ * retain its content until the next allocation, or
+ * 3. It is from a kmalloc cache which enables the debug option
+ * to store original size.
*/
- if ((cache->flags & SLAB_TYPESAFE_BY_RCU) || cache->ctor) {
+ if ((cache->flags & SLAB_TYPESAFE_BY_RCU) || cache->ctor ||
+ slub_debug_orig_size(cache)) {
cache->kasan_info.free_meta_offset = *size;
*size += sizeof(struct kasan_free_meta);
goto free_meta_added;
diff --git a/mm/slab.h b/mm/slab.h
index f22fb760b286..f72a8849b988 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -689,6 +689,12 @@ void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
void __check_heap_object(const void *ptr, unsigned long n,
const struct slab *slab, bool to_user);
+static inline bool slub_debug_orig_size(struct kmem_cache *s)
+{
+ return (kmem_cache_debug_flags(s, SLAB_STORE_USER) &&
+ (s->flags & SLAB_KMALLOC));
+}
+
#ifdef CONFIG_SLUB_DEBUG
void skip_orig_size_check(struct kmem_cache *s, const void *object);
#endif
diff --git a/mm/slub.c b/mm/slub.c
index 21f71cb6cc06..87c95f170f13 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -230,12 +230,6 @@ static inline bool kmem_cache_debug(struct kmem_cache *s)
return kmem_cache_debug_flags(s, SLAB_DEBUG_FLAGS);
}
-static inline bool slub_debug_orig_size(struct kmem_cache *s)
-{
- return (kmem_cache_debug_flags(s, SLAB_STORE_USER) &&
- (s->flags & SLAB_KMALLOC));
-}
-
void *fixup_red_left(struct kmem_cache *s, void *p)
{
if (kmem_cache_debug_flags(s, SLAB_RED_ZONE))
@@ -760,21 +754,10 @@ static inline void set_orig_size(struct kmem_cache *s,
void *object, unsigned int orig_size)
{
void *p = kasan_reset_tag(object);
- unsigned int kasan_meta_size;
if (!slub_debug_orig_size(s))
return;
- /*
- * KASAN can save its free meta data inside of the object at offset 0.
- * If this meta data size is larger than 'orig_size', it will overlap
- * the data redzone in [orig_size+1, object_size]. Thus, we adjust
- * 'orig_size' to be as at least as big as KASAN's meta data.
- */
- kasan_meta_size = kasan_metadata_size(s, true);
- if (kasan_meta_size > orig_size)
- orig_size = kasan_meta_size;
-
p += get_info_end(s);
p += sizeof(struct track) * 2;
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v2 2/5] mm/slub: Consider kfence case for get_orig_size()
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
2024-09-11 6:45 ` [PATCH v2 1/5] mm/kasan: Don't store metadata inside kmalloc object when slub_debug_orig_size is on Feng Tang
@ 2024-09-11 6:45 ` Feng Tang
2024-09-11 6:45 ` [PATCH v2 3/5] mm/slub: Move krealloc() and related code to slub.c Feng Tang
` (3 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-09-11 6:45 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel, Feng Tang
When 'orig_size' of kmalloc object is enabled by debug option, it
should either contains the actual requested size or the cache's
'object_size'.
But it's not true if that object is a kfence-allocated one, and its
'orig_size' in metadata could be zero or other values. This is not
a big issue for current 'orig_size' usage, as init_object() and
check_object() during alloc/free process will be skipped for kfence
addresses.
As 'orig_size' will be used by some function block like krealloc(),
handle it by returning the 'object_size' in get_orig_size() for
kfence addresses.
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index 87c95f170f13..021991e17287 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -768,7 +768,7 @@ static inline unsigned int get_orig_size(struct kmem_cache *s, void *object)
{
void *p = kasan_reset_tag(object);
- if (!slub_debug_orig_size(s))
+ if (!slub_debug_orig_size(s) || is_kfence_address(object))
return s->object_size;
p += get_info_end(s);
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v2 3/5] mm/slub: Move krealloc() and related code to slub.c
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
2024-09-11 6:45 ` [PATCH v2 1/5] mm/kasan: Don't store metadata inside kmalloc object when slub_debug_orig_size is on Feng Tang
2024-09-11 6:45 ` [PATCH v2 2/5] mm/slub: Consider kfence case for get_orig_size() Feng Tang
@ 2024-09-11 6:45 ` Feng Tang
2024-09-11 6:45 ` [PATCH v2 4/5] mm/slub: Improve redzone check and zeroing for krealloc() Feng Tang
` (2 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-09-11 6:45 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel, Feng Tang
This is a preparation for the following refactoring of krealloc(),
for more efficient function calling as it will call some internal
functions defined in slub.c.
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
mm/slab_common.c | 84 ------------------------------------------------
mm/slub.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 84 insertions(+), 84 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index af6b14769fbd..5734b61a106f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1185,90 +1185,6 @@ module_init(slab_proc_init);
#endif /* CONFIG_SLUB_DEBUG */
-static __always_inline __realloc_size(2) void *
-__do_krealloc(const void *p, size_t new_size, gfp_t flags)
-{
- void *ret;
- size_t ks;
-
- /* Check for double-free before calling ksize. */
- if (likely(!ZERO_OR_NULL_PTR(p))) {
- if (!kasan_check_byte(p))
- return NULL;
- ks = ksize(p);
- } else
- ks = 0;
-
- /* If the object still fits, repoison it precisely. */
- if (ks >= new_size) {
- /* Zero out spare memory. */
- if (want_init_on_alloc(flags)) {
- kasan_disable_current();
- memset((void *)p + new_size, 0, ks - new_size);
- kasan_enable_current();
- }
-
- p = kasan_krealloc((void *)p, new_size, flags);
- return (void *)p;
- }
-
- ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
- if (ret && p) {
- /* Disable KASAN checks as the object's redzone is accessed. */
- kasan_disable_current();
- memcpy(ret, kasan_reset_tag(p), ks);
- kasan_enable_current();
- }
-
- return ret;
-}
-
-/**
- * krealloc - reallocate memory. The contents will remain unchanged.
- * @p: object to reallocate memory for.
- * @new_size: how many bytes of memory are required.
- * @flags: the type of memory to allocate.
- *
- * If @p is %NULL, krealloc() behaves exactly like kmalloc(). If @new_size
- * is 0 and @p is not a %NULL pointer, the object pointed to is freed.
- *
- * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
- * initial memory allocation, every subsequent call to this API for the same
- * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
- * __GFP_ZERO is not fully honored by this API.
- *
- * This is the case, since krealloc() only knows about the bucket size of an
- * allocation (but not the exact size it was allocated with) and hence
- * implements the following semantics for shrinking and growing buffers with
- * __GFP_ZERO.
- *
- * new bucket
- * 0 size size
- * |--------|----------------|
- * | keep | zero |
- *
- * In any case, the contents of the object pointed to are preserved up to the
- * lesser of the new and old sizes.
- *
- * Return: pointer to the allocated memory or %NULL in case of error
- */
-void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
-{
- void *ret;
-
- if (unlikely(!new_size)) {
- kfree(p);
- return ZERO_SIZE_PTR;
- }
-
- ret = __do_krealloc(p, new_size, flags);
- if (ret && kasan_reset_tag(p) != kasan_reset_tag(ret))
- kfree(p);
-
- return ret;
-}
-EXPORT_SYMBOL(krealloc_noprof);
-
/**
* kfree_sensitive - Clear sensitive information in memory before freeing
* @p: object to free memory of
diff --git a/mm/slub.c b/mm/slub.c
index 021991e17287..c1796f9dd30f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4712,6 +4712,90 @@ void kfree(const void *object)
}
EXPORT_SYMBOL(kfree);
+static __always_inline __realloc_size(2) void *
+__do_krealloc(const void *p, size_t new_size, gfp_t flags)
+{
+ void *ret;
+ size_t ks;
+
+ /* Check for double-free before calling ksize. */
+ if (likely(!ZERO_OR_NULL_PTR(p))) {
+ if (!kasan_check_byte(p))
+ return NULL;
+ ks = ksize(p);
+ } else
+ ks = 0;
+
+ /* If the object still fits, repoison it precisely. */
+ if (ks >= new_size) {
+ /* Zero out spare memory. */
+ if (want_init_on_alloc(flags)) {
+ kasan_disable_current();
+ memset((void *)p + new_size, 0, ks - new_size);
+ kasan_enable_current();
+ }
+
+ p = kasan_krealloc((void *)p, new_size, flags);
+ return (void *)p;
+ }
+
+ ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
+ if (ret && p) {
+ /* Disable KASAN checks as the object's redzone is accessed. */
+ kasan_disable_current();
+ memcpy(ret, kasan_reset_tag(p), ks);
+ kasan_enable_current();
+ }
+
+ return ret;
+}
+
+/**
+ * krealloc - reallocate memory. The contents will remain unchanged.
+ * @p: object to reallocate memory for.
+ * @new_size: how many bytes of memory are required.
+ * @flags: the type of memory to allocate.
+ *
+ * If @p is %NULL, krealloc() behaves exactly like kmalloc(). If @new_size
+ * is 0 and @p is not a %NULL pointer, the object pointed to is freed.
+ *
+ * If __GFP_ZERO logic is requested, callers must ensure that, starting with the
+ * initial memory allocation, every subsequent call to this API for the same
+ * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
+ * __GFP_ZERO is not fully honored by this API.
+ *
+ * This is the case, since krealloc() only knows about the bucket size of an
+ * allocation (but not the exact size it was allocated with) and hence
+ * implements the following semantics for shrinking and growing buffers with
+ * __GFP_ZERO.
+ *
+ * new bucket
+ * 0 size size
+ * |--------|----------------|
+ * | keep | zero |
+ *
+ * In any case, the contents of the object pointed to are preserved up to the
+ * lesser of the new and old sizes.
+ *
+ * Return: pointer to the allocated memory or %NULL in case of error
+ */
+void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags)
+{
+ void *ret;
+
+ if (unlikely(!new_size)) {
+ kfree(p);
+ return ZERO_SIZE_PTR;
+ }
+
+ ret = __do_krealloc(p, new_size, flags);
+ if (ret && kasan_reset_tag(p) != kasan_reset_tag(ret))
+ kfree(p);
+
+ return ret;
+}
+EXPORT_SYMBOL(krealloc_noprof);
+
struct detached_freelist {
struct slab *slab;
void *tail;
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v2 4/5] mm/slub: Improve redzone check and zeroing for krealloc()
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
` (2 preceding siblings ...)
2024-09-11 6:45 ` [PATCH v2 3/5] mm/slub: Move krealloc() and related code to slub.c Feng Tang
@ 2024-09-11 6:45 ` Feng Tang
2024-09-11 6:45 ` [PATCH v2 5/5] mm/slub, kunit: Add testcase for krealloc redzone and zeroing Feng Tang
2024-10-02 10:42 ` [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Vlastimil Babka
5 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-09-11 6:45 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel, Feng Tang
For current krealloc(), one problem is its caller doesn't pass the old
request size, say the object is 64 bytes kmalloc one, but caller may
only requested 48 bytes. Then when krealloc() shrinks or grows in the
same object, or allocate a new bigger object, it lacks this 'original
size' information to do accurate data preserving or zeroing (when
__GFP_ZERO is set).
Thus with slub debug redzone and object tracking enabled, parts of the
object after krealloc() might contain redzone data instead of zeroes,
which is violating the __GFP_ZERO guarantees. Good thing is in this
case, kmalloc caches do have this 'orig_size' feature. So solve the
problem by utilize 'org_size' to do accurate data zeroing and preserving.
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
mm/slub.c | 54 ++++++++++++++++++++++++++++++++++++++----------------
1 file changed, 38 insertions(+), 16 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index c1796f9dd30f..e0fb0a26c796 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4717,33 +4717,51 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
{
void *ret;
size_t ks;
+ int orig_size = 0;
+ struct kmem_cache *s;
- /* Check for double-free before calling ksize. */
+ /* Check for double-free. */
if (likely(!ZERO_OR_NULL_PTR(p))) {
if (!kasan_check_byte(p))
return NULL;
- ks = ksize(p);
+
+ s = virt_to_cache(p);
+ orig_size = get_orig_size(s, (void *)p);
+ ks = s->object_size;
} else
ks = 0;
- /* If the object still fits, repoison it precisely. */
- if (ks >= new_size) {
- /* Zero out spare memory. */
- if (want_init_on_alloc(flags)) {
- kasan_disable_current();
+ /* If the object doesn't fit, allocate a bigger one */
+ if (new_size > ks)
+ goto alloc_new;
+
+ /* Zero out spare memory. */
+ if (want_init_on_alloc(flags)) {
+ kasan_disable_current();
+ if (orig_size < new_size)
+ memset((void *)p + orig_size, 0, new_size - orig_size);
+ else
memset((void *)p + new_size, 0, ks - new_size);
- kasan_enable_current();
- }
+ kasan_enable_current();
+ }
- p = kasan_krealloc((void *)p, new_size, flags);
- return (void *)p;
+ if (slub_debug_orig_size(s) && !is_kfence_address(p)) {
+ set_orig_size(s, (void *)p, new_size);
+ if (s->flags & SLAB_RED_ZONE && new_size < ks)
+ memset_no_sanitize_memory((void *)p + new_size,
+ SLUB_RED_ACTIVE, ks - new_size);
}
+ p = kasan_krealloc((void *)p, new_size, flags);
+ return (void *)p;
+
+alloc_new:
ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
if (ret && p) {
/* Disable KASAN checks as the object's redzone is accessed. */
kasan_disable_current();
- memcpy(ret, kasan_reset_tag(p), ks);
+ if (orig_size)
+ memcpy(ret, kasan_reset_tag(p), orig_size);
kasan_enable_current();
}
@@ -4764,16 +4782,20 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
* memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible that
* __GFP_ZERO is not fully honored by this API.
*
- * This is the case, since krealloc() only knows about the bucket size of an
- * allocation (but not the exact size it was allocated with) and hence
- * implements the following semantics for shrinking and growing buffers with
- * __GFP_ZERO.
+ * When slub_debug_orig_size() is off, krealloc() only knows about the bucket
+ * size of an allocation (but not the exact size it was allocated with) and
+ * hence implements the following semantics for shrinking and growing buffers
+ * with __GFP_ZERO.
*
* new bucket
* 0 size size
* |--------|----------------|
* | keep | zero |
*
+ * Otherwise, the original allocation size 'orig_size' could be used to
+ * precisely clear the requested size, and the new size will also be stored
+ * as the new 'orig_size'.
+ *
* In any case, the contents of the object pointed to are preserved up to the
* lesser of the new and old sizes.
*
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCH v2 5/5] mm/slub, kunit: Add testcase for krealloc redzone and zeroing
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
` (3 preceding siblings ...)
2024-09-11 6:45 ` [PATCH v2 4/5] mm/slub: Improve redzone check and zeroing for krealloc() Feng Tang
@ 2024-09-11 6:45 ` Feng Tang
2024-10-02 10:42 ` [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Vlastimil Babka
5 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-09-11 6:45 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel, Feng Tang
Danilo Krummrich raised issue about krealloc+GFP_ZERO [1], and Vlastimil
suggested to add some test case which can sanity test the kmalloc-redzone
and zeroing by utilizing the kmalloc's 'orig_size' debug feature.
It covers the grow and shrink case of krealloc() re-using current kmalloc
object, and the case of re-allocating a new bigger object.
[1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
---
Hi Danilo,
I keep your Reviewed-By tag, as I think this v2 mostly changes what kmalloc
slab to be used. Let me know if you want it dropped, thanks.
lib/slub_kunit.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
index 6e3a1e5a7142..b3d158f38b98 100644
--- a/lib/slub_kunit.c
+++ b/lib/slub_kunit.c
@@ -186,6 +186,47 @@ static void test_leak_destroy(struct kunit *test)
KUNIT_EXPECT_EQ(test, 1, slab_errors);
}
+static void test_krealloc_redzone_zeroing(struct kunit *test)
+{
+ u8 *p;
+ int i;
+ struct kmem_cache *s = test_kmem_cache_create("TestSlub_krealloc", 64,
+ SLAB_KMALLOC|SLAB_STORE_USER|SLAB_RED_ZONE);
+
+ p = __kmalloc_cache_noprof(s, GFP_KERNEL, 48);
+ memset(p, 0xff, 48);
+
+ kasan_disable_current();
+ OPTIMIZER_HIDE_VAR(p);
+
+ /* Test shrink */
+ p = krealloc(p, 40, GFP_KERNEL | __GFP_ZERO);
+ for (i = 40; i < 64; i++)
+ KUNIT_EXPECT_EQ(test, p[i], SLUB_RED_ACTIVE);
+
+ /* Test grow within the same 64B kmalloc object */
+ p = krealloc(p, 56, GFP_KERNEL | __GFP_ZERO);
+ for (i = 40; i < 56; i++)
+ KUNIT_EXPECT_EQ(test, p[i], 0);
+ for (i = 56; i < 64; i++)
+ KUNIT_EXPECT_EQ(test, p[i], SLUB_RED_ACTIVE);
+
+ validate_slab_cache(s);
+ KUNIT_EXPECT_EQ(test, 0, slab_errors);
+
+ memset(p, 0xff, 56);
+ /* Test grow with allocating a bigger 128B object */
+ p = krealloc(p, 112, GFP_KERNEL | __GFP_ZERO);
+ for (i = 0; i < 56; i++)
+ KUNIT_EXPECT_EQ(test, p[i], 0xff);
+ for (i = 56; i < 112; i++)
+ KUNIT_EXPECT_EQ(test, p[i], 0);
+
+ kfree(p);
+ kasan_enable_current();
+ kmem_cache_destroy(s);
+}
+
static int test_init(struct kunit *test)
{
slab_errors = 0;
@@ -208,6 +249,7 @@ static struct kunit_case test_cases[] = {
KUNIT_CASE(test_kmalloc_redzone_access),
KUNIT_CASE(test_kfree_rcu),
KUNIT_CASE(test_leak_destroy),
+ KUNIT_CASE(test_krealloc_redzone_zeroing),
{}
};
--
2.34.1
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-09-11 6:45 [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Feng Tang
` (4 preceding siblings ...)
2024-09-11 6:45 ` [PATCH v2 5/5] mm/slub, kunit: Add testcase for krealloc redzone and zeroing Feng Tang
@ 2024-10-02 10:42 ` Vlastimil Babka
2024-10-04 6:44 ` Marco Elver
5 siblings, 1 reply; 21+ messages in thread
From: Vlastimil Babka @ 2024-10-02 10:42 UTC (permalink / raw)
To: Feng Tang, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Marco Elver, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino
Cc: linux-mm, kasan-dev, linux-kernel
On 9/11/24 08:45, Feng Tang wrote:
> Danilo Krummrich's patch [1] raised one problem about krealloc() that
> its caller doesn't pass the old request size, say the object is 64
> bytes kmalloc one, but caller originally only requested 48 bytes. Then
> when krealloc() shrinks or grows in the same object, or allocate a new
> bigger object, it lacks this 'original size' information to do accurate
> data preserving or zeroing (when __GFP_ZERO is set).
>
> Thus with slub debug redzone and object tracking enabled, parts of the
> object after krealloc() might contain redzone data instead of zeroes,
> which is violating the __GFP_ZERO guarantees. Good thing is in this
> case, kmalloc caches do have this 'orig_size' feature, which could be
> used to improve the situation here.
>
> To make the 'orig_size' accurate, we adjust some kasan/slub meta data
> handling. Also add a slub kunit test case for krealloc().
>
> This patchset has dependency over patches in both -mm tree and -slab
> trees, so it is written based on linux-next tree '20240910' version.
>
> [1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/
Thanks, added to slab/for-next
>
> Thanks,
> Feng
>
> Changelog:
>
> Since v1:
> * Drop the patch changing generic kunit code from this patchset,
> and will send it separately.
> * Separate the krealloc moving form slab_common.c to slub.c to a
> new patch for better review (Danilo/Vlastimil)
> * Improve commit log and comments (Vlastimil/Danilo)
> * Rework the kunit test case to remove its dependency over
> slub_debug (which is incomplete in v1) (Vlastimil)
> * Add ack and review tag from developers.
>
> Feng Tang (5):
> mm/kasan: Don't store metadata inside kmalloc object when
> slub_debug_orig_size is on
> mm/slub: Consider kfence case for get_orig_size()
> mm/slub: Move krealloc() and related code to slub.c
> mm/slub: Improve redzone check and zeroing for krealloc()
> mm/slub, kunit: Add testcase for krealloc redzone and zeroing
>
> lib/slub_kunit.c | 42 +++++++++++++++
> mm/kasan/generic.c | 7 ++-
> mm/slab.h | 6 +++
> mm/slab_common.c | 84 ------------------------------
> mm/slub.c | 125 ++++++++++++++++++++++++++++++++++++++-------
> 5 files changed, 160 insertions(+), 104 deletions(-)
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-02 10:42 ` [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled Vlastimil Babka
@ 2024-10-04 6:44 ` Marco Elver
2024-10-04 9:18 ` Vlastimil Babka
0 siblings, 1 reply; 21+ messages in thread
From: Marco Elver @ 2024-10-04 6:44 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Feng Tang, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm, kasan-dev, linux-kernel,
Eric Dumazet
On Wed, 2 Oct 2024 at 12:42, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 9/11/24 08:45, Feng Tang wrote:
> > Danilo Krummrich's patch [1] raised one problem about krealloc() that
> > its caller doesn't pass the old request size, say the object is 64
> > bytes kmalloc one, but caller originally only requested 48 bytes. Then
> > when krealloc() shrinks or grows in the same object, or allocate a new
> > bigger object, it lacks this 'original size' information to do accurate
> > data preserving or zeroing (when __GFP_ZERO is set).
> >
> > Thus with slub debug redzone and object tracking enabled, parts of the
> > object after krealloc() might contain redzone data instead of zeroes,
> > which is violating the __GFP_ZERO guarantees. Good thing is in this
> > case, kmalloc caches do have this 'orig_size' feature, which could be
> > used to improve the situation here.
> >
> > To make the 'orig_size' accurate, we adjust some kasan/slub meta data
> > handling. Also add a slub kunit test case for krealloc().
> >
> > This patchset has dependency over patches in both -mm tree and -slab
> > trees, so it is written based on linux-next tree '20240910' version.
> >
> > [1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/
>
> Thanks, added to slab/for-next
This series just hit -next, and we're seeing several "KFENCE: memory
corruption ...". Here's one:
https://lore.kernel.org/all/66ff8bf6.050a0220.49194.0453.GAE@google.com/
One more (no link):
> ==================================================================
> BUG: KFENCE: memory corruption in xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>
> Corrupted memory at 0xffff88823bf5a0d0 [ 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ] (in kfence-#172):
> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
> rcu_do_batch kernel/rcu/tree.c:2567 [inline]
[...]
>
> kfence-#172: 0xffff88823bf5a000-0xffff88823bf5a0cf, size=208, cache=kmalloc-256
>
> allocated by task 5494 on cpu 0 at 101.266046s (0.409225s ago):
> __do_krealloc mm/slub.c:4784 [inline]
> krealloc_noprof+0xd6/0x2e0 mm/slub.c:4838
> xfs_iext_realloc_root fs/xfs/libxfs/xfs_iext_tree.c:613 [inline]
[...]
>
> freed by task 16 on cpu 0 at 101.573936s (0.186416s ago):
> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
[...]
>
> CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc1-next-20241003-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> ==================================================================
Unfortunately there's no reproducer yet it seems. Unless it's
immediately obvious to say what's wrong, is it possible to take this
series out of -next to confirm this series is causing the memory
corruptions? Syzbot should then stop finding these crashes.
Thanks,
-- Marco
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-04 6:44 ` Marco Elver
@ 2024-10-04 9:18 ` Vlastimil Babka
2024-10-04 9:52 ` Vlastimil Babka
0 siblings, 1 reply; 21+ messages in thread
From: Vlastimil Babka @ 2024-10-04 9:18 UTC (permalink / raw)
To: Marco Elver
Cc: Feng Tang, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm, kasan-dev, linux-kernel,
Eric Dumazet
On 10/4/24 08:44, Marco Elver wrote:
> On Wed, 2 Oct 2024 at 12:42, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 9/11/24 08:45, Feng Tang wrote:
>> > Danilo Krummrich's patch [1] raised one problem about krealloc() that
>> > its caller doesn't pass the old request size, say the object is 64
>> > bytes kmalloc one, but caller originally only requested 48 bytes. Then
>> > when krealloc() shrinks or grows in the same object, or allocate a new
>> > bigger object, it lacks this 'original size' information to do accurate
>> > data preserving or zeroing (when __GFP_ZERO is set).
>> >
>> > Thus with slub debug redzone and object tracking enabled, parts of the
>> > object after krealloc() might contain redzone data instead of zeroes,
>> > which is violating the __GFP_ZERO guarantees. Good thing is in this
>> > case, kmalloc caches do have this 'orig_size' feature, which could be
>> > used to improve the situation here.
>> >
>> > To make the 'orig_size' accurate, we adjust some kasan/slub meta data
>> > handling. Also add a slub kunit test case for krealloc().
>> >
>> > This patchset has dependency over patches in both -mm tree and -slab
>> > trees, so it is written based on linux-next tree '20240910' version.
>> >
>> > [1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/
>>
>> Thanks, added to slab/for-next
>
> This series just hit -next, and we're seeing several "KFENCE: memory
> corruption ...". Here's one:
> https://lore.kernel.org/all/66ff8bf6.050a0220.49194.0453.GAE@google.com/
>
> One more (no link):
>
>> ==================================================================
>> BUG: KFENCE: memory corruption in xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>>
>> Corrupted memory at 0xffff88823bf5a0d0 [ 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ] (in kfence-#172):
>> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
>> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
>> rcu_do_batch kernel/rcu/tree.c:2567 [inline]
> [...]
>>
>> kfence-#172: 0xffff88823bf5a000-0xffff88823bf5a0cf, size=208, cache=kmalloc-256
>>
>> allocated by task 5494 on cpu 0 at 101.266046s (0.409225s ago):
>> __do_krealloc mm/slub.c:4784 [inline]
>> krealloc_noprof+0xd6/0x2e0 mm/slub.c:4838
>> xfs_iext_realloc_root fs/xfs/libxfs/xfs_iext_tree.c:613 [inline]
> [...]
>>
>> freed by task 16 on cpu 0 at 101.573936s (0.186416s ago):
>> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
>> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
> [...]
>>
>> CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc1-next-20241003-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
>> ==================================================================
>
> Unfortunately there's no reproducer yet it seems. Unless it's
> immediately obvious to say what's wrong, is it possible to take this
> series out of -next to confirm this series is causing the memory
> corruptions? Syzbot should then stop finding these crashes.
I think it's commit d0a38fad51cc7 doing in __do_krealloc()
- ks = ksize(p);
+
+ s = virt_to_cache(p);
+ orig_size = get_orig_size(s, (void *)p);
+ ks = s->object_size;
so for kfence objects we don't get their actual allocation size but the
potentially larger bucket size?
I guess we could do:
ks = kfence_ksize(p) ?: s->object_size;
?
> Thanks,
> -- Marco
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-04 9:18 ` Vlastimil Babka
@ 2024-10-04 9:52 ` Vlastimil Babka
2024-10-04 10:28 ` Feng Tang
2024-10-14 7:52 ` Feng Tang
0 siblings, 2 replies; 21+ messages in thread
From: Vlastimil Babka @ 2024-10-04 9:52 UTC (permalink / raw)
To: Marco Elver
Cc: Feng Tang, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm, kasan-dev, linux-kernel,
Eric Dumazet
On 10/4/24 11:18, Vlastimil Babka wrote:
> On 10/4/24 08:44, Marco Elver wrote:
>
> I think it's commit d0a38fad51cc7 doing in __do_krealloc()
>
> - ks = ksize(p);
> +
> + s = virt_to_cache(p);
> + orig_size = get_orig_size(s, (void *)p);
> + ks = s->object_size;
>
> so for kfence objects we don't get their actual allocation size but the
> potentially larger bucket size?
>
> I guess we could do:
>
> ks = kfence_ksize(p) ?: s->object_size;
>
> ?
Hmm this probably is not the whole story, we also have:
- memcpy(ret, kasan_reset_tag(p), ks);
+ if (orig_size)
+ memcpy(ret, kasan_reset_tag(p), orig_size);
orig_size for kfence will be again s->object_size so the memcpy might be a
(read) buffer overflow from a kfence allocation.
I think get_orig_size() should perhaps return kfence_ksize(p) for kfence
allocations, in addition to the change above.
Or alternatively we don't change get_orig_size() (in a different commit) at
all, but __do_krealloc() will have an "if is_kfence_address()" that sets
both orig_size and ks to kfence_ksize(p) appropriately. That might be easier
to follow.
But either way means rewriting 2 commits. I think it's indeed better to drop
the series now from -next and submit a v3.
Vlastimil
>> Thanks,
>> -- Marco
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-04 9:52 ` Vlastimil Babka
@ 2024-10-04 10:28 ` Feng Tang
2024-10-14 7:52 ` Feng Tang
1 sibling, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-10-04 10:28 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> On 10/4/24 11:18, Vlastimil Babka wrote:
> > On 10/4/24 08:44, Marco Elver wrote:
> >
> > I think it's commit d0a38fad51cc7 doing in __do_krealloc()
> >
> > - ks = ksize(p);
> > +
> > + s = virt_to_cache(p);
> > + orig_size = get_orig_size(s, (void *)p);
> > + ks = s->object_size;
> >
> > so for kfence objects we don't get their actual allocation size but the
> > potentially larger bucket size?
> >
> > I guess we could do:
> >
> > ks = kfence_ksize(p) ?: s->object_size;
> >
> > ?
>
> Hmm this probably is not the whole story, we also have:
>
> - memcpy(ret, kasan_reset_tag(p), ks);
> + if (orig_size)
> + memcpy(ret, kasan_reset_tag(p), orig_size);
>
> orig_size for kfence will be again s->object_size so the memcpy might be a
> (read) buffer overflow from a kfence allocation.
>
> I think get_orig_size() should perhaps return kfence_ksize(p) for kfence
> allocations, in addition to the change above.
>
> Or alternatively we don't change get_orig_size() (in a different commit) at
> all, but __do_krealloc() will have an "if is_kfence_address()" that sets
> both orig_size and ks to kfence_ksize(p) appropriately. That might be easier
> to follow.
>
> But either way means rewriting 2 commits. I think it's indeed better to drop
> the series now from -next and submit a v3.
Yes, we can revert now. Sorry for the inconvenience.
Thanks,
Feng
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-04 9:52 ` Vlastimil Babka
2024-10-04 10:28 ` Feng Tang
@ 2024-10-14 7:52 ` Feng Tang
2024-10-14 8:53 ` Vlastimil Babka
1 sibling, 1 reply; 21+ messages in thread
From: Feng Tang @ 2024-10-14 7:52 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> On 10/4/24 11:18, Vlastimil Babka wrote:
> > On 10/4/24 08:44, Marco Elver wrote:
> >
> > I think it's commit d0a38fad51cc7 doing in __do_krealloc()
> >
> > - ks = ksize(p);
> > +
> > + s = virt_to_cache(p);
> > + orig_size = get_orig_size(s, (void *)p);
> > + ks = s->object_size;
> >
> > so for kfence objects we don't get their actual allocation size but the
> > potentially larger bucket size?
> >
> > I guess we could do:
> >
> > ks = kfence_ksize(p) ?: s->object_size;
> >
> > ?
>
> Hmm this probably is not the whole story, we also have:
>
> - memcpy(ret, kasan_reset_tag(p), ks);
> + if (orig_size)
> + memcpy(ret, kasan_reset_tag(p), orig_size);
>
> orig_size for kfence will be again s->object_size so the memcpy might be a
> (read) buffer overflow from a kfence allocation.
>
> I think get_orig_size() should perhaps return kfence_ksize(p) for kfence
> allocations, in addition to the change above.
>
> Or alternatively we don't change get_orig_size() (in a different commit) at
> all, but __do_krealloc() will have an "if is_kfence_address()" that sets
> both orig_size and ks to kfence_ksize(p) appropriately. That might be easier
> to follow.
Thanks for the suggestion!
As there were error report about the NULL slab for big kmalloc object, how
about the following code for
__do_krealloc(const void *p, size_t new_size, gfp_t flags)
{
void *ret;
size_t ks = 0;
int orig_size = 0;
struct kmem_cache *s = NULL;
/* Check for double-free. */
if (likely(!ZERO_OR_NULL_PTR(p))) {
if (!kasan_check_byte(p))
return NULL;
ks = ksize(p);
/* Some objects have no orig_size, like big kmalloc case */
if (is_kfence_address(p)) {
orig_size = kfence_ksize(p);
} else if (virt_to_slab(p)) {
s = virt_to_cache(p);
orig_size = get_orig_size(s, (void *)p);
}
} else {
goto alloc_new;
}
/* If the object doesn't fit, allocate a bigger one */
if (new_size > ks)
goto alloc_new;
/* Zero out spare memory. */
if (want_init_on_alloc(flags)) {
kasan_disable_current();
if (orig_size && orig_size < new_size)
memset((void *)p + orig_size, 0, new_size - orig_size);
else
memset((void *)p + new_size, 0, ks - new_size);
kasan_enable_current();
}
/* Setup kmalloc redzone when needed */
if (s && slub_debug_orig_size(s) && !is_kfence_address(p)) {
set_orig_size(s, (void *)p, new_size);
if (s->flags & SLAB_RED_ZONE && new_size < ks)
memset_no_sanitize_memory((void *)p + new_size,
SLUB_RED_ACTIVE, ks - new_size);
}
p = kasan_krealloc((void *)p, new_size, flags);
return (void *)p;
alloc_new:
ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
if (ret && p) {
/* Disable KASAN checks as the object's redzone is accessed. */
kasan_disable_current();
memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
kasan_enable_current();
}
return ret;
}
I've run it with the reproducer of syzbot, so far the issue hasn't been
reproduced on my local machine.
Thanks,
Feng
>
> But either way means rewriting 2 commits. I think it's indeed better to drop
> the series now from -next and submit a v3.
>
> Vlastimil
>
> >> Thanks,
> >> -- Marco
> >
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 7:52 ` Feng Tang
@ 2024-10-14 8:53 ` Vlastimil Babka
2024-10-14 12:52 ` Feng Tang
0 siblings, 1 reply; 21+ messages in thread
From: Vlastimil Babka @ 2024-10-14 8:53 UTC (permalink / raw)
To: Feng Tang
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On 10/14/24 09:52, Feng Tang wrote:
> On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> Thanks for the suggestion!
>
> As there were error report about the NULL slab for big kmalloc object, how
> about the following code for
>
> __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> {
> void *ret;
> size_t ks = 0;
> int orig_size = 0;
> struct kmem_cache *s = NULL;
>
> /* Check for double-free. */
> if (likely(!ZERO_OR_NULL_PTR(p))) {
> if (!kasan_check_byte(p))
> return NULL;
>
> ks = ksize(p);
I think this will result in __ksize() doing
skip_orig_size_check(folio_slab(folio)->slab_cache, object);
and we don't want that?
Also the checks below repeat some of the checks of ksize().
So I think in __do_krealloc() we should do things manually to determine ks
and not call ksize(). Just not break any of the cases ksize() handles
(kfence, large kmalloc).
>
> /* Some objects have no orig_size, like big kmalloc case */
> if (is_kfence_address(p)) {
> orig_size = kfence_ksize(p);
> } else if (virt_to_slab(p)) {
> s = virt_to_cache(p);
> orig_size = get_orig_size(s, (void *)p);
> }
> } else {
> goto alloc_new;
> }
>
> /* If the object doesn't fit, allocate a bigger one */
> if (new_size > ks)
> goto alloc_new;
>
> /* Zero out spare memory. */
> if (want_init_on_alloc(flags)) {
> kasan_disable_current();
> if (orig_size && orig_size < new_size)
> memset((void *)p + orig_size, 0, new_size - orig_size);
> else
> memset((void *)p + new_size, 0, ks - new_size);
> kasan_enable_current();
> }
>
> /* Setup kmalloc redzone when needed */
> if (s && slub_debug_orig_size(s) && !is_kfence_address(p)) {
> set_orig_size(s, (void *)p, new_size);
> if (s->flags & SLAB_RED_ZONE && new_size < ks)
> memset_no_sanitize_memory((void *)p + new_size,
> SLUB_RED_ACTIVE, ks - new_size);
> }
>
> p = kasan_krealloc((void *)p, new_size, flags);
> return (void *)p;
>
> alloc_new:
> ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
> if (ret && p) {
> /* Disable KASAN checks as the object's redzone is accessed. */
> kasan_disable_current();
> memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
> kasan_enable_current();
> }
>
> return ret;
> }
>
> I've run it with the reproducer of syzbot, so far the issue hasn't been
> reproduced on my local machine.
>
> Thanks,
> Feng
>
>>
>> But either way means rewriting 2 commits. I think it's indeed better to drop
>> the series now from -next and submit a v3.
>>
>> Vlastimil
>>
>> >> Thanks,
>> >> -- Marco
>> >
>>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 8:53 ` Vlastimil Babka
@ 2024-10-14 12:52 ` Feng Tang
2024-10-14 13:12 ` Vlastimil Babka
0 siblings, 1 reply; 21+ messages in thread
From: Feng Tang @ 2024-10-14 12:52 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
> On 10/14/24 09:52, Feng Tang wrote:
> > On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> > Thanks for the suggestion!
> >
> > As there were error report about the NULL slab for big kmalloc object, how
> > about the following code for
> >
> > __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> > {
> > void *ret;
> > size_t ks = 0;
> > int orig_size = 0;
> > struct kmem_cache *s = NULL;
> >
> > /* Check for double-free. */
> > if (likely(!ZERO_OR_NULL_PTR(p))) {
> > if (!kasan_check_byte(p))
> > return NULL;
> >
> > ks = ksize(p);
>
> I think this will result in __ksize() doing
> skip_orig_size_check(folio_slab(folio)->slab_cache, object);
> and we don't want that?
I think that's fine. As later code will re-set the orig_size anyway.
> Also the checks below repeat some of the checks of ksize().
Yes, there is some redundancy, mostly the virt_to_slab()
> So I think in __do_krealloc() we should do things manually to determine ks
> and not call ksize(). Just not break any of the cases ksize() handles
> (kfence, large kmalloc).
OK, originally I tried not to expose internals of __ksize(). Let me
try this way.
Thanks,
Feng
>
> >
> > /* Some objects have no orig_size, like big kmalloc case */
> > if (is_kfence_address(p)) {
> > orig_size = kfence_ksize(p);
> > } else if (virt_to_slab(p)) {
> > s = virt_to_cache(p);
> > orig_size = get_orig_size(s, (void *)p);
> > }
> > } else {
> > goto alloc_new;
> > }
> >
> > /* If the object doesn't fit, allocate a bigger one */
> > if (new_size > ks)
> > goto alloc_new;
> >
> > /* Zero out spare memory. */
> > if (want_init_on_alloc(flags)) {
> > kasan_disable_current();
> > if (orig_size && orig_size < new_size)
> > memset((void *)p + orig_size, 0, new_size - orig_size);
> > else
> > memset((void *)p + new_size, 0, ks - new_size);
> > kasan_enable_current();
> > }
> >
> > /* Setup kmalloc redzone when needed */
> > if (s && slub_debug_orig_size(s) && !is_kfence_address(p)) {
> > set_orig_size(s, (void *)p, new_size);
> > if (s->flags & SLAB_RED_ZONE && new_size < ks)
> > memset_no_sanitize_memory((void *)p + new_size,
> > SLUB_RED_ACTIVE, ks - new_size);
> > }
> >
> > p = kasan_krealloc((void *)p, new_size, flags);
> > return (void *)p;
> >
> > alloc_new:
> > ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
> > if (ret && p) {
> > /* Disable KASAN checks as the object's redzone is accessed. */
> > kasan_disable_current();
> > memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
> > kasan_enable_current();
> > }
> >
> > return ret;
> > }
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 12:52 ` Feng Tang
@ 2024-10-14 13:12 ` Vlastimil Babka
2024-10-14 14:20 ` Feng Tang
2024-10-14 20:35 ` Kees Cook
0 siblings, 2 replies; 21+ messages in thread
From: Vlastimil Babka @ 2024-10-14 13:12 UTC (permalink / raw)
To: Feng Tang, Kees Cook
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On 10/14/24 14:52, Feng Tang wrote:
> On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
>> On 10/14/24 09:52, Feng Tang wrote:
>> > On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
>> > Thanks for the suggestion!
>> >
>> > As there were error report about the NULL slab for big kmalloc object, how
>> > about the following code for
>> >
>> > __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>> > {
>> > void *ret;
>> > size_t ks = 0;
>> > int orig_size = 0;
>> > struct kmem_cache *s = NULL;
>> >
>> > /* Check for double-free. */
>> > if (likely(!ZERO_OR_NULL_PTR(p))) {
>> > if (!kasan_check_byte(p))
>> > return NULL;
>> >
>> > ks = ksize(p);
>>
>> I think this will result in __ksize() doing
>> skip_orig_size_check(folio_slab(folio)->slab_cache, object);
>> and we don't want that?
>
> I think that's fine. As later code will re-set the orig_size anyway.
But you also read it first.
>> > /* Some objects have no orig_size, like big kmalloc case */
>> > if (is_kfence_address(p)) {
>> > orig_size = kfence_ksize(p);
>> > } else if (virt_to_slab(p)) {
>> > s = virt_to_cache(p);
>> > orig_size = get_orig_size(s, (void *)p);
here.
>> > }
>> Also the checks below repeat some of the checks of ksize().
>
> Yes, there is some redundancy, mostly the virt_to_slab()
>
>> So I think in __do_krealloc() we should do things manually to determine ks
>> and not call ksize(). Just not break any of the cases ksize() handles
>> (kfence, large kmalloc).
>
> OK, originally I tried not to expose internals of __ksize(). Let me
> try this way.
ksize() makes assumptions that a user outside of slab itself is calling it.
But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
querying ksize() for the purposes of writing beyond the original
kmalloc(size) up to the bucket size. So maybe we can also investigate if the
skip_orig_size_check() mechanism can be removed now?
Still I think __do_krealloc() should rather do its own thing and not call
ksize().
> Thanks,
> Feng
>
>>
>> >
>> > } else {
>> > goto alloc_new;
>> > }
>> >
>> > /* If the object doesn't fit, allocate a bigger one */
>> > if (new_size > ks)
>> > goto alloc_new;
>> >
>> > /* Zero out spare memory. */
>> > if (want_init_on_alloc(flags)) {
>> > kasan_disable_current();
>> > if (orig_size && orig_size < new_size)
>> > memset((void *)p + orig_size, 0, new_size - orig_size);
>> > else
>> > memset((void *)p + new_size, 0, ks - new_size);
>> > kasan_enable_current();
>> > }
>> >
>> > /* Setup kmalloc redzone when needed */
>> > if (s && slub_debug_orig_size(s) && !is_kfence_address(p)) {
>> > set_orig_size(s, (void *)p, new_size);
>> > if (s->flags & SLAB_RED_ZONE && new_size < ks)
>> > memset_no_sanitize_memory((void *)p + new_size,
>> > SLUB_RED_ACTIVE, ks - new_size);
>> > }
>> >
>> > p = kasan_krealloc((void *)p, new_size, flags);
>> > return (void *)p;
>> >
>> > alloc_new:
>> > ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
>> > if (ret && p) {
>> > /* Disable KASAN checks as the object's redzone is accessed. */
>> > kasan_disable_current();
>> > memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
>> > kasan_enable_current();
>> > }
>> >
>> > return ret;
>> > }
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 13:12 ` Vlastimil Babka
@ 2024-10-14 14:20 ` Feng Tang
2024-10-14 20:40 ` Kees Cook
2024-11-04 11:28 ` Feng Tang
2024-10-14 20:35 ` Kees Cook
1 sibling, 2 replies; 21+ messages in thread
From: Feng Tang @ 2024-10-14 14:20 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kees Cook, Marco Elver, Andrew Morton, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Roman Gushchin,
Hyeonggon Yoo, Andrey Konovalov, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino, linux-mm@kvack.org,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
Eric Dumazet
On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> On 10/14/24 14:52, Feng Tang wrote:
> > On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
> >> On 10/14/24 09:52, Feng Tang wrote:
> >> > On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> >> > Thanks for the suggestion!
> >> >
> >> > As there were error report about the NULL slab for big kmalloc object, how
> >> > about the following code for
> >> >
> >> > __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >> > {
> >> > void *ret;
> >> > size_t ks = 0;
> >> > int orig_size = 0;
> >> > struct kmem_cache *s = NULL;
> >> >
> >> > /* Check for double-free. */
> >> > if (likely(!ZERO_OR_NULL_PTR(p))) {
> >> > if (!kasan_check_byte(p))
> >> > return NULL;
> >> >
> >> > ks = ksize(p);
> >>
> >> I think this will result in __ksize() doing
> >> skip_orig_size_check(folio_slab(folio)->slab_cache, object);
> >> and we don't want that?
> >
> > I think that's fine. As later code will re-set the orig_size anyway.
>
> But you also read it first.
>
> >> > /* Some objects have no orig_size, like big kmalloc case */
> >> > if (is_kfence_address(p)) {
> >> > orig_size = kfence_ksize(p);
> >> > } else if (virt_to_slab(p)) {
> >> > s = virt_to_cache(p);
> >> > orig_size = get_orig_size(s, (void *)p);
>
> here.
Aha, you are right!
>
> >> > }
>
> >> Also the checks below repeat some of the checks of ksize().
> >
> > Yes, there is some redundancy, mostly the virt_to_slab()
> >
> >> So I think in __do_krealloc() we should do things manually to determine ks
> >> and not call ksize(). Just not break any of the cases ksize() handles
> >> (kfence, large kmalloc).
> >
> > OK, originally I tried not to expose internals of __ksize(). Let me
> > try this way.
>
> ksize() makes assumptions that a user outside of slab itself is calling it.
>
> But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> querying ksize() for the purposes of writing beyond the original
> kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> skip_orig_size_check() mechanism can be removed now?
I did a quick grep, and fortunately it seems that the ksize() user are
much less than before. We used to see some trouble in network code, which
is now very clean without the need to skip orig_size check. Will check
other call site later.
> Still I think __do_krealloc() should rather do its own thing and not call
> ksize().
Yes. I made some changes:
static __always_inline __realloc_size(2) void *
__do_krealloc(const void *p, size_t new_size, gfp_t flags)
{
void *ret;
size_t ks = 0;
int orig_size = 0;
struct kmem_cache *s = NULL;
/* Check for double-free. */
if (unlikely(ZERO_OR_NULL_PTR(p)))
goto alloc_new;
if (!kasan_check_byte(p))
return NULL;
if (is_kfence_address(p)) {
ks = orig_size = kfence_ksize(p);
} else {
struct folio *folio;
folio = virt_to_folio(p);
if (unlikely(!folio_test_slab(folio))) {
/* Big kmalloc object */
WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE);
WARN_ON(p != folio_address(folio));
ks = folio_size(folio);
} else {
s = folio_slab(folio)->slab_cache;
orig_size = get_orig_size(s, (void *)p);
ks = s->object_size;
}
}
/* If the old object doesn't fit, allocate a bigger one */
if (new_size > ks)
goto alloc_new;
/* Zero out spare memory. */
if (want_init_on_alloc(flags)) {
kasan_disable_current();
if (orig_size && orig_size < new_size)
memset((void *)p + orig_size, 0, new_size - orig_size);
else
memset((void *)p + new_size, 0, ks - new_size);
kasan_enable_current();
}
/* Setup kmalloc redzone when needed */
if (s && slub_debug_orig_size(s)) {
set_orig_size(s, (void *)p, new_size);
if (s->flags & SLAB_RED_ZONE && new_size < ks)
memset_no_sanitize_memory((void *)p + new_size,
SLUB_RED_ACTIVE, ks - new_size);
}
p = kasan_krealloc((void *)p, new_size, flags);
return (void *)p;
alloc_new:
ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
if (ret && p) {
/* Disable KASAN checks as the object's redzone is accessed. */
kasan_disable_current();
memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
kasan_enable_current();
}
return ret;
}
Thanks,
Feng
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 14:20 ` Feng Tang
@ 2024-10-14 20:40 ` Kees Cook
2024-11-04 11:28 ` Feng Tang
1 sibling, 0 replies; 21+ messages in thread
From: Kees Cook @ 2024-10-14 20:40 UTC (permalink / raw)
To: Feng Tang
Cc: Vlastimil Babka, Marco Elver, Andrew Morton, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Roman Gushchin,
Hyeonggon Yoo, Andrey Konovalov, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino, linux-mm@kvack.org,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
Eric Dumazet
On Mon, Oct 14, 2024 at 10:20:36PM +0800, Feng Tang wrote:
> On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> > On 10/14/24 14:52, Feng Tang wrote:
> > > On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
> > >> On 10/14/24 09:52, Feng Tang wrote:
> > > OK, originally I tried not to expose internals of __ksize(). Let me
> > > try this way.
> >
> > ksize() makes assumptions that a user outside of slab itself is calling it.
> >
> > But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> > querying ksize() for the purposes of writing beyond the original
> > kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> > skip_orig_size_check() mechanism can be removed now?
>
> I did a quick grep, and fortunately it seems that the ksize() user are
> much less than before. We used to see some trouble in network code, which
> is now very clean without the need to skip orig_size check. Will check
> other call site later.
Right -- only things that are performing a reallocation should be using
ksize(). e.g. see __slab_build_skb()
--
Kees Cook
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 14:20 ` Feng Tang
2024-10-14 20:40 ` Kees Cook
@ 2024-11-04 11:28 ` Feng Tang
2024-11-04 11:45 ` Vlastimil Babka
1 sibling, 1 reply; 21+ messages in thread
From: Feng Tang @ 2024-11-04 11:28 UTC (permalink / raw)
To: Vlastimil Babka, Kees Cook
Cc: Kees Cook, Marco Elver, Andrew Morton, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Roman Gushchin,
Hyeonggon Yoo, Andrey Konovalov, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino, linux-mm@kvack.org,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
Eric Dumazet
On Mon, Oct 14, 2024 at 10:20:36PM +0800, Tang, Feng wrote:
> On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> > >
> > >> So I think in __do_krealloc() we should do things manually to determine ks
> > >> and not call ksize(). Just not break any of the cases ksize() handles
> > >> (kfence, large kmalloc).
> > >
> > > OK, originally I tried not to expose internals of __ksize(). Let me
> > > try this way.
> >
> > ksize() makes assumptions that a user outside of slab itself is calling it.
> >
> > But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> > querying ksize() for the purposes of writing beyond the original
> > kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> > skip_orig_size_check() mechanism can be removed now?
>
> I did a quick grep, and fortunately it seems that the ksize() user are
> much less than before. We used to see some trouble in network code, which
> is now very clean without the need to skip orig_size check. Will check
> other call site later.
I did more further check about ksize() usage, and there are still some
places to be handled. The thing stands out is kfree_sensitive(), and
another potential one is sound/soc/codecs/cs-amp-lib-test.c
Some details:
* Thanks to Kees Cook, who has cured many cases of ksize() as below:
drivers/base/devres.c: total_old_size = ksize(container_of(ptr, struct devres, data));
drivers/net/ethernet/intel/igb/igb_main.c: } else if (size > ksize(q_vector)) {
net/core/skbuff.c: *size = ksize(data);
net/openvswitch/flow_netlink.c: new_acts_size = max(next_offset + req_size, ksize(*sfa) * 2);
kernel/bpf/verifier.c: alloc_bytes = max(ksize(orig), kmalloc_size_roundup(bytes));
* Some callers use ksize() mostly for calculation or sanity check,
and not for accessing those extra space, which are fine:
drivers/gpu/drm/drm_managed.c: WARN_ON(dev + 1 > (struct drm_device *) (container + ksize(container)));
lib/kunit/string-stream-test.c: actual_bytes_used = ksize(stream);
lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container);
lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container->fragment);
mm/nommu.c: return ksize(objp);
mm/util.c: memcpy(n, kasan_reset_tag(p), ksize(p));
security/tomoyo/gc.c: tomoyo_memory_used[TOMOYO_MEMORY_POLICY] -= ksize(ptr);
security/tomoyo/memory.c: const size_t s = ksize(ptr);
drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
drivers/md/dm-vdo/memory-alloc.c: remove_kmalloc_block(ksize(ptr));
* One usage may need to be handled
sound/soc/codecs/cs-amp-lib-test.c: KUNIT_ASSERT_GE_MSG(test, ksize(buf), priv->cal_blob->size, "Buffer to small");
* bigger problem is the kfree_sensitive(), which will use ksize() to
get the total size and then zero all of them.
One solution for this could be get the kmem_cache first, and
do the skip_orig_size_check()
Thanks,
Feng
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-11-04 11:28 ` Feng Tang
@ 2024-11-04 11:45 ` Vlastimil Babka
2024-11-04 12:37 ` Feng Tang
0 siblings, 1 reply; 21+ messages in thread
From: Vlastimil Babka @ 2024-11-04 11:45 UTC (permalink / raw)
To: Feng Tang, Kees Cook
Cc: Marco Elver, Andrew Morton, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
Andrey Konovalov, Shuah Khan, David Gow, Danilo Krummrich,
Alexander Potapenko, Andrey Ryabinin, Dmitry Vyukov,
Vincenzo Frascino, linux-mm@kvack.org, kasan-dev@googlegroups.com,
linux-kernel@vger.kernel.org, Eric Dumazet
On 11/4/24 12:28, Feng Tang wrote:
> On Mon, Oct 14, 2024 at 10:20:36PM +0800, Tang, Feng wrote:
>> On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
>> > >
>> > >> So I think in __do_krealloc() we should do things manually to determine ks
>> > >> and not call ksize(). Just not break any of the cases ksize() handles
>> > >> (kfence, large kmalloc).
>> > >
>> > > OK, originally I tried not to expose internals of __ksize(). Let me
>> > > try this way.
>> >
>> > ksize() makes assumptions that a user outside of slab itself is calling it.
>> >
>> > But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
>> > querying ksize() for the purposes of writing beyond the original
>> > kmalloc(size) up to the bucket size. So maybe we can also investigate if the
>> > skip_orig_size_check() mechanism can be removed now?
>>
>> I did a quick grep, and fortunately it seems that the ksize() user are
>> much less than before. We used to see some trouble in network code, which
>> is now very clean without the need to skip orig_size check. Will check
>> other call site later.
>
>
> I did more further check about ksize() usage, and there are still some
> places to be handled. The thing stands out is kfree_sensitive(), and
> another potential one is sound/soc/codecs/cs-amp-lib-test.c
>
> Some details:
>
> * Thanks to Kees Cook, who has cured many cases of ksize() as below:
>
> drivers/base/devres.c: total_old_size = ksize(container_of(ptr, struct devres, data));
> drivers/net/ethernet/intel/igb/igb_main.c: } else if (size > ksize(q_vector)) {
> net/core/skbuff.c: *size = ksize(data);
> net/openvswitch/flow_netlink.c: new_acts_size = max(next_offset + req_size, ksize(*sfa) * 2);
> kernel/bpf/verifier.c: alloc_bytes = max(ksize(orig), kmalloc_size_roundup(bytes));
>
> * Some callers use ksize() mostly for calculation or sanity check,
> and not for accessing those extra space, which are fine:
>
> drivers/gpu/drm/drm_managed.c: WARN_ON(dev + 1 > (struct drm_device *) (container + ksize(container)));
> lib/kunit/string-stream-test.c: actual_bytes_used = ksize(stream);
> lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container);
> lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container->fragment);
> mm/nommu.c: return ksize(objp);
> mm/util.c: memcpy(n, kasan_reset_tag(p), ksize(p));
> security/tomoyo/gc.c: tomoyo_memory_used[TOMOYO_MEMORY_POLICY] -= ksize(ptr);
> security/tomoyo/memory.c: const size_t s = ksize(ptr);
> drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
> drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
> drivers/md/dm-vdo/memory-alloc.c: remove_kmalloc_block(ksize(ptr));
>
> * One usage may need to be handled
>
> sound/soc/codecs/cs-amp-lib-test.c: KUNIT_ASSERT_GE_MSG(test, ksize(buf), priv->cal_blob->size, "Buffer to small");
>
> * bigger problem is the kfree_sensitive(), which will use ksize() to
> get the total size and then zero all of them.
>
> One solution for this could be get the kmem_cache first, and
> do the skip_orig_size_check()
Maybe add a parameter for __ksize() that controls if we do
skip_orig_size_check(), current ksize() will pass "false" to it (once
remaining wrong users are handled), then another ksize_internal() variant
will pass "true" and be used from kfree_sensitive()?
> Thanks,
> Feng
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-11-04 11:45 ` Vlastimil Babka
@ 2024-11-04 12:37 ` Feng Tang
0 siblings, 0 replies; 21+ messages in thread
From: Feng Tang @ 2024-11-04 12:37 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Kees Cook, Marco Elver, Andrew Morton, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Roman Gushchin,
Hyeonggon Yoo, Andrey Konovalov, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino, linux-mm@kvack.org,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
Eric Dumazet
On Mon, Nov 04, 2024 at 12:45:51PM +0100, Vlastimil Babka wrote:
> On 11/4/24 12:28, Feng Tang wrote:
> > On Mon, Oct 14, 2024 at 10:20:36PM +0800, Tang, Feng wrote:
> >> On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> >> > >
> >> > >> So I think in __do_krealloc() we should do things manually to determine ks
> >> > >> and not call ksize(). Just not break any of the cases ksize() handles
> >> > >> (kfence, large kmalloc).
> >> > >
> >> > > OK, originally I tried not to expose internals of __ksize(). Let me
> >> > > try this way.
> >> >
> >> > ksize() makes assumptions that a user outside of slab itself is calling it.
> >> >
> >> > But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> >> > querying ksize() for the purposes of writing beyond the original
> >> > kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> >> > skip_orig_size_check() mechanism can be removed now?
> >>
> >> I did a quick grep, and fortunately it seems that the ksize() user are
> >> much less than before. We used to see some trouble in network code, which
> >> is now very clean without the need to skip orig_size check. Will check
> >> other call site later.
> >
> >
> > I did more further check about ksize() usage, and there are still some
> > places to be handled. The thing stands out is kfree_sensitive(), and
> > another potential one is sound/soc/codecs/cs-amp-lib-test.c
> >
> > Some details:
> >
> > * Thanks to Kees Cook, who has cured many cases of ksize() as below:
> >
> > drivers/base/devres.c: total_old_size = ksize(container_of(ptr, struct devres, data));
> > drivers/net/ethernet/intel/igb/igb_main.c: } else if (size > ksize(q_vector)) {
> > net/core/skbuff.c: *size = ksize(data);
> > net/openvswitch/flow_netlink.c: new_acts_size = max(next_offset + req_size, ksize(*sfa) * 2);
> > kernel/bpf/verifier.c: alloc_bytes = max(ksize(orig), kmalloc_size_roundup(bytes));
> >
> > * Some callers use ksize() mostly for calculation or sanity check,
> > and not for accessing those extra space, which are fine:
> >
> > drivers/gpu/drm/drm_managed.c: WARN_ON(dev + 1 > (struct drm_device *) (container + ksize(container)));
> > lib/kunit/string-stream-test.c: actual_bytes_used = ksize(stream);
> > lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container);
> > lib/kunit/string-stream-test.c: actual_bytes_used += ksize(frag_container->fragment);
> > mm/nommu.c: return ksize(objp);
> > mm/util.c: memcpy(n, kasan_reset_tag(p), ksize(p));
> > security/tomoyo/gc.c: tomoyo_memory_used[TOMOYO_MEMORY_POLICY] -= ksize(ptr);
> > security/tomoyo/memory.c: const size_t s = ksize(ptr);
> > drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
> > drivers/md/dm-vdo/memory-alloc.c: add_kmalloc_block(ksize(p));
> > drivers/md/dm-vdo/memory-alloc.c: remove_kmalloc_block(ksize(ptr));
> >
> > * One usage may need to be handled
> >
> > sound/soc/codecs/cs-amp-lib-test.c: KUNIT_ASSERT_GE_MSG(test, ksize(buf), priv->cal_blob->size, "Buffer to small");
> >
> > * bigger problem is the kfree_sensitive(), which will use ksize() to
> > get the total size and then zero all of them.
> >
> > One solution for this could be get the kmem_cache first, and
> > do the skip_orig_size_check()
>
> Maybe add a parameter for __ksize() that controls if we do
> skip_orig_size_check(), current ksize() will pass "false" to it (once
> remaining wrong users are handled), then another ksize_internal() variant
> will pass "true" and be used from kfree_sensitive()?
Sounds good to me! And for future wrong usages of ksize(), we can fix
them case by case when they are deteced.
Thanks,
Feng
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled
2024-10-14 13:12 ` Vlastimil Babka
2024-10-14 14:20 ` Feng Tang
@ 2024-10-14 20:35 ` Kees Cook
1 sibling, 0 replies; 21+ messages in thread
From: Kees Cook @ 2024-10-14 20:35 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Feng Tang, Marco Elver, Andrew Morton, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Roman Gushchin,
Hyeonggon Yoo, Andrey Konovalov, Shuah Khan, David Gow,
Danilo Krummrich, Alexander Potapenko, Andrey Ryabinin,
Dmitry Vyukov, Vincenzo Frascino, linux-mm@kvack.org,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
Eric Dumazet
On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> On 10/14/24 14:52, Feng Tang wrote:
> > On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
> >> On 10/14/24 09:52, Feng Tang wrote:
> >> > On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> >> > Thanks for the suggestion!
> >> >
> >> > As there were error report about the NULL slab for big kmalloc object, how
> >> > about the following code for
> >> >
> >> > __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >> > {
> >> > void *ret;
> >> > size_t ks = 0;
> >> > int orig_size = 0;
> >> > struct kmem_cache *s = NULL;
> >> >
> >> > /* Check for double-free. */
> >> > if (likely(!ZERO_OR_NULL_PTR(p))) {
> >> > if (!kasan_check_byte(p))
> >> > return NULL;
> >> >
> >> > ks = ksize(p);
> >>
> >> I think this will result in __ksize() doing
> >> skip_orig_size_check(folio_slab(folio)->slab_cache, object);
> >> and we don't want that?
> >
> > I think that's fine. As later code will re-set the orig_size anyway.
>
> But you also read it first.
>
> >> > /* Some objects have no orig_size, like big kmalloc case */
> >> > if (is_kfence_address(p)) {
> >> > orig_size = kfence_ksize(p);
> >> > } else if (virt_to_slab(p)) {
> >> > s = virt_to_cache(p);
> >> > orig_size = get_orig_size(s, (void *)p);
>
> here.
>
> >> > }
>
> >> Also the checks below repeat some of the checks of ksize().
> >
> > Yes, there is some redundancy, mostly the virt_to_slab()
> >
> >> So I think in __do_krealloc() we should do things manually to determine ks
> >> and not call ksize(). Just not break any of the cases ksize() handles
> >> (kfence, large kmalloc).
> >
> > OK, originally I tried not to expose internals of __ksize(). Let me
> > try this way.
>
> ksize() makes assumptions that a user outside of slab itself is calling it.
>
> But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> querying ksize() for the purposes of writing beyond the original
> kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> skip_orig_size_check() mechanism can be removed now?
>
> Still I think __do_krealloc() should rather do its own thing and not call
> ksize().
The goal was to avoid having users of the allocation APIs change the
sizes of allocations without calling into realloc. This is because
otherwise the "alloc_size" attribute used by compilers inform
__builtin_dynamic_object_size() can get confused:
ptr = alloc(less_than_bucket_size);
...
size = ksize(ptr); /* larger size! */
memcpy(ptr, src, size); /* compiler instrumentation doesn't see that ptr "grows" */
So the callers use kmalloc_size_roundup() to just allocate the rounded
up size immediately. Internally, the allocator can do what it wants.
--
Kees Cook
^ permalink raw reply [flat|nested] 21+ messages in thread