* [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
[not found] <CGME20250226081354epcas2p44c2f53d569296ac2e5f8a7b01f4552fa@epcas2p4.samsung.com>
@ 2025-02-26 8:11 ` Hyesoo Yu
[not found] ` <CGME20250226081357epcas2p2f4c462b215b75291a9aeeec23aa1eaca@epcas2p2.samsung.com>
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Hyesoo Yu @ 2025-02-26 8:11 UTC (permalink / raw)
Cc: janghyuck.kim, vbabka, harry.yoo, Hyesoo Yu, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel
Dear Maintainer,
The purpose is to improve the debugging capabilities of the slub allocator
when a error occurs. The following improvements have been made:
- Added WARN() calls at specific locations (slab_err, object_err) to detect
errors effectively and to generate a crash dump if panic_on_warn is enabled.
- Additionally, the error printing location in check_object has been adjusted to
display the broken data before the restoration process. This improvement
allows for a better understanding of how the data was corrupted.
This series combines two patches that were discussed seperately in the links below.
https://lore.kernel.org/linux-mm/20250120082908.4162780-1-hyesoo.yu@samsung.com/
https://lore.kernel.org/linux-mm/20250120083023.4162932-1-hyesoo.yu@samsung.com/
Thanks you.
version 2 changes
- Replaced direct calling of BUG_ON with the use of WARN() to trigger a panic.
- Modified the code to print the broken data only once before the restore.
version 3 changes
- Moved WARN() from slab_fix to slab_err and object to call WARN on all error
reporting paths.
- Changed the parameter type of check_bytes_and_report.
version 4 changes
- Modified the print format to include specific error names.
- Removed the redundant warning by removing WARN() in kmem_cache_destroy
Hyesoo Yu (2):
mm: slub: Print the broken data before restoring slub.
mm: slub: call WARN() when the slab detect an error
mm/slab_common.c | 3 ---
mm/slub.c | 63 +++++++++++++++++++++++++-----------------------
2 files changed, 33 insertions(+), 33 deletions(-)
--
2.28.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v4 1/2] mm: slub: Print the broken data before restoring slub.
[not found] ` <CGME20250226081357epcas2p2f4c462b215b75291a9aeeec23aa1eaca@epcas2p2.samsung.com>
@ 2025-02-26 8:12 ` Hyesoo Yu
2025-02-27 11:51 ` Harry Yoo
0 siblings, 1 reply; 14+ messages in thread
From: Hyesoo Yu @ 2025-02-26 8:12 UTC (permalink / raw)
Cc: janghyuck.kim, vbabka, harry.yoo, Hyesoo Yu, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel
Previously, the restore occured after printing the object in slub.
After commit 47d911b02cbe ("slab: make check_object() more consistent"),
the bytes are printed after the restore. This information about the bytes
before the restore is highly valuable for debugging purpose.
For instance, in a event of cache issue, it displays byte patterns
by breaking them down into 64-bytes units. Without this information,
we can only speculate on how it was broken. Hence the corrupted regions
should be printed prior to the restoration process. However if an object
breaks in multiple places, the same log may be output multiple times.
Therefore the slub log is reported only once to prevent redundant printing,
by sending a parameter indicating whether an error has occurred previously.
Changes in v4:
- Change the print format to include specific error names.
Changes in v3:
- Change the parameter type of check_bytes_and_report.
Changes in v2:
- Instead of using print_section every time on check_bytes_and_report,
just print it once for the entire slub object before the restore.
Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
---
mm/slub.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index b3969d63cc04..8c13cd43c0fd 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1192,8 +1192,8 @@ static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
static pad_check_attributes int
check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
- u8 *object, char *what,
- u8 *start, unsigned int value, unsigned int bytes)
+ u8 *object, char *what, u8 *start, unsigned int value,
+ unsigned int bytes, bool slab_obj_print)
{
u8 *fault;
u8 *end;
@@ -1212,10 +1212,11 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
if (slab_add_kunit_errors())
goto skip_bug_print;
- slab_bug(s, "%s overwritten", what);
- pr_err("0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
- fault, end - 1, fault - addr,
- fault[0], value);
+ pr_err("[%s overwritten] 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
+ what, fault, end - 1, fault - addr, fault[0], value);
+
+ if (slab_obj_print)
+ object_err(s, slab, object, "Object corrupt");
skip_bug_print:
restore_bytes(s, what, value, fault, end);
@@ -1279,7 +1280,7 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
return 1;
return check_bytes_and_report(s, slab, p, "Object padding",
- p + off, POISON_INUSE, size_from_object(s) - off);
+ p + off, POISON_INUSE, size_from_object(s) - off, true);
}
/* Check the pad bytes at the end of a slab page */
@@ -1329,11 +1330,11 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
if (s->flags & SLAB_RED_ZONE) {
if (!check_bytes_and_report(s, slab, object, "Left Redzone",
- object - s->red_left_pad, val, s->red_left_pad))
+ object - s->red_left_pad, val, s->red_left_pad, ret))
ret = 0;
if (!check_bytes_and_report(s, slab, object, "Right Redzone",
- endobject, val, s->inuse - s->object_size))
+ endobject, val, s->inuse - s->object_size, ret))
ret = 0;
if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
@@ -1342,7 +1343,7 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
if (s->object_size > orig_size &&
!check_bytes_and_report(s, slab, object,
"kmalloc Redzone", p + orig_size,
- val, s->object_size - orig_size)) {
+ val, s->object_size - orig_size, ret)) {
ret = 0;
}
}
@@ -1350,7 +1351,7 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
if ((s->flags & SLAB_POISON) && s->object_size < s->inuse) {
if (!check_bytes_and_report(s, slab, p, "Alignment padding",
endobject, POISON_INUSE,
- s->inuse - s->object_size))
+ s->inuse - s->object_size, ret))
ret = 0;
}
}
@@ -1366,11 +1367,11 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
if (kasan_meta_size < s->object_size - 1 &&
!check_bytes_and_report(s, slab, p, "Poison",
p + kasan_meta_size, POISON_FREE,
- s->object_size - kasan_meta_size - 1))
+ s->object_size - kasan_meta_size - 1, ret))
ret = 0;
if (kasan_meta_size < s->object_size &&
!check_bytes_and_report(s, slab, p, "End Poison",
- p + s->object_size - 1, POISON_END, 1))
+ p + s->object_size - 1, POISON_END, 1, ret))
ret = 0;
}
/*
@@ -1396,11 +1397,6 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
ret = 0;
}
- if (!ret && !slab_in_kunit_test()) {
- print_trailer(s, slab, object);
- add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
- }
-
return ret;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error
[not found] ` <CGME20250226081359epcas2p2a6a1f3f92540660129164734fa6eaa64@epcas2p2.samsung.com>
@ 2025-02-26 8:12 ` Hyesoo Yu
2025-02-27 12:55 ` Harry Yoo
2025-02-27 14:38 ` Vlastimil Babka
0 siblings, 2 replies; 14+ messages in thread
From: Hyesoo Yu @ 2025-02-26 8:12 UTC (permalink / raw)
Cc: janghyuck.kim, vbabka, harry.yoo, Hyesoo Yu, Christoph Lameter,
Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel
If a slab object is corrupted or an error occurs in its internal
value, continuing after restoration may cause other side effects.
At this point, it is difficult to debug because the problem occurred
in the past. It is useful to use WARN() to catch errors at the point
of issue because WARN() could trigger panic for system debugging when
panic_on_warn is enabled. WARN() is added where to detect the error
on slab_err and object_err.
It makes sense to only do the WARN() after printing the logs. slab_err
is splited to __slab_err that calls the WARN() and it is called after
printing logs.
Changes in v4:
- Remove WARN() in kmem_cache_destroy to remove redundant warning.
Changes in v3:
- move the WARN from slab_fix to slab_err, object_err and check_obj to
use WARN on all error reporting paths.
Changes in v2:
- Replace direct calling with BUG_ON with the use of WARN in slab_fix.
Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
---
mm/slab_common.c | 3 ---
mm/slub.c | 31 +++++++++++++++++++------------
2 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 477fa471da18..d13f4ffe252b 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -517,9 +517,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
kasan_cache_shutdown(s);
err = __kmem_cache_shutdown(s);
- if (!slab_in_kunit_test())
- WARN(err, "%s %s: Slab cache still has objects when called from %pS",
- __func__, s->name, (void *)_RET_IP_);
list_del(&s->list);
diff --git a/mm/slub.c b/mm/slub.c
index 8c13cd43c0fd..4961eeccf3ad 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1096,8 +1096,6 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
/* Beginning of the filler is the free pointer */
print_section(KERN_ERR, "Padding ", p + off,
size_from_object(s) - off);
-
- dump_stack();
}
static void object_err(struct kmem_cache *s, struct slab *slab,
@@ -1109,6 +1107,8 @@ static void object_err(struct kmem_cache *s, struct slab *slab,
slab_bug(s, "%s", reason);
print_trailer(s, slab, object);
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+
+ WARN_ON(1);
}
static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
@@ -1125,6 +1125,14 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
return false;
}
+static void __slab_err(struct slab *slab)
+{
+ print_slab_info(slab);
+ add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+
+ WARN_ON(1);
+}
+
static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
const char *fmt, ...)
{
@@ -1138,9 +1146,7 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
vsnprintf(buf, sizeof(buf), fmt, args);
va_end(args);
slab_bug(s, "%s", buf);
- print_slab_info(slab);
- dump_stack();
- add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+ __slab_err(slab);
}
static void init_object(struct kmem_cache *s, void *object, u8 val)
@@ -1313,9 +1319,10 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab)
while (end > fault && end[-1] == POISON_INUSE)
end--;
- slab_err(s, slab, "Padding overwritten. 0x%p-0x%p @offset=%tu",
- fault, end - 1, fault - start);
+ slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
+ fault, end - 1, fault - start);
print_section(KERN_ERR, "Padding ", pad, remainder);
+ __slab_err(slab);
restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
}
@@ -5428,14 +5435,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
return !!oo_objects(s->oo);
}
-static void list_slab_objects(struct kmem_cache *s, struct slab *slab,
- const char *text)
+static void list_slab_objects(struct kmem_cache *s, struct slab *slab)
{
#ifdef CONFIG_SLUB_DEBUG
void *addr = slab_address(slab);
void *p;
- slab_err(s, slab, text, s->name);
+ slab_bug(s, "Objects remaining on __kmem_cache_shutdown()");
spin_lock(&object_map_lock);
__fill_map(object_map, s, slab);
@@ -5450,6 +5456,8 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab,
}
}
spin_unlock(&object_map_lock);
+
+ __slab_err(slab);
#endif
}
@@ -5470,8 +5478,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
remove_partial(n, slab);
list_add(&slab->slab_list, &discard);
} else {
- list_slab_objects(s, slab,
- "Objects remaining in %s on __kmem_cache_shutdown()");
+ list_slab_objects(s, slab);
}
}
spin_unlock_irq(&n->list_lock);
--
2.28.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/2] mm: slub: Print the broken data before restoring slub.
2025-02-26 8:12 ` [PATCH v4 1/2] mm: slub: Print the broken data before restoring slub Hyesoo Yu
@ 2025-02-27 11:51 ` Harry Yoo
2025-02-27 12:36 ` Harry Yoo
0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2025-02-27 11:51 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, vbabka, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On Wed, Feb 26, 2025 at 05:12:00PM +0900, Hyesoo Yu wrote:
> Previously, the restore occured after printing the object in slub.
> After commit 47d911b02cbe ("slab: make check_object() more consistent"),
> the bytes are printed after the restore. This information about the bytes
> before the restore is highly valuable for debugging purpose.
> For instance, in a event of cache issue, it displays byte patterns
> by breaking them down into 64-bytes units. Without this information,
> we can only speculate on how it was broken. Hence the corrupted regions
> should be printed prior to the restoration process. However if an object
> breaks in multiple places, the same log may be output multiple times.
> Therefore the slub log is reported only once to prevent redundant printing,
> by sending a parameter indicating whether an error has occurred previously.
>
> Changes in v4:
> - Change the print format to include specific error names.
>
> Changes in v3:
> - Change the parameter type of check_bytes_and_report.
>
> Changes in v2:
> - Instead of using print_section every time on check_bytes_and_report,
> just print it once for the entire slub object before the restore.
IMHO it is not a good practice to include patch version changes
in the changelog, because the changelog should make sense on its own.
> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
> ---
I think that's why people usually place patch version log just below '---' line.
(More details can be found in the "Submitting patches" documentation
https://docs.kernel.org/process/submitting-patches.html#commentary)
Anyway, the code itself looks good to me (with a nit below).
Please feel free to add:
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> mm/slub.c | 32 ++++++++++++++------------------
> 1 file changed, 14 insertions(+), 18 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index b3969d63cc04..8c13cd43c0fd 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1192,8 +1192,8 @@ static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
>
> static pad_check_attributes int
> check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> - u8 *object, char *what,
> - u8 *start, unsigned int value, unsigned int bytes)
> + u8 *object, char *what, u8 *start, unsigned int value,
> + unsigned int bytes, bool slab_obj_print)
> {
> u8 *fault;
> u8 *end;
> @@ -1212,10 +1212,11 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> if (slab_add_kunit_errors())
> goto skip_bug_print;
>
> - slab_bug(s, "%s overwritten", what);
> - pr_err("0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> - fault, end - 1, fault - addr,
> - fault[0], value);
> + pr_err("[%s overwritten] 0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
> + what, fault, end - 1, fault - addr, fault[0], value);
> +
> + if (slab_obj_print)
> + object_err(s, slab, object, "Object corrupt");
>
> skip_bug_print:
> restore_bytes(s, what, value, fault, end);
> @@ -1279,7 +1280,7 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
> return 1;
>
> return check_bytes_and_report(s, slab, p, "Object padding",
> - p + off, POISON_INUSE, size_from_object(s) - off);
> + p + off, POISON_INUSE, size_from_object(s) - off, true);
> }
>
> /* Check the pad bytes at the end of a slab page */
> @@ -1329,11 +1330,11 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
>
> if (s->flags & SLAB_RED_ZONE) {
> if (!check_bytes_and_report(s, slab, object, "Left Redzone",
> - object - s->red_left_pad, val, s->red_left_pad))
> + object - s->red_left_pad, val, s->red_left_pad, ret))
> ret = 0;
>
> if (!check_bytes_and_report(s, slab, object, "Right Redzone",
> - endobject, val, s->inuse - s->object_size))
> + endobject, val, s->inuse - s->object_size, ret))
> ret = 0;
>
> if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
> @@ -1342,7 +1343,7 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> if (s->object_size > orig_size &&
> !check_bytes_and_report(s, slab, object,
> "kmalloc Redzone", p + orig_size,
> - val, s->object_size - orig_size)) {
> + val, s->object_size - orig_size, ret)) {
> ret = 0;
> }
> }
> @@ -1350,7 +1351,7 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> if ((s->flags & SLAB_POISON) && s->object_size < s->inuse) {
> if (!check_bytes_and_report(s, slab, p, "Alignment padding",
> endobject, POISON_INUSE,
> - s->inuse - s->object_size))
> + s->inuse - s->object_size, ret))
> ret = 0;
> }
> }
> @@ -1366,11 +1367,11 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> if (kasan_meta_size < s->object_size - 1 &&
> !check_bytes_and_report(s, slab, p, "Poison",
> p + kasan_meta_size, POISON_FREE,
> - s->object_size - kasan_meta_size - 1))
> + s->object_size - kasan_meta_size - 1, ret))
> ret = 0;
> if (kasan_meta_size < s->object_size &&
> !check_bytes_and_report(s, slab, p, "End Poison",
> - p + s->object_size - 1, POISON_END, 1))
> + p + s->object_size - 1, POISON_END, 1, ret))
> ret = 0;
> }
> /*
> @@ -1396,11 +1397,6 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> ret = 0;
> }
>
> - if (!ret && !slab_in_kunit_test()) {
nit: check_object() was the only user of slab_in_kunit_test().
Can we remove it altogether?
--
Cheers,
Harry
> - print_trailer(s, slab, object);
> - add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> - }
> -
> return ret;
> }
>
> --
> 2.28.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-26 8:11 ` [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error Hyesoo Yu
[not found] ` <CGME20250226081357epcas2p2f4c462b215b75291a9aeeec23aa1eaca@epcas2p2.samsung.com>
[not found] ` <CGME20250226081359epcas2p2a6a1f3f92540660129164734fa6eaa64@epcas2p2.samsung.com>
@ 2025-02-27 11:53 ` Harry Yoo
2025-02-27 16:12 ` Vlastimil Babka
3 siblings, 0 replies; 14+ messages in thread
From: Harry Yoo @ 2025-02-27 11:53 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, vbabka, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On Wed, Feb 26, 2025 at 05:11:59PM +0900, Hyesoo Yu wrote:
> Dear Maintainer,
>
> The purpose is to improve the debugging capabilities of the slub allocator
> when a error occurs. The following improvements have been made:
>
> - Added WARN() calls at specific locations (slab_err, object_err) to detect
> errors effectively and to generate a crash dump if panic_on_warn is enabled.
>
> - Additionally, the error printing location in check_object has been adjusted to
> display the broken data before the restoration process. This improvement
> allows for a better understanding of how the data was corrupted.
>
> This series combines two patches that were discussed seperately in the links below.
> https://urldefense.com/v3/__https://lore.kernel.org/linux-mm/20250120082908.4162780-1-hyesoo.yu@samsung.com/__;!!ACWV5N9M2RV99hQ!JpvsczvJJcu4xw6xseDcLQJyiNXgZmwubb5cXEfORBj3VslI2ZTgmipoW7pdQ6qTldrr0mnk2l99xw3nio0$
> https://urldefense.com/v3/__https://lore.kernel.org/linux-mm/20250120083023.4162932-1-hyesoo.yu@samsung.com/__;!!ACWV5N9M2RV99hQ!JpvsczvJJcu4xw6xseDcLQJyiNXgZmwubb5cXEfORBj3VslI2ZTgmipoW7pdQ6qTldrr0mnk2l99Cdp4khE$
IMHO it will be helpful if the cover letter includes error reporting output
before and after this patch series.
--
Cheers,
Harry
> Thanks you.
>
> version 2 changes
> - Replaced direct calling of BUG_ON with the use of WARN() to trigger a panic.
> - Modified the code to print the broken data only once before the restore.
>
> version 3 changes
> - Moved WARN() from slab_fix to slab_err and object to call WARN on all error
> reporting paths.
> - Changed the parameter type of check_bytes_and_report.
>
> version 4 changes
> - Modified the print format to include specific error names.
> - Removed the redundant warning by removing WARN() in kmem_cache_destroy
>
> Hyesoo Yu (2):
> mm: slub: Print the broken data before restoring slub.
> mm: slub: call WARN() when the slab detect an error
>
> mm/slab_common.c | 3 ---
> mm/slub.c | 63 +++++++++++++++++++++++++-----------------------
> 2 files changed, 33 insertions(+), 33 deletions(-)
>
> --
> 2.28.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/2] mm: slub: Print the broken data before restoring slub.
2025-02-27 11:51 ` Harry Yoo
@ 2025-02-27 12:36 ` Harry Yoo
0 siblings, 0 replies; 14+ messages in thread
From: Harry Yoo @ 2025-02-27 12:36 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, vbabka, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On Thu, Feb 27, 2025 at 08:51:19PM +0900, Harry Yoo wrote:
> On Wed, Feb 26, 2025 at 05:12:00PM +0900, Hyesoo Yu wrote:
> > @@ -1396,11 +1397,6 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
> > ret = 0;
> > }
> >
> > - if (!ret && !slab_in_kunit_test()) {
>
> nit: check_object() was the only user of slab_in_kunit_test().
> Can we remove it altogether?
Uh, there is another user in mm/slab_common. But it is also removed
in patch 2. So can be removed in patch 2.
--
Cheers,
Harry
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error
2025-02-26 8:12 ` [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error Hyesoo Yu
@ 2025-02-27 12:55 ` Harry Yoo
2025-02-27 15:18 ` Vlastimil Babka
2025-02-27 14:38 ` Vlastimil Babka
1 sibling, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2025-02-27 12:55 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, vbabka, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On Wed, Feb 26, 2025 at 05:12:01PM +0900, Hyesoo Yu wrote:
> If a slab object is corrupted or an error occurs in its internal
> value, continuing after restoration may cause other side effects.
> At this point, it is difficult to debug because the problem occurred
> in the past. It is useful to use WARN() to catch errors at the point
> of issue because WARN() could trigger panic for system debugging when
> panic_on_warn is enabled. WARN() is added where to detect the error
> on slab_err and object_err.
>
> It makes sense to only do the WARN() after printing the logs. slab_err
> is splited to __slab_err that calls the WARN() and it is called after
> printing logs.
>
> Changes in v4:
> - Remove WARN() in kmem_cache_destroy to remove redundant warning.
>
> Changes in v3:
> - move the WARN from slab_fix to slab_err, object_err and check_obj to
> use WARN on all error reporting paths.
>
> Changes in v2:
> - Replace direct calling with BUG_ON with the use of WARN in slab_fix.
Same here, please rephrase the changelog without "Changes in vN" in the
changelog, and move the patch version changes below "---" line.
> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
> ---
Otherwise this change in general looks good to me (with a suggestion below).
Please feel free to add:
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
# Suggestion
There's a case where SLUB just calls pr_err() and dump_stack() instead of
slab_err() or object_err() in free_consistency_checks().
I would like to add something like this:
diff --git a/mm/slub.c b/mm/slub.c
index b7660ee85db0..d5609fa63da4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1022,12 +1022,16 @@ static void slab_bug(struct kmem_cache *s, char *fmt, ...)
{
struct va_format vaf;
va_list args;
+ const char *name = "<unknown>";
+
+ if (s)
+ name = s->name;
va_start(args, fmt);
vaf.fmt = fmt;
vaf.va = &args;
pr_err("=============================================================================\n");
- pr_err("BUG %s (%s): %pV\n", s->name, print_tainted(), &vaf);
+ pr_err("BUG %s (%s): %pV\n", name, print_tainted(), &vaf);
pr_err("-----------------------------------------------------------------------------\n\n");
va_end(args);
}
@@ -1628,9 +1632,8 @@ static inline int free_consistency_checks(struct kmem_cache *s,
slab_err(s, slab, "Attempt to free object(0x%p) outside of slab",
object);
} else if (!slab->slab_cache) {
- pr_err("SLUB <none>: no slab for object 0x%p.\n",
- object);
- dump_stack();
+ slab_err(NULL, slab, "No slab for object 0x%p",
+ object);
} else
object_err(s, slab, object,
"page slab pointer corrupt.");
--
Cheers,
Harry
> mm/slab_common.c | 3 ---
> mm/slub.c | 31 +++++++++++++++++++------------
> 2 files changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 477fa471da18..d13f4ffe252b 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -517,9 +517,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> kasan_cache_shutdown(s);
>
> err = __kmem_cache_shutdown(s);
> - if (!slab_in_kunit_test())
> - WARN(err, "%s %s: Slab cache still has objects when called from %pS",
> - __func__, s->name, (void *)_RET_IP_);
>
> list_del(&s->list);
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 8c13cd43c0fd..4961eeccf3ad 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1096,8 +1096,6 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
> /* Beginning of the filler is the free pointer */
> print_section(KERN_ERR, "Padding ", p + off,
> size_from_object(s) - off);
> -
> - dump_stack();
> }
>
> static void object_err(struct kmem_cache *s, struct slab *slab,
> @@ -1109,6 +1107,8 @@ static void object_err(struct kmem_cache *s, struct slab *slab,
> slab_bug(s, "%s", reason);
> print_trailer(s, slab, object);
> add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> +
> + WARN_ON(1);
> }
>
> static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
> @@ -1125,6 +1125,14 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab,
> return false;
> }
>
> +static void __slab_err(struct slab *slab)
> +{
> + print_slab_info(slab);
> + add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> +
> + WARN_ON(1);
> +}
> +
> static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
> const char *fmt, ...)
> {
> @@ -1138,9 +1146,7 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
> vsnprintf(buf, sizeof(buf), fmt, args);
> va_end(args);
> slab_bug(s, "%s", buf);
> - print_slab_info(slab);
> - dump_stack();
> - add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> + __slab_err(slab);
> }
>
> static void init_object(struct kmem_cache *s, void *object, u8 val)
> @@ -1313,9 +1319,10 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab)
> while (end > fault && end[-1] == POISON_INUSE)
> end--;
>
> - slab_err(s, slab, "Padding overwritten. 0x%p-0x%p @offset=%tu",
> - fault, end - 1, fault - start);
> + slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
> + fault, end - 1, fault - start);
> print_section(KERN_ERR, "Padding ", pad, remainder);
> + __slab_err(slab);
>
> restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
> }
> @@ -5428,14 +5435,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> return !!oo_objects(s->oo);
> }
>
> -static void list_slab_objects(struct kmem_cache *s, struct slab *slab,
> - const char *text)
> +static void list_slab_objects(struct kmem_cache *s, struct slab *slab)
> {
> #ifdef CONFIG_SLUB_DEBUG
> void *addr = slab_address(slab);
> void *p;
>
> - slab_err(s, slab, text, s->name);
> + slab_bug(s, "Objects remaining on __kmem_cache_shutdown()");
>
> spin_lock(&object_map_lock);
> __fill_map(object_map, s, slab);
> @@ -5450,6 +5456,8 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab,
> }
> }
> spin_unlock(&object_map_lock);
> +
> + __slab_err(slab);
> #endif
> }
>
> @@ -5470,8 +5478,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
> remove_partial(n, slab);
> list_add(&slab->slab_list, &discard);
> } else {
> - list_slab_objects(s, slab,
> - "Objects remaining in %s on __kmem_cache_shutdown()");
> + list_slab_objects(s, slab);
> }
> }
> spin_unlock_irq(&n->list_lock);
> --
> 2.28.0
>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error
2025-02-26 8:12 ` [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error Hyesoo Yu
2025-02-27 12:55 ` Harry Yoo
@ 2025-02-27 14:38 ` Vlastimil Babka
1 sibling, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-02-27 14:38 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, harry.yoo, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On 2/26/25 09:12, Hyesoo Yu wrote:
> If a slab object is corrupted or an error occurs in its internal
> value, continuing after restoration may cause other side effects.
> At this point, it is difficult to debug because the problem occurred
> in the past. It is useful to use WARN() to catch errors at the point
> of issue because WARN() could trigger panic for system debugging when
> panic_on_warn is enabled. WARN() is added where to detect the error
> on slab_err and object_err.
>
> It makes sense to only do the WARN() after printing the logs. slab_err
> is splited to __slab_err that calls the WARN() and it is called after
> printing logs.
>
> Changes in v4:
> - Remove WARN() in kmem_cache_destroy to remove redundant warning.
>
> Changes in v3:
> - move the WARN from slab_fix to slab_err, object_err and check_obj to
> use WARN on all error reporting paths.
>
> Changes in v2:
> - Replace direct calling with BUG_ON with the use of WARN in slab_fix.
>
> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
As Harry said. I'll remove that locally.
> ---
> mm/slab_common.c | 3 ---
> mm/slub.c | 31 +++++++++++++++++++------------
> 2 files changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 477fa471da18..d13f4ffe252b 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -517,9 +517,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> kasan_cache_shutdown(s);
>
> err = __kmem_cache_shutdown(s);
> - if (!slab_in_kunit_test())
> - WARN(err, "%s %s: Slab cache still has objects when called from %pS",
> - __func__, s->name, (void *)_RET_IP_);
I think I'll keep this one, because the more detailed warning via
list_slab_objects() is only enabled with CONFIG_SLUB_DEBUG.
If it's not enabled, the kmem_cache_destroy() failure rather should not be
silent.
So slab_in_kunit_test() would also stay.
> } else {
> - list_slab_objects(s, slab,
> - "Objects remaining in %s on __kmem_cache_shutdown()");
> + list_slab_objects(s, slab);
I tried to extract slab_bug() and __slab_err() from list_slab_objects() but
they were also only available with CONFIG_SLUB_DEBUG.
Perhaps we can improve that, but as a follow-up cleanup so we don't hold
this up further.
> }
> }
> spin_unlock_irq(&n->list_lock);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error
2025-02-27 12:55 ` Harry Yoo
@ 2025-02-27 15:18 ` Vlastimil Babka
0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2025-02-27 15:18 UTC (permalink / raw)
To: Harry Yoo, Hyesoo Yu
Cc: janghyuck.kim, Christoph Lameter, Pekka Enberg, David Rientjes,
Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
linux-mm, linux-kernel
On 2/27/25 13:55, Harry Yoo wrote:
> On Wed, Feb 26, 2025 at 05:12:01PM +0900, Hyesoo Yu wrote:
>> If a slab object is corrupted or an error occurs in its internal
>> value, continuing after restoration may cause other side effects.
>> At this point, it is difficult to debug because the problem occurred
>> in the past. It is useful to use WARN() to catch errors at the point
>> of issue because WARN() could trigger panic for system debugging when
>> panic_on_warn is enabled. WARN() is added where to detect the error
>> on slab_err and object_err.
>>
>> It makes sense to only do the WARN() after printing the logs. slab_err
>> is splited to __slab_err that calls the WARN() and it is called after
>> printing logs.
>>
>> Changes in v4:
>> - Remove WARN() in kmem_cache_destroy to remove redundant warning.
>>
>> Changes in v3:
>> - move the WARN from slab_fix to slab_err, object_err and check_obj to
>> use WARN on all error reporting paths.
>>
>> Changes in v2:
>> - Replace direct calling with BUG_ON with the use of WARN in slab_fix.
>
> Same here, please rephrase the changelog without "Changes in vN" in the
> changelog, and move the patch version changes below "---" line.
>
>> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
>> ---
>
> Otherwise this change in general looks good to me (with a suggestion below).
> Please feel free to add:
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>
> # Suggestion
>
> There's a case where SLUB just calls pr_err() and dump_stack() instead of
> slab_err() or object_err() in free_consistency_checks().
>
> I would like to add something like this:
>
> diff --git a/mm/slub.c b/mm/slub.c
> index b7660ee85db0..d5609fa63da4 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1022,12 +1022,16 @@ static void slab_bug(struct kmem_cache *s, char *fmt, ...)
> {
> struct va_format vaf;
> va_list args;
> + const char *name = "<unknown>";
> +
> + if (s)
> + name = s->name;
>
> va_start(args, fmt);
> vaf.fmt = fmt;
> vaf.va = &args;
> pr_err("=============================================================================\n");
> - pr_err("BUG %s (%s): %pV\n", s->name, print_tainted(), &vaf);
> + pr_err("BUG %s (%s): %pV\n", name, print_tainted(), &vaf);
s ? s->name : "<unknown>"
more concise
> pr_err("-----------------------------------------------------------------------------\n\n");
> va_end(args);
> }
> @@ -1628,9 +1632,8 @@ static inline int free_consistency_checks(struct kmem_cache *s,
> slab_err(s, slab, "Attempt to free object(0x%p) outside of slab",
> object);
> } else if (!slab->slab_cache) {
> - pr_err("SLUB <none>: no slab for object 0x%p.\n",
> - object);
> - dump_stack();
> + slab_err(NULL, slab, "No slab for object 0x%p",
> + object);
Good suggestion, added that locally. Probably not likely to trigger as a
use-after-free would mean we're rather hit !folio_test_slab() above than a
folio that has a slab flag but has a NULL pointer (or the pointer might be
garbage and not NULL). But at least the error handling will be consistent.
Changed the error message to "No slab cache" as that's more accurate.
Thanks.
> } else
> object_err(s, slab, object,
> "page slab pointer corrupt.");
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-26 8:11 ` [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error Hyesoo Yu
` (2 preceding siblings ...)
2025-02-27 11:53 ` [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error Harry Yoo
@ 2025-02-27 16:12 ` Vlastimil Babka
2025-02-27 16:26 ` Vlastimil Babka
3 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-02-27 16:12 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, harry.yoo, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On 2/26/25 09:11, Hyesoo Yu wrote:
> Dear Maintainer,
>
> The purpose is to improve the debugging capabilities of the slub allocator
> when a error occurs. The following improvements have been made:
>
> - Added WARN() calls at specific locations (slab_err, object_err) to detect
> errors effectively and to generate a crash dump if panic_on_warn is enabled.
>
> - Additionally, the error printing location in check_object has been adjusted to
> display the broken data before the restoration process. This improvement
> allows for a better understanding of how the data was corrupted.
>
> This series combines two patches that were discussed seperately in the links below.
> https://lore.kernel.org/linux-mm/20250120082908.4162780-1-hyesoo.yu@samsung.com/
> https://lore.kernel.org/linux-mm/20250120083023.4162932-1-hyesoo.yu@samsung.com/
>
> Thanks you.
Thanks. On top of things already mentioned, I added some kunit suppressions
in patch 2. Please check the result:
https://web.git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.15/fixes-cleanups
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-27 16:12 ` Vlastimil Babka
@ 2025-02-27 16:26 ` Vlastimil Babka
2025-02-28 12:47 ` Harry Yoo
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-02-27 16:26 UTC (permalink / raw)
To: Hyesoo Yu
Cc: janghyuck.kim, harry.yoo, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On 2/27/25 17:12, Vlastimil Babka wrote:
> On 2/26/25 09:11, Hyesoo Yu wrote:
>> Dear Maintainer,
>>
>> The purpose is to improve the debugging capabilities of the slub allocator
>> when a error occurs. The following improvements have been made:
>>
>> - Added WARN() calls at specific locations (slab_err, object_err) to detect
>> errors effectively and to generate a crash dump if panic_on_warn is enabled.
>>
>> - Additionally, the error printing location in check_object has been adjusted to
>> display the broken data before the restoration process. This improvement
>> allows for a better understanding of how the data was corrupted.
>>
>> This series combines two patches that were discussed seperately in the links below.
>> https://lore.kernel.org/linux-mm/20250120082908.4162780-1-hyesoo.yu@samsung.com/
>> https://lore.kernel.org/linux-mm/20250120083023.4162932-1-hyesoo.yu@samsung.com/
>>
>> Thanks you.
>
> Thanks. On top of things already mentioned, I added some kunit suppressions
> in patch 2. Please check the result:
>
> https://web.git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.15/fixes-cleanups
What do you think about the following patch on top?
---8<---
From c38dadde6293cacdb91f95afc3615c22dec5830a Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Thu, 27 Feb 2025 16:05:46 +0100
Subject: [PATCH] mm, slab: cleanup slab_bug() parameters
slab_err() has variadic printf arguments but instead of passing them to
slab_bug() it does vsnprintf() to a buffer and passes %s, buf.
To allow passing them directly, turn slab_bug() to __slab_bug() with a
va_list parameter, and slab_bug() a wrapper with fmt, ... parameters.
Then slab_err() can call __slab_bug() without the intermediate buffer.
Also constify fmt everywhere, which also simplifies object_err()'s
call to slab_bug().
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/slub.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index a9a02b4ae4d6..d94af020b305 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1017,12 +1017,12 @@ void skip_orig_size_check(struct kmem_cache *s, const void *object)
set_orig_size(s, (void *)object, s->object_size);
}
-static void slab_bug(struct kmem_cache *s, char *fmt, ...)
+static void __slab_bug(struct kmem_cache *s, const char *fmt, va_list argsp)
{
struct va_format vaf;
va_list args;
- va_start(args, fmt);
+ va_copy(args, argsp);
vaf.fmt = fmt;
vaf.va = &args;
pr_err("=============================================================================\n");
@@ -1031,8 +1031,17 @@ static void slab_bug(struct kmem_cache *s, char *fmt, ...)
va_end(args);
}
+static void slab_bug(struct kmem_cache *s, const char *fmt, ...)
+{
+ va_list args;
+
+ va_start(args, fmt);
+ __slab_bug(s, fmt, args);
+ va_end(args);
+}
+
__printf(2, 3)
-static void slab_fix(struct kmem_cache *s, char *fmt, ...)
+static void slab_fix(struct kmem_cache *s, const char *fmt, ...)
{
struct va_format vaf;
va_list args;
@@ -1088,12 +1097,12 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
}
static void object_err(struct kmem_cache *s, struct slab *slab,
- u8 *object, char *reason)
+ u8 *object, const char *reason)
{
if (slab_add_kunit_errors())
return;
- slab_bug(s, "%s", reason);
+ slab_bug(s, reason);
print_trailer(s, slab, object);
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
@@ -1129,15 +1138,14 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
const char *fmt, ...)
{
va_list args;
- char buf[100];
if (slab_add_kunit_errors())
return;
va_start(args, fmt);
- vsnprintf(buf, sizeof(buf), fmt, args);
+ __slab_bug(s, fmt, args);
va_end(args);
- slab_bug(s, "%s", buf);
+
__slab_err(slab);
}
@@ -1175,7 +1183,7 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
s->inuse - poison_size);
}
-static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
+static void restore_bytes(struct kmem_cache *s, const char *message, u8 data,
void *from, void *to)
{
slab_fix(s, "Restoring %s 0x%p-0x%p=0x%x", message, from, to - 1, data);
@@ -1190,7 +1198,7 @@ static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
static pad_check_attributes int
check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
- u8 *object, char *what, u8 *start, unsigned int value,
+ u8 *object, const char *what, u8 *start, unsigned int value,
unsigned int bytes, bool slab_obj_print)
{
u8 *fault;
--
2.48.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-27 16:26 ` Vlastimil Babka
@ 2025-02-28 12:47 ` Harry Yoo
2025-02-28 16:02 ` Vlastimil Babka
0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2025-02-28 12:47 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Hyesoo Yu, janghyuck.kim, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On Thu, Feb 27, 2025 at 05:26:26PM +0100, Vlastimil Babka wrote:
> On 2/27/25 17:12, Vlastimil Babka wrote:
> > On 2/26/25 09:11, Hyesoo Yu wrote:
> >> Dear Maintainer,
> >>
> >> The purpose is to improve the debugging capabilities of the slub allocator
> >> when a error occurs. The following improvements have been made:
> >>
> >> - Added WARN() calls at specific locations (slab_err, object_err) to detect
> >> errors effectively and to generate a crash dump if panic_on_warn is enabled.
> >>
> >> - Additionally, the error printing location in check_object has been adjusted to
> >> display the broken data before the restoration process. This improvement
> >> allows for a better understanding of how the data was corrupted.
> >>
> >> This series combines two patches that were discussed seperately in the links below.
> >> https://lore.kernel.org/linux-mm/20250120082908.4162780-1-hyesoo.yu@samsung.com/
> >> https://lore.kernel.org/linux-mm/20250120083023.4162932-1-hyesoo.yu@samsung.com/
> >>
> >> Thanks you.
> >
> > Thanks. On top of things already mentioned, I added some kunit suppressions
> > in patch 2. Please check the result:
> >
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.15/fixes-cleanups
>
> What do you think about the following patch on top?
>
> ---8<---
> From c38dadde6293cacdb91f95afc3615c22dec5830a Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Thu, 27 Feb 2025 16:05:46 +0100
> Subject: [PATCH] mm, slab: cleanup slab_bug() parameters
>
> slab_err() has variadic printf arguments but instead of passing them to
> slab_bug() it does vsnprintf() to a buffer and passes %s, buf.
>
> To allow passing them directly, turn slab_bug() to __slab_bug() with a
> va_list parameter, and slab_bug() a wrapper with fmt, ... parameters.
> Then slab_err() can call __slab_bug() without the intermediate buffer.
>
> Also constify fmt everywhere, which also simplifies object_err()'s
> call to slab_bug().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
Looks good to me.
FWIW,
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry
> mm/slub.c | 28 ++++++++++++++++++----------
> 1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index a9a02b4ae4d6..d94af020b305 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1017,12 +1017,12 @@ void skip_orig_size_check(struct kmem_cache *s, const void *object)
> set_orig_size(s, (void *)object, s->object_size);
> }
>
> -static void slab_bug(struct kmem_cache *s, char *fmt, ...)
> +static void __slab_bug(struct kmem_cache *s, const char *fmt, va_list argsp)
> {
> struct va_format vaf;
> va_list args;
>
> - va_start(args, fmt);
> + va_copy(args, argsp);
> vaf.fmt = fmt;
> vaf.va = &args;
> pr_err("=============================================================================\n");
> @@ -1031,8 +1031,17 @@ static void slab_bug(struct kmem_cache *s, char *fmt, ...)
> va_end(args);
> }
>
> +static void slab_bug(struct kmem_cache *s, const char *fmt, ...)
> +{
> + va_list args;
> +
> + va_start(args, fmt);
> + __slab_bug(s, fmt, args);
> + va_end(args);
> +}
> +
> __printf(2, 3)
> -static void slab_fix(struct kmem_cache *s, char *fmt, ...)
> +static void slab_fix(struct kmem_cache *s, const char *fmt, ...)
> {
> struct va_format vaf;
> va_list args;
> @@ -1088,12 +1097,12 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
> }
>
> static void object_err(struct kmem_cache *s, struct slab *slab,
> - u8 *object, char *reason)
> + u8 *object, const char *reason)
> {
> if (slab_add_kunit_errors())
> return;
>
> - slab_bug(s, "%s", reason);
> + slab_bug(s, reason);
> print_trailer(s, slab, object);
> add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
>
> @@ -1129,15 +1138,14 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab,
> const char *fmt, ...)
> {
> va_list args;
> - char buf[100];
>
> if (slab_add_kunit_errors())
> return;
>
> va_start(args, fmt);
> - vsnprintf(buf, sizeof(buf), fmt, args);
> + __slab_bug(s, fmt, args);
> va_end(args);
> - slab_bug(s, "%s", buf);
> +
> __slab_err(slab);
> }
>
> @@ -1175,7 +1183,7 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
> s->inuse - poison_size);
> }
>
> -static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
> +static void restore_bytes(struct kmem_cache *s, const char *message, u8 data,
> void *from, void *to)
> {
> slab_fix(s, "Restoring %s 0x%p-0x%p=0x%x", message, from, to - 1, data);
> @@ -1190,7 +1198,7 @@ static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
>
> static pad_check_attributes int
> check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> - u8 *object, char *what, u8 *start, unsigned int value,
> + u8 *object, const char *what, u8 *start, unsigned int value,
> unsigned int bytes, bool slab_obj_print)
> {
> u8 *fault;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-28 12:47 ` Harry Yoo
@ 2025-02-28 16:02 ` Vlastimil Babka
2025-03-04 1:37 ` Hyesoo Yu
0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2025-02-28 16:02 UTC (permalink / raw)
To: Harry Yoo
Cc: Hyesoo Yu, janghyuck.kim, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
On 2/28/25 13:47, Harry Yoo wrote:
> On Thu, Feb 27, 2025 at 05:26:26PM +0100, Vlastimil Babka wrote:
>> ---8<---
>> From c38dadde6293cacdb91f95afc3615c22dec5830a Mon Sep 17 00:00:00 2001
>> From: Vlastimil Babka <vbabka@suse.cz>
>> Date: Thu, 27 Feb 2025 16:05:46 +0100
>> Subject: [PATCH] mm, slab: cleanup slab_bug() parameters
>>
>> slab_err() has variadic printf arguments but instead of passing them to
>> slab_bug() it does vsnprintf() to a buffer and passes %s, buf.
>>
>> To allow passing them directly, turn slab_bug() to __slab_bug() with a
>> va_list parameter, and slab_bug() a wrapper with fmt, ... parameters.
>> Then slab_err() can call __slab_bug() without the intermediate buffer.
>>
>> Also constify fmt everywhere, which also simplifies object_err()'s
>> call to slab_bug().
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>
> Looks good to me.
>
> FWIW,
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Thanks, adding to slab/for-next
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error
2025-02-28 16:02 ` Vlastimil Babka
@ 2025-03-04 1:37 ` Hyesoo Yu
0 siblings, 0 replies; 14+ messages in thread
From: Hyesoo Yu @ 2025-03-04 1:37 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Harry Yoo, janghyuck.kim, Christoph Lameter, Pekka Enberg,
David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
Hyeonggon Yoo, linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1128 bytes --]
On Fri, Feb 28, 2025 at 05:02:00PM +0100, Vlastimil Babka wrote:
> On 2/28/25 13:47, Harry Yoo wrote:
> > On Thu, Feb 27, 2025 at 05:26:26PM +0100, Vlastimil Babka wrote:
> >> ---8<---
> >> From c38dadde6293cacdb91f95afc3615c22dec5830a Mon Sep 17 00:00:00 2001
> >> From: Vlastimil Babka <vbabka@suse.cz>
> >> Date: Thu, 27 Feb 2025 16:05:46 +0100
> >> Subject: [PATCH] mm, slab: cleanup slab_bug() parameters
> >>
> >> slab_err() has variadic printf arguments but instead of passing them to
> >> slab_bug() it does vsnprintf() to a buffer and passes %s, buf.
> >>
> >> To allow passing them directly, turn slab_bug() to __slab_bug() with a
> >> va_list parameter, and slab_bug() a wrapper with fmt, ... parameters.
> >> Then slab_err() can call __slab_bug() without the intermediate buffer.
> >>
> >> Also constify fmt everywhere, which also simplifies object_err()'s
> >> call to slab_bug().
> >>
> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >> ---
> >
> > Looks good to me.
> >
> > FWIW,
> > Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>
> Thanks, adding to slab/for-next
>
>
Looks good to me.
Thanks!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-03-04 1:39 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20250226081354epcas2p44c2f53d569296ac2e5f8a7b01f4552fa@epcas2p4.samsung.com>
2025-02-26 8:11 ` [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error Hyesoo Yu
[not found] ` <CGME20250226081357epcas2p2f4c462b215b75291a9aeeec23aa1eaca@epcas2p2.samsung.com>
2025-02-26 8:12 ` [PATCH v4 1/2] mm: slub: Print the broken data before restoring slub Hyesoo Yu
2025-02-27 11:51 ` Harry Yoo
2025-02-27 12:36 ` Harry Yoo
[not found] ` <CGME20250226081359epcas2p2a6a1f3f92540660129164734fa6eaa64@epcas2p2.samsung.com>
2025-02-26 8:12 ` [PATCH v4 2/2] mm: slub: call WARN() when the slab detect an error Hyesoo Yu
2025-02-27 12:55 ` Harry Yoo
2025-02-27 15:18 ` Vlastimil Babka
2025-02-27 14:38 ` Vlastimil Babka
2025-02-27 11:53 ` [PATCH v4 0/2] mm: slub: Enhanced debugging in slub error Harry Yoo
2025-02-27 16:12 ` Vlastimil Babka
2025-02-27 16:26 ` Vlastimil Babka
2025-02-28 12:47 ` Harry Yoo
2025-02-28 16:02 ` Vlastimil Babka
2025-03-04 1:37 ` Hyesoo Yu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).