* [PATCH v3 1/2] debugobjects: Show the state of debug_objects_enabled
2025-06-17 5:35 [PATCH v3 0/2] debugobjects: Allow object pool refill mostly in non-atomic context Waiman Long
@ 2025-06-17 5:35 ` Waiman Long
2025-06-17 5:35 ` [PATCH v3 2/2] debugobjects: Allow object pool refill mostly in non-atomic context Waiman Long
1 sibling, 0 replies; 5+ messages in thread
From: Waiman Long @ 2025-06-17 5:35 UTC (permalink / raw)
To: Thomas Gleixner, Andrew Morton; +Cc: linux-kernel, Waiman Long
With a PREEMPT_RT kernel, there is a fair chance that debug_objects
could get disabled because we are running out of free debug objects as
debug object allocation is disabled in non-preemptible context. With
!PREEMPT_RT kernels, the chance of this should be minimal. When we
consider imposing restrictions on where debug object allocation can be
done, the chance of running out of them increases.
Currently, it is not easy to figure if debug objects tracking is
disabled. Fix that by showing the state of "debug_objects_enabled"
in the stats debugfs file as well as always printing a message in the
console log.
Signed-off-by: Waiman Long <longman@redhat.com>
---
lib/debugobjects.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 7f50c4480a4e..5598105ecf0d 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -125,6 +125,12 @@ static int __init disable_object_debug(char *str)
}
early_param("no_debug_objects", disable_object_debug);
+static void debug_objects_disable(const char *msg)
+{
+ debug_objects_enabled = false;
+ printk_deferred(KERN_WARNING "debug_objects disabled: %s\n", msg);
+}
+
static const char *obj_states[ODEBUG_STATE_MAX] = {
[ODEBUG_STATE_NONE] = "none",
[ODEBUG_STATE_INIT] = "initialized",
@@ -690,7 +696,7 @@ static struct debug_obj *lookup_object_or_alloc(void *addr, struct debug_bucket
}
/* Out of memory. Do the cleanup outside of the locked region */
- debug_objects_enabled = false;
+ debug_objects_disable("out of memory");
return NULL;
}
@@ -1161,6 +1167,8 @@ static int debug_stats_show(struct seq_file *m, void *v)
seq_printf(m, "on_free_list : %u\n", pool_count(&pool_to_free));
seq_printf(m, "objs_allocated: %d\n", debug_objects_allocated);
seq_printf(m, "objs_freed : %d\n", debug_objects_freed);
+ seq_printf(m, "debug_objects : %s\n", debug_objects_enabled ? "enabled"
+ : "disabled");
return 0;
}
DEFINE_SHOW_ATTRIBUTE(debug_stats);
@@ -1314,7 +1322,7 @@ check_results(void *addr, enum debug_obj_state state, int fixups, int warnings)
out:
raw_spin_unlock_irqrestore(&db->lock, flags);
if (res)
- debug_objects_enabled = false;
+ debug_objects_disable("selftest");
return res;
}
@@ -1486,11 +1494,8 @@ void __init debug_objects_mem_init(void)
cache = kmem_cache_create("debug_objects_cache", sizeof (struct debug_obj), 0,
SLAB_DEBUG_OBJECTS | SLAB_NOLEAKTRACE, NULL);
- if (!cache || !debug_objects_replace_static_objects(cache)) {
- debug_objects_enabled = false;
- pr_warn("Out of memory.\n");
- return;
- }
+ if (!cache || !debug_objects_replace_static_objects(cache))
+ debug_objects_disable("out of memory");
/*
* Adjust the thresholds for allocating and freeing objects
--
2.49.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v3 2/2] debugobjects: Allow object pool refill mostly in non-atomic context
2025-06-17 5:35 [PATCH v3 0/2] debugobjects: Allow object pool refill mostly in non-atomic context Waiman Long
2025-06-17 5:35 ` [PATCH v3 1/2] debugobjects: Show the state of debug_objects_enabled Waiman Long
@ 2025-06-17 5:35 ` Waiman Long
2025-06-24 8:28 ` kernel test robot
2025-06-26 15:22 ` Thomas Gleixner
1 sibling, 2 replies; 5+ messages in thread
From: Waiman Long @ 2025-06-17 5:35 UTC (permalink / raw)
To: Thomas Gleixner, Andrew Morton; +Cc: linux-kernel, Waiman Long
With PREEMPT_RT kernel, object pool refill can only happen in preemptible
context. For other !PREEMPT_RT kernels, pool refill can happen in any
context. This can sometimes lead to problem like the following circular
locking dependency shown below.
-> #3 (&zone->lock){-.-.}-{2:2}:
-> #2 (&base->lock){-.-.}-{2:2}:
-> #1 (&console_sch_key){-.-.}-{2:2}:
-> #0 (console_owner){..-.}-{0:0}:
The "console_owner" is from calling printk() from the following call
chain:
rmqueue_bulk() => expand() => __warn_printk() => _printk()
This is due to the invocation of the VM_WARN_ONCE() macro in
__add_to_free_list().
The "base->lock" is from lock_timer_base() and "zone->lock" is due to
calling add_timer_on() leading to debug_object_activate() doing actual
memory allocation in pool refill acquiring the zone lock.
The "console_sch_key" comes from a s390 console driver in
driver/s390/cio. The console_sch_key -> timer dependency happens
because the console driver is setting a timeout value while holding
its lock. Apparently it is pretty common for a console driver to use
timer for timeout or other timing purposes. So this may happen to other
console drivers as well.
There are three debug objects functions that will invoke
debug_objects_fill_pool() for pool refill - __debug_object_init(),
debug_object_activate() & debug_object_assert_init(). Thomas suggested
that we may enforce the pool refill only in the init function and
remove the debug_objects_fill_pool() call from the other two to avoid
the kind of circular locking problem shown above. It is because the init
function can be called in a cluster with many debug objects initialized
consecutively which can lead to exhaustion of the global object pool
if we disable the init function from doing pool refill. See [1] for
such an example. The call patterns of the other two are typically more
spread out. Of the three, the activate function is called at least an
order of magnitude more than the other two.
Removing the pool refill call from the other two may make pool
exhaustion happen more easily leading to the disabling of the debug
object tracking. As a middle ground, we will allow pool refill from the
activate and assert_init functions if they are called from a non-atomic
context which is roughly half of the times depending on the workloads.
As in_atomic() may not know preemption has been disabled, when
a spinlock has been acquired for example, if CONFIG_PREEMPT_COUNT
hasn't been set. So make DEBUG_OBJECTS select PREEMPT_COUNT to make
sure that the preemption state is properly captured. The overhead of
adding PREEMPT_COUNT should be insignificant compared with the overhead
imposed by enabling the debug object tracking code itself.
[1] https://lore.kernel.org/lkml/202506121115.b69b8c2-lkp@intel.com/
Signed-off-by: Waiman Long <longman@redhat.com>
---
lib/Kconfig.debug | 1 +
lib/debugobjects.c | 15 +++++++++++----
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ebe33181b6e6..854a2f12a64b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -723,6 +723,7 @@ source "mm/Kconfig.debug"
config DEBUG_OBJECTS
bool "Debug object operations"
depends on DEBUG_KERNEL
+ select PREEMPT_COUNT
help
If you say Y here, additional code will be inserted into the
kernel to track the life time of various objects and validate
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 5598105ecf0d..d85f87f359d2 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -700,11 +700,18 @@ static struct debug_obj *lookup_object_or_alloc(void *addr, struct debug_bucket
return NULL;
}
-static void debug_objects_fill_pool(void)
+static void debug_objects_fill_pool(bool init)
{
if (!static_branch_likely(&obj_cache_enabled))
return;
+ /*
+ * Attempt to fill the pool only if called from debug_objects_init()
+ * or not in atomic context.
+ */
+ if (!init && in_atomic())
+ return;
+
if (likely(!pool_should_refill(&pool_global)))
return;
@@ -740,7 +747,7 @@ __debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack
struct debug_bucket *db;
unsigned long flags;
- debug_objects_fill_pool();
+ debug_objects_fill_pool(true);
db = get_bucket((unsigned long) addr);
@@ -817,7 +824,7 @@ int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
if (!debug_objects_enabled)
return 0;
- debug_objects_fill_pool();
+ debug_objects_fill_pool(false);
db = get_bucket((unsigned long) addr);
@@ -1006,7 +1013,7 @@ void debug_object_assert_init(void *addr, const struct debug_obj_descr *descr)
if (!debug_objects_enabled)
return;
- debug_objects_fill_pool();
+ debug_objects_fill_pool(false);
db = get_bucket((unsigned long) addr);
--
2.49.0
^ permalink raw reply related [flat|nested] 5+ messages in thread