* [PATCH] debugobjects: skip activate fixup when disabled by a concurrent OOM
@ 2026-06-10 3:47 Jiayuan Chen
2026-06-28 6:01 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Jiayuan Chen @ 2026-06-10 3:47 UTC (permalink / raw)
To: linux-kernel; +Cc: Jiayuan Chen, Andrew Morton, Thomas Gleixner
When a tracking object cannot be allocated, lookup_object_or_alloc()
sets debug_objects_enabled = false and the caller runs
debug_objects_oom(), which wipes the whole hash. The flag is cleared
before the hash is wiped.
debug_object_activate() only tests debug_objects_enabled once on entry.
If another CPU hits OOM and wipes the hash after that test, the lookup
here misses and the object is taken as ODEBUG_STATE_NOTAVAILABLE.
fixup_activate() then "repairs" it; for timers that overwrites a live
timer's callback with stub_timer, which later fires a bogus WARN.
Re-check debug_objects_enabled while still holding the bucket lock,
before the fixup. The flag is cleared before the hash is wiped, and both
the wipe and the lookup are serialized by the bucket lock, so a
wipe-induced miss is guaranteed to observe the cleared flag and the
spurious fixup is skipped.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
lib/debugobjects.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index b18a682fe3da..fcb7949cb2be 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -865,6 +865,16 @@ int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
}
}
+ /*
+ * A concurrent OOM teardown may have disabled debugobjects and
+ * wiped the hash after the check at function entry. So we need
+ * check it again here.
+ */
+ if (!debug_objects_enabled) {
+ raw_spin_unlock_irqrestore(&db->lock, flags);
+ return 0;
+ }
+
raw_spin_unlock_irqrestore(&db->lock, flags);
debug_print_object(&o, "activate");
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] debugobjects: skip activate fixup when disabled by a concurrent OOM
2026-06-10 3:47 [PATCH] debugobjects: skip activate fixup when disabled by a concurrent OOM Jiayuan Chen
@ 2026-06-28 6:01 ` Andrew Morton
2026-06-28 16:52 ` Thomas Gleixner
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2026-06-28 6:01 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: linux-kernel, Thomas Gleixner
On Wed, 10 Jun 2026 11:47:25 +0800 Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> When a tracking object cannot be allocated, lookup_object_or_alloc()
> sets debug_objects_enabled = false and the caller runs
> debug_objects_oom(), which wipes the whole hash. The flag is cleared
> before the hash is wiped.
>
> debug_object_activate() only tests debug_objects_enabled once on entry.
> If another CPU hits OOM and wipes the hash after that test, the lookup
> here misses and the object is taken as ODEBUG_STATE_NOTAVAILABLE.
> fixup_activate() then "repairs" it; for timers that overwrites a live
> timer's callback with stub_timer, which later fires a bogus WARN.
>
> Re-check debug_objects_enabled while still holding the bucket lock,
> before the fixup. The flag is cleared before the hash is wiped, and both
> the wipe and the lookup are serialized by the bucket lock, so a
> wipe-induced miss is guaranteed to observe the cleared flag and the
> spurious fixup is skipped.
Thanks for working on this.
The patch was sent at an inopportune time - late in -rc people aren't
paying much attention to new material so bugfixes can fall through
cracks.
How was this problem observed? Code inspection? Fault injection?
Real-world failures?
If it is known that this WARN can trigger in real-world situations then
we'll probably want to fix earlier kernels, which means a Fixes: and a
cc:stable.
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -865,6 +865,16 @@ int debug_object_activate(void *addr, const struct debug_obj_descr *descr)
> }
> }
>
> + /*
> + * A concurrent OOM teardown may have disabled debugobjects and
> + * wiped the hash after the check at function entry. So we need
> + * check it again here.
> + */
> + if (!debug_objects_enabled) {
> + raw_spin_unlock_irqrestore(&db->lock, flags);
> + return 0;
> + }
> +
Sashiko AI review suggests that further fixing might be needed:
https://sashiko.dev/#/patchset/20260610034726.213910-1-jiayuan.chen@linux.dev
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] debugobjects: skip activate fixup when disabled by a concurrent OOM
2026-06-28 6:01 ` Andrew Morton
@ 2026-06-28 16:52 ` Thomas Gleixner
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2026-06-28 16:52 UTC (permalink / raw)
To: Andrew Morton, Jiayuan Chen; +Cc: linux-kernel
On Sat, Jun 27 2026 at 23:01, Andrew Morton wrote:
> On Wed, 10 Jun 2026 11:47:25 +0800 Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> The patch was sent at an inopportune time - late in -rc people aren't
> paying much attention to new material so bugfixes can fall through
> cracks.
Indeed I missed it somehow.
> If it is known that this WARN can trigger in real-world situations then
> we'll probably want to fix earlier kernels, which means a Fixes: and a
> cc:stable.
It's already fixed upstream:
b81dde13cc16 ("debugobjects: Plug race against a concurrent OOM disable")
syzbot triggered the problem...
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-28 16:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 3:47 [PATCH] debugobjects: skip activate fixup when disabled by a concurrent OOM Jiayuan Chen
2026-06-28 6:01 ` Andrew Morton
2026-06-28 16:52 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox