The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: leitao@debian.org
Cc: catalin.marinas@arm.com, akpm@linux-foundation.org,
	lance.yang@linux.dev, dave@stgolabs.net, oleg@redhat.com,
	cai@lca.pw, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@meta.com, stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/kmemleak: avoid soft lockup when scanning task stacks
Date: Sat, 13 Jun 2026 00:52:06 +0800	[thread overview]
Message-ID: <20260612165206.93126-1-lance.yang@linux.dev> (raw)
In-Reply-To: <20260612-kmemleak-stack-resched-v2-1-53240de79e88@debian.org>


On Fri, Jun 12, 2026 at 08:16:07AM -0700, Breno Leitao wrote:
>kmemleak_scan() walks every thread and scans its kernel stack under a
>single rcu_read_lock() with no reschedule point. On a host with very
>many threads -- amplified by KASAN/lockdep in debug builds -- this loop
>can hog a CPU long enough to trip the soft lockup watchdog:
>
>  watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537]
>   scan_block
>   kmemleak_scan
>   kmemleak_scan_thread
>   kthread
>
>A cond_resched() cannot be added directly: the loop runs inside an RCU
>read-side critical section.
>
>Borrow the rcu_lock_break() pattern from kernel/hung_task.c: when a
>reschedule is needed, pin the two iteration cursors, drop the RCU read
>lock, cond_resched(), then re-acquire it and continue only if both
>cursors are still hashed.
>
>If a cursor was unhashed while the lock was dropped, the thread list
>cannot be walked further, so the round is aborted. Such a round scans
>only part of the task stacks, which would make live objects look
>unreferenced, so reuse the existing "scan interrupted" path to skip
>reporting; the next full scan reports the real leaks.

TBH, a bit dense to me as written ...

>Fixes: c4b28963fd79 ("mm/kmemleak: rely on rcu for task stack scanning")
>Cc: stable@vger.kernel.org
>Signed-off-by: Breno Leitao <leitao@debian.org>
>---
>Changes in v2:
>- Do not create the nasty array, but use the same pattern as
>  kernel/hung_task.c.
>- Link to v1: https://lore.kernel.org/r/20260611-kmemleak-stack-resched-v1-1-d6248ade5f4a@debian.org
>---
> mm/kmemleak.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
>diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>index 7c7ba17ce7af0..d88274dc0c605 100644
>--- a/mm/kmemleak.c
>+++ b/mm/kmemleak.c
>@@ -1695,6 +1695,32 @@ static void kmemleak_cond_resched(struct kmemleak_object *object)
> 	put_object(object);
> }
> 
>+/*
>+ * Briefly drop the RCU read lock to reschedule during the task stack scan.
>+ * Both cursors are pinned across the gap; return false if either one was
>+ * unhashed meanwhile, so the caller stops this round instead of walking a
>+ * stale list.
>+ */

Personally, looks a bit clunky to me with "gap" and "unhashed" ...

Maybe:

"
Drop RCU long enough to reschedule during task stack scanning. Keep both
cursors alive while RCU is dropped; return false if either cursor can no
longer continue the walk.
"

>+static bool kmemleak_stack_scan_break(struct task_struct *g,
>+				      struct task_struct *p)
>+{
>+	bool can_cont;
>+
>+	get_task_struct(g);
>+	get_task_struct(p);
>+
>+	rcu_read_unlock();
>+	cond_resched();
>+	rcu_read_lock();
>+
>+	can_cont = pid_alive(g) && pid_alive(p);
>+
>+	put_task_struct(p);
>+	put_task_struct(g);
>+
>+	return can_cont;
>+}
>+
> /*
>  * Print one leak inline. The hex dump is gated on OBJECT_ALLOCATED so it
>  * does not touch user memory that was freed concurrently; the rest of the
>@@ -1804,6 +1830,7 @@ static void kmemleak_scan(void)
> 	int __maybe_unused i;
> 	struct xarray dedup;
> 	int new_leaks = 0;
>+	bool aborted = false;
> 
> 	jiffies_last_scan = jiffies;
> 
>@@ -1890,11 +1917,21 @@ static void kmemleak_scan(void)
> 		rcu_read_lock();
> 		for_each_process_thread(g, p) {
> 			void *stack = try_get_task_stack(p);
>+
> 			if (stack) {
> 				scan_block(stack, stack + THREAD_SIZE, NULL);
> 				put_task_stack(p);
> 			}
>+			/*
>+			 * This is an expensive loop, we must to call the
>+			 * scheduler to avoid lockups
>+			 */

need_resched() plus the helper name already says most of it. Maybe just:

"
Break the RCU read-side section before rescheduling.
"

>+			if (need_resched() && !kmemleak_stack_scan_break(g, p)) {
>+				aborted = true;
>+				goto unlock;
>+			}
> 		}
>+unlock:
> 		rcu_read_unlock();
> 	}
> 
>@@ -1937,9 +1974,10 @@ static void kmemleak_scan(void)
> 	scan_gray_list();
> 
> 	/*
>-	 * If scanning was stopped do not report any new unreferenced objects.
>+	 * If scanning was stopped or a stack scan round was aborted, do not
>+	 * report any new unreferenced objects.
> 	 */

Maybe just say "stack root scan was incomplete" here? That's the actual
reason we skip reporting.

"
If scanning was stopped or the stack root scan was incomplete, do not
report any new unreferenced objects.
"

>-	if (scan_should_stop())
>+	if (scan_should_stop() || aborted)
> 		return;
> 
> 	/*
>
>---

Apart from that, feel free to add:

Acked-by: Lance Yang <lance.yang@linux.dev>

  reply	other threads:[~2026-06-12 16:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 15:16 [PATCH v2] mm/kmemleak: avoid soft lockup when scanning task stacks Breno Leitao
2026-06-12 16:52 ` Lance Yang [this message]
2026-06-12 17:11 ` Catalin Marinas
2026-06-13  0:53 ` SeongJae Park
2026-06-13 10:45 ` Oleg Nesterov
2026-06-13 11:42   ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612165206.93126-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=dave@stgolabs.net \
    --cc=kernel-team@meta.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox