Re: [PATCH RFC] mm/kmemleak: avoid soft lockup when scanning task stacks

Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Lance Yang <lance.yang@linux.dev>
To: Breno Leitao <leitao@debian.org>
Cc: catalin.marinas@arm.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@meta.com, sj@kernel.org
Subject: Re: [PATCH RFC] mm/kmemleak: avoid soft lockup when scanning task stacks
Date: Fri, 12 Jun 2026 17:57:12 +0800	[thread overview]
Message-ID: <6bc446b7-8e25-4add-9d72-a3c9e9191533@linux.dev> (raw)
In-Reply-To: <aivJ5Dv0hAQ1sJu1@gmail.com>



On 2026/6/12 17:09, Breno Leitao wrote:
> Hello Lance,
> 
> First of all, thanks for ther review, really awesome!

Cool!

> On Fri, Jun 12, 2026 at 11:16:05AM +0800, Lance Yang wrote:
>> On Thu, Jun 11, 2026 at 05:45:00AM -0700, Breno Leitao wrote:
>>> kmemleak_scan() walks every thread and scans its kernel stack under a
>>> single rcu_read_lock() with no reschedule point. On a host with very
>>> many threads -- amplified by KASAN/lockdep in debug builds -- this loop
>>> can hog a CPU long enough to trip the soft lockup watchdog:
>>>
>>>   watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537]
>>>    scan_block
>>>    kmemleak_scan
>>>    kmemleak_scan_thread
>>>    kthread
>>
>> Neat, good catch!
>>
>>> A cond_resched() cannot be added directly: the loop runs inside an RCU
>>> read-side critical section.
>>>
>>> Split the scan in two parts:
>>>
>>> 1) get the list of tasks (with RCU read lock) in an array
>>> 2) run scan_block() for the tasks (with cond_reschd()).
>>>
>>> Is it a sane approach?
>>
>> Why not use the kernel/hung_task.c pattern here? Seems simpler, with no
>> extra task-array allocation ;)
> 
> I've looked at it, but I am not sure we want to break the loop mid-air,
> that seems to increase the false positives, given we did a half-baked
> scan, right?
> 
>> Could break RCU only when resched is needed. Pin the current cursors,
>> drop RCU, cond_resched(), take RCU again, and continue only if the
>> cursors are still alive ;)
>>
>> If either cursor died while RCU was droped, stopping this scan round
>> should be fine, IMHO.
> 
> I am not sure, this is not the same as the existing kmemleak_cond_resched()
> raciness in the object_list loops. Those iterate the marked set, where a miss
> only means "this object isn't reported until the next scan" -- under-reporting,
> self-healing, and the in-tree comment says exactly that.
> 
> Dropping a *root* mid-scan is the opposite: it makes *other* objects get
> falsely reported. So the "it's already racy, bailing is fine" reasoning doesn't
> carry over from the object loop to the stack loop.
> 
> If we go this route, the aborted round has to suppress reporting, reusing
> kmemleak's existing "scan was interrupted -> don't report" path:
> 
>        if (need_resched() && !kmemleak_stack_scan_break(g, p)) {
>                aborted = true;
>                goto unlock;
>        }

I'd expect the normal case to just drop RCU, cond_resched(), take RCU
again, see both cursors still alive, and keep walking :)

>        ...
>        if (scan_should_stop() || aborted)
>                return;

And yeah, you're right. If we do lost a cursor, bailing out and
skipping reporting fot that incomplete root scan should be the
right thing, I guess :D

> Then an abort means "this round reports nothing; the next full scan
> reports the real leaks" instead of a false-positive flood.
> 
> On boxes with very many threads, where the stack walk is long and
> need_resched() fires constantly, so the break helper runs a lot -- which makes
> aborts (and thus fully-suppressed, non-reporting rounds) plausibly more than
> "rare".
> 
> Since each round restarts from the head, the tail of the thread list is the
> most likely to be perpetually skipped, on exactly the workload this is meant to
> fix.
> 
> The snapshot avoids that by scanning a complete, similar to what we have today.
> 
> Anyway, I would love to get rid of the array, but, I am not convinced that
> dropping the scan mid-air will not cause false positives.
> 
> Thanks for the review,

Cheers!

next prev parent reply	other threads:[~2026-06-12  9:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 12:45 [PATCH RFC] mm/kmemleak: avoid soft lockup when scanning task stacks Breno Leitao
2026-06-12  1:10 ` SeongJae Park
2026-06-12  9:42   ` Breno Leitao
2026-06-12  3:16 ` Lance Yang
2026-06-12  4:27   ` Lance Yang
2026-06-12  9:09   ` Breno Leitao
2026-06-12  9:57     ` Lance Yang [this message]
2026-06-12 10:39       ` Breno Leitao
2026-06-12 11:22         ` Lance Yang
2026-06-12 11:57           ` Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bc446b7-8e25-4add-9d72-a3c9e9191533@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=kernel-team@meta.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox