From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0999F25B09D for ; Fri, 12 Jun 2026 09:42:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781257378; cv=none; b=EjFdbuQlyuARQAoz4MvJM9e7sAk9f+AIl4JNLN+e7Ru/lglvHGb66g/VlaEAbp8o80Mo1O/3DfEZNHG08w9I1q5CgurmcDGuJr/7Q9dRLI6sLizr7Mnt26Iwt5srOqNUl5CT6GYsvWouy/p4mCaOpJatjL2fpqXKdvZZjUXvsTc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781257378; c=relaxed/simple; bh=l9ho+sy0UDk9MQp00j32fsBhMkz7gmMXGisfH3JN93w=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lUimw6AB+815ksKeIL4iGmgybtYsQ54Ks59fvl0NkuEfm6nmnGHKO4e077TuCQQfckBksYrAGQQQRGgBDAUHv9yNKqTUeUg8zqZ8M5xnhHcjJfTfgDEaVcVlX14ifAMvfyReuvsFfYn2j6uYogQGTbb4PPfqOubkSaw/tdO5MCU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=ERIvupxV; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="ERIvupxV" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=r+EETPDO05l17/BFbIZL6wDNJDo2//SDdspTodnkeLw=; b=ERIvupxVzVUFRmj4gv5o8WeMMR q4zMCkbqJZYhsVXEZpBmGgEgN6Ow0mWQKGmhTj8pko0iVqA2AMQkUw6A/NyDmAfwgaV93H87vloNE dZWC5B4ugTHM0KB1PyBlqVMqs8B+YjCHnMOOSdRUxz/0OTV2RZB4/um0xVl53EpLJNesBthNvDLpx djs7gAXwISeEkaGospstfw17x8jnQmVSpV5svqq3fSedFEahpGINFz5prZ2iqRLK2t+y+vjAUBZ1j ZCD62iYbFEiHfMbh05JZKTpQ8XrsiE6HC8cjqSmgygobGoG+d4NgMBSx+66JQlsWU0Q+pjdU26NcM oxgaj1qQ==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wXyPV-00Adru-3A; Fri, 12 Jun 2026 09:42:46 +0000 Date: Fri, 12 Jun 2026 02:42:41 -0700 From: Breno Leitao To: SeongJae Park Cc: Catalin Marinas , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH RFC] mm/kmemleak: avoid soft lockup when scanning task stacks Message-ID: References: <20260611-kmemleak-stack-resched-v1-1-d6248ade5f4a@debian.org> <20260612011049.84146-1-sj@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260612011049.84146-1-sj@kernel.org> X-Debian-User: leitao On Thu, Jun 11, 2026 at 06:10:48PM -0700, SeongJae Park wrote: > On Thu, 11 Jun 2026 05:45:00 -0700 Breno Leitao wrote: > > > kmemleak_scan() walks every thread and scans its kernel stack under a > > single rcu_read_lock() with no reschedule point. On a host with very > > many threads -- amplified by KASAN/lockdep in debug builds -- this loop > > can hog a CPU long enough to trip the soft lockup watchdog: > > > > watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537] > > scan_block > > kmemleak_scan > > kmemleak_scan_thread > > kthread > > > > A cond_resched() cannot be added directly: the loop runs inside an RCU > > read-side critical section. > > > > Split the scan in two parts: > > > > 1) get the list of tasks (with RCU read lock) in an array > > 2) run scan_block() for the tasks (with cond_reschd()). > > > > Is it a sane approach? > > > > Signed-off-by: Breno Leitao > > --- > > mm/kmemleak.c | 26 ++++++++++++++++++++++---- > > 1 file changed, 22 insertions(+), 4 deletions(-) > > > > diff --git a/mm/kmemleak.c b/mm/kmemleak.c > > index 7c7ba17ce7af0..9f8a35ecbb50c 100644 > > --- a/mm/kmemleak.c > > +++ b/mm/kmemleak.c > > @@ -62,6 +62,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -1885,17 +1886,34 @@ static void kmemleak_scan(void) > > * Scanning the task stacks (may introduce false negatives). > > */ > > if (kmemleak_stack_scan) { > > - struct task_struct *p, *g; > > + struct task_struct **tasks, *p, *g; > > + unsigned int nr = 0, max, i; > > > > + max = nr_threads + 64; > > + tasks = kvmalloc_array(max, sizeof(*tasks), GFP_KERNEL); > > + > > + /* Snapshot the threads under RCU */ > > rcu_read_lock(); > > for_each_process_thread(g, p) { > > - void *stack = try_get_task_stack(p); > > + if (!tasks || nr >= max) > > + break; > > Why don't you check !tasks right after the allocation? Good question. I will update if we agree this approach is good enough. Thanks for the review, SJ! --breno