* Re: [PATCH v3 3/3] mm/kmemleak: stop the per-cpu and struct page scans early too
[not found] ` <ajEgSI_UiTPU5eIF@redhat.com>
@ 2026-06-16 12:46 ` Breno Leitao
2026-06-16 13:10 ` Oleg Nesterov
0 siblings, 1 reply; 2+ messages in thread
From: Breno Leitao @ 2026-06-16 12:46 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Catalin Marinas, Andrew Morton, lance.yang, Davidlohr Bueso,
Qian Cai, sj, linux-mm, linux-kernel, kernel-team
Hello Oleg,
On Tue, Jun 16, 2026 at 12:07:04PM +0200, Oleg Nesterov wrote:
> On 06/15, Breno Leitao wrote:
> >
> > #ifdef CONFIG_SMP
> > /* per-cpu sections scanning */
> > - for_each_possible_cpu(i)
> > - scan_large_block(__per_cpu_start + per_cpu_offset(i),
> > - __per_cpu_end + per_cpu_offset(i));
> > + for_each_possible_cpu(i) {
> > + if (scan_large_block(__per_cpu_start + per_cpu_offset(i),
> > + __per_cpu_end + per_cpu_offset(i)))
> > + break;
> > + }
> > #endif
>
> The patch looks correct, but...
>
> if scan_large_block() returns true, then
>
> > @@ -1902,6 +1908,7 @@ static void kmemleak_scan(void)
> > unsigned long start_pfn = zone->zone_start_pfn;
> > unsigned long end_pfn = zone_end_pfn(zone);
> > unsigned long pfn;
> > + int stop = 0;
> >
> > for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> > struct page *page = pfn_to_online_page(pfn);
> > @@ -1918,8 +1925,12 @@ static void kmemleak_scan(void)
> > /* only scan if page is in use */
> > if (page_count(page) == 0)
> > continue;
> > - scan_block(page, page + 1, NULL);
> > + stop = scan_block(page, page + 1, NULL);
>
> it is pointless to enter this for_each_populated_zone() loop and call
> kmemleak_scan_task_stacks() after that?
Upon further review, your semll is right, once a phase trips scan_should_stop()
the later phases are still entered and each just bails on its first
scan_block(). That can be optimized away, e.g. by lifting 'stop' to function
scope and guarding the remaining phases so they aren't entered at all.
It is not a regression, though -- the pointless work is pre-existing.
Before this series none of these loops broke out on scan_should_stop(): the
scan ground through every CPU, every pfn and every thread, relying only on
scan_block() returning early on each call. This series already removes the bulk
of that; the bit you spotted (entering the next phase only to bail) is the
small leftover.
So I'd rather keep this series as-is and do that optimization as a follow-up
once it lands, instead of growing a patch that's already been reviewed. Are
you OK with that?
Thanks,
--breno
^ permalink raw reply [flat|nested] 2+ messages in thread