From: Mike Rapoport <rppt@kernel.org>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Bill Wendling <morbo@google.com>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
Justin Stitt <justinstitt@google.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Miguel Ojeda <ojeda@kernel.org>,
Nathan Chancellor <nathan@kernel.org>,
Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
linux-kernel@vger.kernel.org, llvm@lists.linux.dev
Subject: Re: [PATCH 1/4] mm/mm_init: use deferred_init_memmap_chunk() in deferred_grow_zone()
Date: Tue, 19 Aug 2025 13:54:46 +0300 [thread overview]
Message-ID: <aKRX9iIe8h9fFi9v@kernel.org> (raw)
In-Reply-To: <20250819095223.ckjdsii4gc6u4nec@master>
On Tue, Aug 19, 2025 at 09:52:23AM +0000, Wei Yang wrote:
> Hi, Mike
>
> After going through the code again, I have some trivial thoughts to discuss
> with you. If not right, please let me know.
>
> On Mon, Aug 18, 2025 at 09:46:12AM +0300, Mike Rapoport wrote:
> [...]
> > bool __init deferred_grow_zone(struct zone *zone, unsigned int order)
> > {
> >- unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION);
> >+ unsigned long nr_pages_needed = SECTION_ALIGN_UP(1 << order);
> > pg_data_t *pgdat = zone->zone_pgdat;
> > unsigned long first_deferred_pfn = pgdat->first_deferred_pfn;
> > unsigned long spfn, epfn, flags;
> > unsigned long nr_pages = 0;
> >- u64 i = 0;
> >
> > /* Only the last zone may have deferred pages */
> > if (zone_end_pfn(zone) != pgdat_end_pfn(pgdat))
> >@@ -2262,37 +2272,26 @@ bool __init deferred_grow_zone(struct zone *zone, unsigned int order)
> > return true;
> > }
>
> In the file above this line, there is a compare between first_deferred_pfn and
> its original value after grab pgdat_resize_lock.
Do you mean this one:
if (first_deferred_pfn != pgdat->first_deferred_pfn) {
pgdat_resize_unlock(pgdat, &flags);
return true;
}
> I am thinking to compare first_deferred_pfn with ULONG_MAX, as it compared in
> deferred_init_memmap(). This indicate this zone has already been initialized
> totally.
It may be another CPU ran deferred_grow_zone() and won the race for resize
lock. Then pgdat->first_deferred_pfn will be larger than
first_deferred_pfn, but still not entire zone would be initialized.
> Current code guard this by spfn < zone_end_pfn(zone). Maybe a check ahead
> would be more clear?
Not sure I follow you here. The check that we don't pass zone_end_pfn is
inside the loop for every section we initialize.
> >
> >- /* If the zone is empty somebody else may have cleared out the zone */
> >- if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
> >- first_deferred_pfn)) {
> >- pgdat->first_deferred_pfn = ULONG_MAX;
> >- pgdat_resize_unlock(pgdat, &flags);
> >- /* Retry only once. */
> >- return first_deferred_pfn != ULONG_MAX;
> >+ /*
> >+ * Initialize at least nr_pages_needed in section chunks.
> >+ * If a section has less free memory than nr_pages_needed, the next
> >+ * section will be also initalized.
> >+ * Note, that it still does not guarantee that allocation of order can
> >+ * be satisfied if the sections are fragmented because of memblock
> >+ * allocations.
> >+ */
> >+ for (spfn = first_deferred_pfn, epfn = SECTION_ALIGN_UP(spfn + 1);
>
> I am expecting first_deferred_pfn is section aligned. So epfn += PAGES_PER_SECTION
> is fine?
It should be, but I'd prefer to be on the safe side and keep it this way.
> Maybe I missed something.
>
> >+ nr_pages < nr_pages_needed && spfn < zone_end_pfn(zone);
> >+ spfn = epfn, epfn += PAGES_PER_SECTION) {
> >+ nr_pages += deferred_init_memmap_chunk(spfn, epfn, zone);
> > }
> >
> > /*
> >- * Initialize and free pages in MAX_PAGE_ORDER sized increments so
> >- * that we can avoid introducing any issues with the buddy
> >- * allocator.
> >+ * There were no pages to initialize and free which means the zone's
> >+ * memory map is completely initialized.
> > */
> >- while (spfn < epfn) {
> >- /* update our first deferred PFN for this section */
> >- first_deferred_pfn = spfn;
> >-
> >- nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn);
> >- touch_nmi_watchdog();
> >-
> >- /* We should only stop along section boundaries */
> >- if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION)
> >- continue;
> >-
> >- /* If our quota has been met we can stop here */
> >- if (nr_pages >= nr_pages_needed)
> >- break;
> >- }
> >+ pgdat->first_deferred_pfn = nr_pages ? spfn : ULONG_MAX;
>
> If we come here because spfn >= zone_end_pfn(zone), first_deferred_pfn is left
> a "valid" value and deferred_init_memmap() will try to do its job. But
> actually nothing left to initialize.
We anyway run a thread for each node with memory. In the very unlikely case
we've completely initialized a deferred zone that thread will finish much
faster :)
> For this case, I suggest to set it ULONG_MAX too. But this is really corner
> case.
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2025-08-19 10:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-18 6:46 [PATCH 0/4] mm/mm_init: simplify deferred init of struct pages Mike Rapoport
2025-08-18 6:46 ` [PATCH 1/4] mm/mm_init: use deferred_init_memmap_chunk() in deferred_grow_zone() Mike Rapoport
2025-08-19 7:44 ` David Hildenbrand
2025-08-19 9:52 ` Wei Yang
2025-08-19 10:54 ` Mike Rapoport [this message]
2025-08-19 23:51 ` Wei Yang
2025-08-20 9:20 ` Mike Rapoport
2025-08-20 12:42 ` Wei Yang
2025-08-18 6:46 ` [PATCH 2/4] mm/mm_init: deferred_init_memmap: use a job per zone Mike Rapoport
2025-08-19 7:45 ` David Hildenbrand
2025-08-18 6:46 ` [PATCH 3/4] mm/mm_init: drop deferred_init_maxorder() Mike Rapoport
2025-08-19 7:54 ` David Hildenbrand
2025-08-19 9:22 ` Wei Yang
2025-08-19 10:39 ` Mike Rapoport
2025-08-19 12:31 ` David Hildenbrand
2025-08-18 6:46 ` [PATCH 4/4] memblock: drop for_each_free_mem_pfn_range_in_zone_from() Mike Rapoport
2025-08-19 7:39 ` [PATCH 0/4] mm/mm_init: simplify deferred init of struct pages Wei Yang
2025-08-19 10:41 ` Mike Rapoport
2025-08-22 5:54 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aKRX9iIe8h9fFi9v@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=daniel.m.jordan@oracle.com \
--cc=justinstitt@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=llvm@lists.linux.dev \
--cc=morbo@google.com \
--cc=mpe@ellerman.id.au \
--cc=nathan@kernel.org \
--cc=nick.desaulniers+lkml@gmail.com \
--cc=ojeda@kernel.org \
--cc=richard.weiyang@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.