All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Donet Tom <donettom@linux.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Ritesh Harjani <ritesh.list@gmail.com>,
	rafael@kernel.org, Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set
Date: Fri, 11 Apr 2025 13:59:38 +0300	[thread overview]
Message-ID: <Z_j2Gv9n4DOj6LSs@kernel.org> (raw)
In-Reply-To: <d982df07-e7d1-4d4f-a2c3-857901ccc0d0@linux.ibm.com>

On Fri, Apr 11, 2025 at 12:27:28AM +0530, Donet Tom wrote:
> 
> On 4/10/25 1:37 PM, Mike Rapoport wrote:
> > On Wed, Apr 09, 2025 at 10:57:57AM +0530, Donet Tom wrote:
> > > In the current implementation, when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
> > > set, we iterate over all PFNs in the memory block and use
> > > early_pfn_to_nid to find the NID until a match is found.
> > > 
> > > This patch we are using curr_node_memblock_intersect_memory_block() to
> > > check if the current node's memblock intersects with the memory block
> > > passed when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. If an intersection
> > > is found, the memory block is added to the current node.
> > > 
> > > If CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the existing mechanism
> > > for finding the NID will continue to be used.
> > I don't think we really need different mechanisms for different settings of
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
> > 
> > node_dev_init() runs after all struct pages are already initialized and can
> > always use pfn_to_nid().
> 
> 
> In the current implementation, if CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, we perform a binary search in the memblock region to
> determine the pfn's nid. Otherwise, we use pfn_to_nid() to obtain
> the pfn's nid.
> 
> Your point is that we could unify this logic and always use
> pfn_to_nid() to determine the pfn's nid, regardless of whether
> CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Is that
> correct?

Yes, struct pages should be ready by the time node_dev_init() is called
even when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set.
 
> > 
> > kernel_init_freeable() ->
> > 	page_alloc_init_late(); /* completes initialization of deferred pages */
> > 	...
> > 	do_basic_setup() ->
> > 		driver_init() ->
> > 			node_dev_init();
> > 
> > The next step could be refactoring register_mem_block_under_node_early() to
> > loop over memblock regions rather than over pfns.
> So it the current implementation
> 
> node_dev_init()
>     register_one_node
>         register_memory_blocks_under_node
>             walk_memory_blocks()
>                 register_mem_block_under_node_early
>                     get_nid_for_pfn
> 
> We get each node's start and end PFN from the pg_data. Using these
> values, we determine the memory block's start and end within the
> current node. To identify the node to which these memory block
> belongs,we iterate over each PFN in the range.
> 
> The problem I am facing is,
> 
> In my system node4 has a memory block ranging from memory30351
> to memory38524, and memory128433. The memory blocks between
> memory38524 and memory128433 do not belong to this node.
> 
> In  walk_memory_blocks() we iterate over all memory blocks starting
> from memory38524 to memory128433.
> In register_mem_block_under_node_early(), up to memory38524, the
> first pfn correctly returns the corresponding nid and the function
> returns from there. But after memory38524 and until memory128433,
> the loop iterates through each pfn and checks the nid. Since the nid
> does not match the required nid, the loop continues. This causes
> the soft lockups.
> 
> This issue occurs only when CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, as a binary search is used to determine the PFN's nid. When
> this configuration is disabled, pfn_to_nid is faster, and the issue does
> not seen.( Faster because nid is getting from page)
> 
> To speed up the code when CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, I added this function that iterates over all memblock regions
> for each memory block to determine its nid.
> 
> "Loop over memblock regions instead of iterating over PFNs" -
> My question is - in register_one_node, do you mean that we should iterate
> over all memblock regions, identify the regions belonging to the current
> node, and then retrieve the corresponding memory blocks to register them
> under that node?

I looked more closely at register_mem_block_under_node_early() and
iteration over memblock regions won't make sense there. 

It might make sense to use for_each_mem_range() as top level loop in
node_dev_init(), but that's a separate topic.
 
> Thanks
> Donet
> 

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2025-04-11 10:59 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-09  5:27 [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Donet Tom
2025-04-09  5:27 ` [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set Donet Tom
2025-04-10  8:07   ` Mike Rapoport
2025-04-10 18:57     ` Donet Tom
2025-04-11 10:59       ` Mike Rapoport [this message]
2025-04-11 11:36         ` Donet Tom
2025-04-15  9:46           ` Mike Rapoport
2025-04-15 10:08             ` Donet Tom
2025-04-10  2:20 ` [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Andrew Morton
2025-04-10  4:35   ` Donet Tom
2025-04-10  7:03 ` kernel test robot
2025-04-10  7:25 ` kernel test robot
2025-04-10  7:49 ` Mike Rapoport
2025-04-10 19:06   ` Donet Tom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_j2Gv9n4DOj6LSs@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=donettom@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.