From: Donet Tom <donettom@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-kernel@vger.kernel.org
Cc: Ritesh Harjani <ritesh.list@gmail.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH] driver/base/node.c: Fix softlockups during the initialization of large systems with interleaved memory blocks
Date: Tue, 11 Mar 2025 20:33:30 +0530 [thread overview]
Message-ID: <7c8b8dd2-0348-4311-b237-6129fcc60b08@linux.ibm.com> (raw)
In-Reply-To: <5f8db79f-5eb3-48f0-a7cd-a903f9cbe75e@redhat.com>
On 3/11/25 2:52 PM, David Hildenbrand wrote:
> On 10.03.25 12:53, Donet Tom wrote:
>> On large systems with more than 64TB of DRAM, if the memory blocks
>> are interleaved, node initialization (node_dev_init()) could take
>> a long time since it iterates over each memory block. If the memory
>> block belongs to the current iterating node, the first pfn_to_nid
>> will provide the correct value. Otherwise, it will iterate over all
>> PFNs and check the nid. On non-preemptive kernels, this can result
>> in a watchdog softlockup warning. Even though CONFIG_PREEMPT_LAZY
>> is enabled in kernels now [1], we may still need to fix older
>> stable kernels to avoid encountering these kernel warnings during
>> boot.
>
> If it's not an issue upstream, there is no need for an upstream patch.
>
> Fix stable kernels separately.
>
> Or did I get you wrong and this can be triggered upstream?
Yes, the issue is present upstream if CONFIG_PREEMPT_LAZY is disabled.
Thanks
Donet
>
>>
>> This patch adds a cond_resched() call in node_dev_init() to avoid
>> this warning.
>>
>> node_dev_init()
>> register_one_node
>> register_memory_blocks_under_node
>> walk_memory_blocks()
>> register_mem_block_under_node_early
>> get_nid_for_pfn
>> early_pfn_to_nid
>>
>> In my system node4 has a memory block ranging from memory30351
>> to memory38524, and memory128433. The memory blocks between
>> memory38524 and memory128433 do not belong to this node.
>>
>> In walk_memory_blocks() we iterate over all memblocks starting
>> from memory38524 to memory128433.
>> In register_mem_block_under_node_early(), up to memory38524, the
>> first pfn correctly returns the corresponding nid and the function
>> returns from there. But after memory38524 and until memory128433,
>> the loop iterates through each pfn and checks the nid. Since the nid
>> does not match the required nid, the loop continues. This causes
>> the soft lockups.
>>
>> [1]:
>> https://lore.kernel.org/linuxppc-dev/20241116192306.88217-1-sshegde@linux.ibm.com/
>> Fixes: 2848a28b0a60 ("drivers/base/node: consolidate node device
>> subsystem initialization in node_dev_init()")
>
> That commit only moved code; so very likely, that is not the
> problematic commit.
>
>
>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>> ---
>> drivers/base/node.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 0ea653fa3433..107eb508e28e 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -975,5 +975,6 @@ void __init node_dev_init(void)
>> ret = register_one_node(i);
>> if (ret)
>> panic("%s() failed to add node: %d\n", __func__, ret);
>> + cond_resched();
>> }
>> }
>
>
prev parent reply other threads:[~2025-03-11 15:03 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-10 11:53 [PATCH] driver/base/node.c: Fix softlockups during the initialization of large systems with interleaved memory blocks Donet Tom
2025-03-10 12:52 ` Greg Kroah-Hartman
2025-03-11 8:56 ` Donet Tom
2025-03-11 9:29 ` David Hildenbrand
2025-03-11 15:00 ` Donet Tom
2025-03-11 19:39 ` David Hildenbrand
2025-03-11 9:22 ` David Hildenbrand
2025-03-11 15:03 ` Donet Tom [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7c8b8dd2-0348-4311-b237-6129fcc60b08@linux.ibm.com \
--to=donettom@linux.ibm.com \
--cc=dakr@kernel.org \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=ritesh.list@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.