All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Aditya Gupta <adityag@linux.ibm.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Danilo Krummrich <dakr@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Sourabh Jain <sourabhjain@linux.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [REPORT] Softlockups on PowerNV with upstream
Date: Thu, 10 Apr 2025 07:25:33 +0200	[thread overview]
Message-ID: <Z_dWTU8UsvCHFMpN@localhost.localdomain> (raw)
In-Reply-To: <20250409180344.477916-1-adityag@linux.ibm.com>

On Wed, Apr 09, 2025 at 11:33:44PM +0530, Aditya Gupta wrote:
> Hi,
> 
> While booting current upstream kernel, I consistently get "softlockups", on IBM PowerNV system.
> 
> I have tested it only on PowerNV systems. But some architectures/platforms also
> might have it. PSeries systems don't have this issue though.
> 
> Bisect points to the following commit:
> 
>     commit 61659efdb35ce6c6ac7639342098f3c4548b794b
>     Author: Gavin Shan <gshan@redhat.com>
>     Date:   Wed Mar 12 09:30:43 2025 +1000
> 
>         drivers/base/memory: improve add_boot_memory_block()
> 
... 
> Console log
> -----------
> 
>     [    2.783371] smp: Brought up 4 nodes, 256 CPUs
>     [    2.783475] numa: Node 0 CPUs: 0-63
>     [    2.783537] numa: Node 2 CPUs: 64-127
>     [    2.783591] numa: Node 4 CPUs: 128-191
>     [    2.783653] numa: Node 6 CPUs: 192-255
>     [    2.804945] Memory: 735777792K/738197504K available (17536K kernel code, 5760K rwdata, 15232K rodata, 6528K init, 2517K bss, 1369664K reserved, 0K cma-reserved)

If I am not mistaken this is ~700GB, and PowerNV uses 16MB as section size,
and sections_per_block == 1 (I think).

The code before the mentioned commit, was something like:

 for (nr = base_section_nr; nr < base_section_nr + sections_per_block; nr++)
       if (present_section_nr(nr))
          section_count++;

 if (section_count == 0)
    return 0;
 return add_memory_block()

So, in case of PowerNV , we will just check one section at a time and
either return or call add_memory_block depending whether it is present.

Now, with the current code that is something different.
We now have 

memory_dev_init:
 for(nr = 0, nr <= __highest_present_section_nr; nr += 1)
     ret = add_boot_memory_block

add_boot_memory_block:
 for_each_present_section_nr(base_section_nr, nr) {
     if (nr >= (base_section_nr + sections_per_block))
            break;

     return add_memory_block();
 }
 return 0;

The thing is that next_present_section_nr() (which is called in
for_each_present_section_nr()) will loop until we find a present
section.
And then we will check whether the found section is beyond
base_section_nr + sections_per_block (where sections_per_block = 1).
If so, we skip add_memory_block.

Now, I think that the issue comes from for_each_present_section_nr
having to loop a lot until we find a present section.
And then the loop in memory_dev_init increments only by 1, which means
that the next iteration we might have to loop a lot again to find the
another present section. And so on and so forth.

Maybe we can fix this by making memory_dev_init() remember in which
section add_boot_memory_block returns.
Something like the following (only compile-tested)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 8f3a41d9bfaa..d97635cbfd1d 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -816,18 +816,25 @@ static int add_memory_block(unsigned long block_id, unsigned long state,
 	return 0;
 }

-static int __init add_boot_memory_block(unsigned long base_section_nr)
+static int __init add_boot_memory_block(unsigned long *base_section_nr)
 {
+	int ret;
 	unsigned long nr;

-	for_each_present_section_nr(base_section_nr, nr) {
-		if (nr >= (base_section_nr + sections_per_block))
+	for_each_present_section_nr(*base_section_nr, nr) {
+		if (nr >= (*base_section_nr + sections_per_block))
 			break;

-		return add_memory_block(memory_block_id(base_section_nr),
-					MEM_ONLINE, NULL, NULL);
+		ret = add_memory_block(memory_block_id(*base_section_nr),
+				       MEM_ONLINE, NULL, NULL);
+		*base_section = nr;
+		return ret;
 	}

+	if (nr == -1)
+		*base_section = __highest_present_section_nr + 1;
+	else
+		*base_section = nr;
 	return 0;
 }

@@ -973,9 +980,9 @@ void __init memory_dev_init(void)
 	 * Create entries for memory sections that were found
 	 * during boot and have been initialized
 	 */
-	for (nr = 0; nr <= __highest_present_section_nr;
-	     nr += sections_per_block) {
-		ret = add_boot_memory_block(nr);
+	nr = first_present_section_nr();
+	for (; nr <= __highest_present_section_nr; nr += sections_per_block) {
+		ret = add_boot_memory_block(&nr);
 		if (ret)
 			panic("%s() failed to add memory block: %d\n", __func__,
 			      ret);
 

@Aditya: can you please give it a try?



-- 
Oscar Salvador
SUSE Labs


  parent reply	other threads:[~2025-04-10  5:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-09 18:03 [REPORT] Softlockups on PowerNV with upstream Aditya Gupta
2025-04-10  1:35 ` Gavin Shan
2025-04-10 11:38   ` Aditya Gupta
2025-04-10  5:25 ` Oscar Salvador [this message]
2025-04-10  5:35   ` Gavin Shan
2025-04-10  8:23     ` Oscar Salvador
2025-04-10  9:44       ` Gavin Shan
2025-04-10 11:49         ` Aditya Gupta
2025-04-10 12:22         ` Aditya Gupta
2025-04-10 12:32           ` Gavin Shan
2025-04-10 11:44   ` Aditya Gupta
2025-04-10 12:26     ` Aditya Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_dWTU8UsvCHFMpN@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=adityag@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mahesh@linux.ibm.com \
    --cc=rafael@kernel.org \
    --cc=sourabhjain@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.