public inbox for linux-next@vger.kernel.org
 help / color / mirror / Atom feed
From: Tom Lendacky <thomas.lendacky@amd.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Aithal, Srikanth" <sraithal@amd.com>
Cc: Linux-Next Mailing List <linux-next@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	"Roth, Michael" <Michael.Roth@amd.com>
Subject: Re: linux-next regression: SNP Guest boot hangs with certain cpu/mem config combination
Date: Wed, 26 Mar 2025 17:30:35 -0500	[thread overview]
Message-ID: <08981771-39ac-af66-e2ec-e8f9bf6aed0a@amd.com> (raw)
In-Reply-To: <rar5bkfy7iplfhitsbna3b2dmxbk7nunlaiclwars6kffdetl4@lzm7iualliua>

On 3/25/25 08:33, Kirill A. Shutemov wrote:
> On Tue, Mar 25, 2025 at 02:40:00PM +0530, Aithal, Srikanth wrote:
>> Hello,
>>
>>
>> Starting linux-next build next-20250312, including recent build 20250324, we
>> are seeing an issue where the SNP guest boot hangs at the "boot smp config"
>> step:
>>
>>
>>  [ 2.294722] smp: Bringing up secondary CPUs ...
>> [    2.295211] smpboot: Parallel CPU startup disabled by the platform
>> [    2.309687] smpboot: x86: Booting SMP configuration:
>> [    2.310214] .... node  #0, CPUs:          #1   #2   #3   #4 #5   #6  
>> #7   #8   #9  #10  #11  #12  #13  #14  #15  #16  #17 #18  #19  #20  #21 
>> #22  #23  #24  #25  #26  #27  #28  #29  #30 #31  #32  #33  #34  #35  #36 
>> #37  #38  #39  #40  #41  #42  #43 #44  #45  #46  #47  #48  #49  #50  #51 
>> #52  #53  #54  #55  #56 #57  #58  #59  #60  #61  #62  #63  #64  #65  #66 
>> #67  #68  #69 #70  #71  #72  #73  #74  #75  #76  #77  #78  #79  #80  #81 
>> #82 #83  #84  #85  #86  #87  #88  #89  #90  #91  #92  #93  #94  #95 #96 
>> #97  #98  #99 #100 #101 #102 #103 #104 #105 #106 #107 #108 #109 #110 #111
>> #112 #113 #114 #115 #116 #117 #118 #119 #120 #121 #122 #123 #124 #125 #126
>> #127 #128 #129 #130 #131 #132 #133 #134 #135 #136 #137 #138 #139 #140 #141
>> #142 #143 #144 #145 #146 #147 #148 #149 #150 #151 #152 #153 #154 #155 #156
>> #157 #158 #159 #160 #161 #162 #163 #164 #165 #166 #167 #168 #169 #170 #171
>> #172 #173 #174 #175 #176 #177 #178 #179 #180 #181 #182 #183 #184 #185 #186
>> #187 #188 #189 #190 #191 #192 #193 #194 #195 #196 #197 #198
>> --> The guest hangs forever at this point.
>>
>>
>> I have observed that certain vCPU and memory combinations work, while others
>> do not. The VM configuration I am using does not have any NUMA nodes.
>>
>> vcpus             Mem        SNP guest boot
>> <=240            19456M    Boots fine
>>> =241,<255   19456M    Hangs
>> 1-255              2048M    Boots fine
>> 1-255              4096M    Boots fine
>>> 71                 8192M    Hangs
>>> 41                 6144M    Hangs
>>
>> When I bisected this issue, it pointed to the following commit :
>>
>>
>> *commit 800f1059c99e2b39899bdc67a7593a7bea6375d8*
>> Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Date:   Mon Mar 10 10:28:55 2025 +0200
>>
>>     mm/page_alloc: fix memory accept before watermarks gets initialized
> 
> Hm. It is puzzling for me. I don't see how this commit can cause the hang.
> 
> Could you track down where hang happens?

Let me say that the guest config is key for this. Using that config, I
think you might be able to repro this on TDX. The config does turn off TDX
support, so I'm hoping that turning it on doesn't change anything.

I've been able to track it down slightly... It is happening during the CPU
bringup trace points and it eventually gets to line 2273 in
rb_allocate_cpu_buffer() and never comes back from an alloc_pages_node()
call. That's as far as I've gotten so far. I'm not a mm expert so not sure
if I'll be able to progress much further.

Thanks,
Tom

> 

  parent reply	other threads:[~2025-03-26 22:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25  9:10 linux-next regression: SNP Guest boot hangs with certain cpu/mem config combination Aithal, Srikanth
2025-03-25 13:33 ` Kirill A. Shutemov
2025-03-26 14:17   ` Aithal, Srikanth
2025-03-26 22:30   ` Tom Lendacky [this message]
2025-03-27 12:58     ` Kirill A. Shutemov
2025-03-27 14:25       ` Kirill A. Shutemov
2025-03-27 14:35         ` Jason Baron
2025-03-27 14:43           ` Kirill A. Shutemov
2025-03-27 15:02             ` Steven Rostedt
2025-03-27 17:39               ` Kirill A. Shutemov
2025-03-28  8:28                 ` Kirill A. Shutemov
2025-03-28  8:39                   ` Aithal, Srikanth
2025-03-28  9:09                   ` Kirill A. Shutemov
2025-03-28  9:17                     ` Ard Biesheuvel
2025-03-28  9:26                       ` Kirill A. Shutemov
2025-03-28  9:33                         ` Ard Biesheuvel
2025-03-28 10:54                           ` Kirill A. Shutemov
2025-03-28 16:18                             ` Ard Biesheuvel
2025-03-28  9:19                     ` Aithal, Srikanth
2025-03-28  9:25                       ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08981771-39ac-af66-e2ec-e8f9bf6aed0a@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=Michael.Roth@amd.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=sraithal@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox