public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>,
	<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
	<dave.hansen@intel.com>, <bp@alien8.de>, <tglx@linutronix.de>,
	<mingo@redhat.com>, <hpa@zytor.com>, <luto@kernel.org>,
	<peterz@infradead.org>, <rick.p.edgecombe@intel.com>,
	<ashish.kalra@amd.com>, <chao.gao@intel.com>, <bhe@redhat.com>,
	<nik.borisov@suse.com>, <pbonzini@redhat.com>,
	<seanjc@google.com>
Subject: Re: [PATCH v2 2/5] x86/kexec: do unconditional WBINVD in relocate_kernel()
Date: Wed, 20 Mar 2024 13:45:32 +1300	[thread overview]
Message-ID: <256052ee-4e7b-45c5-8399-515fbb529a01@intel.com> (raw)
In-Reply-To: <i3nxazyv2dlauias4jmoqwpjixviuduaw6bgtfv4claxtimlm3@54xmat6zqud4>



On 20/03/2024 1:19 pm, Kirill A. Shutemov wrote:
> On Wed, Mar 20, 2024 at 10:20:50AM +1300, Huang, Kai wrote:
>>
>>
>> On 20/03/2024 3:38 am, Tom Lendacky wrote:
>>> On 3/19/24 06:13, Kirill A. Shutemov wrote:
>>>> On Tue, Mar 19, 2024 at 01:48:45AM +0000, Kai Huang wrote:
>>>>> Both SME and TDX can leave caches in incoherent state due to memory
>>>>> encryption.  During kexec, the caches must be flushed before jumping to
>>>>> the second kernel to avoid silent memory corruption to the
>>>>> second kernel.
>>>>>
>>>>> During kexec, the WBINVD in stop_this_cpu() flushes caches for all
>>>>> remote cpus when they are being stopped.  For SME, the WBINVD in
>>>>> relocate_kernel() flushes the cache for the last running cpu (which is
>>>>> executing the kexec).
>>>>>
>>>>> Similarly, for TDX after stopping all remote cpus with cache flushed, to
>>>>> support kexec, the kernel needs to flush cache for the last running cpu.
>>>>>
>>>>> Make the WBINVD in the relocate_kernel() unconditional to cover both SME
>>>>> and TDX.
>>>>
>>>> Nope. It breaks TDX guest. WBINVD triggers #VE for TDX guests.
>>>
>>> Ditto for SEV-ES/SEV-SNP, a #VC is generated and crashes the guest.
>>>
>>
>> Oh I forgot these.
>>
>> Hi Kirill,
>>
>> Then I think patch 1 will also break TDX guest after your series to enable
>> multiple cpus for the second kernel after kexec()?
> 
> Well, not exactly.
> 
> My patchset overrides stop_this_cpu() with own implementation for MADT
> wakeup method that doesn't have WBINVD. So the patch doesn't break
> anything, 

Well, your callback actually only gets called _after_ this WBINVD, so...

I guess I should have that checked by myself. :-)

but if in the future TDX (or SEV) host would use MADT wake up
> method instead of IPI we will get back to the problem with missing
> WBINVD.
> 
> I don't know if we care. There's no reason for host to use MADT wake up
> method.
> 

I don't think MADT wake up will be used at any native environment.

Anyway, regardless whether patch 1 will break TDX/SEV-ES/SEV-SNP guests, 
I think to resolve this, we can simply adjust our mindset from ...

	"do unconditional WBINVD"

to ...

	"do unconditional WBINVD when it can be done safely"

For now, AFAICT, only TDX guests and SEV-ES/SEV-SNP guests are such guests.

And they all report the CC_ATTR_GUEST_MEM_ENCRYPT flag as true, so we 
can change to only do WBINVD when the kernel sees that flag.

	if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
		native_wbinvd();

Alternatively, we can have a dedicated X86_FEATURE_NO_WBINVD and get it 
set for TDX/SEV-ES/SEV-SNP guests (and any guests if this is true), and do:

	if (!boot_cpu_has(X86_FEATURE_NO_WBINVD))
		native_wbinvd();

It seems the first one is too generic (for any CoCo VMs), and the second 
one is better.

Any comments?

Hi Boris/Dave,

Do you have any comments?

  reply	other threads:[~2024-03-20  0:45 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-19  1:48 [PATCH v2 0/5] TDX host: kexec() support Kai Huang
2024-03-19  1:48 ` [PATCH v2 1/5] x86/kexec: do unconditional WBINVD in stop_this_cpu() Kai Huang
2024-03-19  1:48 ` [PATCH v2 2/5] x86/kexec: do unconditional WBINVD in relocate_kernel() Kai Huang
2024-03-19 11:13   ` Kirill A. Shutemov
2024-03-19 14:38     ` Tom Lendacky
2024-03-19 21:20       ` Huang, Kai
2024-03-20  0:19         ` Kirill A. Shutemov
2024-03-20  0:45           ` Huang, Kai [this message]
2024-03-20 12:51             ` Kirill A. Shutemov
2024-03-20 13:49         ` Tom Lendacky
2024-03-20 20:48           ` Huang, Kai
2024-03-20 21:06             ` Tom Lendacky
2024-03-20 21:58               ` Huang, Kai
2024-03-20 23:10             ` Kirill A. Shutemov
2024-03-21 21:02               ` Tom Lendacky
2024-03-22 10:40                 ` Kirill A. Shutemov
2024-03-22 14:50                   ` Tom Lendacky
2024-03-25 13:04                     ` Huang, Kai
2024-03-28 16:10                       ` kirill.shutemov
2024-04-01  9:13                         ` Huang, Kai
2024-03-19 15:41   ` Borislav Petkov
2024-03-19 21:08     ` Huang, Kai
2024-03-19  1:48 ` [PATCH v2 3/5] x86/kexec: Reset TDX private memory on platforms with TDX erratum Kai Huang
2024-03-19  1:48 ` [PATCH v2 4/5] x86/virt/tdx: Remove the !KEXEC_CORE dependency Kai Huang
2024-03-19  1:48 ` [PATCH v2 5/5] x86/virt/tdx: Add TDX memory reset notifier to reset other private pages Kai Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=256052ee-4e7b-45c5-8399-515fbb529a01@intel.com \
    --to=kai.huang@intel.com \
    --cc=ashish.kalra@amd.com \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox