All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Gao <chao.gao@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: <kvm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>, <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] KVM: VMX: Flush shadow VMCS on emergency reboot
Date: Fri, 11 Apr 2025 16:46:02 +0800	[thread overview]
Message-ID: <Z/jWytoXdiGdCeXz@intel.com> (raw)
In-Reply-To: <Z_g-UQoZ8fQhVD_2@google.com>

On Thu, Apr 10, 2025 at 02:55:29PM -0700, Sean Christopherson wrote:
>On Mon, Mar 24, 2025, Chao Gao wrote:
>> Ensure the shadow VMCS cache is evicted during an emergency reboot to
>> prevent potential memory corruption if the cache is evicted after reboot.
>
>I don't suppose Intel would want to go on record and state what CPUs would actually
>be affected by this bug.  My understanding is that Intel has never shipped a CPU
>that caches shadow VMCS state.

I am not sure. Would you like me to check internally?

However, SDM Chapter 26.11 includes a footnote stating:
"
As noted in Section 26.1, execution of the VMPTRLD instruction makes a VMCS is
active. In addition, VM entry makes active any shadow VMCS referenced by the
VMCS link pointer in the current VMCS. If a shadow VMCS is made active by VM
entry, it is necessary to execute VMCLEAR for that VMCS before allowing that
VMCS to become active on another logical processor.
"

To me, this suggests that shadow VMCS may be cached, and software shouldn't
assume the CPU won't cache it. But, I don't know if this is the reality or
if the statement is merely for hardware implementation flexibility.

>
>On a very related topic, doesn't SPR+ now flush the VMCS caches on VMXOFF?  If

Actually this behavior is not publicly documented.

>that's going to be the architectural behavior going forward, will that behavior
>be enumerated to software?  Regardless of whether there's software enumeration,
>I would like to have the emergency disable path depend on that behavior.  In part
>to gain confidence that SEAM VMCSes won't screw over kdump, but also in light of
>this bug.

I don't understand how we can gain confidence that SEAM VMCSes won't screw
over kdump.

If a VMM wants to leverage the VMXOFF behavior, software enumeration
might be needed for nested virtualization. Using CPU F/M/S (SPR+) to
enumerate a behavior could be problematic for virtualization. Right?

>
>If all past CPUs never cache shadow VMCS state, and all future CPUs flush the
>caches on VMXOFF, then this is a glorified NOP, and thus probably shouldn't be
>tagged for stable.

Agreed.

Sean, I am not clear whether you intend to fix this issue and, if so, how.
Could you clarify?

>
>> This issue was identified through code inspection, as __loaded_vmcs_clear()
>> flushes both the normal VMCS and the shadow VMCS.
>> 
>> Avoid checking the "launched" state during an emergency reboot, unlike the
>> behavior in __loaded_vmcs_clear(). This is important because reboot NMIs
>> can interfere with operations like copy_shadow_to_vmcs12(), where shadow
>> VMCSes are loaded directly using VMPTRLD. In such cases, if NMIs occur
>> right after the VMCS load, the shadow VMCSes will be active but the
>> "launched" state may not be set.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>  arch/x86/kvm/vmx/vmx.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index b70ed72c1783..dccd1c9939b8 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -769,8 +769,11 @@ void vmx_emergency_disable_virtualization_cpu(void)
>>  		return;
>>  
>>  	list_for_each_entry(v, &per_cpu(loaded_vmcss_on_cpu, cpu),
>> -			    loaded_vmcss_on_cpu_link)
>> +			    loaded_vmcss_on_cpu_link) {
>>  		vmcs_clear(v->vmcs);
>> +		if (v->shadow_vmcs)
>> +			vmcs_clear(v->shadow_vmcs);
>> +	}
>>  
>>  	kvm_cpu_vmxoff();
>>  }
>> -- 
>> 2.46.1
>> 

  reply	other threads:[~2025-04-11  8:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-24 14:08 [PATCH] KVM: VMX: Flush shadow VMCS on emergency reboot Chao Gao
2025-03-31 23:17 ` Huang, Kai
2025-04-10 21:55 ` Sean Christopherson
2025-04-11  8:46   ` Chao Gao [this message]
2025-04-11 16:57     ` Sean Christopherson
2025-04-14  6:24       ` Xiaoyao Li
2025-04-14 12:15       ` Huang, Kai
2025-04-14 13:18       ` Chao Gao
2025-04-15  1:03         ` Sean Christopherson
2025-04-15  1:55           ` Chao Gao
2025-10-08 23:01   ` Sean Christopherson
2025-10-09  5:36     ` Chao Gao
2025-10-10  1:16     ` dan.j.williams
2025-10-10 21:22       ` VMXON for TDX (was: Re: [PATCH] KVM: VMX: Flush shadow VMCS on emergency reboot) Sean Christopherson
2025-05-02 21:51 ` [PATCH] KVM: VMX: Flush shadow VMCS on emergency reboot Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z/jWytoXdiGdCeXz@intel.com \
    --to=chao.gao@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.