linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Xin Li <xin3.li@intel.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	 linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 pbonzini@redhat.com, corbet@lwn.net, tglx@linutronix.de,
	mingo@redhat.com,  bp@alien8.de, dave.hansen@linux.intel.com,
	x86@kernel.org, hpa@zytor.com,  shuah@kernel.org,
	vkuznets@redhat.com, peterz@infradead.org,
	 ravi.v.shankar@intel.com, xin@zytor.com
Subject: Re: [PATCH v2 19/25] KVM: nVMX: Add FRED VMCS fields
Date: Thu, 13 Jun 2024 11:29:34 -0700	[thread overview]
Message-ID: <Zms6jkwA9PfvXCGv@google.com> (raw)
In-Reply-To: <20240207172646.3981-20-xin3.li@intel.com>

On Wed, Feb 07, 2024, Xin Li wrote:
> Add FRED VMCS fields to nested VMX context management.
> 
> Todo: change VMCS12_REVISION, as struct vmcs12 is changed.

It actually doesn't, the comment is just stale.  At this point, KVM must _never_
change VMCS12_REVISION as doing so will break backwards compatibility.

I'll post this once I've written a changelog:

---
 arch/x86/kvm/vmx/vmcs12.h | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index edf7fcef8ccf..d67bebb9f1c2 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -207,11 +207,9 @@ struct __packed vmcs12 {
 };
 
 /*
- * VMCS12_REVISION is an arbitrary id that should be changed if the content or
- * layout of struct vmcs12 is changed. MSR_IA32_VMX_BASIC returns this id, and
- * VMPTRLD verifies that the VMCS region that L1 is loading contains this id.
+ * VMCS12_REVISION is KVM's arbitrary id for the layout of struct vmcs12.
  *
- * IMPORTANT: Changing this value will break save/restore compatibility with
+ * DO NOT change this value, as it will break save/restore compatibility with
  * older kvm releases.
  */
 #define VMCS12_REVISION 0x11e57ed0
@@ -225,7 +223,8 @@ struct __packed vmcs12 {
 #define VMCS12_SIZE		KVM_STATE_NESTED_VMX_VMCS_SIZE
 
 /*
- * For save/restore compatibility, the vmcs12 field offsets must not change.
+ * For save/restore compatibility, the vmcs12 field offsets must not change,
+ * although appending fields and/or filling gaps is obviously allowed.
  */
 #define CHECK_OFFSET(field, loc) \
 	ASSERT_STRUCT_OFFSET(struct vmcs12, field, loc)

base-commit: 878fe4c2f7eead383f2b306cbafd300006dd518c
-- 

> Signed-off-by: Xin Li <xin3.li@intel.com>
> Tested-by: Shan Kang <shan.kang@intel.com>
> ---
> 
> Change since v1:
> * Remove hyperv TLFS related changes (Jeremi Piotrowski).
> * Use kvm_cpu_cap_has() instead of cpu_feature_enabled() (Chao Gao).
> ---
>  Documentation/virt/kvm/x86/nested-vmx.rst | 18 +++++
>  arch/x86/kvm/vmx/nested.c                 | 91 +++++++++++++++++++----
>  arch/x86/kvm/vmx/vmcs12.c                 | 18 +++++
>  arch/x86/kvm/vmx/vmcs12.h                 | 36 +++++++++
>  arch/x86/kvm/vmx/vmcs_shadow_fields.h     |  4 +
>  5 files changed, 152 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/nested-vmx.rst b/Documentation/virt/kvm/x86/nested-vmx.rst
> index e64ef231f310..87fa9f3877ab 100644
> --- a/Documentation/virt/kvm/x86/nested-vmx.rst
> +++ b/Documentation/virt/kvm/x86/nested-vmx.rst
> @@ -218,6 +218,24 @@ struct shadow_vmcs is ever changed.
>  		u16 host_gs_selector;
>  		u16 host_tr_selector;
>  		u64 secondary_vm_exit_controls;
> +		u64 guest_ia32_fred_config;
> +		u64 guest_ia32_fred_rsp1;
> +		u64 guest_ia32_fred_rsp2;
> +		u64 guest_ia32_fred_rsp3;
> +		u64 guest_ia32_fred_stklvls;
> +		u64 guest_ia32_fred_ssp1;
> +		u64 guest_ia32_fred_ssp2;
> +		u64 guest_ia32_fred_ssp3;
> +		u64 host_ia32_fred_config;
> +		u64 host_ia32_fred_rsp1;
> +		u64 host_ia32_fred_rsp2;
> +		u64 host_ia32_fred_rsp3;
> +		u64 host_ia32_fred_stklvls;
> +		u64 host_ia32_fred_ssp1;
> +		u64 host_ia32_fred_ssp2;
> +		u64 host_ia32_fred_ssp3;
> +		u64 injected_event_data;
> +		u64 original_event_data;
>  	};
>  
>  
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 94da6a0a2f81..f9c1fbeac302 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -686,6 +686,9 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
>  
>  	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
>  					 MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> +
> +	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
> +					 MSR_IA32_FRED_RSP0, MSR_TYPE_RW);
>  #endif
>  	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
>  					 MSR_IA32_SPEC_CTRL, MSR_TYPE_RW);
> @@ -2498,6 +2501,8 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
>  			     vmcs12->vm_entry_instruction_len);
>  		vmcs_write32(GUEST_INTERRUPTIBILITY_INFO,
>  			     vmcs12->guest_interruptibility_info);
> +		if (kvm_cpu_cap_has(X86_FEATURE_FRED))

This is wrong, vmcs02 should be set from vmcs12 if and only if the field is enabled
in L1's VMX configuration, i.e. iff nested_cpu_has(vmcs12, ???).

Note, the ??? should be tied to whatever VMX MSR feature flag enumerates
INJECTED_EVENT_DATA.  KVM's clearing of X86_FEATURE_FRED when one or more pieces
is missing is a software decision, i.e. not archictectural.

> +			vmcs_write64(INJECTED_EVENT_DATA, vmcs12->injected_event_data);
>  		vmx->loaded_vmcs->nmi_known_unmasked =
>  			!(vmcs12->guest_interruptibility_info & GUEST_INTR_STATE_NMI);
>  	} else {
> @@ -2548,6 +2553,17 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
>  		vmcs_writel(GUEST_GDTR_BASE, vmcs12->guest_gdtr_base);
>  		vmcs_writel(GUEST_IDTR_BASE, vmcs12->guest_idtr_base);
>  
> +		if (kvm_cpu_cap_has(X86_FEATURE_FRED)) {

Same thing here.

> +			vmcs_write64(GUEST_IA32_FRED_CONFIG, vmcs12->guest_ia32_fred_config);
> +			vmcs_write64(GUEST_IA32_FRED_RSP1, vmcs12->guest_ia32_fred_rsp1);
> +			vmcs_write64(GUEST_IA32_FRED_RSP2, vmcs12->guest_ia32_fred_rsp2);
> +			vmcs_write64(GUEST_IA32_FRED_RSP3, vmcs12->guest_ia32_fred_rsp3);
> +			vmcs_write64(GUEST_IA32_FRED_STKLVLS, vmcs12->guest_ia32_fred_stklvls);
> +			vmcs_write64(GUEST_IA32_FRED_SSP1, vmcs12->guest_ia32_fred_ssp1);
> +			vmcs_write64(GUEST_IA32_FRED_SSP2, vmcs12->guest_ia32_fred_ssp2);
> +			vmcs_write64(GUEST_IA32_FRED_SSP3, vmcs12->guest_ia32_fred_ssp3);
> +		}
> +
>  		vmx->segment_cache.bitmask = 0;
>  	}
>  
> @@ -3835,6 +3851,22 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  			vcpu->arch.cr4_guest_owned_bits));
>  }
>  
> +static inline unsigned long
> +nested_vmx_get_event_data(struct kvm_vcpu *vcpu, bool for_ex_vmexit)

Heh, two form letters for the price of one:

#1
Do not use "inline" for functions that are visible only to the local compilation
unit.  "inline" is just a hint, and modern compilers are smart enough to inline
functions when appropriate without a hint.

A longer explanation/rant here: https://lore.kernel.org/all/ZAdfX+S323JVWNZC@google.com

#2
Do not wrap before the function name.  Linus has a nice explanation/rant on this[*].

[*] https://lore.kernel.org/all/CAHk-=wjoLAYG446ZNHfg=GhjSY6nFmuB_wA8fYd5iLBNXjo9Bw@mail.gmail.com

> +{
> +	struct kvm_queued_exception *ex = for_ex_vmexit ?
> +		&vcpu->arch.exception_vmexit : &vcpu->arch.exception;
> +
> +	if (ex->has_payload)
> +		return ex->payload;
> +	else if (ex->vector == PF_VECTOR)
> +		return vcpu->arch.cr2;
> +	else if (ex->vector == DB_VECTOR)
> +		return (vcpu->arch.dr6 & ~DR6_BT) ^ DR6_ACTIVE_LOW;
> +	else
> +		return 0;

I'll circle back to this on the next version, i.e. after it's reworked to account
for the suggested payload changes.  I highly doubt it's correct as-is.

>  static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
>  				      struct vmcs12 *vmcs12,
>  				      u32 vm_exit_reason, u32 exit_intr_info)
> @@ -3842,6 +3874,8 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
>  	u32 idt_vectoring;
>  	unsigned int nr;
>  
> +	vmcs12->original_event_data = 0;
> +
>  	/*
>  	 * Per the SDM, VM-Exits due to double and triple faults are never
>  	 * considered to occur during event delivery, even if the double/triple
> @@ -3880,6 +3914,12 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
>  				vcpu->arch.exception.error_code;
>  		}
>  
> +		idt_vectoring |= vcpu->arch.exception.nested ?
> +				INTR_INFO_NESTED_EXCEPTION_MASK : 0;

Please stop using ternary operators this way.  It's less readable and the same
number of lines as:

		if (vcpu->arch.exception.nested)
			idt_vectoring |= INTR_INFO_NESTED_EXCEPTION_MASK;

> +
> +		vmcs12->original_event_data =
> +			nested_vmx_get_event_data(vcpu, false);
> +
>  		vmcs12->idt_vectoring_info_field = idt_vectoring;
>  	} else if (vcpu->arch.nmi_injected) {
>  		vmcs12->idt_vectoring_info_field =
> @@ -3970,19 +4010,7 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu)
>  	struct kvm_queued_exception *ex = &vcpu->arch.exception_vmexit;
>  	u32 intr_info = ex->vector | INTR_INFO_VALID_MASK;
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -	unsigned long exit_qual;
> -
> -	if (ex->has_payload) {
> -		exit_qual = ex->payload;
> -	} else if (ex->vector == PF_VECTOR) {
> -		exit_qual = vcpu->arch.cr2;
> -	} else if (ex->vector == DB_VECTOR) {
> -		exit_qual = vcpu->arch.dr6;
> -		exit_qual &= ~DR6_BT;
> -		exit_qual ^= DR6_ACTIVE_LOW;
> -	} else {
> -		exit_qual = 0;
> -	}
> +	unsigned long exit_qual = nested_vmx_get_event_data(vcpu, true);

This can't possibly be correct, EXIT_QUAL and EVENT_DATA aren't equivalent, e.g.
the former doesn't have XFD_ERR, but the latter does.

>  	/*
>  	 * Unlike AMD's Paged Real Mode, which reports an error code on #PF
> @@ -4003,10 +4031,12 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu)
>  		intr_info |= INTR_INFO_DELIVER_CODE_MASK;
>  	}
>  
> -	if (kvm_exception_is_soft(ex->vector))
> +	if (kvm_exception_is_soft(ex->vector)) {
>  		intr_info |= INTR_TYPE_SOFT_EXCEPTION;
> -	else
> +	} else {
>  		intr_info |= INTR_TYPE_HARD_EXCEPTION;
> +		intr_info |= ex->nested ? INTR_INFO_NESTED_EXCEPTION_MASK : 0;

Again,

		if (ex->nested)
			intr_info |= INTR_INFO_NESTED_EXCEPTION_MASK;

> +	if (kvm_cpu_cap_has(X86_FEATURE_FRED)) {

And here

> +		vmcs12->guest_ia32_fred_config = vmcs_read64(GUEST_IA32_FRED_CONFIG);
> +		vmcs12->guest_ia32_fred_rsp1 = vmcs_read64(GUEST_IA32_FRED_RSP1);
> +		vmcs12->guest_ia32_fred_rsp2 = vmcs_read64(GUEST_IA32_FRED_RSP2);
> +		vmcs12->guest_ia32_fred_rsp3 = vmcs_read64(GUEST_IA32_FRED_RSP3);
> +		vmcs12->guest_ia32_fred_stklvls = vmcs_read64(GUEST_IA32_FRED_STKLVLS);
> +		vmcs12->guest_ia32_fred_ssp1 = vmcs_read64(GUEST_IA32_FRED_SSP1);
> +		vmcs12->guest_ia32_fred_ssp2 = vmcs_read64(GUEST_IA32_FRED_SSP2);
> +		vmcs12->guest_ia32_fred_ssp3 = vmcs_read64(GUEST_IA32_FRED_SSP3);
> +	}
> +
>  	vmcs12->guest_pending_dbg_exceptions =
>  		vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS);
>  
> @@ -4625,6 +4675,17 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  	vmcs_write32(GUEST_IDTR_LIMIT, 0xFFFF);
>  	vmcs_write32(GUEST_GDTR_LIMIT, 0xFFFF);
>  
> +	if (kvm_cpu_cap_has(X86_FEATURE_FRED)) {

And here

  reply	other threads:[~2024-06-13 18:29 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-07 17:26 [PATCH v2 00/25] Enable FRED with KVM VMX Xin Li
2024-02-07 17:26 ` [PATCH v2 01/25] KVM: VMX: Cleanup VMX basic information defines and usages Xin Li
2024-02-07 17:26 ` [PATCH v2 02/25] KVM: VMX: Cleanup VMX misc " Xin Li
2024-02-07 17:26 ` [PATCH v2 03/25] KVM: VMX: Add support for the secondary VM exit controls Xin Li
2024-04-19 10:21   ` Chao Gao
2024-02-07 17:26 ` [PATCH v2 04/25] KVM: x86: Mark CR4.FRED as not reserved Xin Li
2024-04-19 10:22   ` Chao Gao
2024-06-12 21:12   ` Sean Christopherson
2024-06-13  3:27     ` Li, Xin3
2024-02-07 17:26 ` [PATCH v2 05/25] KVM: VMX: Initialize FRED VM entry/exit controls in vmcs_config Xin Li
2024-04-19 10:24   ` Chao Gao
2024-02-07 17:26 ` [PATCH v2 06/25] KVM: VMX: Defer enabling FRED MSRs save/load until after set CPUID Xin Li
2024-04-19 11:02   ` Chao Gao
2024-06-12 21:19   ` Sean Christopherson
2024-06-13  3:31     ` Li, Xin3
2024-02-07 17:26 ` [PATCH v2 07/25] KVM: VMX: Set intercept for FRED MSRs Xin Li
2024-04-19 13:35   ` Chao Gao
2024-04-19 17:06     ` Li, Xin3
2024-06-12 21:32     ` Sean Christopherson
2024-09-05 17:09       ` Xin Li
2024-09-12 20:05         ` Sean Christopherson
2024-09-18  8:35           ` Xin Li
2024-09-25 14:12             ` Sean Christopherson
2024-09-25 22:13               ` Xin Li
2024-09-27 17:48                 ` Xin Li
2024-09-30 16:56                   ` Sean Christopherson
2024-06-12 21:20   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 08/25] KVM: VMX: Initialize VMCS FRED fields Xin Li
2024-04-19 14:01   ` Chao Gao
2024-04-19 17:02     ` Li, Xin3
2024-06-12 21:41   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 09/25] KVM: VMX: Switch FRED RSP0 between host and guest Xin Li
2024-04-19 14:23   ` Chao Gao
2024-04-19 16:37     ` Li, Xin3
2024-06-12 21:53   ` Sean Christopherson
2024-07-10 15:51     ` Li, Xin3
2024-07-12 15:12       ` Sean Christopherson
2024-07-12 16:15         ` Li, Xin3
2024-07-12 16:27           ` Sean Christopherson
2024-07-12 17:17             ` Li, Xin3
2024-07-12 19:30               ` Sean Christopherson
2024-07-17 17:31               ` Li, Xin3
2024-07-18 13:59                 ` Sean Christopherson
2024-07-18 17:44                   ` Li, Xin3
2024-07-18 21:04         ` H. Peter Anvin
2024-07-19 15:59           ` Sean Christopherson
2024-07-21 18:09             ` Li, Xin3
2024-02-07 17:26 ` [PATCH v2 10/25] KVM: VMX: Add support for FRED context save/restore Xin Li
2024-04-29  6:31   ` Chao Gao
2024-06-12 22:09   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 11/25] KVM: x86: Add kvm_is_fred_enabled() Xin Li
2024-04-29  8:24   ` Chao Gao
2024-05-11  1:24     ` Li, Xin3
2024-05-11  1:53       ` Chao Gao
2024-06-12 22:13   ` Sean Christopherson
2024-06-13 16:55   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 12/25] KVM: VMX: Handle FRED event data Xin Li
2024-04-30  3:14   ` Chao Gao
2024-05-10  9:36     ` Li, Xin3
2024-05-11  3:03       ` Chao Gao
2024-06-12 23:31         ` Sean Christopherson
2024-06-13  5:29           ` Chao Gao
2024-06-13 17:57             ` Sean Christopherson
2024-06-12 22:52   ` Sean Christopherson
2024-06-13 16:57   ` Sean Christopherson
2024-06-13 18:02     ` Li, Xin3
2024-02-07 17:26 ` [PATCH v2 13/25] KVM: VMX: Handle VMX nested exception for FRED Xin Li
2024-04-30  7:34   ` Chao Gao
2024-06-13 17:01   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 14/25] KVM: VMX: Disable FRED if FRED consistency checks fail Xin Li
2024-04-30  8:21   ` Chao Gao
2024-06-13 18:00     ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 15/25] KVM: VMX: Dump FRED context in dump_vmcs() Xin Li
2024-04-30  9:09   ` Chao Gao
2024-06-12 23:55     ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 16/25] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup Xin Li
2024-02-07 17:26 ` [PATCH v2 17/25] KVM: nVMX: Add support for the secondary VM exit controls Xin Li
2024-06-13 18:11   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 18/25] KVM: nVMX: Add a prerequisite to SHADOW_FIELD_R[OW] macros Xin Li
2024-06-13 18:16   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 19/25] KVM: nVMX: Add FRED VMCS fields Xin Li
2024-06-13 18:29   ` Sean Christopherson [this message]
2024-02-07 17:26 ` [PATCH v2 20/25] KVM: nVMX: Add support for VMX FRED controls Xin Li
2024-02-07 17:26 ` [PATCH v2 21/25] KVM: nVMX: Add VMCS FRED states checking Xin Li
2024-02-07 17:26 ` [PATCH v2 22/25] KVM: x86: Allow FRED/LKGS/WRMSRNS to be exposed to guests Xin Li
2024-06-13 18:31   ` Sean Christopherson
2024-02-07 17:26 ` [PATCH v2 23/25] KVM: selftests: Run debug_regs test with FRED enabled Xin Li
2024-02-07 17:26 ` [PATCH v2 24/25] KVM: selftests: Add a new VM guest mode to run user level code Xin Li
2024-02-07 17:26 ` [PATCH v2 25/25] KVM: selftests: Add fred exception tests Xin Li
2024-03-29 20:18   ` Muhammad Usama Anjum
2024-03-29 20:18     ` Muhammad Usama Anjum
2024-04-24 16:08       ` Sean Christopherson
2024-03-27  8:08 ` [PATCH v2 00/25] Enable FRED with KVM VMX Kang, Shan
2024-06-13 18:38   ` Sean Christopherson
2024-06-14  0:52     ` Li, Xin3
2024-04-15 17:58 ` Li, Xin3

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zms6jkwA9PfvXCGv@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=xin3.li@intel.com \
    --cc=xin@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).