Re: [PATCH v4 08/19] KVM: VMX: Add support for FRED context save/restore

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: "Xin Li (Intel)" <xin@zytor.com>
Cc: pbonzini@redhat.com, kvm@vger.kernel.org,
	linux-doc@vger.kernel.org,  linux-kernel@vger.kernel.org,
	corbet@lwn.net, tglx@linutronix.de,  mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	 hpa@zytor.com, andrew.cooper3@citrix.com, luto@kernel.org,
	 peterz@infradead.org, chao.gao@intel.com, xin3.li@intel.com
Subject: Re: [PATCH v4 08/19] KVM: VMX: Add support for FRED context save/restore
Date: Tue, 24 Jun 2025 09:27:16 -0700	[thread overview]
Message-ID: <aFrR5Nk1Ge3_ApWy@google.com> (raw)
In-Reply-To: <20250328171205.2029296-9-xin@zytor.com>

On Fri, Mar 28, 2025, Xin Li (Intel) wrote:
> From: Xin Li <xin3.li@intel.com>
> 
> Handle FRED MSR access requests, allowing FRED context to be set/get
> from both host and guest.
> 
> During VM save/restore and live migration, FRED context needs to be
> saved/restored, which requires FRED MSRs to be accessed from userspace,
> e.g., Qemu.
> 
> Note, handling of MSR_IA32_FRED_SSP0, i.e., MSR_IA32_PL0_SSP, is not
> added yet, which is done in the KVM CET patch set.
> 
> Signed-off-by: Xin Li <xin3.li@intel.com>
> Signed-off-by: Xin Li (Intel) <xin@zytor.com>
> Tested-by: Shan Kang <shan.kang@intel.com>
> ---
> 
> Changes since v2:
> * Add a helper to convert FRED MSR index to VMCS field encoding to
>   make the code more compact (Chao Gao).
> * Get rid of the "host_initiated" check because userspace has to set
>   CPUID before MSRs (Chao Gao & Sean Christopherson).
> * Address a few cleanup comments (Sean Christopherson).
> 
> Changes since v1:
> * Use kvm_cpu_cap_has() instead of cpu_feature_enabled() (Chao Gao).
> * Fail host requested FRED MSRs access if KVM cannot virtualize FRED
>   (Chao Gao).
> * Handle the case FRED MSRs are valid but KVM cannot virtualize FRED
>   (Chao Gao).
> * Add sanity checks when writing to FRED MSRs.
> ---
>  arch/x86/kvm/vmx/vmx.c | 48 ++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c     | 28 ++++++++++++++++++++++++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 1fd32aa255f9..ae9712624413 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1426,6 +1426,24 @@ static void vmx_write_guest_kernel_gs_base(struct vcpu_vmx *vmx, u64 data)
>  	preempt_enable();
>  	vmx->msr_guest_kernel_gs_base = data;
>  }
> +
> +static u64 vmx_read_guest_fred_rsp0(struct vcpu_vmx *vmx)
> +{
> +	preempt_disable();
> +	if (vmx->guest_state_loaded)
> +		vmx->msr_guest_fred_rsp0 = read_msr(MSR_IA32_FRED_RSP0);
> +	preempt_enable();
> +	return vmx->msr_guest_fred_rsp0;
> +}
> +
> +static void vmx_write_guest_fred_rsp0(struct vcpu_vmx *vmx, u64 data)
> +{
> +	preempt_disable();
> +	if (vmx->guest_state_loaded)
> +		wrmsrns(MSR_IA32_FRED_RSP0, data);
> +	preempt_enable();
> +	vmx->msr_guest_fred_rsp0 = data;
> +}
>  #endif

Maybe add helpers to deal with the preemption stuff?  Oh, never mind, FRED
uses WRMSRNS.  Hmm, actually, can't these all be non-serializing?  KVM is
progating *guest* values to hardware, so a VM-Enter is guaranteed before the
CPU value can be consumed.

#ifdef CONFIG_X86_64
static u64 vmx_read_guest_host_msr(struct vcpu_vmx *vmx, u32 msr, u64 *cache)
{
	preempt_disable();
	if (vmx->guest_state_loaded)
		*cache = read_msr(msr);
	preempt_enable();
	return *cache;
}

static u64 vmx_write_guest_host_msr(struct vcpu_vmx *vmx, u32 msr, u64 data,
				    u64 *cache)
{
	preempt_disable();
	if (vmx->guest_state_loaded)
		wrmsrns(MSR_KERNEL_GS_BASE, data);
	preempt_enable();
	*cache = data;
}

static u64 vmx_read_guest_kernel_gs_base(struct vcpu_vmx *vmx)
{
	return vmx_read_guest_host_msr(vmx, MSR_KERNEL_GS_BASE,
				       &vmx->msr_guest_kernel_gs_base);
}

static void vmx_write_guest_kernel_gs_base(struct vcpu_vmx *vmx, u64 data)
{
	vmx_write_guest_host_msr(vmx, MSR_KERNEL_GS_BASE, data,
				 &vmx->msr_guest_kernel_gs_base);
}

static u64 vmx_read_guest_fred_rsp0(struct vcpu_vmx *vmx)
{
	return vmx_read_guest_host_msr(vmx, MSR_IA32_FRED_RSP0,
				       &vmx->msr_guest_fred_rsp0);
}

static void vmx_write_guest_fred_rsp0(struct vcpu_vmx *vmx, u64 data)
{
	return vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_RSP0, data,
				        &vmx->msr_guest_fred_rsp0);
}
#endif

>  static void grow_ple_window(struct kvm_vcpu *vcpu)
> @@ -2039,6 +2057,24 @@ int vmx_get_feature_msr(u32 msr, u64 *data)
>  	}
>  }
>  
> +#ifdef CONFIG_X86_64
> +static u32 fred_msr_vmcs_fields[] = {

This should be const.

> +	GUEST_IA32_FRED_RSP1,
> +	GUEST_IA32_FRED_RSP2,
> +	GUEST_IA32_FRED_RSP3,
> +	GUEST_IA32_FRED_STKLVLS,
> +	GUEST_IA32_FRED_SSP1,
> +	GUEST_IA32_FRED_SSP2,
> +	GUEST_IA32_FRED_SSP3,
> +	GUEST_IA32_FRED_CONFIG,
> +};

I think it also makes sense to add a static_assert() here, more so to help
readers follow along than anything else.

static_assert(MSR_IA32_FRED_CONFIG - MSR_IA32_FRED_RSP1 ==
	      ARRAY_SIZE(fred_msr_vmcs_fields) - 1);

> +
> +static u32 fred_msr_to_vmcs(u32 msr)
> +{
> +	return fred_msr_vmcs_fields[msr - MSR_IA32_FRED_RSP1];
> +}
> +#endif
> +
>  /*
>   * Reads an msr value (of 'msr_info->index') into 'msr_info->data'.
>   * Returns 0 on success, non-0 otherwise.
> @@ -2061,6 +2097,12 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_KERNEL_GS_BASE:
>  		msr_info->data = vmx_read_guest_kernel_gs_base(vmx);
>  		break;
> +	case MSR_IA32_FRED_RSP0:
> +		msr_info->data = vmx_read_guest_fred_rsp0(vmx);
> +		break;
> +	case MSR_IA32_FRED_RSP1 ... MSR_IA32_FRED_CONFIG:
> +		msr_info->data = vmcs_read64(fred_msr_to_vmcs(msr_info->index));
> +		break;
>  #endif
>  	case MSR_EFER:
>  		return kvm_get_msr_common(vcpu, msr_info);
> @@ -2268,6 +2310,12 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			vmx_update_exception_bitmap(vcpu);
>  		}
>  		break;
> +	case MSR_IA32_FRED_RSP0:
> +		vmx_write_guest_fred_rsp0(vmx, data);
> +		break;
> +	case MSR_IA32_FRED_RSP1 ... MSR_IA32_FRED_CONFIG:
> +		vmcs_write64(fred_msr_to_vmcs(msr_index), data);
> +		break;
>  #endif
>  	case MSR_IA32_SYSENTER_CS:
>  		if (is_guest_mode(vcpu))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c841817a914a..007577143337 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -318,6 +318,9 @@ static const u32 msrs_to_save_base[] = {
>  	MSR_STAR,
>  #ifdef CONFIG_X86_64
>  	MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
> +	MSR_IA32_FRED_RSP0, MSR_IA32_FRED_RSP1, MSR_IA32_FRED_RSP2,
> +	MSR_IA32_FRED_RSP3, MSR_IA32_FRED_STKLVLS, MSR_IA32_FRED_SSP1,
> +	MSR_IA32_FRED_SSP2, MSR_IA32_FRED_SSP3, MSR_IA32_FRED_CONFIG,
>  #endif
>  	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
>  	MSR_IA32_FEAT_CTL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
> @@ -1849,6 +1852,23 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
>  
>  		data = (u32)data;
>  		break;
> +	case MSR_IA32_FRED_RSP0 ... MSR_IA32_FRED_CONFIG:
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
> +			return 1;

Yeesh, this is a bit of a no-win situation.  Having to re-check the MSR index is
no fun, but the amount of overlap between MSRs is significant, i.e. I see why you
bundled everything together.  Ugh, and MSR_IA32_FRED_STKLVLS is buried smack dab
in the middle of everything.

> +
> +		/* Bit 11, bits 5:4, and bit 2 of the IA32_FRED_CONFIG must be zero */

Eh, the comment isn't helping much.  If we want to add more documentation, add
#defines.  But I think we can documented the reserved behavior while also tidying
up the code a bit.

After much fiddling, how about this?

	case MSR_IA32_FRED_STKLVLS:
		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
			return 1;
		break;			

	case MSR_IA32_FRED_RSP0 ... MSR_IA32_FRED_RSP3:
	case MSR_IA32_FRED_SSP1 ... MSR_IA32_FRED_CONFIG: {
		u64 reserved_bits;

		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
			return 1;

		if (is_noncanonical_msr_address(data, vcpu))
			return 1;

		switch (index) {
		case MSR_IA32_FRED_CONFIG:
			reserved_bits = BIT_ULL(11) | GENMASK_ULL(5, 4) | BIT_ULL(2);
			break;
		case MSR_IA32_FRED_RSP0 ... MSR_IA32_FRED_RSP3:
			reserved_bits = GENMASK_ULL(5, 0);
			break;
		case MSR_IA32_FRED_SSP1 ... MSR_IA32_FRED_SSP3:
			reserved_bits = GENMASK_ULL(2, 0);
			break;
		default:
			WARN_ON_ONCE(1);
			return 1;
		}
		if (data & reserved_bits)
			return 1;
		break;
	}

> @@ -1893,6 +1913,10 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>  		    !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  		break;
> +	case MSR_IA32_FRED_RSP0 ... MSR_IA32_FRED_CONFIG:
> +		if (!guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
> +			return 1;
> +		break;
>  	}
>  
>  	msr.index = index;
> @@ -7455,6 +7479,10 @@ static void kvm_probe_msr_to_save(u32 msr_index)
>  		if (!(kvm_get_arch_capabilities() & ARCH_CAP_TSX_CTRL_MSR))
>  			return;
>  		break;
> +	case MSR_IA32_FRED_RSP0 ... MSR_IA32_FRED_CONFIG:
> +		if (!kvm_cpu_cap_has(X86_FEATURE_FRED))
> +			return;
> +		break;
>  	default:
>  		break;
>  	}
> -- 
> 2.48.1
>

next prev parent reply	other threads:[~2025-06-24 16:27 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28 17:11 [PATCH v4 00/19] Enable FRED with KVM VMX Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 01/19] KVM: VMX: Add support for the secondary VM exit controls Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 02/19] KVM: VMX: Initialize VM entry/exit FRED controls in vmcs_config Xin Li (Intel)
2025-04-14  7:41   ` Chao Gao
2025-04-14 16:53     ` Xin Li
2025-03-28 17:11 ` [PATCH v4 03/19] KVM: VMX: Disable FRED if FRED consistency checks fail Xin Li (Intel)
2025-06-24 15:20   ` Sean Christopherson
2025-03-28 17:11 ` [PATCH v4 04/19] x86/cea: Export per CPU array 'cea_exception_stacks' for KVM to use Xin Li (Intel)
2025-04-10  8:53   ` Christoph Hellwig
2025-04-10 14:18     ` Dave Hansen
2025-04-11 16:16       ` Xin Li
2025-03-28 17:11 ` [PATCH v4 05/19] KVM: VMX: Initialize VMCS FRED fields Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 06/19] KVM: VMX: Set FRED MSR interception Xin Li (Intel)
2025-06-24 15:27   ` Sean Christopherson
2025-03-28 17:11 ` [PATCH v4 07/19] KVM: VMX: Save/restore guest FRED RSP0 Xin Li (Intel)
2025-06-24 15:44   ` Sean Christopherson
2025-03-28 17:11 ` [PATCH v4 08/19] KVM: VMX: Add support for FRED context save/restore Xin Li (Intel)
2025-06-24 16:27   ` Sean Christopherson [this message]
2025-06-25 17:18     ` Xin Li
2025-06-26 17:22       ` Xin Li
2025-06-26 20:50         ` Sean Christopherson
2025-03-28 17:11 ` [PATCH v4 09/19] KVM: x86: Add a helper to detect if FRED is enabled for a vCPU Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 10/19] KVM: VMX: Virtualize FRED event_data Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 11/19] KVM: VMX: Virtualize FRED nested exception tracking Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 12/19] KVM: x86: Save/restore the nested flag of an exception Xin Li (Intel)
2025-03-28 17:11 ` [PATCH v4 13/19] KVM: x86: Mark CR4.FRED as not reserved Xin Li (Intel)
2025-03-28 17:12 ` [PATCH v4 14/19] KVM: VMX: Dump FRED context in dump_vmcs() Xin Li (Intel)
2025-06-24 16:32   ` Sean Christopherson
2025-06-25 17:38     ` Xin Li
2025-03-28 17:12 ` [PATCH v4 15/19] KVM: x86: Allow FRED/LKGS to be advertised to guests Xin Li (Intel)
2025-06-24 16:38   ` Sean Christopherson
2025-06-25 18:05     ` Xin Li
2025-06-25 18:29       ` Sean Christopherson
2025-03-28 17:12 ` [PATCH v4 16/19] KVM: nVMX: Add support for the secondary VM exit controls Xin Li (Intel)
2025-06-24 16:54   ` Sean Christopherson
2025-03-28 17:12 ` [PATCH v4 17/19] KVM: nVMX: Add FRED VMCS fields to nested VMX context management Xin Li (Intel)
2025-03-28 17:12 ` [PATCH v4 18/19] KVM: nVMX: Add VMCS FRED states checking Xin Li (Intel)
2025-03-28 17:12 ` [PATCH v4 19/19] KVM: nVMX: Allow VMX FRED controls Xin Li (Intel)
2025-03-28 17:25 ` [PATCH v4 00/19] Enable FRED with KVM VMX Xin Li
2025-06-24 17:06   ` Sean Christopherson
2025-06-24 17:43     ` Xin Li
2025-06-24 17:47       ` H. Peter Anvin
2025-06-24 18:02         ` Xin Li
2025-06-24 18:40           ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFrR5Nk1Ge3_ApWy@google.com \
    --to=seanjc@google.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xin3.li@intel.com \
    --cc=xin@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.