All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Zhong <yang.zhong@intel.com>
To: x86@kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org,
	seanjc@google.com
Cc: jun.nakajima@intel.com, kevin.tian@intel.com,
	jing2.liu@linux.intel.com, jing2.liu@intel.com,
	guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com
Subject: [PATCH v5 21/21] kvm: x86: Disable interception for IA32_XFD on demand
Date: Wed,  5 Jan 2022 04:35:32 -0800	[thread overview]
Message-ID: <20220105123532.12586-22-yang.zhong@intel.com> (raw)
In-Reply-To: <20220105123532.12586-1-yang.zhong@intel.com>

From: Kevin Tian <kevin.tian@intel.com>

Always intercepting IA32_XFD causes non-negligible overhead when this
register is updated frequently in the guest.

Disable r/w emulation after intercepting the first WRMSR(IA32_XFD)
with a non-zero value.

Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync
with the software states in fpstate and the per-cpu xfd cache. This
leads to two additional changes accordingly:

  - Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring
    software states back in-sync with the MSR, before handle_exit_irqoff()
    is called.

  - Always trap #NM once write interception is disabled for IA32_XFD.
    The #NM exception is rare if the guest doesn't use dynamic
    features. Otherwise, there is at most one exception per guest
    task given a dynamic feature.

p.s. We have confirmed that SDM is being revised to say that
when setting IA32_XFD[18] the AMX register state is not guaranteed
to be preserved. This clarification avoids adding mess for a creative
guest which sets IA32_XFD[18]=1 before saving active AMX state to
its own storage.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/vmx.c          | 24 +++++++++++++++++++-----
 arch/x86/kvm/vmx/vmx.h          |  2 +-
 arch/x86/kvm/x86.c              |  8 ++++++++
 4 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5d97f4adc1cb..2e8ffe269c23 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -647,6 +647,7 @@ struct kvm_vcpu_arch {
 	u64 smi_count;
 	bool tpr_access_reporting;
 	bool xsaves_enabled;
+	bool xfd_no_write_intercept;
 	u64 ia32_xss;
 	u64 microcode_version;
 	u64 arch_capabilities;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b601dc8a7f13..a3be4b02df2c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -162,6 +162,7 @@ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = {
 	MSR_FS_BASE,
 	MSR_GS_BASE,
 	MSR_KERNEL_GS_BASE,
+	MSR_IA32_XFD,
 	MSR_IA32_XFD_ERR,
 #endif
 	MSR_IA32_SYSENTER_CS,
@@ -764,10 +765,11 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu)
 	}
 
 	/*
-	 * Trap #NM if guest xfd contains a non-zero value so guest XFD_ERR
-	 * can be saved timely.
+	 * Disabling xfd interception indicates that dynamic xfeatures
+	 * might be used in the guest. Always trap #NM in this case
+	 * to save guest xfd_err timely.
 	 */
-	if (vcpu->arch.guest_fpu.fpstate->xfd)
+	if (vcpu->arch.xfd_no_write_intercept)
 		eb |= (1u << NM_VECTOR);
 
 	vmcs_write32(EXCEPTION_BITMAP, eb);
@@ -1978,9 +1980,21 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_XFD:
 		ret = kvm_set_msr_common(vcpu, msr_info);
-		/* Update #NM interception according to guest xfd */
-		if (!ret)
+		/*
+		 * Always intercepting WRMSR could incur non-negligible
+		 * overhead given xfd might be changed frequently in
+		 * guest context switch. Disable write interception
+		 * upon the first write with a non-zero value (indicating
+		 * potential usage on dynamic xfeatures). Also update
+		 * exception bitmap to trap #NM for proper virtualization
+		 * of guest xfd_err.
+		 */
+		if (!ret && data) {
+			vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD,
+						      MSR_TYPE_RW);
+			vcpu->arch.xfd_no_write_intercept = true;
 			vmx_update_exception_bitmap(vcpu);
+		}
 		break;
 #endif
 	case MSR_IA32_SYSENTER_CS:
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e0922c80c9d1..7f2c82e7f38f 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -352,7 +352,7 @@ struct vcpu_vmx {
 	struct lbr_desc lbr_desc;
 
 	/* Save desired MSR intercept (read: pass-through) state */
-#define MAX_POSSIBLE_PASSTHROUGH_MSRS	14
+#define MAX_POSSIBLE_PASSTHROUGH_MSRS	15
 	struct {
 		DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS);
 		DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 92a1ea13ad51..df0150671b98 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10084,6 +10084,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
 
+	/*
+	 * Sync xfd before calling handle_exit_irqoff() which may
+	 * rely on the fact that guest_fpu::xfd is up-to-date (e.g.
+	 * in #NM irqoff handler).
+	 */
+	if (vcpu->arch.xfd_no_write_intercept)
+		fpu_sync_guest_vmexit_xfd_state();
+
 	static_call(kvm_x86_handle_exit_irqoff)(vcpu);
 
 	if (vcpu->arch.guest_fpu.xfd_err)

      parent reply	other threads:[~2022-01-05 12:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 12:35 [PATCH v5 00/21] AMX Support in KVM Yang Zhong
2022-01-05 12:35 ` [PATCH v5 01/21] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Yang Zhong
2022-01-05 12:35 ` [PATCH v5 02/21] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Yang Zhong
2022-01-05 12:35 ` [PATCH v5 03/21] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2022-01-05 12:35 ` [PATCH v5 04/21] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Yang Zhong
2022-01-05 12:35 ` [PATCH v5 05/21] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Yang Zhong
2022-01-05 12:35 ` [PATCH v5 06/21] x86/fpu: Add guest support to xfd_enable_feature() Yang Zhong
2022-01-05 12:35 ` [PATCH v5 07/21] x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM Yang Zhong
2022-01-05 13:06   ` Paolo Bonzini
2022-01-06  0:51     ` Tian, Kevin
2022-01-06  6:02     ` Yang Zhong
2022-01-05 12:35 ` [PATCH v5 08/21] kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 Yang Zhong
2022-01-05 12:35 ` [PATCH v5 09/21] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Yang Zhong
2022-01-05 12:35 ` [PATCH v5 10/21] kvm: x86: Add emulation for IA32_XFD Yang Zhong
2022-01-05 12:35 ` [PATCH v5 11/21] x86/fpu: Prepare xfd_err in struct fpu_guest Yang Zhong
2022-01-05 12:35 ` [PATCH v5 12/21] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Yang Zhong
2022-01-05 22:22   ` Sean Christopherson
2022-01-05 22:30     ` Jim Mattson
2022-01-06  1:04       ` Tian, Kevin
2022-01-06  0:55     ` Tian, Kevin
2022-01-05 12:35 ` [PATCH v5 13/21] kvm: x86: Emulate IA32_XFD_ERR for guest Yang Zhong
2022-01-05 12:35 ` [PATCH v5 14/21] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Yang Zhong
2022-01-05 12:35 ` [PATCH v5 15/21] kvm: x86: Add XCR0 support for Intel AMX Yang Zhong
2022-01-05 12:35 ` [PATCH v5 16/21] kvm: x86: Add CPUID " Yang Zhong
2022-01-05 22:29   ` Sean Christopherson
2022-01-06  0:52     ` Tian, Kevin
2022-01-05 12:35 ` [PATCH v5 17/21] x86/fpu: Add uabi_size to guest_fpu Yang Zhong
2022-01-05 12:35 ` [PATCH v5 18/21] kvm: x86: Add support for getting/setting expanded xstate buffer Yang Zhong
2022-01-05 12:35 ` [PATCH v5 19/21] kvm: selftests: Add support for KVM_CAP_XSAVE2 Yang Zhong
2022-01-05 12:35 ` [PATCH v5 20/21] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Yang Zhong
2022-01-05 12:35 ` Yang Zhong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220105123532.12586-22-yang.zhong@intel.com \
    --to=yang.zhong@intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=guang.zeng@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.