public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization
@ 2023-04-14  6:25 Chao Gao
  2023-04-14  6:25 ` [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO Chao Gao
                   ` (11 more replies)
  0 siblings, 12 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Chao Gao, Alexandre Chartre, Arnaldo Carvalho de Melo,
	Babu Moger, Borislav Petkov, Borislav Petkov, Daniel Sneddon,
	Dave Hansen, David Matlack, H. Peter Anvin, Ingo Molnar,
	Josh Poimboeuf, Kim Phillips, Len Brown, linux-kernel,
	linux-kselftest, Nikunj A Dadhania, Paolo Bonzini, Pawan Gupta,
	Peter Zijlstra, Sandipan Das, Sean Christopherson, Shuah Khan,
	Thomas Gleixner, Vitaly Kuznetsov, x86, Zhang Chen

Changes since RFC v1:
 * add two kselftests (patch 10-11)
 * set virtual MSRs also on APs [Pawan]
 * enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
   some bits of IA32_SPEC_CTRL (patch 4)
 * other misc cleanup and cosmetic changes

RFC v1: https://lore.kernel.org/lkml/20221210160046.2608762-1-chen.zhang@intel.com/


This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
introduction and use cases of this new feature.

### Virtualize IA32_SPEC_CTRL

"Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
pass-thru'd to a guest.


### Use cases of "virtualize IA32_SPEC_CTRL" [2]

Software mitigations like Retpoline and software BHB-clearing sequence depend on
CPU microarchitectures. And guest cannot know exactly the underlying
microarchitecture. When a guest is migrated between processors of different
microarchitectures, software mitigations which work perfectly on previous
microachitecture may be not effective on the new one. To fix the problem, some
hardware mitigations should be used in conjunction with software mitigations.
Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
to guests and avoid those hardware mitigations being unintentionally disabled
when guest changes IA32_SPEC_CTRL MSR.


### Intention of this series

This series adds the capability of enforcing hardware mitigations for guests
transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR
accesses) to kvm. The capability can be used to solve the VM migration issue in
a pool consisting of processors of different microarchitectures.

Specifically, below are two target scenarios of this series:

Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
	    RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
	    with a microarchitecture doesn't enumerate RRSBA.

Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
	    to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
	    BHI_DIS_S on new parts which doesn't enumerate BHI_NO.

Intel defines some virtual MSRs [2] for guests to report in-use software
mitigations. This allows guests to opt in VMM's deploying hardware mitigations
for them if the guests are either running or later migrated to a system on which
in-use software mitigations are not effective. The virtual MSRs interface is
also added in this series.

### Organization of this series

1. Patch 1-3	Advertise RRSBA_CTRL and BHI_CTRL to guest
2. Patch 4	Add "virtualize IA32_SPEC_CTRL" support
3. Patch 5-9	Allow guests to report in-use software mitigations to KVM so
                that KVM can enable hardware mitigations for guests.
4. Patch 10-11	Add kselftest for virtual MSRs and IA32_SPEC_CTRL

[1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
[2]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html

Chao Gao (3):
  KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
  KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
  KVM: selftests: Add tests for IA32_SPEC_CTRL MSR

Pawan Gupta (1):
  x86/bugs: Use Virtual MSRs to request hardware mitigations

Zhang Chen (7):
  x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
  KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  KVM: x86: Advertise BHI_CTRL support
  KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
  KVM: VMX: Advertise MITIGATION_CTRL support
  KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT

 arch/x86/include/asm/msr-index.h              |  33 +++-
 arch/x86/include/asm/vmx.h                    |   5 +
 arch/x86/include/asm/vmxfeatures.h            |   2 +
 arch/x86/kernel/cpu/bugs.c                    |  25 +++
 arch/x86/kvm/cpuid.c                          |  22 ++-
 arch/x86/kvm/reverse_cpuid.h                  |   8 +
 arch/x86/kvm/svm/svm.c                        |   3 +
 arch/x86/kvm/vmx/capabilities.h               |   5 +
 arch/x86/kvm/vmx/nested.c                     |  13 ++
 arch/x86/kvm/vmx/vmcs.h                       |   2 +
 arch/x86/kvm/vmx/vmx.c                        | 112 ++++++++++-
 arch/x86/kvm/vmx/vmx.h                        |  43 ++++-
 arch/x86/kvm/x86.c                            |  19 +-
 tools/arch/x86/include/asm/msr-index.h        |  37 +++-
 tools/testing/selftests/kvm/Makefile          |   2 +
 .../selftests/kvm/include/x86_64/processor.h  |   5 +
 .../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
 .../kvm/x86_64/virtual_mitigation_msr_test.c  | 175 +++++++++++++++++
 18 files changed, 676 insertions(+), 13 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c


base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46
-- 
2.40.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-14  9:52   ` Binbin Wu
  2023-04-14  6:25 ` [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support Chao Gao
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
	Pawan Gupta, Daniel Sneddon, Sandipan Das, Nikunj A Dadhania,
	Arnaldo Carvalho de Melo, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

To ensure VM migration from a system where software mitigation works to
a system where it doesn't won't harm guest's security level, KVM must
mitigate BHI attacks for guests since migration is transparent to guests
and guests won't and can't react to VM migration.

For example, simple BHB clear sequence [1] is effective in mitigating BHI
attacks on processors prior to Alder Lake, but it is not on Alder Lake.
Guests migrated from prior to Alder Lake host to Alder Lake host become
vulnerable to BHI attacks even if the simmple BHB clear sequence is
deployed. In this case, KVM can enable hardware mitigation for guests by
setting BHI_DIS_S bit of IA32_SPEC_CTRL MSR.

Define the SPEC_CTRL_BHI_DIS_S of IA32_SPEC_CTRL MSR and BHI_NO bits in
arch_capabilities, which will be used by KVM later.

[1]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-2-4

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/include/asm/msr-index.h       | 8 +++++++-
 tools/arch/x86/include/asm/msr-index.h | 8 +++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index ad35355ee43e..60b25d87b82c 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -48,8 +48,10 @@
 #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
 #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
-#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior */
+#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior in supervisor mode */
 #define SPEC_CTRL_RRSBA_DIS_S		BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+#define SPEC_CTRL_BHI_DIS_S_SHIFT	10	   /* Disable BHI behavior in supervisor mode */
+#define SPEC_CTRL_BHI_DIS_S		BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
 
 /* A mask for bits which the kernel toggles when controlling mitigations */
 #define SPEC_CTRL_MITIGATIONS_MASK	(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
@@ -151,6 +153,10 @@
 						 * are restricted to targets in
 						 * kernel.
 						 */
+#define ARCH_CAP_BHI_NO			BIT(20)	/*
+						 * Not susceptible to Branch History
+						 * Injection.
+						 */
 #define ARCH_CAP_PBRSB_NO		BIT(24)	/*
 						 * Not susceptible to Post-Barrier
 						 * Return Stack Buffer Predictions.
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index ad35355ee43e..6079a5fdb40b 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -48,8 +48,10 @@
 #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
 #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
-#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior */
+#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior in supervisor mode */
 #define SPEC_CTRL_RRSBA_DIS_S		BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+#define SPEC_CTRL_BHI_DIS_S_SHIFT	10         /* Disable BHI behavior in supervisor mode */
+#define SPEC_CTRL_BHI_DIS_S		BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
 
 /* A mask for bits which the kernel toggles when controlling mitigations */
 #define SPEC_CTRL_MITIGATIONS_MASK	(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
@@ -151,6 +153,10 @@
 						 * are restricted to targets in
 						 * kernel.
 						 */
+#define ARCH_CAP_BHI_NO			BIT(20) /*
+						 * Not susceptible to Branch History
+						 * Injection.
+						 */
 #define ARCH_CAP_PBRSB_NO		BIT(24)	/*
 						 * Not susceptible to Post-Barrier
 						 * Return Stack Buffer Predictions.
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
  2023-04-14  6:25 ` [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-16  7:04   ` Binbin Wu
  2023-05-15  6:53   ` Xiaoyao Li
  2023-04-14  6:25 ` [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support Chao Gao
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
as the first feature in the leaf.

RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
(bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.

Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
after a non-zero is written to the MSR. Therefore, guests can already
toggle the two bits if the host supports RRSBA_CTRL, and no extra code
is needed to allow guests to toggle the two bits.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
 arch/x86/kvm/reverse_cpuid.h |  7 +++++++
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 9583a110cf5f..f024c3ac2203 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
 		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
 	);
 
+	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
+		SF(RRSBA_CTRL)
+	);
+
 	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
 		F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
 		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
@@ -949,13 +953,14 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		break;
 	/* function 7 has additional index. */
 	case 7:
-		entry->eax = min(entry->eax, 1u);
+		entry->eax = min(entry->eax, 2u);
 		cpuid_entry_override(entry, CPUID_7_0_EBX);
 		cpuid_entry_override(entry, CPUID_7_ECX);
 		cpuid_entry_override(entry, CPUID_7_EDX);
 
-		/* KVM only supports 0x7.0 and 0x7.1, capped above via min(). */
-		if (entry->eax == 1) {
+		max_idx = entry->eax;
+
+		if (max_idx >= 1) {
 			entry = do_host_cpuid(array, function, 1);
 			if (!entry)
 				goto out;
@@ -965,6 +970,17 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			entry->ebx = 0;
 			entry->ecx = 0;
 		}
+
+		if (max_idx >= 2) {
+			entry = do_host_cpuid(array, function, 2);
+			if (!entry)
+				goto out;
+
+			cpuid_entry_override(entry, CPUID_7_2_EDX);
+			entry->eax = 0;
+			entry->ebx = 0;
+			entry->ecx = 0;
+		}
 		break;
 	case 0xa: { /* Architectural Performance Monitoring */
 		union cpuid10_eax eax;
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index a5717282bb9c..72bad8314a9c 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -15,6 +15,7 @@ enum kvm_only_cpuid_leafs {
 	CPUID_12_EAX	 = NCAPINTS,
 	CPUID_7_1_EDX,
 	CPUID_8000_0007_EDX,
+	CPUID_7_2_EDX,
 	NR_KVM_CPU_CAPS,
 
 	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
@@ -47,6 +48,9 @@ enum kvm_only_cpuid_leafs {
 /* CPUID level 0x80000007 (EDX). */
 #define KVM_X86_FEATURE_CONSTANT_TSC	KVM_X86_FEATURE(CPUID_8000_0007_EDX, 8)
 
+/* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
+#define KVM_X86_FEATURE_RRSBA_CTRL	KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
+
 struct cpuid_reg {
 	u32 function;
 	u32 index;
@@ -69,6 +73,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
 	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
 	[CPUID_7_1_EAX]       = {         7, 1, CPUID_EAX},
+	[CPUID_7_2_EDX]       = {         7, 2, CPUID_EDX},
 	[CPUID_12_EAX]        = {0x00000012, 0, CPUID_EAX},
 	[CPUID_8000_001F_EAX] = {0x8000001f, 0, CPUID_EAX},
 	[CPUID_7_1_EDX]       = {         7, 1, CPUID_EDX},
@@ -108,6 +113,8 @@ static __always_inline u32 __feature_translate(int x86_feature)
 		return KVM_X86_FEATURE_SGX_EDECCSSA;
 	else if (x86_feature == X86_FEATURE_CONSTANT_TSC)
 		return KVM_X86_FEATURE_CONSTANT_TSC;
+	else if (x86_feature == X86_FEATURE_RRSBA_CTRL)
+		return KVM_X86_FEATURE_RRSBA_CTRL;
 
 	return x86_feature;
 }
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
  2023-04-14  6:25 ` [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO Chao Gao
  2023-04-14  6:25 ` [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-05-15  7:14   ` Xiaoyao Li
  2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Add 100% kvm-only feature for BHI_CTRL because the kernel doesn't use it
at all.

BHI_CTRL is enumerated by CPUID.7.2.EDX[4]. If supported, BHI_DIS_S (bit
10) of IA32_SPEC_CTRL MSR can be used to enable BHI_DIS_S behavior.

Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
after a non-zero is written to the MSR. Therefore, guests can already
toggle the BHI_DIS_S bit if the host supports BHI_CTRL, and no extra
code is needed to allow guests to toggle the bit.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/cpuid.c         | 2 +-
 arch/x86/kvm/reverse_cpuid.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f024c3ac2203..7cdd859d09a2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -686,7 +686,7 @@ void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
-		SF(RRSBA_CTRL)
+		SF(RRSBA_CTRL) | F(BHI_CTRL)
 	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index 72bad8314a9c..e7e70c9aa384 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -50,6 +50,7 @@ enum kvm_only_cpuid_leafs {
 
 /* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
 #define KVM_X86_FEATURE_RRSBA_CTRL	KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
+#define X86_FEATURE_BHI_CTRL		KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
 
 struct cpuid_reg {
 	u32 function;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (2 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-17  3:17   ` Binbin Wu
                     ` (2 more replies)
  2023-04-14  6:25 ` [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations Chao Gao
                   ` (7 subsequent siblings)
  11 siblings, 3 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
value to hardware.

"virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
MSR cannot be modified by guest software.

Two VMCS fields are added:

  IA32_SPEC_CTRL_MASK:   bits that guest software cannot modify
  IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
			 IA32_SPEC_CTRL MSR

On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
is written to the IA32_SPEC_CTRL MSR, where
  * cur_val is the original value of IA32_SPEC_CTRL MSR
  * mask is the value of IA32_SPEC_CTRL_MASK

Add a mask e.g., loaded_vmcs->spec_ctrl_mask to represent the bits guest
shouldn't change. It is 0 for now and some bits will be added by
following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
on nested transition because the mask is expected to be rarely changed
and the same for vmcs01 and vmcs02.

To prevent guest from changing the bits in the mask, enable "virtualize
IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
is mainly to give the same capability to KVM running on potential broken
hardware or L1 guests.

To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
and the mask to preserve them across nested transitions. Note that the
shadow value may be changed because L2 may access the IA32_SPEC_CTRL
directly and the mask may be changed due to migration when L2 vCPUs are
running.

Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/include/asm/vmx.h         |  5 ++++
 arch/x86/include/asm/vmxfeatures.h |  2 ++
 arch/x86/kvm/vmx/capabilities.h    |  5 ++++
 arch/x86/kvm/vmx/nested.c          | 13 ++++++++++
 arch/x86/kvm/vmx/vmcs.h            |  2 ++
 arch/x86/kvm/vmx/vmx.c             | 34 ++++++++++++++++++++-----
 arch/x86/kvm/vmx/vmx.h             | 40 +++++++++++++++++++++++++++++-
 7 files changed, 94 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 498dc600bd5c..b9f88ecf20c3 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -81,6 +81,7 @@
  * Definitions of Tertiary Processor-Based VM-Execution Controls.
  */
 #define TERTIARY_EXEC_IPI_VIRT			VMCS_CONTROL_BIT(IPI_VIRT)
+#define TERTIARY_EXEC_SPEC_CTRL_VIRT		VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)
 
 #define PIN_BASED_EXT_INTR_MASK                 VMCS_CONTROL_BIT(INTR_EXITING)
 #define PIN_BASED_NMI_EXITING                   VMCS_CONTROL_BIT(NMI_EXITING)
@@ -233,6 +234,10 @@ enum vmcs_field {
 	TERTIARY_VM_EXEC_CONTROL_HIGH	= 0x00002035,
 	PID_POINTER_TABLE		= 0x00002042,
 	PID_POINTER_TABLE_HIGH		= 0x00002043,
+	IA32_SPEC_CTRL_MASK             = 0x0000204A,
+	IA32_SPEC_CTRL_MASK_HIGH        = 0x0000204B,
+	IA32_SPEC_CTRL_SHADOW           = 0x0000204C,
+	IA32_SPEC_CTRL_SHADOW_HIGH      = 0x0000204D,
 	GUEST_PHYSICAL_ADDRESS          = 0x00002400,
 	GUEST_PHYSICAL_ADDRESS_HIGH     = 0x00002401,
 	VMCS_LINK_POINTER               = 0x00002800,
diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
index c6a7eed03914..c70d0769b7d0 100644
--- a/arch/x86/include/asm/vmxfeatures.h
+++ b/arch/x86/include/asm/vmxfeatures.h
@@ -89,4 +89,6 @@
 
 /* Tertiary Processor-Based VM-Execution Controls, word 3 */
 #define VMX_FEATURE_IPI_VIRT		( 3*32+  4) /* Enable IPI virtualization */
+#define VMX_FEATURE_SPEC_CTRL_VIRT	( 3*32+  7) /* Enable IA32_SPEC_CTRL virtualization */
+
 #endif /* _ASM_X86_VMXFEATURES_H */
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 45162c1bcd8f..a7ab70b55acf 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
 		CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
 }
 
+static __always_inline bool cpu_has_spec_ctrl_virt(void)
+{
+	return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
+}
+
 static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
 {
 	return vmcs_config.cpu_based_2nd_exec_ctrl &
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f63b28f46a71..96861a42316d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -276,6 +276,8 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
 	vmx->loaded_vmcs = vmcs;
 	vmx_vcpu_load_vmcs(vcpu, cpu, prev);
 	vmx_sync_vmcs_host_state(vmx, prev);
+	vmx_set_spec_ctrl_mask(vmx, prev->spec_ctrl_mask);
+	vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
 	put_cpu();
 
 	vcpu->arch.regs_avail = ~VMX_REGS_LAZY_LOAD_SET;
@@ -2342,6 +2344,17 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
 		secondary_exec_controls_set(vmx, exec_control);
 	}
 
+	/*
+	 * TERTIARY EXEC CONTROLS
+	 */
+	if (cpu_has_tertiary_exec_ctrls()) {
+		exec_control = __tertiary_exec_controls_get(vmcs01);
+
+		/* IPI virtualization for L2 isn't supported */
+		exec_control &= ~TERTIARY_EXEC_IPI_VIRT;
+		tertiary_exec_controls_set(vmx, exec_control);
+	}
+
 	/*
 	 * ENTRY CONTROLS
 	 *
diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index 7c1996b433e2..5f1fc43c8be0 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -73,6 +73,8 @@ struct loaded_vmcs {
 	struct list_head loaded_vmcss_on_cpu_link;
 	struct vmcs_host_state host_state;
 	struct vmcs_controls_shadow controls_shadow;
+	/* write-through cache of the SPEC_CTRL_MASK field in VMCS */
+	u64 spec_ctrl_mask;
 };
 
 static __always_inline bool is_intr_type(u32 intr_info, u32 type)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 56e0c7ae961d..9f6919bec2b3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1996,7 +1996,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		    !guest_has_spec_ctrl_msr(vcpu))
 			return 1;
 
-		msr_info->data = to_vmx(vcpu)->spec_ctrl;
+		msr_info->data = vmx->guest_spec_ctrl;
 		break;
 	case MSR_IA32_SYSENTER_CS:
 		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
@@ -2145,7 +2145,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	struct vmx_uret_msr *msr;
 	int ret = 0;
 	u32 msr_index = msr_info->index;
-	u64 data = msr_info->data;
+	u64 data = msr_info->data, spec_ctrl_mask;
 	u32 index;
 
 	switch (msr_index) {
@@ -2256,13 +2256,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		    !guest_has_spec_ctrl_msr(vcpu))
 			return 1;
 
-		if (kvm_spec_ctrl_test_value(data))
+		spec_ctrl_mask = vmx_get_spec_ctrl_mask(vmx);
+		if (kvm_spec_ctrl_test_value(data | spec_ctrl_mask))
 			return 1;
 
-		vmx->spec_ctrl = data;
+		vmx_set_guest_spec_ctrl(vmx, data);
+
 		if (!data)
 			break;
 
+		/*
+		 * Don't disable intercept to prevent guest from changing some
+		 * bits if "virtualize IA32_SPEC_CTRL" isn't supported.
+		 */
+		if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
+			break;
+
 		/*
 		 * For non-nested:
 		 * When it's written (to non-zero) for the first time, pass
@@ -4822,7 +4831,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		__vmx_vcpu_reset(vcpu);
 
 	vmx->rmode.vm86_active = 0;
-	vmx->spec_ctrl = 0;
+	vmx_set_spec_ctrl_mask(vmx, 0);
+	vmx_set_guest_spec_ctrl(vmx, 0);
 
 	vmx->msr_ia32_umwait_control = 0;
 
@@ -7117,8 +7127,20 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
 	if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL))
 		return;
 
-	if (flags & VMX_RUN_SAVE_SPEC_CTRL)
+	if (flags & VMX_RUN_SAVE_SPEC_CTRL) {
 		vmx->spec_ctrl = __rdmsr(MSR_IA32_SPEC_CTRL);
+		/*
+		 * For simplicity, always keep vmx->guest_spec_ctrl
+		 * up-to-date. The guest value is needed for userspace read
+		 * e.g. live migration, and nested transitions. At later
+		 * point, it is uneasy to tell which one is the latest
+		 * guest value since the intercept state may change.
+		 */
+		if (cpu_has_spec_ctrl_virt())
+			vmx->guest_spec_ctrl = vmcs_read64(IA32_SPEC_CTRL_SHADOW);
+		else
+			vmx->guest_spec_ctrl = vmx->spec_ctrl;
+	}
 
 	/*
 	 * If the guest/host SPEC_CTRL values differ, restore the host value.
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index cb766f65a3eb..021d86b52e18 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -291,6 +291,7 @@ struct vcpu_vmx {
 #endif
 
 	u64		      spec_ctrl;
+	u64		      guest_spec_ctrl;
 	u32		      msr_ia32_umwait_control;
 
 	/*
@@ -589,7 +590,8 @@ static inline u8 vmx_get_rvi(void)
 
 #define KVM_REQUIRED_VMX_TERTIARY_VM_EXEC_CONTROL 0
 #define KVM_OPTIONAL_VMX_TERTIARY_VM_EXEC_CONTROL			\
-	(TERTIARY_EXEC_IPI_VIRT)
+	(TERTIARY_EXEC_IPI_VIRT |					\
+	 TERTIARY_EXEC_SPEC_CTRL_VIRT)
 
 #define BUILD_CONTROLS_SHADOW(lname, uname, bits)						\
 static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)			\
@@ -624,6 +626,20 @@ BUILD_CONTROLS_SHADOW(exec, CPU_BASED_VM_EXEC_CONTROL, 32)
 BUILD_CONTROLS_SHADOW(secondary_exec, SECONDARY_VM_EXEC_CONTROL, 32)
 BUILD_CONTROLS_SHADOW(tertiary_exec, TERTIARY_VM_EXEC_CONTROL, 64)
 
+static inline void vmx_set_spec_ctrl_mask(struct vcpu_vmx *vmx, u64 mask)
+{
+	if (vmx->loaded_vmcs->spec_ctrl_mask != mask) {
+		if (cpu_has_spec_ctrl_virt())
+			vmcs_write64(IA32_SPEC_CTRL_MASK, mask);
+		vmx->loaded_vmcs->spec_ctrl_mask = mask;
+	}
+}
+
+static inline u64 vmx_get_spec_ctrl_mask(struct vcpu_vmx *vmx)
+{
+	return vmx->loaded_vmcs->spec_ctrl_mask;
+}
+
 /*
  * VMX_REGS_LAZY_LOAD_SET - The set of registers that will be updated in the
  * cache on demand.  Other registers not listed here are synced to
@@ -750,4 +766,26 @@ static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
 	       to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
 }
 
+static inline u64 vmx_get_guest_spec_ctrl(struct vcpu_vmx *vmx)
+{
+	return vmx->guest_spec_ctrl;
+}
+
+static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
+{
+	vmx->guest_spec_ctrl = val;
+
+	/*
+	 * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
+	 * regardless of the MSR intercept state.
+	 */
+	if (cpu_has_spec_ctrl_virt())
+		vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
+
+	/*
+	 * Update the effective value of IA32_SPEC_CTRL to reflect changes to
+	 * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
+	 */
+	vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
+}
 #endif /* __KVM_X86_VMX_H */
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (3 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-17 13:43   ` Binbin Wu
  2023-04-14  6:25 ` [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support Chao Gao
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Pawan Gupta, Zhang Chen, Chao Gao, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Peter Zijlstra, Babu Moger, Daniel Sneddon, Sandipan Das,
	Nikunj A Dadhania, Josh Poimboeuf, Kim Phillips,
	Alexandre Chartre, linux-kernel

From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

Guests that have different family/model than the host may not be aware
of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
particularly true when guests migrate. To solve this problem Intel
processors have added a virtual MSR interface through which guests can
report their mitigation status and request VMM to deploy relevant
hardware mitigations.

Use this virtualized MSR interface to request relevant hardware controls
for retpoline mitigation.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Co-developed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
 arch/x86/kernel/cpu/bugs.c       | 25 +++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 60b25d87b82c..aec213f0c6fc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -166,6 +166,7 @@
 						 * IA32_XAPIC_DISABLE_STATUS MSR
 						 * supported
 						 */
+#define ARCH_CAP_VIRTUAL_ENUM		BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
@@ -1103,6 +1104,30 @@
 #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
 #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
 #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
+
+/* Intel virtual MSRs */
+#define MSR_VIRTUAL_ENUMERATION			0x50000000
+#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT	BIT(0)	/*
+							 * Mitigation ctrl via virtual
+							 * MSRs supported
+							 */
+
+#define MSR_VIRTUAL_MITIGATION_ENUM		0x50000001
+#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT	BIT(0)	/* VMM supports BHI_DIS_S */
+#define MITI_ENUM_RETPOLINE_S_SUPPORT		BIT(1)	/* VMM supports RRSBA_DIS_S */
+
+#define MSR_VIRTUAL_MITIGATION_CTRL		0x50000002
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT	0	/*
+							 * Request VMM to deploy
+							 * BHI_DIS_S mitigation
+							 */
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED		BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)
+#define MITI_CTRL_RETPOLINE_S_USED_BIT		1	/*
+							 * Request VMM to deploy
+							 * RRSBA_DIS_S mitigation
+							 */
+#define MITI_CTRL_RETPOLINE_S_USED		BIT(MITI_CTRL_RETPOLINE_S_USED_BIT)
+
 /* AMD-V MSRs */
 
 #define MSR_VM_CR                       0xc0010114
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index f9d060e71c3e..5326c03d9d5e 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1435,6 +1435,27 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
 	dump_stack();
 }
 
+/* Speculation control using virtualized MSRs */
+static void spec_ctrl_setup_virtualized_msr(void)
+{
+	u64 msr_virt_enum, msr_mitigation_enum;
+
+	/* When retpoline is being used, request relevant hardware controls */
+	if (!boot_cpu_has(X86_FEATURE_RETPOLINE))
+		return;
+
+	if (!(x86_read_arch_cap_msr() & ARCH_CAP_VIRTUAL_ENUM))
+		return;
+
+	rdmsrl(MSR_VIRTUAL_ENUMERATION, msr_virt_enum);
+	if (!(msr_virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+		return;
+
+	rdmsrl(MSR_VIRTUAL_MITIGATION_ENUM, msr_mitigation_enum);
+	if (msr_mitigation_enum & MITI_ENUM_RETPOLINE_S_SUPPORT)
+		msr_set_bit(MSR_VIRTUAL_MITIGATION_CTRL, MITI_CTRL_RETPOLINE_S_USED_BIT);
+}
+
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -1546,6 +1567,8 @@ static void __init spectre_v2_select_mitigation(void)
 	    mode == SPECTRE_V2_RETPOLINE)
 		spec_ctrl_disable_kernel_rrsba();
 
+	spec_ctrl_setup_virtualized_msr();
+
 	spectre_v2_enabled = mode;
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
@@ -2115,6 +2138,8 @@ void x86_spec_ctrl_setup_ap(void)
 
 	if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
 		x86_amd_ssb_disable();
+
+	spec_ctrl_setup_virtualized_msr();
 }
 
 bool itlb_multihit_kvm_mitigation;
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (4 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-05-18 10:14   ` Xiaoyao Li
  2023-04-14  6:25 ` [RFC PATCH v2 07/11] KVM: VMX: Advertise MITIGATION_CTRL support Chao Gao
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Bit 63 of IA32_ARCH_CAPABILITIES MSR indicates availablility of the
VIRTUAL_ENUMERATION_MSR (index 0x50000000) that enumerates features like
e.g., mitigation enumeration which is used for guest to hint VMMs the
software mitigations in use.

Advertise ARCH_CAP_VIRTUAL_ENUM support for VMX and emulate read/write
of the VIRTUAL_ENUMERATION_MSR. Now VIRTUAL_ENUMERATION_MSR is always 0.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/svm/svm.c |  1 +
 arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h |  1 +
 arch/x86/kvm/x86.c     | 16 +++++++++++++++-
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 57f241c5a371..195d0cf9309a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4093,6 +4093,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
 {
 	switch (index) {
 	case MSR_IA32_MCG_EXT_CTL:
+	case MSR_VIRTUAL_ENUMERATION:
 	case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
 		return false;
 	case MSR_IA32_SMBASE:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9f6919bec2b3..85419137decb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1943,6 +1943,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
 	return !(msr->data & ~valid_bits);
 }
 
+#define VIRTUAL_ENUMERATION_VALID_BITS	0ULL
+
 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 {
 	switch (msr->index) {
@@ -1950,6 +1952,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 		if (!nested)
 			return 1;
 		return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
+	case MSR_VIRTUAL_ENUMERATION:
+		msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
+		return 0;
 	default:
 		return KVM_MSR_RET_INVALID;
 	}
@@ -2096,6 +2101,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_DEBUGCTLMSR:
 		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
 		break;
+	case MSR_VIRTUAL_ENUMERATION:
+		if (!msr_info->host_initiated &&
+		    !(vcpu->arch.arch_capabilities & ARCH_CAP_VIRTUAL_ENUM))
+			return 1;
+		msr_info->data = vmx->msr_virtual_enumeration;
+		break;
 	default:
 	find_uret_msr:
 		msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2437,6 +2448,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		}
 		ret = kvm_set_msr_common(vcpu, msr_info);
 		break;
+	case MSR_VIRTUAL_ENUMERATION:
+		if (!msr_info->host_initiated)
+			return 1;
+		if (data & ~VIRTUAL_ENUMERATION_VALID_BITS)
+			return 1;
+
+		vmx->msr_virtual_enumeration = data;
+		break;
 
 	default:
 	find_uret_msr:
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 021d86b52e18..a7faaf9fdc26 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -292,6 +292,7 @@ struct vcpu_vmx {
 
 	u64		      spec_ctrl;
 	u64		      guest_spec_ctrl;
+	u64		      msr_virtual_enumeration;
 	u32		      msr_ia32_umwait_control;
 
 	/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3c58dbae7b4c..a1bc52bebdcc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1537,6 +1537,7 @@ static const u32 emulated_msrs_all[] = {
 
 	MSR_K7_HWCR,
 	MSR_KVM_POLL_CONTROL,
+	MSR_VIRTUAL_ENUMERATION,
 };
 
 static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
@@ -1570,6 +1571,7 @@ static const u32 msr_based_features_all[] = {
 	MSR_IA32_UCODE_REV,
 	MSR_IA32_ARCH_CAPABILITIES,
 	MSR_IA32_PERF_CAPABILITIES,
+	MSR_VIRTUAL_ENUMERATION,
 };
 
 static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
@@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
 	 ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
 	 ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
 	 ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
-	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
+	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
+	 ARCH_CAP_VIRTUAL_ENUM)
 
 static u64 kvm_get_arch_capabilities(void)
 {
@@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
 	 */
 	data |= ARCH_CAP_PSCHANGE_MC_NO;
 
+	/*
+	 * Virtual enumeration is a paravirt feature. The only usage for now
+	 * is to bridge the gap caused by microarchitecture changes between
+	 * different Intel processors. And its usage is linked to "virtualize
+	 * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
+	 * from the same usage and how to implement it is still unclear. Limit
+	 * virtual enumeration to VMX.
+	 */
+	if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
+		data |= ARCH_CAP_VIRTUAL_ENUM;
+
 	/*
 	 * If we're doing cache flushes (either "always" or "cond")
 	 * we will do one whenever the guest does a vmlaunch/vmresume.
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 07/11] KVM: VMX: Advertise MITIGATION_CTRL support
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (5 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-14  6:25 ` [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT Chao Gao
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Advertise MITIGATION_CTRL support and emulate accesses to two associated
MSRs.

MITIGATION_CTRL is enumerated by bit 0 of MSR_VIRTUAL_ENUMERATION. If
supported, two virtual MSRs MSR_VIRTUAL_MITIGATION_ENUM(0x50000001) and
MSR_VIRTUAL_MITIGATION_CTRL(0x50000002) are available.

The two MSRs are used to for guest to report software mitigation status.
Such information is preserved across live migration, therefore KVM can
leverage the information to deploy necessary hardware mitigation for
guests to guarantee guests maintain the same security level after
migration.

Note that MSR_VIRTUAL_MITIGATION_ENUM is also a feature MSR since each
bit in the MSR represents a software mitigation that the underlying VMM
understands.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/svm/svm.c |  2 ++
 arch/x86/kvm/vmx/vmx.c | 36 +++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h |  2 ++
 arch/x86/kvm/x86.c     |  3 +++
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 195d0cf9309a..80bb7a62e9b2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4094,6 +4094,8 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
 	switch (index) {
 	case MSR_IA32_MCG_EXT_CTL:
 	case MSR_VIRTUAL_ENUMERATION:
+	case MSR_VIRTUAL_MITIGATION_ENUM:
+	case MSR_VIRTUAL_MITIGATION_CTRL:
 	case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
 		return false;
 	case MSR_IA32_SMBASE:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 85419137decb..980498c4c30c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1943,7 +1943,9 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
 	return !(msr->data & ~valid_bits);
 }
 
-#define VIRTUAL_ENUMERATION_VALID_BITS	0ULL
+#define VIRTUAL_ENUMERATION_VALID_BITS	VIRT_ENUM_MITIGATION_CTRL_SUPPORT
+#define MITI_ENUM_VALID_BITS		0ULL
+#define MITI_CTRL_VALID_BITS		0ULL
 
 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 {
@@ -1955,6 +1957,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 	case MSR_VIRTUAL_ENUMERATION:
 		msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
 		return 0;
+	case MSR_VIRTUAL_MITIGATION_ENUM:
+		msr->data = MITI_ENUM_VALID_BITS;
+		return 0;
 	default:
 		return KVM_MSR_RET_INVALID;
 	}
@@ -2107,6 +2112,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		msr_info->data = vmx->msr_virtual_enumeration;
 		break;
+	case MSR_VIRTUAL_MITIGATION_ENUM:
+		if (!msr_info->host_initiated &&
+		    !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+			return 1;
+		msr_info->data = vmx->msr_virtual_mitigation_enum;
+		break;
+	case MSR_VIRTUAL_MITIGATION_CTRL:
+		if (!msr_info->host_initiated &&
+		    !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+			return 1;
+		msr_info->data = vmx->msr_virtual_mitigation_ctrl;
+		break;
 	default:
 	find_uret_msr:
 		msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2456,7 +2473,23 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 		vmx->msr_virtual_enumeration = data;
 		break;
+	case MSR_VIRTUAL_MITIGATION_ENUM:
+		if (!msr_info->host_initiated)
+			return 1;
+		if (data & ~MITI_ENUM_VALID_BITS)
+			return 1;
+
+		vmx->msr_virtual_mitigation_enum = data;
+		break;
+	case MSR_VIRTUAL_MITIGATION_CTRL:
+		if (!msr_info->host_initiated &&
+		    !(vmx->msr_virtual_enumeration & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
+			return 1;
+		if (data & ~MITI_CTRL_VALID_BITS)
+			return 1;
 
+		vmx->msr_virtual_mitigation_ctrl = data;
+		break;
 	default:
 	find_uret_msr:
 		msr = vmx_find_uret_msr(vmx, msr_index);
@@ -4852,6 +4885,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	vmx->rmode.vm86_active = 0;
 	vmx_set_spec_ctrl_mask(vmx, 0);
 	vmx_set_guest_spec_ctrl(vmx, 0);
+	vmx->msr_virtual_mitigation_ctrl = 0;
 
 	vmx->msr_ia32_umwait_control = 0;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a7faaf9fdc26..b81480c879b5 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -293,6 +293,8 @@ struct vcpu_vmx {
 	u64		      spec_ctrl;
 	u64		      guest_spec_ctrl;
 	u64		      msr_virtual_enumeration;
+	u64		      msr_virtual_mitigation_enum;
+	u64		      msr_virtual_mitigation_ctrl;
 	u32		      msr_ia32_umwait_control;
 
 	/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1bc52bebdcc..3b567dc03b27 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1538,6 +1538,8 @@ static const u32 emulated_msrs_all[] = {
 	MSR_K7_HWCR,
 	MSR_KVM_POLL_CONTROL,
 	MSR_VIRTUAL_ENUMERATION,
+	MSR_VIRTUAL_MITIGATION_ENUM,
+	MSR_VIRTUAL_MITIGATION_CTRL,
 };
 
 static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
@@ -1572,6 +1574,7 @@ static const u32 msr_based_features_all[] = {
 	MSR_IA32_ARCH_CAPABILITIES,
 	MSR_IA32_PERF_CAPABILITIES,
 	MSR_VIRTUAL_ENUMERATION,
+	MSR_VIRTUAL_MITIGATION_ENUM,
 };
 
 static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (6 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 07/11] KVM: VMX: Advertise MITIGATION_CTRL support Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-05-18 10:25   ` Xiaoyao Li
  2023-05-22  9:43   ` Liu, Jingqi
  2023-04-14  6:25 ` [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT Chao Gao
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Chao Gao, Zhang Chen, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

Allow guest to query if the underying VMM understands Retpoline and
report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
MSR_VIRTUAL_MITIGATION_ENUM/CTRL.

Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
using retpoline and the processor has the behavior.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 980498c4c30c..25afb4c3e55e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1944,8 +1944,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
 }
 
 #define VIRTUAL_ENUMERATION_VALID_BITS	VIRT_ENUM_MITIGATION_CTRL_SUPPORT
-#define MITI_ENUM_VALID_BITS		0ULL
-#define MITI_CTRL_VALID_BITS		0ULL
+#define MITI_ENUM_VALID_BITS		MITI_ENUM_RETPOLINE_S_SUPPORT
+#define MITI_CTRL_VALID_BITS		MITI_CTRL_RETPOLINE_S_USED
 
 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 {
@@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	struct vmx_uret_msr *msr;
 	int ret = 0;
 	u32 msr_index = msr_info->index;
-	u64 data = msr_info->data, spec_ctrl_mask;
+	u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;
 	u32 index;
 
 	switch (msr_index) {
@@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (data & ~MITI_CTRL_VALID_BITS)
 			return 1;
 
+		if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+			rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
+
+		if (data & MITI_CTRL_RETPOLINE_S_USED &&
+		    kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&
+		    arch_msr & ARCH_CAP_RRSBA)
+			spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;
+
+		/*
+		 * Intercept IA32_SPEC_CTRL to disallow guest from changing
+		 * certain bits.
+		 */
+		if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
+			vmx_enable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW);
+
+		vmx_set_spec_ctrl_mask(vmx, spec_ctrl_mask);
+		vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
+
 		vmx->msr_virtual_mitigation_ctrl = data;
 		break;
 	default:
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (7 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-05-22  9:41   ` Liu, Jingqi
  2023-04-14  6:25 ` [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs Chao Gao
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Zhang Chen, Chao Gao, Sean Christopherson,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

From: Zhang Chen <chen.zhang@intel.com>

Allow guest to query if the underying VMM understands BHB-clearing
sequence and report the statue of BHB-clearing sequence in suprevisor
mode i.e. CPL < 3 via MSR_VIRTUAL_MITIGATION_ENUM/CTRL.

Enable BHI_DIS_S for guest if guest is using BHI-clearing sequence and
the sequence isn't effective on the processor.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Jiaan Lu <jiaan.lu@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 25afb4c3e55e..abbbf8e66b53 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1944,8 +1944,10 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
 }
 
 #define VIRTUAL_ENUMERATION_VALID_BITS	VIRT_ENUM_MITIGATION_CTRL_SUPPORT
-#define MITI_ENUM_VALID_BITS		MITI_ENUM_RETPOLINE_S_SUPPORT
-#define MITI_CTRL_VALID_BITS		MITI_CTRL_RETPOLINE_S_USED
+#define MITI_ENUM_VALID_BITS		(MITI_ENUM_RETPOLINE_S_SUPPORT | \
+					 MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT)
+#define MITI_CTRL_VALID_BITS		(MITI_CTRL_RETPOLINE_S_USED | \
+					 MITI_CTRL_BHB_CLEAR_SEQ_S_USED)
 
 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 {
@@ -2496,6 +2498,11 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		    arch_msr & ARCH_CAP_RRSBA)
 			spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;
 
+		if (data & MITI_CTRL_BHB_CLEAR_SEQ_S_USED &&
+		    kvm_cpu_cap_has(X86_FEATURE_BHI_CTRL) &&
+		    !(arch_msr & ARCH_CAP_BHI_NO))
+			spec_ctrl_mask |= SPEC_CTRL_BHI_DIS_S;
+
 		/*
 		 * Intercept IA32_SPEC_CTRL to disallow guest from changing
 		 * certain bits.
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (8 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-05-22  9:39   ` Liu, Jingqi
  2023-04-14  6:25 ` [RFC PATCH v2 11/11] KVM: selftests: Add tests for IA32_SPEC_CTRL MSR Chao Gao
  2023-04-14  9:51 ` [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Binbin Wu
  11 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Chao Gao, Paolo Bonzini, Shuah Khan,
	Arnaldo Carvalho de Melo, Borislav Petkov, Pawan Gupta,
	Zhang Chen, Daniel Sneddon, linux-kernel, linux-kselftest

Three virtual MSRs added for guest to report the usage of software
mitigations. They are enumerated in an architectural way. Try to
access the three MSRs to ensure the behavior is expected:
Specifically,

1. below three cases should cause #GP:
 * access to a non-present MSR
 * write to read-only MSRs
 * toggling reserved bit of a writeable MSR

2. rdmsr/wrmsr in other cases should succeed

3. rdmsr should return the value last written

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 tools/arch/x86/include/asm/msr-index.h        |  23 +++
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/x86_64/virtual_mitigation_msr_test.c  | 175 ++++++++++++++++++
 3 files changed, 199 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c

diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 6079a5fdb40b..55f75e9ebbb7 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -166,6 +166,7 @@
 						 * IA32_XAPIC_DISABLE_STATUS MSR
 						 * supported
 						 */
+#define ARCH_CAP_VIRTUAL_ENUM		BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
@@ -1103,6 +1104,28 @@
 #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
 #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
 #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
+
+/* Intel virtual MSRs */
+#define MSR_VIRTUAL_ENUMERATION			0x50000000
+#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT	BIT(0)  /*
+							 * Mitigation ctrl via virtual
+							 * MSRs supported
+							 */
+
+#define MSR_VIRTUAL_MITIGATION_ENUM		0x50000001
+#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT	BIT(0)	/* VMM supports BHI_DIS_S */
+#define MITI_ENUM_RETPOLINE_S_SUPPORT		BIT(1)	/* VMM supports RRSBA_DIS_S */
+
+#define MSR_VIRTUAL_MITIGATION_CTRL		0x50000002
+#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED		BIT(0)	/*
+							 * Request VMM to deploy
+							 * BHI_DIS_S mitigation
+							 */
+#define MITI_CTRL_RETPOLINE_S_USED		BIT(1)	/*
+							 * Request VMM to deploy
+							 * RRSBA_DIS_S mitigation
+							 */
+
 /* AMD-V MSRs */
 
 #define MSR_VM_CR                       0xc0010114
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 84a627c43795..9db9a7e49a54 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -115,6 +115,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
 TEST_GEN_PROGS_x86_64 += x86_64/amx_test
 TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
 TEST_GEN_PROGS_x86_64 += x86_64/triple_fault_event_test
+TEST_GEN_PROGS_x86_64 += x86_64/virtual_mitigation_msr_test
 TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c b/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
new file mode 100644
index 000000000000..4d924a0cf2dd
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Intel, Inc.
+ *
+ * tests for virtual mitigation MSR accesses
+ */
+
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+
+#include "kvm_util.h"
+#include "processor.h"
+
+static int guest_exception_count;
+static int expected_exception_count;
+static void guest_gp_handler(struct ex_regs *regs)
+{
+	/* RDMSR/WRMSR are 2 bytes */
+	regs->rip += 2;
+	++guest_exception_count;
+}
+
+static void write_msr_expect_gp(uint32_t msr, uint64_t val)
+{
+	uint64_t old_val;
+
+	old_val = rdmsr(msr);
+	wrmsr(msr, val);
+	expected_exception_count++;
+	GUEST_ASSERT_2(guest_exception_count == expected_exception_count,
+		       guest_exception_count, expected_exception_count);
+	GUEST_ASSERT_2(rdmsr(msr) == old_val, rdmsr(msr), old_val);
+}
+
+static void write_msr_expect_no_gp(uint32_t msr, uint64_t val)
+{
+	wrmsr(msr, val);
+	GUEST_ASSERT_EQ(guest_exception_count, expected_exception_count);
+	GUEST_ASSERT_EQ(rdmsr(msr), val);
+}
+
+static void read_msr_expect_gp(uint32_t msr)
+{
+	(void)rdmsr(msr);
+	expected_exception_count++;
+	GUEST_ASSERT_2(guest_exception_count == expected_exception_count,
+		       guest_exception_count, expected_exception_count);
+}
+
+static void guest_code_with_virtual_mitigation_ctrl(void)
+{
+	uint64_t val, miti_ctrl = 0;
+	int i;
+
+	val = rdmsr(MSR_VIRTUAL_ENUMERATION);
+	/* MSR_VIRTUAL_ENUMERATION is read-only. #GP is expected on write */
+	write_msr_expect_gp(MSR_VIRTUAL_ENUMERATION, val);
+
+	val = rdmsr(MSR_VIRTUAL_MITIGATION_ENUM);
+	/* MSR_VIRTUAL_MITIGATION_ENUM is read-only. #GP is expected on write */
+	write_msr_expect_gp(MSR_VIRTUAL_MITIGATION_ENUM, val);
+
+	for (i = 0; i < 64; i++) {
+		if (val & BIT_ULL(i)) {
+			miti_ctrl |= BIT_ULL(i);
+			write_msr_expect_no_gp(MSR_VIRTUAL_MITIGATION_CTRL, miti_ctrl);
+		} else {
+			write_msr_expect_gp(MSR_VIRTUAL_MITIGATION_CTRL, miti_ctrl | BIT_ULL(i));
+		}
+	}
+
+	write_msr_expect_no_gp(MSR_VIRTUAL_MITIGATION_CTRL, 0);
+	GUEST_DONE();
+}
+
+static void guest_code_no_virtual_enumeration(void)
+{
+	read_msr_expect_gp(MSR_VIRTUAL_ENUMERATION);
+	read_msr_expect_gp(MSR_VIRTUAL_MITIGATION_ENUM);
+	read_msr_expect_gp(MSR_VIRTUAL_MITIGATION_CTRL);
+	GUEST_DONE();
+}
+
+bool kvm_cpu_has_virtual_mitigation_ctrl(void)
+{
+	const struct kvm_msr_list *feature_list;
+	u64 virt_enum = 0;
+	int i;
+
+	feature_list = kvm_get_feature_msr_index_list();
+	for (i = 0; i < feature_list->nmsrs; i++) {
+		if (feature_list->indices[i] == MSR_VIRTUAL_ENUMERATION)
+			virt_enum = kvm_get_feature_msr(MSR_VIRTUAL_ENUMERATION);
+	}
+
+	return virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT;
+}
+
+static void enable_virtual_mitigation_ctrl(struct kvm_vcpu *vcpu)
+{
+	vcpu_set_msr(vcpu, MSR_IA32_ARCH_CAPABILITIES, ARCH_CAP_VIRTUAL_ENUM);
+	vcpu_set_msr(vcpu, MSR_VIRTUAL_ENUMERATION, VIRT_ENUM_MITIGATION_CTRL_SUPPORT);
+	vcpu_set_msr(vcpu, MSR_VIRTUAL_MITIGATION_ENUM,
+		     kvm_get_feature_msr(MSR_VIRTUAL_MITIGATION_ENUM));
+}
+
+static void disable_virtual_enumeration(struct kvm_vcpu *vcpu)
+{
+	vcpu_set_msr(vcpu, MSR_IA32_ARCH_CAPABILITIES, 0);
+}
+
+static void test_virtual_mitiation_ctrl(bool enable)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	struct ucall uc;
+	void *guest_code;
+
+	guest_code = enable ? guest_code_with_virtual_mitigation_ctrl :
+			      guest_code_no_virtual_enumeration;
+
+	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+	run = vcpu->run;
+
+	if (enable)
+		enable_virtual_mitigation_ctrl(vcpu);
+	else
+		disable_virtual_enumeration(vcpu);
+
+
+	/* Register #GP handler */
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vcpu);
+	vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
+
+	while (1) {
+		vcpu_run(vcpu);
+
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "Unexpected exit reason: %u (%s),\n",
+			    run->exit_reason,
+			    exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT_2(uc, "real %ld expected %ld");
+			break;
+		case UCALL_DONE:
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_ARCH_CAPABILITIES));
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GET_MSR_FEATURES));
+	TEST_REQUIRE(kvm_cpu_has_virtual_mitigation_ctrl());
+
+	test_virtual_mitiation_ctrl(true);
+	test_virtual_mitiation_ctrl(false);
+
+	return 0;
+}
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v2 11/11] KVM: selftests: Add tests for IA32_SPEC_CTRL MSR
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (9 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs Chao Gao
@ 2023-04-14  6:25 ` Chao Gao
  2023-04-14  9:51 ` [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Binbin Wu
  11 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-14  6:25 UTC (permalink / raw)
  To: kvm
  Cc: Jiaan Lu, Chao Gao, Paolo Bonzini, Shuah Khan,
	Arnaldo Carvalho de Melo, Borislav Petkov, Pawan Gupta,
	Zhang Chen, Sean Christopherson, David Matlack, Vitaly Kuznetsov,
	linux-kernel, linux-kselftest

Toggle supported bits of IA32_SPEC_CTRL and verify the result. And also
verify the MSR value is preserved across nested transitions.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 tools/arch/x86/include/asm/msr-index.h        |   6 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/processor.h  |   5 +
 .../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
 4 files changed, 190 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c

diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 55f75e9ebbb7..9ad6c307c0d0 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -48,6 +48,12 @@
 #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
 #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
+#define SPEC_CTRL_IPRED_DIS_U_SHIFT	3	   /* Disable IPRED behavior in user mode */
+#define SPEC_CTRL_IPRED_DIS_U		BIT(SPEC_CTRL_IPRED_DIS_U_SHIFT)
+#define SPEC_CTRL_IPRED_DIS_S_SHIFT	4	   /* Disable IPRED behavior in supervisor mode */
+#define SPEC_CTRL_IPRED_DIS_S		BIT(SPEC_CTRL_IPRED_DIS_S_SHIFT)
+#define SPEC_CTRL_RRSBA_DIS_U_SHIFT	5	   /* Disable RRSBA behavior in user mode */
+#define SPEC_CTRL_RRSBA_DIS_U		BIT(SPEC_CTRL_RRSBA_DIS_U_SHIFT)
 #define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior in supervisor mode */
 #define SPEC_CTRL_RRSBA_DIS_S		BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
 #define SPEC_CTRL_BHI_DIS_S_SHIFT	10         /* Disable BHI behavior in supervisor mode */
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 9db9a7e49a54..9f117cf80477 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -116,6 +116,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/amx_test
 TEST_GEN_PROGS_x86_64 += x86_64/max_vcpuid_cap_test
 TEST_GEN_PROGS_x86_64 += x86_64/triple_fault_event_test
 TEST_GEN_PROGS_x86_64 += x86_64/virtual_mitigation_msr_test
+TEST_GEN_PROGS_x86_64 += x86_64/spec_ctrl_msr_test
 TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 90387ddcb2a9..355aba25dfef 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -125,8 +125,13 @@ struct kvm_x86_cpu_feature {
 #define	X86_FEATURE_IBT			KVM_X86_CPU_FEATURE(0x7, 0, EDX, 20)
 #define	X86_FEATURE_AMX_TILE		KVM_X86_CPU_FEATURE(0x7, 0, EDX, 24)
 #define	X86_FEATURE_SPEC_CTRL		KVM_X86_CPU_FEATURE(0x7, 0, EDX, 26)
+#define	X86_FEATURE_INTEL_STIBP		KVM_X86_CPU_FEATURE(0x7, 0, EDX, 27)
+#define	X86_FEATURE_SPEC_CTRL_SSBD	KVM_X86_CPU_FEATURE(0x7, 0, EDX, 31)
 #define	X86_FEATURE_ARCH_CAPABILITIES	KVM_X86_CPU_FEATURE(0x7, 0, EDX, 29)
 #define	X86_FEATURE_PKS			KVM_X86_CPU_FEATURE(0x7, 0, ECX, 31)
+#define	X86_FEATURE_IPRED_CTRL		KVM_X86_CPU_FEATURE(0x7, 2, EDX, 1)
+#define	X86_FEATURE_RRSBA_CTRL		KVM_X86_CPU_FEATURE(0x7, 2, EDX, 2)
+#define	X86_FEATURE_BHI_CTRL		KVM_X86_CPU_FEATURE(0x7, 2, EDX, 4)
 #define	X86_FEATURE_XTILECFG		KVM_X86_CPU_FEATURE(0xD, 0, EAX, 17)
 #define	X86_FEATURE_XTILEDATA		KVM_X86_CPU_FEATURE(0xD, 0, EAX, 18)
 #define	X86_FEATURE_XSAVES		KVM_X86_CPU_FEATURE(0xD, 1, EAX, 3)
diff --git a/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c b/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
new file mode 100644
index 000000000000..ced4640ee92e
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
@@ -0,0 +1,178 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Intel, Inc.
+ *
+ * tests for IA32_SPEC_CTRL MSR accesses
+ */
+
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+
+#include "kvm_util.h"
+#include "vmx.h"
+#include "processor.h"
+
+static void set_spec_ctrl(u64 val)
+{
+	/* Set the bit and verify the result */
+	wrmsr(MSR_IA32_SPEC_CTRL, val);
+	GUEST_ASSERT_2(rdmsr(MSR_IA32_SPEC_CTRL) == val, rdmsr(MSR_IA32_SPEC_CTRL), val);
+
+	/* Clear the bit and verify the result */
+	val = 0;
+	wrmsr(MSR_IA32_SPEC_CTRL, val);
+	GUEST_ASSERT_2(rdmsr(MSR_IA32_SPEC_CTRL) == val, rdmsr(MSR_IA32_SPEC_CTRL), val);
+}
+
+static void guest_code(void)
+{
+	set_spec_ctrl(SPEC_CTRL_IBRS);
+
+	if (this_cpu_has(X86_FEATURE_INTEL_STIBP))
+		set_spec_ctrl(SPEC_CTRL_STIBP);
+
+	if (this_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
+		set_spec_ctrl(SPEC_CTRL_SSBD);
+
+	if (this_cpu_has(X86_FEATURE_IPRED_CTRL)) {
+		set_spec_ctrl(SPEC_CTRL_IPRED_DIS_S);
+		set_spec_ctrl(SPEC_CTRL_IPRED_DIS_U);
+	}
+
+	if (this_cpu_has(X86_FEATURE_RRSBA_CTRL)) {
+		set_spec_ctrl(SPEC_CTRL_RRSBA_DIS_S);
+		set_spec_ctrl(SPEC_CTRL_RRSBA_DIS_U);
+	}
+
+	if (this_cpu_has(X86_FEATURE_BHI_CTRL))
+		set_spec_ctrl(SPEC_CTRL_BHI_DIS_S);
+
+	GUEST_DONE();
+}
+
+static void test_spec_ctrl_access(void)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+	run = vcpu->run;
+
+	while (1) {
+		vcpu_run(vcpu);
+
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "Unexpected exit reason: %u (%s),\n",
+			    run->exit_reason,
+			    exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT_2(uc, "real %ld expected %ld");
+			break;
+		case UCALL_DONE:
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	kvm_vm_free(vm);
+}
+
+static void l2_guest_code(void)
+{
+	GUEST_ASSERT(rdmsr(MSR_IA32_SPEC_CTRL) == SPEC_CTRL_IBRS);
+	wrmsr(MSR_IA32_SPEC_CTRL, 0);
+
+	/* Exit to L1 */
+	__asm__ __volatile__("vmcall");
+}
+
+static void l1_guest_code(struct vmx_pages *vmx_pages)
+{
+#define L2_GUEST_STACK_SIZE 64
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	uint32_t control;
+
+	/*
+	 * Try to disable interception of writes to SPEC_CTRL by writing a
+	 * non-0 value. This test is intended to verify that SPEC_CTRL is
+	 * preserved across nested transitions particuarlly when writes to
+	 * the MSR isn't intercepted by L0 VMM or L1 VMM.
+	 */
+	wrmsr(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS);
+
+	GUEST_ASSERT(vmx_pages->vmcs_gpa);
+	GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
+	GUEST_ASSERT(load_vmcs(vmx_pages));
+	GUEST_ASSERT(vmptrstz() == vmx_pages->vmcs_gpa);
+	prepare_vmcs(vmx_pages, l2_guest_code,
+		     &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	control = vmreadz(CPU_BASED_VM_EXEC_CONTROL);
+	control |= CPU_BASED_USE_MSR_BITMAPS;
+	vmwrite(CPU_BASED_VM_EXEC_CONTROL, control);
+
+	GUEST_ASSERT(!vmlaunch());
+
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+	GUEST_ASSERT(rdmsr(MSR_IA32_SPEC_CTRL) == 0);
+
+	GUEST_DONE();
+}
+
+static void test_spec_ctrl_vmx_transition(void)
+{
+	vm_vaddr_t vmx_pages_gva;
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));
+
+	vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code);
+	vcpu_alloc_vmx(vm, &vmx_pages_gva);
+	vcpu_args_set(vcpu, 1, vmx_pages_gva);
+	run = vcpu->run;
+
+	while (1) {
+		vcpu_run(vcpu);
+
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "Unexpected exit reason: %u (%s),\n",
+			    run->exit_reason,
+			    exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+			break;
+		case UCALL_DONE:
+			goto done;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SPEC_CTRL));
+	test_spec_ctrl_access();
+	test_spec_ctrl_vmx_transition();
+
+	return 0;
+}
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization
  2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
                   ` (10 preceding siblings ...)
  2023-04-14  6:25 ` [RFC PATCH v2 11/11] KVM: selftests: Add tests for IA32_SPEC_CTRL MSR Chao Gao
@ 2023-04-14  9:51 ` Binbin Wu
  2023-04-14 22:10   ` Pawan Gupta
  11 siblings, 1 reply; 42+ messages in thread
From: Binbin Wu @ 2023-04-14  9:51 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Alexandre Chartre, Arnaldo Carvalho de Melo, Babu Moger,
	Borislav Petkov, Borislav Petkov, Daniel Sneddon, Dave Hansen,
	David Matlack, H. Peter Anvin, Ingo Molnar, Josh Poimboeuf,
	Kim Phillips, Len Brown, linux-kernel, linux-kselftest,
	Nikunj A Dadhania, Paolo Bonzini, Pawan Gupta, Peter Zijlstra,
	Sandipan Das, Sean Christopherson, Shuah Khan, Thomas Gleixner,
	Vitaly Kuznetsov, x86, Zhang Chen


On 4/14/2023 2:25 PM, Chao Gao wrote:
> Changes since RFC v1:
>   * add two kselftests (patch 10-11)
>   * set virtual MSRs also on APs [Pawan]
>   * enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
>     some bits of IA32_SPEC_CTRL (patch 4)
>   * other misc cleanup and cosmetic changes
>
> RFC v1: https://lore.kernel.org/lkml/20221210160046.2608762-1-chen.zhang@intel.com/
>
>
> This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
> introduction and use cases of this new feature.
>
> ### Virtualize IA32_SPEC_CTRL
>
> "Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
> allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
> pass-thru'd to a guest.
>
>
> ### Use cases of "virtualize IA32_SPEC_CTRL" [2]
>
> Software mitigations like Retpoline and software BHB-clearing sequence depend on
> CPU microarchitectures. And guest cannot know exactly the underlying
> microarchitecture. When a guest is migrated between processors of different
> microarchitectures, software mitigations which work perfectly on previous
> microachitecture may be not effective on the new one. To fix the problem, some
> hardware mitigations should be used in conjunction with software mitigations.

So even the hardware mitigations are enabled, the software mitigations 
are still needed, right?


> Using virtual IA32_SPEC_CTRL, VMM can enforce hardware mitigations transparently
> to guests and avoid those hardware mitigations being unintentionally disabled
> when guest changes IA32_SPEC_CTRL MSR.
>
>
> ### Intention of this series
>
> This series adds the capability of enforcing hardware mitigations for guests
> transparently and efficiently (i.e., without intecepting IA32_SPEC_CTRL MSR

/s/intecepting/intercepting


> accesses) to kvm. The capability can be used to solve the VM migration issue in
> a pool consisting of processors of different microarchitectures.
>
> Specifically, below are two target scenarios of this series:
>
> Scenario 1: If retpoline is used by a VM to mitigate IMBTI in CPL0, VMM can set
> 	    RRSBA_DIS_S on parts enumerates RRSBA. Note that the VM is presented
> 	    with a microarchitecture doesn't enumerate RRSBA.
>
> Scenario 2: If a VM uses software BHB-clearing sequence on transitions into CPL0
> 	    to mitigate BHI, VMM can use "virtualize IA32_SPEC_CTRL" to set
> 	    BHI_DIS_S on new parts which doesn't enumerate BHI_NO.
>
> Intel defines some virtual MSRs [2] for guests to report in-use software
> mitigations. This allows guests to opt in VMM's deploying hardware mitigations
> for them if the guests are either running or later migrated to a system on which
> in-use software mitigations are not effective. The virtual MSRs interface is
> also added in this series.
>
> ### Organization of this series
>
> 1. Patch 1-3	Advertise RRSBA_CTRL and BHI_CTRL to guest
> 2. Patch 4	Add "virtualize IA32_SPEC_CTRL" support
> 3. Patch 5-9	Allow guests to report in-use software mitigations to KVM so
>                  that KVM can enable hardware mitigations for guests.
> 4. Patch 10-11	Add kselftest for virtual MSRs and IA32_SPEC_CTRL
>
> [1]: https://cdrdv2.intel.com/v1/dl/getContent/671368 Ref. #319433-047 Chapter 12
> [2]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html
>
> Chao Gao (3):
>    KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
>    KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
>    KVM: selftests: Add tests for IA32_SPEC_CTRL MSR
>
> Pawan Gupta (1):
>    x86/bugs: Use Virtual MSRs to request hardware mitigations
>
> Zhang Chen (7):
>    x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
>    KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
>    KVM: x86: Advertise BHI_CTRL support
>    KVM: VMX: Add IA32_SPEC_CTRL virtualization support
>    KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
>    KVM: VMX: Advertise MITIGATION_CTRL support
>    KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
>
>   arch/x86/include/asm/msr-index.h              |  33 +++-
>   arch/x86/include/asm/vmx.h                    |   5 +
>   arch/x86/include/asm/vmxfeatures.h            |   2 +
>   arch/x86/kernel/cpu/bugs.c                    |  25 +++
>   arch/x86/kvm/cpuid.c                          |  22 ++-
>   arch/x86/kvm/reverse_cpuid.h                  |   8 +
>   arch/x86/kvm/svm/svm.c                        |   3 +
>   arch/x86/kvm/vmx/capabilities.h               |   5 +
>   arch/x86/kvm/vmx/nested.c                     |  13 ++
>   arch/x86/kvm/vmx/vmcs.h                       |   2 +
>   arch/x86/kvm/vmx/vmx.c                        | 112 ++++++++++-
>   arch/x86/kvm/vmx/vmx.h                        |  43 ++++-
>   arch/x86/kvm/x86.c                            |  19 +-
>   tools/arch/x86/include/asm/msr-index.h        |  37 +++-
>   tools/testing/selftests/kvm/Makefile          |   2 +
>   .../selftests/kvm/include/x86_64/processor.h  |   5 +
>   .../selftests/kvm/x86_64/spec_ctrl_msr_test.c | 178 ++++++++++++++++++
>   .../kvm/x86_64/virtual_mitigation_msr_test.c  | 175 +++++++++++++++++
>   18 files changed, 676 insertions(+), 13 deletions(-)
>   create mode 100644 tools/testing/selftests/kvm/x86_64/spec_ctrl_msr_test.c
>   create mode 100644 tools/testing/selftests/kvm/x86_64/virtual_mitigation_msr_test.c
>
>
> base-commit: 400d2132288edbd6d500f45eab5d85526ca94e46

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO
  2023-04-14  6:25 ` [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO Chao Gao
@ 2023-04-14  9:52   ` Binbin Wu
  0 siblings, 0 replies; 42+ messages in thread
From: Binbin Wu @ 2023-04-14  9:52 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
	Pawan Gupta, Daniel Sneddon, Sandipan Das, Nikunj A Dadhania,
	Arnaldo Carvalho de Melo, linux-kernel


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
>
> To ensure VM migration from a system where software mitigation works to
> a system where it doesn't won't harm guest's security level, KVM must
> mitigate BHI attacks for guests since migration is transparent to guests
> and guests won't and can't react to VM migration.
>
> For example, simple BHB clear sequence [1] is effective in mitigating BHI
> attacks on processors prior to Alder Lake, but it is not on Alder Lake.
> Guests migrated from prior to Alder Lake host to Alder Lake host become
> vulnerable to BHI attacks even if the simmple BHB clear sequence is

/s/simmple/simple


> deployed. In this case, KVM can enable hardware mitigation for guests by
> setting BHI_DIS_S bit of IA32_SPEC_CTRL MSR.
>
> Define the SPEC_CTRL_BHI_DIS_S of IA32_SPEC_CTRL MSR and BHI_NO bits in
> arch_capabilities, which will be used by KVM later.
>
> [1]: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-2-4
>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Co-developed-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/include/asm/msr-index.h       | 8 +++++++-
>   tools/arch/x86/include/asm/msr-index.h | 8 +++++++-
>   2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index ad35355ee43e..60b25d87b82c 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -48,8 +48,10 @@
>   #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
>   #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
>   #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
> -#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior */
> +#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior in supervisor mode */
>   #define SPEC_CTRL_RRSBA_DIS_S		BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
> +#define SPEC_CTRL_BHI_DIS_S_SHIFT	10	   /* Disable BHI behavior in supervisor mode */
> +#define SPEC_CTRL_BHI_DIS_S		BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
>   
>   /* A mask for bits which the kernel toggles when controlling mitigations */
>   #define SPEC_CTRL_MITIGATIONS_MASK	(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
> @@ -151,6 +153,10 @@
>   						 * are restricted to targets in
>   						 * kernel.
>   						 */
> +#define ARCH_CAP_BHI_NO			BIT(20)	/*
> +						 * Not susceptible to Branch History
> +						 * Injection.
> +						 */
>   #define ARCH_CAP_PBRSB_NO		BIT(24)	/*
>   						 * Not susceptible to Post-Barrier
>   						 * Return Stack Buffer Predictions.
> diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
> index ad35355ee43e..6079a5fdb40b 100644
> --- a/tools/arch/x86/include/asm/msr-index.h
> +++ b/tools/arch/x86/include/asm/msr-index.h
> @@ -48,8 +48,10 @@
>   #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
>   #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
>   #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
> -#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior */
> +#define SPEC_CTRL_RRSBA_DIS_S_SHIFT	6	   /* Disable RRSBA behavior in supervisor mode */
>   #define SPEC_CTRL_RRSBA_DIS_S		BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
> +#define SPEC_CTRL_BHI_DIS_S_SHIFT	10         /* Disable BHI behavior in supervisor mode */
> +#define SPEC_CTRL_BHI_DIS_S		BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
>   
>   /* A mask for bits which the kernel toggles when controlling mitigations */
>   #define SPEC_CTRL_MITIGATIONS_MASK	(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
> @@ -151,6 +153,10 @@
>   						 * are restricted to targets in
>   						 * kernel.
>   						 */
> +#define ARCH_CAP_BHI_NO			BIT(20) /*
> +						 * Not susceptible to Branch History
> +						 * Injection.
> +						 */
>   #define ARCH_CAP_PBRSB_NO		BIT(24)	/*
>   						 * Not susceptible to Post-Barrier
>   						 * Return Stack Buffer Predictions.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization
  2023-04-14  9:51 ` [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Binbin Wu
@ 2023-04-14 22:10   ` Pawan Gupta
  0 siblings, 0 replies; 42+ messages in thread
From: Pawan Gupta @ 2023-04-14 22:10 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Chao Gao, kvm, Jiaan Lu, Alexandre Chartre,
	Arnaldo Carvalho de Melo, Babu Moger, Borislav Petkov,
	Borislav Petkov, Daniel Sneddon, Dave Hansen, David Matlack,
	H. Peter Anvin, Ingo Molnar, Josh Poimboeuf, Kim Phillips,
	Len Brown, linux-kernel, linux-kselftest, Nikunj A Dadhania,
	Paolo Bonzini, Peter Zijlstra, Sandipan Das, Sean Christopherson,
	Shuah Khan, Thomas Gleixner, Vitaly Kuznetsov, x86, Zhang Chen

On Fri, Apr 14, 2023 at 05:51:43PM +0800, Binbin Wu wrote:
> 
> On 4/14/2023 2:25 PM, Chao Gao wrote:
> > Changes since RFC v1:
> >   * add two kselftests (patch 10-11)
> >   * set virtual MSRs also on APs [Pawan]
> >   * enable "virtualize IA32_SPEC_CTRL" for L2 to prevent L2 from changing
> >     some bits of IA32_SPEC_CTRL (patch 4)
> >   * other misc cleanup and cosmetic changes
> > 
> > RFC v1: https://lore.kernel.org/lkml/20221210160046.2608762-1-chen.zhang@intel.com/
> > 
> > 
> > This series introduces "virtualize IA32_SPEC_CTRL" support. Here are
> > introduction and use cases of this new feature.
> > 
> > ### Virtualize IA32_SPEC_CTRL
> > 
> > "Virtualize IA32_SPEC_CTRL" [1] is a new VMX feature on Intel CPUs. This feature
> > allows VMM to lock some bits of IA32_SPEC_CTRL MSR even when the MSR is
> > pass-thru'd to a guest.
> > 
> > 
> > ### Use cases of "virtualize IA32_SPEC_CTRL" [2]
> > 
> > Software mitigations like Retpoline and software BHB-clearing sequence depend on
> > CPU microarchitectures. And guest cannot know exactly the underlying
> > microarchitecture. When a guest is migrated between processors of different
> > microarchitectures, software mitigations which work perfectly on previous
> > microachitecture may be not effective on the new one. To fix the problem, some
> > hardware mitigations should be used in conjunction with software mitigations.
> 
> So even the hardware mitigations are enabled, the software mitigations are
> still needed, right?

Retpoline mitigation is not fully effective when RET can take prediction
from an alternate predictor. Newer hardware provides a way to disable
this behavior (using RRSBA_DIS_S bit in MSR SPEC_CTRL).

eIBRS is the preferred way to mitigate BTI, but for some reason when a
guest has deployed retpoline, VMM can make it more effective by
deploying the relevant hardware control. That is why the above text
says:

  "... hardware mitigations should be used in conjunction with software
  mitigations."

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-04-14  6:25 ` [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support Chao Gao
@ 2023-04-16  7:04   ` Binbin Wu
  2023-04-16 13:25     ` Chao Gao
  2023-05-15  6:53   ` Xiaoyao Li
  1 sibling, 1 reply; 42+ messages in thread
From: Binbin Wu @ 2023-04-16  7:04 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
>
> Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
> as the first feature in the leaf.
>
> RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
> (bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
> disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.
>
> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
> after a non-zero is written to the MSR. Therefore, guests can already
> toggle the two bits if the host supports RRSBA_CTRL, and no extra code
> is needed to allow guests to toggle the two bits.
>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
>   arch/x86/kvm/reverse_cpuid.h |  7 +++++++
>   2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 9583a110cf5f..f024c3ac2203 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
>   		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
>   	);
>   
> +	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
> +		SF(RRSBA_CTRL)
> +	);
> +

Is it slightly better to put the new added one after

     kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX, ...

to make it in order?


>   	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
>   		F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
>   		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
> @@ -949,13 +953,14 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>   		break;
>   	/* function 7 has additional index. */
>   	case 7:
> -		entry->eax = min(entry->eax, 1u);
> +		entry->eax = min(entry->eax, 2u);
>   		cpuid_entry_override(entry, CPUID_7_0_EBX);
>   		cpuid_entry_override(entry, CPUID_7_ECX);
>   		cpuid_entry_override(entry, CPUID_7_EDX);
>   
> -		/* KVM only supports 0x7.0 and 0x7.1, capped above via min(). */
> -		if (entry->eax == 1) {
> +		max_idx = entry->eax;
> +
> +		if (max_idx >= 1) {
>   			entry = do_host_cpuid(array, function, 1);
>   			if (!entry)
>   				goto out;
> @@ -965,6 +970,17 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>   			entry->ebx = 0;
>   			entry->ecx = 0;
>   		}
> +
> +		if (max_idx >= 2) {
> +			entry = do_host_cpuid(array, function, 2);
> +			if (!entry)
> +				goto out;
> +
> +			cpuid_entry_override(entry, CPUID_7_2_EDX);
> +			entry->eax = 0;
> +			entry->ebx = 0;
> +			entry->ecx = 0;
> +		}
>   		break;
>   	case 0xa: { /* Architectural Performance Monitoring */
>   		union cpuid10_eax eax;
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index a5717282bb9c..72bad8314a9c 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -15,6 +15,7 @@ enum kvm_only_cpuid_leafs {
>   	CPUID_12_EAX	 = NCAPINTS,
>   	CPUID_7_1_EDX,
>   	CPUID_8000_0007_EDX,
> +	CPUID_7_2_EDX,
>   	NR_KVM_CPU_CAPS,
>   
>   	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> @@ -47,6 +48,9 @@ enum kvm_only_cpuid_leafs {
>   /* CPUID level 0x80000007 (EDX). */
>   #define KVM_X86_FEATURE_CONSTANT_TSC	KVM_X86_FEATURE(CPUID_8000_0007_EDX, 8)
>   
> +/* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
> +#define KVM_X86_FEATURE_RRSBA_CTRL	KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
> +
>   struct cpuid_reg {
>   	u32 function;
>   	u32 index;
> @@ -69,6 +73,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
>   	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
>   	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
>   	[CPUID_7_1_EAX]       = {         7, 1, CPUID_EAX},
> +	[CPUID_7_2_EDX]       = {         7, 2, CPUID_EDX},
>   	[CPUID_12_EAX]        = {0x00000012, 0, CPUID_EAX},
>   	[CPUID_8000_001F_EAX] = {0x8000001f, 0, CPUID_EAX},
>   	[CPUID_7_1_EDX]       = {         7, 1, CPUID_EDX},
> @@ -108,6 +113,8 @@ static __always_inline u32 __feature_translate(int x86_feature)
>   		return KVM_X86_FEATURE_SGX_EDECCSSA;
>   	else if (x86_feature == X86_FEATURE_CONSTANT_TSC)
>   		return KVM_X86_FEATURE_CONSTANT_TSC;
> +	else if (x86_feature == X86_FEATURE_RRSBA_CTRL)
> +		return KVM_X86_FEATURE_RRSBA_CTRL;
>   
>   	return x86_feature;
>   }

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-04-16  7:04   ` Binbin Wu
@ 2023-04-16 13:25     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-16 13:25 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Sun, Apr 16, 2023 at 03:04:59PM +0800, Binbin Wu wrote:
>
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Zhang Chen <chen.zhang@intel.com>
>> 
>> Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
>> as the first feature in the leaf.
>> 
>> RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
>> (bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
>> disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.
>> 
>> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
>> after a non-zero is written to the MSR. Therefore, guests can already
>> toggle the two bits if the host supports RRSBA_CTRL, and no extra code
>> is needed to allow guests to toggle the two bits.
>> 
>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
>> ---
>>   arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
>>   arch/x86/kvm/reverse_cpuid.h |  7 +++++++
>>   2 files changed, 26 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 9583a110cf5f..f024c3ac2203 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
>>   		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
>>   	);
>> +	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
>> +		SF(RRSBA_CTRL)
>> +	);
>> +
>
>Is it slightly better to put the new added one after
>
>    kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX, ...
>
>to make it in order?

Yes. Will do.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
@ 2023-04-17  3:17   ` Binbin Wu
  2023-04-18  2:07     ` Chao Gao
  2023-04-17  6:48   ` Chenyi Qiang
  2023-05-16  7:16   ` Xiaoyao Li
  2 siblings, 1 reply; 42+ messages in thread
From: Binbin Wu @ 2023-04-17  3:17 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
>
> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
> value to hardware.
>
> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
> MSR cannot be modified by guest software.
>
> Two VMCS fields are added:
>
>    IA32_SPEC_CTRL_MASK:   bits that guest software cannot modify
>    IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
> 			 IA32_SPEC_CTRL MSR
>
> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
> is written to the IA32_SPEC_CTRL MSR, where
>    * cur_val is the original value of IA32_SPEC_CTRL MSR
>    * mask is the value of IA32_SPEC_CTRL_MASK
>
> Add a mask e.g.,

e.g. or i.e. ?


> loaded_vmcs->spec_ctrl_mask to represent the bits guest
> shouldn't change. It is 0 for now and some bits will be added by
> following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
> on nested transition because the mask is expected to be rarely changed
> and the same for vmcs01 and vmcs02.
>
> To prevent guest from changing the bits in the mask, enable "virtualize
> IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
> the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
> is mainly to give the same capability to KVM running on potential broken
> hardware or L1 guests.
>
> To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
> in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
> and the mask to preserve them across nested transitions. Note that the
> shadow value may be changed because L2 may access the IA32_SPEC_CTRL
> directly and the mask may be changed due to migration when L2 vCPUs are
> running.
>
> Co-developed-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/include/asm/vmx.h         |  5 ++++
>   arch/x86/include/asm/vmxfeatures.h |  2 ++
>   arch/x86/kvm/vmx/capabilities.h    |  5 ++++
>   arch/x86/kvm/vmx/nested.c          | 13 ++++++++++
>   arch/x86/kvm/vmx/vmcs.h            |  2 ++
>   arch/x86/kvm/vmx/vmx.c             | 34 ++++++++++++++++++++-----
>   arch/x86/kvm/vmx/vmx.h             | 40 +++++++++++++++++++++++++++++-
>   7 files changed, 94 insertions(+), 7 deletions(-)
>
[...]

> @@ -750,4 +766,26 @@ static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
>   	       to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
>   }
>   
> +static inline u64 vmx_get_guest_spec_ctrl(struct vcpu_vmx *vmx)
> +{
> +	return vmx->guest_spec_ctrl;
> +}
> +
> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
> +{
> +	vmx->guest_spec_ctrl = val;
> +
> +	/*
> +	 * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
> +	 * regardless of the MSR intercept state.

It is better to use "IA32_SPEC_CTRL"  explicitly instead of "the MSR" to 
avoid misunderstand.


> +	 */
> +	if (cpu_has_spec_ctrl_virt())
> +		vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
> +
> +	/*
> +	 * Update the effective value of IA32_SPEC_CTRL to reflect changes to
> +	 * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
> +	 */
> +	vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
> +}
>   #endif /* __KVM_X86_VMX_H */

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
  2023-04-17  3:17   ` Binbin Wu
@ 2023-04-17  6:48   ` Chenyi Qiang
  2023-04-17  7:31     ` Chao Gao
  2023-05-16  7:16   ` Xiaoyao Li
  2 siblings, 1 reply; 42+ messages in thread
From: Chenyi Qiang @ 2023-04-17  6:48 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel



On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
> value to hardware.
> 
> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
> MSR cannot be modified by guest software.
> 
> Two VMCS fields are added:
> 
>   IA32_SPEC_CTRL_MASK:   bits that guest software cannot modify
>   IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
> 			 IA32_SPEC_CTRL MSR
> 
> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
> is written to the IA32_SPEC_CTRL MSR, where
>   * cur_val is the original value of IA32_SPEC_CTRL MSR
>   * mask is the value of IA32_SPEC_CTRL_MASK
> 
> Add a mask e.g., loaded_vmcs->spec_ctrl_mask to represent the bits guest
> shouldn't change. It is 0 for now and some bits will be added by
> following patches. Use per-vmcs cache to avoid unnecessary vmcs_write()
> on nested transition because the mask is expected to be rarely changed
> and the same for vmcs01 and vmcs02.
> 
> To prevent guest from changing the bits in the mask, enable "virtualize
> IA32_SPEC_CTRL" if supported or emulate its behavior by intercepting
> the IA32_SPEC_CTRL msr. Emulating "virtualize IA32_SPEC_CTRL" behavior
> is mainly to give the same capability to KVM running on potential broken
> hardware or L1 guests.
> 
> To avoid L2 evading the enforcement, enable "virtualize IA32_SPEC_CTRL"
> in vmcs02. Always update the guest (shadow) value of IA32_SPEC_CTRL MSR
> and the mask to preserve them across nested transitions. Note that the
> shadow value may be changed because L2 may access the IA32_SPEC_CTRL
> directly and the mask may be changed due to migration when L2 vCPUs are
> running.
> 
> Co-developed-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Duplicated SOB. Move the Co-developed-by down like other patches.

> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>  arch/x86/include/asm/vmx.h         |  5 ++++
>  arch/x86/include/asm/vmxfeatures.h |  2 ++
>  arch/x86/kvm/vmx/capabilities.h    |  5 ++++
>  arch/x86/kvm/vmx/nested.c          | 13 ++++++++++
>  arch/x86/kvm/vmx/vmcs.h            |  2 ++
>  arch/x86/kvm/vmx/vmx.c             | 34 ++++++++++++++++++++-----
>  arch/x86/kvm/vmx/vmx.h             | 40 +++++++++++++++++++++++++++++-
>  7 files changed, 94 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index 498dc600bd5c..b9f88ecf20c3 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -81,6 +81,7 @@
>   * Definitions of Tertiary Processor-Based VM-Execution Controls.
>   */
>  #define TERTIARY_EXEC_IPI_VIRT			VMCS_CONTROL_BIT(IPI_VIRT)
> +#define TERTIARY_EXEC_SPEC_CTRL_VIRT		VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)
>  
>  #define PIN_BASED_EXT_INTR_MASK                 VMCS_CONTROL_BIT(INTR_EXITING)
>  #define PIN_BASED_NMI_EXITING                   VMCS_CONTROL_BIT(NMI_EXITING)
> @@ -233,6 +234,10 @@ enum vmcs_field {
>  	TERTIARY_VM_EXEC_CONTROL_HIGH	= 0x00002035,
>  	PID_POINTER_TABLE		= 0x00002042,
>  	PID_POINTER_TABLE_HIGH		= 0x00002043,
> +	IA32_SPEC_CTRL_MASK             = 0x0000204A,
> +	IA32_SPEC_CTRL_MASK_HIGH        = 0x0000204B,
> +	IA32_SPEC_CTRL_SHADOW           = 0x0000204C,
> +	IA32_SPEC_CTRL_SHADOW_HIGH      = 0x0000204D,
>  	GUEST_PHYSICAL_ADDRESS          = 0x00002400,
>  	GUEST_PHYSICAL_ADDRESS_HIGH     = 0x00002401,
>  	VMCS_LINK_POINTER               = 0x00002800,
> diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
> index c6a7eed03914..c70d0769b7d0 100644
> --- a/arch/x86/include/asm/vmxfeatures.h
> +++ b/arch/x86/include/asm/vmxfeatures.h
> @@ -89,4 +89,6 @@
>  
>  /* Tertiary Processor-Based VM-Execution Controls, word 3 */
>  #define VMX_FEATURE_IPI_VIRT		( 3*32+  4) /* Enable IPI virtualization */
> +#define VMX_FEATURE_SPEC_CTRL_VIRT	( 3*32+  7) /* Enable IA32_SPEC_CTRL virtualization */
> +
>  #endif /* _ASM_X86_VMXFEATURES_H */
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 45162c1bcd8f..a7ab70b55acf 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
>  		CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>  }
>  
> +static __always_inline bool cpu_has_spec_ctrl_virt(void)

Do we need to use __always_inline to force generating inline code? or
just align with other cpu_has_xxx() functions, use inline annotation.

> +{
> +	return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
> +}
> +
>  static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
>  {
>  	return vmcs_config.cpu_based_2nd_exec_ctrl &


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-17  6:48   ` Chenyi Qiang
@ 2023-04-17  7:31     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-17  7:31 UTC (permalink / raw)
  To: Chenyi Qiang
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Mon, Apr 17, 2023 at 02:48:56PM +0800, Chenyi Qiang wrote:
>> Co-developed-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>
>Duplicated SOB. Move the Co-developed-by down like other patches.
>

OK. Will do

>> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
>> ---
>>  arch/x86/include/asm/vmx.h         |  5 ++++
>>  arch/x86/include/asm/vmxfeatures.h |  2 ++
>>  arch/x86/kvm/vmx/capabilities.h    |  5 ++++
>>  arch/x86/kvm/vmx/nested.c          | 13 ++++++++++
>>  arch/x86/kvm/vmx/vmcs.h            |  2 ++
>>  arch/x86/kvm/vmx/vmx.c             | 34 ++++++++++++++++++++-----
>>  arch/x86/kvm/vmx/vmx.h             | 40 +++++++++++++++++++++++++++++-
>>  7 files changed, 94 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index 498dc600bd5c..b9f88ecf20c3 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -81,6 +81,7 @@
>>   * Definitions of Tertiary Processor-Based VM-Execution Controls.
>>   */
>>  #define TERTIARY_EXEC_IPI_VIRT			VMCS_CONTROL_BIT(IPI_VIRT)
>> +#define TERTIARY_EXEC_SPEC_CTRL_VIRT		VMCS_CONTROL_BIT(SPEC_CTRL_VIRT)
>>  
>>  #define PIN_BASED_EXT_INTR_MASK                 VMCS_CONTROL_BIT(INTR_EXITING)
>>  #define PIN_BASED_NMI_EXITING                   VMCS_CONTROL_BIT(NMI_EXITING)
>> @@ -233,6 +234,10 @@ enum vmcs_field {
>>  	TERTIARY_VM_EXEC_CONTROL_HIGH	= 0x00002035,
>>  	PID_POINTER_TABLE		= 0x00002042,
>>  	PID_POINTER_TABLE_HIGH		= 0x00002043,
>> +	IA32_SPEC_CTRL_MASK             = 0x0000204A,
>> +	IA32_SPEC_CTRL_MASK_HIGH        = 0x0000204B,
>> +	IA32_SPEC_CTRL_SHADOW           = 0x0000204C,
>> +	IA32_SPEC_CTRL_SHADOW_HIGH      = 0x0000204D,
>>  	GUEST_PHYSICAL_ADDRESS          = 0x00002400,
>>  	GUEST_PHYSICAL_ADDRESS_HIGH     = 0x00002401,
>>  	VMCS_LINK_POINTER               = 0x00002800,
>> diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
>> index c6a7eed03914..c70d0769b7d0 100644
>> --- a/arch/x86/include/asm/vmxfeatures.h
>> +++ b/arch/x86/include/asm/vmxfeatures.h
>> @@ -89,4 +89,6 @@
>>  
>>  /* Tertiary Processor-Based VM-Execution Controls, word 3 */
>>  #define VMX_FEATURE_IPI_VIRT		( 3*32+  4) /* Enable IPI virtualization */
>> +#define VMX_FEATURE_SPEC_CTRL_VIRT	( 3*32+  7) /* Enable IA32_SPEC_CTRL virtualization */
>> +
>>  #endif /* _ASM_X86_VMXFEATURES_H */
>> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
>> index 45162c1bcd8f..a7ab70b55acf 100644
>> --- a/arch/x86/kvm/vmx/capabilities.h
>> +++ b/arch/x86/kvm/vmx/capabilities.h
>> @@ -138,6 +138,11 @@ static inline bool cpu_has_tertiary_exec_ctrls(void)
>>  		CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>>  }
>>  
>> +static __always_inline bool cpu_has_spec_ctrl_virt(void)
>
>Do we need to use __always_inline to force generating inline code? or
>just align with other cpu_has_xxx() functions, use inline annotation.

__always_inline is needed because this function is called from 
vmx_spec_ctrl_restore_host() which is in .noinstr section. With just
__inline, the objtool may complain:

vmlinux.o: warning: objtool: vmx_spec_ctrl_restore_host+0xb3: call to cpu_has_spec_ctrl_virt() leaves .noinstr.text section

>
>> +{
>> +	return vmcs_config.cpu_based_3rd_exec_ctrl & TERTIARY_EXEC_SPEC_CTRL_VIRT;
>> +}
>> +
>>  static inline bool cpu_has_vmx_virtualize_apic_accesses(void)
>>  {
>>  	return vmcs_config.cpu_based_2nd_exec_ctrl &
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations
  2023-04-14  6:25 ` [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations Chao Gao
@ 2023-04-17 13:43   ` Binbin Wu
  2023-04-18  2:01     ` Chao Gao
  0 siblings, 1 reply; 42+ messages in thread
From: Binbin Wu @ 2023-04-17 13:43 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Pawan Gupta, Zhang Chen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Peter Zijlstra,
	Babu Moger, Daniel Sneddon, Sandipan Das, Nikunj A Dadhania,
	Josh Poimboeuf, Kim Phillips, Alexandre Chartre, linux-kernel


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>
> Guests that have different family/model than the host may not be aware
> of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
> particularly true when guests migrate. To solve this problem Intel
> processors have added a virtual MSR interface through which guests can
> report their mitigation status and request VMM to deploy relevant
> hardware mitigations.
>
> Use this virtualized MSR interface to request relevant hardware controls
> for retpoline mitigation.
>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Co-developed-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
>   arch/x86/kernel/cpu/bugs.c       | 25 +++++++++++++++++++++++++
>   2 files changed, 50 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 60b25d87b82c..aec213f0c6fc 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -166,6 +166,7 @@
>   						 * IA32_XAPIC_DISABLE_STATUS MSR
>   						 * supported
>   						 */
> +#define ARCH_CAP_VIRTUAL_ENUM		BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
>   
>   #define MSR_IA32_FLUSH_CMD		0x0000010b
>   #define L1D_FLUSH			BIT(0)	/*
> @@ -1103,6 +1104,30 @@
>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
>   #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
>   #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
> +
> +/* Intel virtual MSRs */
> +#define MSR_VIRTUAL_ENUMERATION			0x50000000
> +#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT	BIT(0)	/*
> +							 * Mitigation ctrl via virtual
> +							 * MSRs supported
> +							 */
> +
> +#define MSR_VIRTUAL_MITIGATION_ENUM		0x50000001
> +#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT	BIT(0)	/* VMM supports BHI_DIS_S */
> +#define MITI_ENUM_RETPOLINE_S_SUPPORT		BIT(1)	/* VMM supports RRSBA_DIS_S */
> +
> +#define MSR_VIRTUAL_MITIGATION_CTRL		0x50000002
> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT	0	/*
> +							 * Request VMM to deploy
> +							 * BHI_DIS_S mitigation
> +							 */
> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED		BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)

Seems it is defined, but not used to request VMM to deploy BHI_DIS_S 
mitigation?


And IMO, it is more natual to put this patch after the four capability 
advertising patches.


> +#define MITI_CTRL_RETPOLINE_S_USED_BIT		1	/*
> +							 * Request VMM to deploy
> +							 * RRSBA_DIS_S mitigation
> +							 */
> +#define MITI_CTRL_RETPOLINE_S_USED		BIT(MITI_CTRL_RETPOLINE_S_USED_BIT)
> +
>   /* AMD-V MSRs */
>   
>   #define MSR_VM_CR                       0xc0010114
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index f9d060e71c3e..5326c03d9d5e 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1435,6 +1435,27 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
>   	dump_stack();
>   }
>   
> +/* Speculation control using virtualized MSRs */
> +static void spec_ctrl_setup_virtualized_msr(void)
> +{
> +	u64 msr_virt_enum, msr_mitigation_enum;
> +
> +	/* When retpoline is being used, request relevant hardware controls */
> +	if (!boot_cpu_has(X86_FEATURE_RETPOLINE))
> +		return;
> +
> +	if (!(x86_read_arch_cap_msr() & ARCH_CAP_VIRTUAL_ENUM))
> +		return;
> +
> +	rdmsrl(MSR_VIRTUAL_ENUMERATION, msr_virt_enum);
> +	if (!(msr_virt_enum & VIRT_ENUM_MITIGATION_CTRL_SUPPORT))
> +		return;
> +
> +	rdmsrl(MSR_VIRTUAL_MITIGATION_ENUM, msr_mitigation_enum);
> +	if (msr_mitigation_enum & MITI_ENUM_RETPOLINE_S_SUPPORT)
> +		msr_set_bit(MSR_VIRTUAL_MITIGATION_CTRL, MITI_CTRL_RETPOLINE_S_USED_BIT);
> +}
> +
>   static void __init spectre_v2_select_mitigation(void)
>   {
>   	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
> @@ -1546,6 +1567,8 @@ static void __init spectre_v2_select_mitigation(void)
>   	    mode == SPECTRE_V2_RETPOLINE)
>   		spec_ctrl_disable_kernel_rrsba();
>   
> +	spec_ctrl_setup_virtualized_msr();
> +
>   	spectre_v2_enabled = mode;
>   	pr_info("%s\n", spectre_v2_strings[mode]);
>   
> @@ -2115,6 +2138,8 @@ void x86_spec_ctrl_setup_ap(void)
>   
>   	if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
>   		x86_amd_ssb_disable();
> +
> +	spec_ctrl_setup_virtualized_msr();
>   }
>   
>   bool itlb_multihit_kvm_mitigation;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations
  2023-04-17 13:43   ` Binbin Wu
@ 2023-04-18  2:01     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-18  2:01 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, Jiaan Lu, Pawan Gupta, Zhang Chen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Peter Zijlstra, Babu Moger, Daniel Sneddon, Sandipan Das,
	Nikunj A Dadhania, Josh Poimboeuf, Kim Phillips,
	Alexandre Chartre, linux-kernel

On Mon, Apr 17, 2023 at 09:43:59PM +0800, Binbin Wu wrote:
>
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>> 
>> Guests that have different family/model than the host may not be aware
>> of hardware mitigations(such as RRSBA_DIS_S) available on host. This is
>> particularly true when guests migrate. To solve this problem Intel
>> processors have added a virtual MSR interface through which guests can
>> report their mitigation status and request VMM to deploy relevant
>> hardware mitigations.
>> 
>> Use this virtualized MSR interface to request relevant hardware controls
>> for retpoline mitigation.
>> 
>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>> Co-developed-by: Zhang Chen <chen.zhang@intel.com>
>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
>> ---
>>   arch/x86/include/asm/msr-index.h | 25 +++++++++++++++++++++++++
>>   arch/x86/kernel/cpu/bugs.c       | 25 +++++++++++++++++++++++++
>>   2 files changed, 50 insertions(+)
>> 
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 60b25d87b82c..aec213f0c6fc 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -166,6 +166,7 @@
>>   						 * IA32_XAPIC_DISABLE_STATUS MSR
>>   						 * supported
>>   						 */
>> +#define ARCH_CAP_VIRTUAL_ENUM		BIT_ULL(63) /* MSR_VIRTUAL_ENUMERATION supported */
>>   #define MSR_IA32_FLUSH_CMD		0x0000010b
>>   #define L1D_FLUSH			BIT(0)	/*
>> @@ -1103,6 +1104,30 @@
>>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
>>   #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
>>   #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
>> +
>> +/* Intel virtual MSRs */
>> +#define MSR_VIRTUAL_ENUMERATION			0x50000000
>> +#define VIRT_ENUM_MITIGATION_CTRL_SUPPORT	BIT(0)	/*
>> +							 * Mitigation ctrl via virtual
>> +							 * MSRs supported
>> +							 */
>> +
>> +#define MSR_VIRTUAL_MITIGATION_ENUM		0x50000001
>> +#define MITI_ENUM_BHB_CLEAR_SEQ_S_SUPPORT	BIT(0)	/* VMM supports BHI_DIS_S */
>> +#define MITI_ENUM_RETPOLINE_S_SUPPORT		BIT(1)	/* VMM supports RRSBA_DIS_S */
>> +
>> +#define MSR_VIRTUAL_MITIGATION_CTRL		0x50000002
>> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT	0	/*
>> +							 * Request VMM to deploy
>> +							 * BHI_DIS_S mitigation
>> +							 */
>> +#define MITI_CTRL_BHB_CLEAR_SEQ_S_USED		BIT(MITI_CTRL_BHB_CLEAR_SEQ_S_USED_BIT)
>
>Seems it is defined, but not used to request VMM to deploy BHI_DIS_S
>mitigation?

Because Linux kernel doesn't use BHB-clearing sequence. Instead,
"disable unprivileged eBPF by default" + SMAP + eIBRS are used.

KVM uses this bit when checking if guests, which may not be running
Linux, are using BHB-clearing sequence.

>
>
>And IMO, it is more natual to put this patch after the four capability
>advertising patches.

Makes sense. I will organize the series in that order.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-17  3:17   ` Binbin Wu
@ 2023-04-18  2:07     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-04-18  2:07 UTC (permalink / raw)
  To: Binbin Wu
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Mon, Apr 17, 2023 at 11:17:36AM +0800, Binbin Wu wrote:
>
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Zhang Chen <chen.zhang@intel.com>
>> 
>> Currently KVM disables interception of IA32_SPEC_CTRL after a non-0 is
>> written to IA32_SPEC_CTRL by guest. Then, guest is allowed to write any
>> value to hardware.
>> 
>> "virtualize IA32_SPEC_CTRL" is a new tertiary vm-exec control. This
>> feature allows KVM to specify that certain bits of the IA32_SPEC_CTRL
>> MSR cannot be modified by guest software.
>> 
>> Two VMCS fields are added:
>> 
>>    IA32_SPEC_CTRL_MASK:   bits that guest software cannot modify
>>    IA32_SPEC_CTRL_SHADOW: value that guest software expects to be in the
>> 			 IA32_SPEC_CTRL MSR
>> 
>> On rdmsr, the shadow value is returned. on wrmsr, EDX:EAX is written
>> to the IA32_SPEC_CTRL_SHADOW and (cur_val & mask) | (EDX:EAX & ~mask)
>> is written to the IA32_SPEC_CTRL MSR, where
>>    * cur_val is the original value of IA32_SPEC_CTRL MSR
>>    * mask is the value of IA32_SPEC_CTRL_MASK
>> 
>> Add a mask e.g.,
>
>e.g. or i.e. ?

Yes, here should be "i.e.".

>> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
>> +{
>> +	vmx->guest_spec_ctrl = val;
>> +
>> +	/*
>> +	 * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
>> +	 * regardless of the MSR intercept state.
>
>It is better to use "IA32_SPEC_CTRL"  explicitly instead of "the MSR" to
>avoid misunderstand.

Agreed. Will do.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-04-14  6:25 ` [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support Chao Gao
  2023-04-16  7:04   ` Binbin Wu
@ 2023-05-15  6:53   ` Xiaoyao Li
  2023-05-16  2:04     ` Chao Gao
  1 sibling, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-15  6:53 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
> as the first feature in the leaf.
> 
> RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
> (bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
> disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.
> 
> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
> after a non-zero is written to the MSR. Therefore, guests can already
> toggle the two bits if the host supports RRSBA_CTRL, and no extra code
> is needed to allow guests to toggle the two bits.

This is a bug that also matters with other bits in MSR_IA32_SPEC_CTRL 
which has a dedicated enumeration CPUID bit and no support in KVM yet.

I think we need to fix this bug at first.

> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
>   arch/x86/kvm/reverse_cpuid.h |  7 +++++++
>   2 files changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 9583a110cf5f..f024c3ac2203 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
>   		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
>   	);
>   
> +	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
> +		SF(RRSBA_CTRL)
> +	);
> +

Please move this hook up to right follow the leaf CPUID_7_1_EAX.

>   	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
>   		F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
>   		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
> @@ -949,13 +953,14 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>   		break;
>   	/* function 7 has additional index. */
>   	case 7:
> -		entry->eax = min(entry->eax, 1u);
> +		entry->eax = min(entry->eax, 2u);
>   		cpuid_entry_override(entry, CPUID_7_0_EBX);
>   		cpuid_entry_override(entry, CPUID_7_ECX);
>   		cpuid_entry_override(entry, CPUID_7_EDX);
>   
> -		/* KVM only supports 0x7.0 and 0x7.1, capped above via min(). */
> -		if (entry->eax == 1) {
> +		max_idx = entry->eax;
> +
> +		if (max_idx >= 1) {
>   			entry = do_host_cpuid(array, function, 1);
>   			if (!entry)
>   				goto out;
> @@ -965,6 +970,17 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>   			entry->ebx = 0;
>   			entry->ecx = 0;
>   		}
> +
> +		if (max_idx >= 2) {
> +			entry = do_host_cpuid(array, function, 2);
> +			if (!entry)
> +				goto out;
> +
> +			cpuid_entry_override(entry, CPUID_7_2_EDX);
> +			entry->eax = 0;
> +			entry->ebx = 0;
> +			entry->ecx = 0;
> +		}
>   		break;
>   	case 0xa: { /* Architectural Performance Monitoring */
>   		union cpuid10_eax eax;
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index a5717282bb9c..72bad8314a9c 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -15,6 +15,7 @@ enum kvm_only_cpuid_leafs {
>   	CPUID_12_EAX	 = NCAPINTS,
>   	CPUID_7_1_EDX,
>   	CPUID_8000_0007_EDX,
> +	CPUID_7_2_EDX,
>   	NR_KVM_CPU_CAPS,
>   
>   	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
> @@ -47,6 +48,9 @@ enum kvm_only_cpuid_leafs {
>   /* CPUID level 0x80000007 (EDX). */
>   #define KVM_X86_FEATURE_CONSTANT_TSC	KVM_X86_FEATURE(CPUID_8000_0007_EDX, 8)
>   
> +/* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
> +#define KVM_X86_FEATURE_RRSBA_CTRL	KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
> +
>   struct cpuid_reg {
>   	u32 function;
>   	u32 index;
> @@ -69,6 +73,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
>   	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
>   	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
>   	[CPUID_7_1_EAX]       = {         7, 1, CPUID_EAX},
> +	[CPUID_7_2_EDX]       = {         7, 2, CPUID_EDX},
>   	[CPUID_12_EAX]        = {0x00000012, 0, CPUID_EAX},
>   	[CPUID_8000_001F_EAX] = {0x8000001f, 0, CPUID_EAX},
>   	[CPUID_7_1_EDX]       = {         7, 1, CPUID_EDX},
> @@ -108,6 +113,8 @@ static __always_inline u32 __feature_translate(int x86_feature)
>   		return KVM_X86_FEATURE_SGX_EDECCSSA;
>   	else if (x86_feature == X86_FEATURE_CONSTANT_TSC)
>   		return KVM_X86_FEATURE_CONSTANT_TSC;
> +	else if (x86_feature == X86_FEATURE_RRSBA_CTRL)
> +		return KVM_X86_FEATURE_RRSBA_CTRL;
>   
>   	return x86_feature;
>   }


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support
  2023-04-14  6:25 ` [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support Chao Gao
@ 2023-05-15  7:14   ` Xiaoyao Li
  0 siblings, 0 replies; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-15  7:14 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Add 100% kvm-only feature for BHI_CTRL because the kernel doesn't use it
> at all.
> 
> BHI_CTRL is enumerated by CPUID.7.2.EDX[4]. If supported, BHI_DIS_S (bit
> 10) of IA32_SPEC_CTRL MSR can be used to enable BHI_DIS_S behavior.
> 
> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
> after a non-zero is written to the MSR. Therefore, guests can already
> toggle the BHI_DIS_S bit if the host supports BHI_CTRL, and no extra
> code is needed to allow guests to toggle the bit.

Same as Patch 2, please first fix virtualization of MSR_IA32_SPEC_CTRL. 
Otherwise the this patch makes no sense. E.g, if only 
X86_FEATURE_BHI_CTRL is exposed to guest without any other CPUID bits 
related to MSR_IA32_SPEC_CTRL, then guest cannot write 
MSR_IA32_SPEC_CTRL at all.

> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/kvm/cpuid.c         | 2 +-
>   arch/x86/kvm/reverse_cpuid.h | 1 +
>   2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index f024c3ac2203..7cdd859d09a2 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -686,7 +686,7 @@ void kvm_set_cpu_caps(void)
>   	);
>   
>   	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
> -		SF(RRSBA_CTRL)
> +		SF(RRSBA_CTRL) | F(BHI_CTRL)
>   	);
>   
>   	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index 72bad8314a9c..e7e70c9aa384 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -50,6 +50,7 @@ enum kvm_only_cpuid_leafs {
>   
>   /* Intel-defined sub-features, CPUID level 0x00000007:2 (EDX) */
>   #define KVM_X86_FEATURE_RRSBA_CTRL	KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
> +#define X86_FEATURE_BHI_CTRL		KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
>   
>   struct cpuid_reg {
>   	u32 function;


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-15  6:53   ` Xiaoyao Li
@ 2023-05-16  2:04     ` Chao Gao
  2023-05-16  2:22       ` Xiaoyao Li
  0 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-05-16  2:04 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Mon, May 15, 2023 at 02:53:07PM +0800, Xiaoyao Li wrote:
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>> From: Zhang Chen <chen.zhang@intel.com>
>> 
>> Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
>> as the first feature in the leaf.
>> 
>> RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
>> (bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
>> disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.
>> 
>> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
>> after a non-zero is written to the MSR. Therefore, guests can already
>> toggle the two bits if the host supports RRSBA_CTRL, and no extra code
>> is needed to allow guests to toggle the two bits.
>
>This is a bug that also matters with other bits in MSR_IA32_SPEC_CTRL which
>has a dedicated enumeration CPUID bit and no support in KVM yet.

Do you mean passing through the MSR is a bug? guest can write any hardware
supported value to the MSR if the MSR isn't intercepted.

I guess this is intentional and a trade-off for performance (note that
context-switch may cause writes to the MSR). And see

commit 841c2be09fe4 ("kvm: x86: replace kvm_spec_ctrl_test_value with runtime test on the host")

it appears that this behavior is widely recognized.

>
>I think we need to fix this bug at first.

I have no idea how to fix the "bug" without intercepting the MSR. The
performance penalty makes me think intercepting the MSR is not a viable
solution.

>
>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
>> ---
>>   arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
>>   arch/x86/kvm/reverse_cpuid.h |  7 +++++++
>>   2 files changed, 26 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 9583a110cf5f..f024c3ac2203 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
>>   		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
>>   	);
>> +	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
>> +		SF(RRSBA_CTRL)
>> +	);
>> +
>
>Please move this hook up to right follow the leaf CPUID_7_1_EAX.

sure. will do.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-16  2:04     ` Chao Gao
@ 2023-05-16  2:22       ` Xiaoyao Li
  2023-05-16  3:01         ` Chao Gao
  0 siblings, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-16  2:22 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 5/16/2023 10:04 AM, Chao Gao wrote:
> On Mon, May 15, 2023 at 02:53:07PM +0800, Xiaoyao Li wrote:
>> On 4/14/2023 2:25 PM, Chao Gao wrote:
>>> From: Zhang Chen <chen.zhang@intel.com>
>>>
>>> Add a kvm-only CPUID feature leaf for CPUID.7.2.EDX and RRSBA_CTRL
>>> as the first feature in the leaf.
>>>
>>> RRSBA_CTRL is enumerated by CPUID.7.2.EDX[2]. If supported, RRSBA_DIS_U
>>> (bit 5) and RRSBA_DIS_S (bit 6) of IA32_SPEC_CTRL MSR can be used to
>>> disable RRSBA behavior for CPL3 and CPL0/1/2 respectively.
>>>
>>> Note that KVM does not intercept guests' IA32_SPEC_CTRL MSR accesses
>>> after a non-zero is written to the MSR. Therefore, guests can already
>>> toggle the two bits if the host supports RRSBA_CTRL, and no extra code
>>> is needed to allow guests to toggle the two bits.
>>
>> This is a bug that also matters with other bits in MSR_IA32_SPEC_CTRL which
>> has a dedicated enumeration CPUID bit and no support in KVM yet.
> 
> Do you mean passing through the MSR is a bug? guest can write any hardware
> supported value to the MSR if the MSR isn't intercepted.
> 
> I guess this is intentional and a trade-off for performance (note that
> context-switch may cause writes to the MSR). And see
> 
> commit 841c2be09fe4 ("kvm: x86: replace kvm_spec_ctrl_test_value with runtime test on the host")
> 
> it appears that this behavior is widely recognized.
> 
>>
>> I think we need to fix this bug at first.
> 
> I have no idea how to fix the "bug" without intercepting the MSR. The
> performance penalty makes me think intercepting the MSR is not a viable
> solution.

I thought correctness always takes higher priority over performance.

>>
>>> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
>>> ---
>>>    arch/x86/kvm/cpuid.c         | 22 +++++++++++++++++++---
>>>    arch/x86/kvm/reverse_cpuid.h |  7 +++++++
>>>    2 files changed, 26 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 9583a110cf5f..f024c3ac2203 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -685,6 +685,10 @@ void kvm_set_cpu_caps(void)
>>>    		SF(SGX1) | SF(SGX2) | SF(SGX_EDECCSSA)
>>>    	);
>>> +	kvm_cpu_cap_init_kvm_defined(CPUID_7_2_EDX,
>>> +		SF(RRSBA_CTRL)
>>> +	);
>>> +
>>
>> Please move this hook up to right follow the leaf CPUID_7_1_EAX.
> 
> sure. will do.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-16  2:22       ` Xiaoyao Li
@ 2023-05-16  3:01         ` Chao Gao
  2023-05-16  7:03           ` Xiaoyao Li
  0 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-05-16  3:01 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Tue, May 16, 2023 at 10:22:22AM +0800, Xiaoyao Li wrote:
>> > I think we need to fix this bug at first.
>> 
>> I have no idea how to fix the "bug" without intercepting the MSR. The
>> performance penalty makes me think intercepting the MSR is not a viable
>> solution.
>
>I thought correctness always takes higher priority over performance.

It is generally true. however, there are situations where we should make
trade-offs between correctness and other factors (like performance):

E.g., instructions without control bits, to be 100% compliant with CPU
spec, in theory, VMMs can trap/decode every instruction and inject #UD
if a guest tries to use some instructions it shouldn't.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-16  3:01         ` Chao Gao
@ 2023-05-16  7:03           ` Xiaoyao Li
  2023-05-16  9:09             ` Chao Gao
  0 siblings, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-16  7:03 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 5/16/2023 11:01 AM, Chao Gao wrote:
> On Tue, May 16, 2023 at 10:22:22AM +0800, Xiaoyao Li wrote:
>>>> I think we need to fix this bug at first.
>>>
>>> I have no idea how to fix the "bug" without intercepting the MSR. The
>>> performance penalty makes me think intercepting the MSR is not a viable
>>> solution.
>>
>> I thought correctness always takes higher priority over performance.
> 
> It is generally true. however, there are situations where we should make
> trade-offs between correctness and other factors (like performance):
> 
> E.g., instructions without control bits, to be 100% compliant with CPU
> spec, in theory, VMMs can trap/decode every instruction and inject #UD
> if a guest tries to use some instructions it shouldn't.

This is the virtualization hole. IMHO, they are different things.

Pass through MSR_IA32_SPEC_CTRL was introduced in commit d28b387fb74d 
("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL"). At that time 
there was only a few bits defined, and the changelog called out that

   No attempt is made to handle STIBP here, intentionally. Filtering
   STIBP may be added in a future patch, which may require trapping all
   writes if we don't want to pass it through directly to the guest.

Per my undesrstanding, it implied that we need to re-visit it when more 
bits added instead of following the pass-through design siliently.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
  2023-04-17  3:17   ` Binbin Wu
  2023-04-17  6:48   ` Chenyi Qiang
@ 2023-05-16  7:16   ` Xiaoyao Li
  2023-05-16  9:20     ` Chao Gao
  2 siblings, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-16  7:16 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:

...

> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
> +{
> +	vmx->guest_spec_ctrl = val;
> +
> +	/*
> +	 * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
> +	 * regardless of the MSR intercept state.
> +	 */
> +	if (cpu_has_spec_ctrl_virt())
> +		vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
> +
> +	/*
> +	 * Update the effective value of IA32_SPEC_CTRL to reflect changes to
> +	 * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
> +	 */

Why bits in the mask should always be set?

The bits set in the mask only means them cannot be modified by guest. 
KVM can use the mask to force the bits to 0 as well.

> +	vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
> +}
>   #endif /* __KVM_X86_VMX_H */


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-16  7:03           ` Xiaoyao Li
@ 2023-05-16  9:09             ` Chao Gao
  2023-05-18  9:50               ` Xiaoyao Li
  0 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-05-16  9:09 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Tue, May 16, 2023 at 03:03:15PM +0800, Xiaoyao Li wrote:
>On 5/16/2023 11:01 AM, Chao Gao wrote:
>> On Tue, May 16, 2023 at 10:22:22AM +0800, Xiaoyao Li wrote:
>> > > > I think we need to fix this bug at first.
>> > > 
>> > > I have no idea how to fix the "bug" without intercepting the MSR. The
>> > > performance penalty makes me think intercepting the MSR is not a viable
>> > > solution.
>> > 
>> > I thought correctness always takes higher priority over performance.
>> 
>> It is generally true. however, there are situations where we should make
>> trade-offs between correctness and other factors (like performance):
>> 
>> E.g., instructions without control bits, to be 100% compliant with CPU
>> spec, in theory, VMMs can trap/decode every instruction and inject #UD
>> if a guest tries to use some instructions it shouldn't.
>
>This is the virtualization hole. IMHO, they are different things.

what are the differences between?
1. Executing some unsupported instructions should cause #UD. But this is allowed
   in a KVM guest.
2. Setting some reserved bits in SPEC_CTRL MSR should cause #GP. But this is
   allowed in a KVM guest.

>
>Pass through MSR_IA32_SPEC_CTRL was introduced in commit d28b387fb74d
>("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL"). At that time there
>was only a few bits defined, and the changelog called out that
>
>  No attempt is made to handle STIBP here, intentionally. Filtering
>  STIBP may be added in a future patch, which may require trapping all
>  writes if we don't want to pass it through directly to the guest.
>
>Per my undesrstanding, it implied that we need to re-visit it when more bits
>added instead of following the pass-through design siliently.

I don't object to re-visiting the design. My point is that to prevent guests from
setting RRSBA_CTRL/BHI_CTRL when they are not advertised isn't a strong
justfication for intercepting the MSR. STIBP and other bits (except IBRS) have
the same problem. And the gain of fixing this is too small.

If passing through the SPEC_CTRL MSR to guests might cause security issues, I
would agree to intercept accesses to the MSR.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support
  2023-05-16  7:16   ` Xiaoyao Li
@ 2023-05-16  9:20     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-05-16  9:20 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Tue, May 16, 2023 at 03:16:59PM +0800, Xiaoyao Li wrote:
>On 4/14/2023 2:25 PM, Chao Gao wrote:
>
>...
>
>> +static inline void vmx_set_guest_spec_ctrl(struct vcpu_vmx *vmx, u64 val)
>> +{
>> +	vmx->guest_spec_ctrl = val;
>> +
>> +	/*
>> +	 * For simplicity, always keep IA32_SPEC_CTRL_SHADOW up-to-date,
>> +	 * regardless of the MSR intercept state.
>> +	 */
>> +	if (cpu_has_spec_ctrl_virt())
>> +		vmcs_write64(IA32_SPEC_CTRL_SHADOW, val);
>> +
>> +	/*
>> +	 * Update the effective value of IA32_SPEC_CTRL to reflect changes to
>> +	 * guest's IA32_SPEC_CTRL. Bits in the mask should always be set.
>> +	 */
>
>Why bits in the mask should always be set?
>
>The bits set in the mask only means them cannot be modified by guest. KVM can
>use the mask to force the bits to 0 as well.

Yes.

Because there is no use case for VMMs to lock some bits to 0 behind guests, this
isn't used in series. There was a note in v1's changelog [1]:

	Note "virtual IA32_SPEC_CTRL" is now used by VMM to enforce some bits
	of IA32_SPEC_CTRL to 1 (i.e., enabled some HW mitigations transparently
	for guests). In theory, VMM can disable some HW mitigations behind guests.
	But to keep this series simple, we leave that for future work.


But somehow I dropped it (when I tried to slim down the changelog). Will add it
back and add a comment above the definition of spec_ctrl_mask.

[1]: https://lore.kernel.org/lkml/20221210160046.2608762-5-chen.zhang@intel.com/

>
>> +	vmx->spec_ctrl = val | vmx_get_spec_ctrl_mask(vmx);
>> +}
>>   #endif /* __KVM_X86_VMX_H */
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-16  9:09             ` Chao Gao
@ 2023-05-18  9:50               ` Xiaoyao Li
  2023-05-19  9:43                 ` Chao Gao
  0 siblings, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-18  9:50 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 5/16/2023 5:09 PM, Chao Gao wrote:
> On Tue, May 16, 2023 at 03:03:15PM +0800, Xiaoyao Li wrote:
>> On 5/16/2023 11:01 AM, Chao Gao wrote:
>>> On Tue, May 16, 2023 at 10:22:22AM +0800, Xiaoyao Li wrote:
>>>>>> I think we need to fix this bug at first.
>>>>>
>>>>> I have no idea how to fix the "bug" without intercepting the MSR. The
>>>>> performance penalty makes me think intercepting the MSR is not a viable
>>>>> solution.
>>>>
>>>> I thought correctness always takes higher priority over performance.
>>>
>>> It is generally true. however, there are situations where we should make
>>> trade-offs between correctness and other factors (like performance):
>>>
>>> E.g., instructions without control bits, to be 100% compliant with CPU
>>> spec, in theory, VMMs can trap/decode every instruction and inject #UD
>>> if a guest tries to use some instructions it shouldn't.
>>
>> This is the virtualization hole. IMHO, they are different things.
> 
> what are the differences between?
> 1. Executing some unsupported instructions should cause #UD. But this is allowed
>     in a KVM guest.
> 2. Setting some reserved bits in SPEC_CTRL MSR should cause #GP. But this is
>     allowed in a KVM guest.

The difference is that for virtualization hole, there is no way but 
intercept and decode every instruction if we want the correctness. It's 
a disaster.

But for MSR virtualization, we do have an option and we don't need to 
trap every instruction. MSR interception is the designated mechanism to 
correctly and elegantly virtualize the MSR.

>>
>> Pass through MSR_IA32_SPEC_CTRL was introduced in commit d28b387fb74d
>> ("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL"). At that time there
>> was only a few bits defined, and the changelog called out that
>>
>>   No attempt is made to handle STIBP here, intentionally. Filtering
>>   STIBP may be added in a future patch, which may require trapping all
>>   writes if we don't want to pass it through directly to the guest.
>>
>> Per my undesrstanding, it implied that we need to re-visit it when more bits
>> added instead of following the pass-through design siliently.
> 
> I don't object to re-visiting the design. My point is that to prevent guests from
> setting RRSBA_CTRL/BHI_CTRL when they are not advertised isn't a strong
> justfication for intercepting the MSR. STIBP and other bits (except IBRS) have
> the same problem. And the gain of fixing this is too small.
> 
> If passing through the SPEC_CTRL MSR to guests might cause security issues, I
> would agree to intercept accesses to the MSR.

I never buy it. How to interpret the security? If the user wants to hide 
one feature from guest but KVM allows it when KVM does have a reasonable 
way to hide it. Does it violate the security?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
  2023-04-14  6:25 ` [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support Chao Gao
@ 2023-05-18 10:14   ` Xiaoyao Li
  2023-05-19  9:57     ` Chao Gao
  0 siblings, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-18 10:14 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Bit 63 of IA32_ARCH_CAPABILITIES MSR indicates availablility of the
> VIRTUAL_ENUMERATION_MSR (index 0x50000000) that enumerates features like
> e.g., mitigation enumeration which is used for guest to hint VMMs the
> software mitigations in use.
> 
> Advertise ARCH_CAP_VIRTUAL_ENUM support for VMX and emulate read/write
> of the VIRTUAL_ENUMERATION_MSR. Now VIRTUAL_ENUMERATION_MSR is always 0.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Co-developed-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/kvm/svm/svm.c |  1 +
>   arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h |  1 +
>   arch/x86/kvm/x86.c     | 16 +++++++++++++++-
>   4 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 57f241c5a371..195d0cf9309a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4093,6 +4093,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
>   {
>   	switch (index) {
>   	case MSR_IA32_MCG_EXT_CTL:
> +	case MSR_VIRTUAL_ENUMERATION:
>   	case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
>   		return false;
>   	case MSR_IA32_SMBASE:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9f6919bec2b3..85419137decb 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1943,6 +1943,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
>   	return !(msr->data & ~valid_bits);
>   }
>   
> +#define VIRTUAL_ENUMERATION_VALID_BITS	0ULL
> +
>   static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
>   {
>   	switch (msr->index) {
> @@ -1950,6 +1952,9 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
>   		if (!nested)
>   			return 1;
>   		return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
> +	case MSR_VIRTUAL_ENUMERATION:
> +		msr->data = VIRTUAL_ENUMERATION_VALID_BITS;
> +		return 0;
>   	default:
>   		return KVM_MSR_RET_INVALID;
>   	}
> @@ -2096,6 +2101,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	case MSR_IA32_DEBUGCTLMSR:
>   		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
>   		break;
> +	case MSR_VIRTUAL_ENUMERATION:
> +		if (!msr_info->host_initiated &&
> +		    !(vcpu->arch.arch_capabilities & ARCH_CAP_VIRTUAL_ENUM))
> +			return 1;
> +		msr_info->data = vmx->msr_virtual_enumeration;
> +		break;
>   	default:
>   	find_uret_msr:
>   		msr = vmx_find_uret_msr(vmx, msr_info->index);
> @@ -2437,6 +2448,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   		}
>   		ret = kvm_set_msr_common(vcpu, msr_info);
>   		break;
> +	case MSR_VIRTUAL_ENUMERATION:
> +		if (!msr_info->host_initiated)
> +			return 1;
> +		if (data & ~VIRTUAL_ENUMERATION_VALID_BITS)
> +			return 1;
> +
> +		vmx->msr_virtual_enumeration = data;
> +		break;
>   
>   	default:
>   	find_uret_msr:
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 021d86b52e18..a7faaf9fdc26 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -292,6 +292,7 @@ struct vcpu_vmx {
>   
>   	u64		      spec_ctrl;
>   	u64		      guest_spec_ctrl;
> +	u64		      msr_virtual_enumeration;
>   	u32		      msr_ia32_umwait_control;
>   
>   	/*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3c58dbae7b4c..a1bc52bebdcc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1537,6 +1537,7 @@ static const u32 emulated_msrs_all[] = {
>   
>   	MSR_K7_HWCR,
>   	MSR_KVM_POLL_CONTROL,
> +	MSR_VIRTUAL_ENUMERATION,
>   };
>   
>   static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
> @@ -1570,6 +1571,7 @@ static const u32 msr_based_features_all[] = {
>   	MSR_IA32_UCODE_REV,
>   	MSR_IA32_ARCH_CAPABILITIES,
>   	MSR_IA32_PERF_CAPABILITIES,
> +	MSR_VIRTUAL_ENUMERATION,
>   };
>   
>   static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
>   	 ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
>   	 ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
>   	 ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
> -	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
> +	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
> +	 ARCH_CAP_VIRTUAL_ENUM)

We cannot do it.

Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is 
possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.

>   
>   static u64 kvm_get_arch_capabilities(void)
>   {
> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
>   	 */
>   	data |= ARCH_CAP_PSCHANGE_MC_NO;
>   
> +	/*
> +	 * Virtual enumeration is a paravirt feature. The only usage for now
> +	 * is to bridge the gap caused by microarchitecture changes between
> +	 * different Intel processors. And its usage is linked to "virtualize
> +	 * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
> +	 * from the same usage and how to implement it is still unclear. Limit
> +	 * virtual enumeration to VMX.
> +	 */
> +	if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
> +		data |= ARCH_CAP_VIRTUAL_ENUM;
> +
>   	/*
>   	 * If we're doing cache flushes (either "always" or "cond")
>   	 * we will do one whenever the guest does a vmlaunch/vmresume.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
  2023-04-14  6:25 ` [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT Chao Gao
@ 2023-05-18 10:25   ` Xiaoyao Li
  2023-05-19 10:26     ` Chao Gao
  2023-05-22  9:43   ` Liu, Jingqi
  1 sibling, 1 reply; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-18 10:25 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Allow guest to query if the underying VMM understands Retpoline and
> report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
> MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
> 
> Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
> using retpoline and the processor has the behavior.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 980498c4c30c..25afb4c3e55e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1944,8 +1944,8 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
>   }
>   
>   #define VIRTUAL_ENUMERATION_VALID_BITS	VIRT_ENUM_MITIGATION_CTRL_SUPPORT
> -#define MITI_ENUM_VALID_BITS		0ULL
> -#define MITI_CTRL_VALID_BITS		0ULL
> +#define MITI_ENUM_VALID_BITS		MITI_ENUM_RETPOLINE_S_SUPPORT
> +#define MITI_CTRL_VALID_BITS		MITI_CTRL_RETPOLINE_S_USED
>   
>   static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
>   {
> @@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	struct vmx_uret_msr *msr;
>   	int ret = 0;
>   	u32 msr_index = msr_info->index;
> -	u64 data = msr_info->data, spec_ctrl_mask;
> +	u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;

Sugget to make arch_msr and spec_ctrl_mask as local variables of each 
case {} block

>   	u32 index;
>   
>   	switch (msr_index) {
> @@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   		if (data & ~MITI_CTRL_VALID_BITS)
>   			return 1;
>   
> +		if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> +			rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
> +
> +		if (data & MITI_CTRL_RETPOLINE_S_USED &&
> +		    kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&

why kvm_cpu_cap_has() is used here? it means whether KVM supports expose 
this feature to guest. But what we need here is whether host supports 
this feature. Though they might get the same result, we'd better use 
boot_cpu_has() or even read CPUID directly (since cpuid info can be 
changed by clearcpuid magic) to avoid confusion.

> +		    arch_msr & ARCH_CAP_RRSBA)
> +			spec_ctrl_mask |= SPEC_CTRL_RRSBA_DIS_S;
> +
> +		/*
> +		 * Intercept IA32_SPEC_CTRL to disallow guest from changing
> +		 * certain bits.
> +		 */
> +		if (spec_ctrl_mask && !cpu_has_spec_ctrl_virt())
> +			vmx_enable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW);
> +
> +		vmx_set_spec_ctrl_mask(vmx, spec_ctrl_mask);
> +		vmx_set_guest_spec_ctrl(vmx, vmx_get_guest_spec_ctrl(vmx));
> +
>   		vmx->msr_virtual_mitigation_ctrl = data;
>   		break;
>   	default:


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support
  2023-05-18  9:50               ` Xiaoyao Li
@ 2023-05-19  9:43                 ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-05-19  9:43 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Thu, May 18, 2023 at 05:50:17PM +0800, Xiaoyao Li wrote:
>On 5/16/2023 5:09 PM, Chao Gao wrote:
>> On Tue, May 16, 2023 at 03:03:15PM +0800, Xiaoyao Li wrote:
>> > On 5/16/2023 11:01 AM, Chao Gao wrote:
>> > > On Tue, May 16, 2023 at 10:22:22AM +0800, Xiaoyao Li wrote:
>> > > > > > I think we need to fix this bug at first.
>> > > > > 
>> > > > > I have no idea how to fix the "bug" without intercepting the MSR. The
>> > > > > performance penalty makes me think intercepting the MSR is not a viable
>> > > > > solution.
>> > > > 
>> > > > I thought correctness always takes higher priority over performance.
>> > > 
>> > > It is generally true. however, there are situations where we should make
>> > > trade-offs between correctness and other factors (like performance):
>> > > 
>> > > E.g., instructions without control bits, to be 100% compliant with CPU
>> > > spec, in theory, VMMs can trap/decode every instruction and inject #UD
>> > > if a guest tries to use some instructions it shouldn't.
>> > 
>> > This is the virtualization hole. IMHO, they are different things.
>> 
>> what are the differences between?
>> 1. Executing some unsupported instructions should cause #UD. But this is allowed
>>     in a KVM guest.
>> 2. Setting some reserved bits in SPEC_CTRL MSR should cause #GP. But this is
>>     allowed in a KVM guest.
>
>The difference is that for virtualization hole, there is no way but intercept
>and decode every instruction if we want the correctness. It's a disaster.
>
>But for MSR virtualization, we do have an option and we don't need to trap
>every instruction. MSR interception is the designated mechanism to correctly
>and elegantly virtualize the MSR.

The gains in this two cases are similar: some operations in guest are prevented.
But the costs on performance are not. So, how do you draw the line when we can
sacrafice correctness for performance?

>
>> > 
>> > Pass through MSR_IA32_SPEC_CTRL was introduced in commit d28b387fb74d
>> > ("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL"). At that time there
>> > was only a few bits defined, and the changelog called out that
>> > 
>> >   No attempt is made to handle STIBP here, intentionally. Filtering
>> >   STIBP may be added in a future patch, which may require trapping all
>> >   writes if we don't want to pass it through directly to the guest.
>> > 
>> > Per my undesrstanding, it implied that we need to re-visit it when more bits
>> > added instead of following the pass-through design siliently.
>> 
>> I don't object to re-visiting the design. My point is that to prevent guests from
>> setting RRSBA_CTRL/BHI_CTRL when they are not advertised isn't a strong
>> justfication for intercepting the MSR. STIBP and other bits (except IBRS) have
>> the same problem. And the gain of fixing this is too small.
>> 
>> If passing through the SPEC_CTRL MSR to guests might cause security issues, I
>> would agree to intercept accesses to the MSR.
>
>I never buy it. How to interpret the security? If the user wants to hide one
>feature from guest but KVM allows it when KVM does have a reasonable way to
>hide it. Does it violate the security?

I would say no. By "security", I mean guest becomes vulnerable to some issues or
guest attacks host or guest can access unauthorized data.

I tried to say that if the value of intercepting IA32_SPEC_CTRL outweighs the
perform penalty, it makes sense to do that. I guess the decision has been made
when enabling STIBP and it means people care more about performance than
preventing guests from setting STIBP.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
  2023-05-18 10:14   ` Xiaoyao Li
@ 2023-05-19  9:57     ` Chao Gao
  2023-05-22  1:02       ` Xiaoyao Li
  0 siblings, 1 reply; 42+ messages in thread
From: Chao Gao @ 2023-05-19  9:57 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Thu, May 18, 2023 at 06:14:40PM +0800, Xiaoyao Li wrote:
>>   static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
>> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
>>   	 ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
>>   	 ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
>>   	 ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
>> -	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
>> +	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
>> +	 ARCH_CAP_VIRTUAL_ENUM)
>
>We cannot do it.
>
>Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is
>possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.

How does AMD L1 see the ARCH_CAP_VIRTUAL_ENUM feature in the first
place? because ...

>
>>   static u64 kvm_get_arch_capabilities(void)
>>   {
>> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
>>   	 */
>>   	data |= ARCH_CAP_PSCHANGE_MC_NO;
>> +	/*
>> +	 * Virtual enumeration is a paravirt feature. The only usage for now
>> +	 * is to bridge the gap caused by microarchitecture changes between
>> +	 * different Intel processors. And its usage is linked to "virtualize
>> +	 * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
>> +	 * from the same usage and how to implement it is still unclear. Limit
>> +	 * virtual enumeration to VMX.
>> +	 */
>> +	if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
>> +		data |= ARCH_CAP_VIRTUAL_ENUM;

the feature is exposed on Intel CPUs only.

Do you mean AMD L1 created on Intel L0? and Intel L0 even emulates
nested (SVM) support for the L1? This sounds a very contrived case.

>> +
>>   	/*
>>   	 * If we're doing cache flushes (either "always" or "cond")
>>   	 * we will do one whenever the guest does a vmlaunch/vmresume.
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
  2023-05-18 10:25   ` Xiaoyao Li
@ 2023-05-19 10:26     ` Chao Gao
  0 siblings, 0 replies; 42+ messages in thread
From: Chao Gao @ 2023-05-19 10:26 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On Thu, May 18, 2023 at 06:25:30PM +0800, Xiaoyao Li wrote:
>> @@ -2173,7 +2173,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   	struct vmx_uret_msr *msr;
>>   	int ret = 0;
>>   	u32 msr_index = msr_info->index;
>> -	u64 data = msr_info->data, spec_ctrl_mask;
>> +	u64 data = msr_info->data, arch_msr = 0, spec_ctrl_mask = 0;
>
>Sugget to make arch_msr and spec_ctrl_mask as local variables of each case {}
>block

Sure. Will do

>
>>   	u32 index;
>>   	switch (msr_index) {
>> @@ -2488,6 +2488,24 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   		if (data & ~MITI_CTRL_VALID_BITS)
>>   			return 1;
>> +		if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
>> +			rdmsrl(MSR_IA32_ARCH_CAPABILITIES, arch_msr);
>> +
>> +		if (data & MITI_CTRL_RETPOLINE_S_USED &&
>> +		    kvm_cpu_cap_has(X86_FEATURE_RRSBA_CTRL) &&
>
>why kvm_cpu_cap_has() is used here? it means whether KVM supports expose this
>feature to guest. But what we need here is whether host supports this
>feature. Though they might get the same result, we'd better use
>boot_cpu_has() or even read CPUID directly (since cpuid info can be changed
>by clearcpuid magic) to avoid confusion.

OK. This makes sense. I will use boot_cpu_has(). clearcpuid sometimes is
helpful for debugging. I prefer to honor it.

Thanks.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
  2023-05-19  9:57     ` Chao Gao
@ 2023-05-22  1:02       ` Xiaoyao Li
  0 siblings, 0 replies; 42+ messages in thread
From: Xiaoyao Li @ 2023-05-22  1:02 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 5/19/2023 5:57 PM, Chao Gao wrote:
> On Thu, May 18, 2023 at 06:14:40PM +0800, Xiaoyao Li wrote:
>>>    static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
>>> @@ -1591,7 +1593,8 @@ static unsigned int num_msr_based_features;
>>>    	 ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
>>>    	 ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
>>>    	 ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
>>> -	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
>>> +	 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | \
>>> +	 ARCH_CAP_VIRTUAL_ENUM)
>>
>> We cannot do it.
>>
>> Otherwise, an AMD L1 with X86_FEATURE_ARCH_CAPABILITIES configured is
>> possible to expose MSR_VIRTUAL_ENUMERATION to L2 while no support for it.
> 
> How does AMD L1 see the ARCH_CAP_VIRTUAL_ENUM feature in the first
> place? because ...
> 
>>
>>>    static u64 kvm_get_arch_capabilities(void)
>>>    {
>>> @@ -1610,6 +1613,17 @@ static u64 kvm_get_arch_capabilities(void)
>>>    	 */
>>>    	data |= ARCH_CAP_PSCHANGE_MC_NO;
>>> +	/*
>>> +	 * Virtual enumeration is a paravirt feature. The only usage for now
>>> +	 * is to bridge the gap caused by microarchitecture changes between
>>> +	 * different Intel processors. And its usage is linked to "virtualize
>>> +	 * IA32_SPEC_CTRL" which is a VMX feature. Whether AMD SVM can benefit
>>> +	 * from the same usage and how to implement it is still unclear. Limit
>>> +	 * virtual enumeration to VMX.
>>> +	 */
>>> +	if (static_call(kvm_x86_has_emulated_msr)(NULL, MSR_VIRTUAL_ENUMERATION))
>>> +		data |= ARCH_CAP_VIRTUAL_ENUM;
> 
> the feature is exposed on Intel CPUs only.
> 
> Do you mean AMD L1 created on Intel L0? and Intel L0 even emulates
> nested (SVM) support for the L1? This sounds a very contrived case.

you are right. I was thinking of an rare case but ignored the fact that 
VMX doesn't nested svm.

Sorry for it.

>>> +
>>>    	/*
>>>    	 * If we're doing cache flushes (either "always" or "cond")
>>>    	 * we will do one whenever the guest does a vmlaunch/vmresume.
>>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs
  2023-04-14  6:25 ` [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs Chao Gao
@ 2023-05-22  9:39   ` Liu, Jingqi
  0 siblings, 0 replies; 42+ messages in thread
From: Liu, Jingqi @ 2023-05-22  9:39 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Paolo Bonzini, Shuah Khan, Arnaldo Carvalho de Melo,
	Borislav Petkov, Pawan Gupta, Zhang Chen, Daniel Sneddon,
	linux-kernel, linux-kselftest

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Three virtual MSRs added for guest to report the usage of software
Seems it's better like below.
s/Three virtual MSRs added/Add three virtual MSRs ?
> mitigations. They are enumerated in an architectural way. Try to
> access the three MSRs to ensure the behavior is expected:
> Specifically,
>
> 1. below three cases should cause #GP:
>   * access to a non-present MSR
>   * write to read-only MSRs
>   * toggling reserved bit of a writeable MSR
>
> 2. rdmsr/wrmsr in other cases should succeed
>
> 3. rdmsr should return the value last written
>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
Thanks,
Jingqi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
  2023-04-14  6:25 ` [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT Chao Gao
@ 2023-05-22  9:41   ` Liu, Jingqi
  0 siblings, 0 replies; 42+ messages in thread
From: Liu, Jingqi @ 2023-05-22  9:41 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel


On 4/14/2023 2:25 PM, Chao Gao wrote:
> From: Zhang Chen <chen.zhang@intel.com>
>
> Allow guest to query if the underying VMM understands BHB-clearing
> sequence and report the statue of BHB-clearing sequence in suprevisor
s/statue/status
> mode i.e. CPL < 3 via MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
>
> Enable BHI_DIS_S for guest if guest is using BHI-clearing sequence and
> the sequence isn't effective on the processor.
>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
Thanks,
Jingqi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
  2023-04-14  6:25 ` [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT Chao Gao
  2023-05-18 10:25   ` Xiaoyao Li
@ 2023-05-22  9:43   ` Liu, Jingqi
  1 sibling, 0 replies; 42+ messages in thread
From: Liu, Jingqi @ 2023-05-22  9:43 UTC (permalink / raw)
  To: Chao Gao, kvm
  Cc: Jiaan Lu, Zhang Chen, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

On 4/14/2023 2:25 PM, Chao Gao wrote:
> Allow guest to query if the underying VMM understands Retpoline and
> report the statue of Retpoline in suprevisor mode i.e. CPL < 3 via
s/statue/status
> MSR_VIRTUAL_MITIGATION_ENUM/CTRL.
>
> Disable RRSBA behavior by setting RRSBA_DIS_S for guest if guest is
> using retpoline and the processor has the behavior.
>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Jiaan Lu <jiaan.lu@intel.com>
> ---
Thanks,
Jingqi

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2023-05-22  9:43 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-14  6:25 [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Chao Gao
2023-04-14  6:25 ` [RFC PATCH v2 01/11] x86/msr-index: Add bit definitions for BHI_DIS_S and BHI_NO Chao Gao
2023-04-14  9:52   ` Binbin Wu
2023-04-14  6:25 ` [RFC PATCH v2 02/11] KVM: x86: Advertise CPUID.7.2.EDX and RRSBA_CTRL support Chao Gao
2023-04-16  7:04   ` Binbin Wu
2023-04-16 13:25     ` Chao Gao
2023-05-15  6:53   ` Xiaoyao Li
2023-05-16  2:04     ` Chao Gao
2023-05-16  2:22       ` Xiaoyao Li
2023-05-16  3:01         ` Chao Gao
2023-05-16  7:03           ` Xiaoyao Li
2023-05-16  9:09             ` Chao Gao
2023-05-18  9:50               ` Xiaoyao Li
2023-05-19  9:43                 ` Chao Gao
2023-04-14  6:25 ` [RFC PATCH v2 03/11] KVM: x86: Advertise BHI_CTRL support Chao Gao
2023-05-15  7:14   ` Xiaoyao Li
2023-04-14  6:25 ` [RFC PATCH v2 04/11] KVM: VMX: Add IA32_SPEC_CTRL virtualization support Chao Gao
2023-04-17  3:17   ` Binbin Wu
2023-04-18  2:07     ` Chao Gao
2023-04-17  6:48   ` Chenyi Qiang
2023-04-17  7:31     ` Chao Gao
2023-05-16  7:16   ` Xiaoyao Li
2023-05-16  9:20     ` Chao Gao
2023-04-14  6:25 ` [RFC PATCH v2 05/11] x86/bugs: Use Virtual MSRs to request hardware mitigations Chao Gao
2023-04-17 13:43   ` Binbin Wu
2023-04-18  2:01     ` Chao Gao
2023-04-14  6:25 ` [RFC PATCH v2 06/11] KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support Chao Gao
2023-05-18 10:14   ` Xiaoyao Li
2023-05-19  9:57     ` Chao Gao
2023-05-22  1:02       ` Xiaoyao Li
2023-04-14  6:25 ` [RFC PATCH v2 07/11] KVM: VMX: Advertise MITIGATION_CTRL support Chao Gao
2023-04-14  6:25 ` [RFC PATCH v2 08/11] KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT Chao Gao
2023-05-18 10:25   ` Xiaoyao Li
2023-05-19 10:26     ` Chao Gao
2023-05-22  9:43   ` Liu, Jingqi
2023-04-14  6:25 ` [RFC PATCH v2 09/11] KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT Chao Gao
2023-05-22  9:41   ` Liu, Jingqi
2023-04-14  6:25 ` [RFC PATCH v2 10/11] KVM: selftests: Add tests for virtual enumeration/mitigation MSRs Chao Gao
2023-05-22  9:39   ` Liu, Jingqi
2023-04-14  6:25 ` [RFC PATCH v2 11/11] KVM: selftests: Add tests for IA32_SPEC_CTRL MSR Chao Gao
2023-04-14  9:51 ` [RFC PATCH v2 00/11] Intel IA32_SPEC_CTRL Virtualization Binbin Wu
2023-04-14 22:10   ` Pawan Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox