kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] LASS KVM virtualization support
@ 2023-04-20 13:37 Zeng Guang
  2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.

LASS works in long mode and partitions the 64-bit canonical linear
address space into two halves:
    1. Lower half (LA[63]=0) --> user space
    2. Upper half (LA[63]=1) --> kernel space

When LASS is enabled, a general protection #GP fault or a stack fault
#SS will be generated if software accesses the address from the half
in which it resides to another half, e.g., either from user space to
upper half, or from kernel space to lower half. This protection applies
to data access, code execution.

This series add KVM LASS virtualization support.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the LASS violation
check, while KVM also needs to apply LASS to instructions emulated by
software and injects #GP or #SS fault to the guest.

Following LASS voilations check will be taken on KVM emulation path.
User-mode access to supervisor space address:
        LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
        Instruction fetch: !LA[bit 63] && (CPL < 3)
        Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
                     CPL < 3) || Implicit supervisor access)

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As current
KVM unittest framework is not compatible to LASS rule that kernel should
run in the upper half, we use kernel module and application test to verify
LASS functionalities in guest instead. The data access related x86 emulator
code is verified with forced emulation prefix (FEP) mechanism. Other test
cases are working in progress.

How to add tests for LASS in KUT or kselftest is still under investigation.

[1] Intel Architecutre Instruction Set Extensions and Future Features
Programming Reference: Chapter Linear Address Space Separation (LASS)
https://cdrdv2.intel.com/v1/dl/getContent/671368

Zeng Guang (6):
  KVM: x86: Virtualize CR4.LASS
  KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  KVM: x86: Add emulator helper for LASS violation check
  KVM: x86: LASS protection on KVM emulation when LASS enabled
  KVM: x86: Advertise LASS CPUID to user space
  KVM: x86: Set KVM LASS based on hardware capability

 arch/x86/include/asm/cpuid.h       | 36 +++++++++++++++++++
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  7 +++-
 arch/x86/kvm/cpuid.c               |  8 +++--
 arch/x86/kvm/emulate.c             | 36 ++++++++++++++++---
 arch/x86/kvm/kvm_emulate.h         |  1 +
 arch/x86/kvm/vmx/nested.c          |  3 ++
 arch/x86/kvm/vmx/sgx.c             |  2 ++
 arch/x86/kvm/vmx/vmx.c             | 58 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  2 ++
 arch/x86/kvm/x86.c                 |  9 +++++
 arch/x86/kvm/x86.h                 |  2 ++
 12 files changed, 157 insertions(+), 8 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-24  6:45   ` Binbin Wu
  2023-04-24  7:32   ` Chao Gao
  2023-04-20 13:37 ` [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Besides, only if VM has LASS capability enumerated with
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
to set CR4.LASS. By design CR4.LASS can be manipulated by nested guest as
well.

Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
effect in legacy mode even if CR4.LASS is set.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 3 +++
 arch/x86/kvm/x86.h              | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6aaae18f1854..8ff89a52ef66 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,7 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 43ff3276918b..c923d7599d71 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7569,6 +7569,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
 
+	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
+
 #undef cr4_fixed1_update
 }
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 7c8a30d44c29..218f4c73789a 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -475,6 +475,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
 		__reserved_bits |= X86_CR4_VMXE;        \
 	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
 		__reserved_bits |= X86_CR4_PCIDE;       \
+	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
+		__reserved_bits |= X86_CR4_LASS;        \
 	__reserved_bits;                                \
 })
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
  2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-24  7:43   ` Binbin Wu
  2023-04-25  3:10   ` Chao Gao
  2023-04-20 13:37 ` [PATCH 3/6] KVM: x86: Add emulator helper " Zeng Guang
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

Intel introduce LASS (Linear Address Separation) feature providing
an independent mechanism to achieve the mode-based protection.

LASS partitions 64-bit linear address space into two halves, user-mode
address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
stops any code execution or data access
    1. from user mode to supervisor-mode address space
    2. from supervisor mode to user-mode address space
and generates LASS violation fault accordingly.

A supervisor mode data access causes a LASS violation only if supervisor
mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
or the access implicitly accesses a system data structure.

Following are the rule of LASS violation check on the linear address(LA).
User access to supervisor-mode address space:
    LA[bit 63] && (CPL == 3)
Supervisor access to user-mode address space:
    Instruction fetch: !LA[bit 63] && (CPL < 3)
    Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
		 CPL < 3) || Implicit supervisor access)

Add new ops in kvm_x86_ops to do LASS violation check.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  5 +++
 arch/x86/kvm/vmx/vmx.c             | 55 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  2 ++
 4 files changed, 63 insertions(+)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index abccd51dcfca..f76c07f2674b 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -131,6 +131,7 @@ KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP_OPTIONAL_RET0(check_lass);
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8ff89a52ef66..31fb8699a1ff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -69,6 +69,9 @@
 #define KVM_X86_NOTIFY_VMEXIT_VALID_BITS	(KVM_X86_NOTIFY_VMEXIT_ENABLED | \
 						 KVM_X86_NOTIFY_VMEXIT_USER)
 
+/* x86-specific emulation flags */
+#define KVM_X86_EMULFLAG_SKIP_LASS	_BITULL(1)
+
 /* x86-specific vcpu->requests bit members */
 #define KVM_REQ_MIGRATE_TIMER		KVM_ARCH_REQ(0)
 #define KVM_REQ_REPORT_TPR_ACCESS	KVM_ARCH_REQ(1)
@@ -1706,6 +1709,8 @@ struct kvm_x86_ops {
 	 * Returns vCPU specific APICv inhibit reasons
 	 */
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c923d7599d71..581327ede66a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8070,6 +8070,59 @@ static void vmx_vm_destroy(struct kvm *kvm)
 	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
 }
 
+/*
+ * Determine whether an access to the linear address causes a LASS violation.
+ * LASS protection is only effective in long mode. As a prerequisite, caller
+ * should make sure VM running in long mode and invoke this api to do LASS
+ * violation check.
+ */
+bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
+{
+	bool user_mode, user_as, rflags_ac;
+
+	if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
+	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
+		return false;
+
+	WARN_ON_ONCE(!is_long_mode(vcpu));
+
+	user_as = !(la >> 63);
+
+	/*
+	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
+	 * accesses a system data structure. For implicit accesses to system
+	 * data structure, the processor acts as if RFLAGS.AC is clear.
+	 */
+	if (access & PFERR_IMPLICIT_ACCESS) {
+		user_mode = false;
+		rflags_ac = false;
+	} else {
+		user_mode = vmx_get_cpl(vcpu) == 3;
+		if (!user_mode)
+			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
+	}
+
+	if (user_mode != user_as) {
+		/*
+		 * Supervisor-mode _data_ accesses to user address space
+		 * cause LASS violations only if SMAP is enabled.
+		 */
+		if (!user_mode && !(access & PFERR_FETCH_MASK)) {
+			return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
+			       !rflags_ac;
+		} else {
+			return true;
+		}
+	}
+
+	return false;
+}
+
+static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
+{
+	return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, flags);
+}
+
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = "kvm_intel",
 
@@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.complete_emulated_msr = kvm_complete_insn_gp,
 
 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+
+	.check_lass = vmx_check_lass,
 };
 
 static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..6569385a5978 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
 u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
 u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 
+bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
+
 static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 					     int type, bool value)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/6] KVM: x86: Add emulator helper for LASS violation check
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
  2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
  2023-04-20 13:37 ` [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-20 13:37 ` [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled Zeng Guang
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

When LASS is enabled, KVM need apply LASS violation check to instruction
emulations. Add helper for the usage of x86 emulator to perform LASS
protection.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/kvm_emulate.h | 1 +
 arch/x86/kvm/x86.c         | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 2d9662be8333..1c55247d52d7 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -224,6 +224,7 @@ struct x86_emulate_ops {
 	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+	bool (*check_lass)(struct x86_emulate_ctxt *ctxt, u64 access, u64 la, u64 flags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 87feb1249ad6..704c5e4b9e76 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8193,6 +8193,14 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
 		kvm_vm_bugged(kvm);
 }
 
+static bool emulator_check_lass(struct x86_emulate_ctxt *ctxt,
+				u64 access, u64 la, u64 flags)
+{
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	return static_call(kvm_x86_check_lass)(vcpu, access, la, flags);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.vm_bugged           = emulator_vm_bugged,
 	.read_gpr            = emulator_read_gpr,
@@ -8237,6 +8245,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.leave_smm           = emulator_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
+	.check_lass          = emulator_check_lass,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
                   ` (2 preceding siblings ...)
  2023-04-20 13:37 ` [PATCH 3/6] KVM: x86: Add emulator helper " Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-25  2:52   ` Binbin Wu
  2023-04-26  1:31   ` Yuan Yao
  2023-04-20 13:37 ` [PATCH 5/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

Do LASS violation check for instructions emulated by KVM. Note that for
instructions executed in the guest directly, hardware will perform the
check.

Not all instruction emulation leads to accesses to guest linear addresses
because 1) some instrutions like CPUID, RDMSR, don't take memory as
operands 2) instruction fetch in most cases is already done inside the
guest.

Four cases in which kvm may access guest linear addresses are identified
by code inspection:
- KVM emulator uses segmented address for instruction fetches or data
  accesses.
- For implicit data access, KVM emulator gets address to a system data
  structure(GDT/LDT/IDT/TR).
- For VMX instruction emulation, KVM gets the address from "VM-exit
  instruction information" field in VMCS.
- For SGX ENCLS instruction emulation, KVM gets the address from registers.

LASS violation check applies to these linear address so as to enforce
mode-based protections as hardware behaves.

As exceptions, the target memory address of emulation of invlpg, branch
and call instructions doesn't require LASS violation check.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c    | 36 +++++++++++++++++++++++++++++++-----
 arch/x86/kvm/vmx/nested.c |  3 +++
 arch/x86/kvm/vmx/sgx.c    |  2 ++
 3 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5cc3efa0e21c..a9a022fd712e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 				       struct segmented_address addr,
 				       unsigned *max_size, unsigned size,
 				       bool write, bool fetch,
-				       enum x86emul_mode mode, ulong *linear)
+				       enum x86emul_mode mode, ulong *linear,
+				       u64 flags)
 {
 	struct desc_struct desc;
 	bool usable;
@@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 	u32 lim;
 	u16 sel;
 	u8  va_bits;
+	u64 access = fetch ? PFERR_FETCH_MASK : 0;
 
 	la = seg_base(ctxt, addr.seg) + addr.ea;
 	*max_size = 0;
@@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		}
 		break;
 	}
+
+	if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
+		goto bad;
+
 	if (la & (insn_alignment(ctxt, size) - 1))
 		return emulate_gp(ctxt, 0);
 	return X86EMUL_CONTINUE;
@@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
 {
 	unsigned max_size;
 	return __linearize(ctxt, addr, &max_size, size, write, false,
-			   ctxt->mode, linear);
+			   ctxt->mode, linear, 0);
 }
 
 static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
-	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+
+	/* LASS doesn't apply to address for branch and call instructions */
+	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
+	     &linear, KVM_X86_EMULFLAG_SKIP_LASS);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
@@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
 static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
 			      void *data, unsigned size)
 {
+	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+		ctxt->exception.vector = GP_VECTOR;
+		ctxt->exception.error_code = 0;
+		ctxt->exception.error_code_valid = true;
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
 	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
@@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
 			       ulong linear, void *data,
 			       unsigned int size)
 {
+	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+		ctxt->exception.vector = GP_VECTOR;
+		ctxt->exception.error_code = 0;
+		ctxt->exception.error_code_valid = true;
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
 	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
@@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	 * against op_size.
 	 */
 	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
-			 &linear);
+			 &linear, 0);
 	if (unlikely(rc != X86EMUL_CONTINUE))
 		return rc;
 
@@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
 {
 	int rc;
 	ulong linear;
+	unsigned max_size;
 
-	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+	/* LASS doesn't apply to the memory address for invlpg */
+	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
+	     ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->ops->invlpg(ctxt, linear);
 	/* Disable writeback. */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c8ae9d0e59b3..55c88c4593a6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 		 * destination for long mode!
 		 */
 		exn = is_noncanonical_address(*ret, vcpu);
+
+		if (!exn)
+			exn = __vmx_check_lass(vcpu, 0, *ret, 0);
 	} else {
 		/*
 		 * When not in long mode, the virtual/linear address is
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index b12da2a6dec9..30cb5d0980be 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
 		fault = true;
 	} else if (likely(is_long_mode(vcpu))) {
 		fault = is_noncanonical_address(*gva, vcpu);
+		if (!fault)
+			fault = __vmx_check_lass(vcpu, 0, *gva, 0);
 	} else {
 		*gva &= 0xffffffff;
 		fault = (s.unusable) ||
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/6] KVM: x86: Advertise LASS CPUID to user space
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
                   ` (3 preceding siblings ...)
  2023-04-20 13:37 ` [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-20 13:37 ` [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability Zeng Guang
  2023-04-24  1:20 ` [PATCH 0/6] LASS KVM virtualization support Binbin Wu
  6 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

LASS (Linear-address space separation) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.

The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]

Advertise LASS to user space to support LASS virtualization.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/cpuid.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ba7f7abc8964..5facb8037140 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -663,8 +663,8 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
-		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
-		F(AVX_IFMA)
+		F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
+		F(AMX_FP16) | F(AVX_IFMA)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
                   ` (4 preceding siblings ...)
  2023-04-20 13:37 ` [PATCH 5/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
@ 2023-04-20 13:37 ` Zeng Guang
  2023-04-25  2:57   ` Binbin Wu
  2023-04-25  7:28   ` Chao Gao
  2023-04-24  1:20 ` [PATCH 0/6] LASS KVM virtualization support Binbin Wu
  6 siblings, 2 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-20 13:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao, Zeng Guang

Host kernel may clear LASS capability in boot_cpu_data.x86_capability
besides explicitly using clearcpuid parameter. That will cause guest
not being able to manage LASS independently. So set KVM LASS directly
based on hardware capability to eliminate the dependency.

Add new helper functions to facilitate getting result of CPUID sub-leaf.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/cpuid.c         |  4 ++++
 2 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
index 9bee3e7bf973..a25dd00b7c0a 100644
--- a/arch/x86/include/asm/cpuid.h
+++ b/arch/x86/include/asm/cpuid.h
@@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
 	return edx;
 }
 
+static inline unsigned int cpuid_count_eax(unsigned int op, int count)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+	return eax;
+}
+
+static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+	return ebx;
+}
+
+static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+	return ecx;
+}
+
+static inline unsigned int cpuid_count_edx(unsigned int op, int count)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
+
+	return edx;
+}
+
 static __always_inline bool cpuid_function_is_indexed(u32 function)
 {
 	switch (function) {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 5facb8037140..e99b99ebe1fe 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
 		F(AMX_FP16) | F(AVX_IFMA)
 	);
 
+	/* Set LASS based on hardware capability */
+	if (cpuid_count_eax(7, 1) & F(LASS))
+		kvm_cpu_cap_set(X86_FEATURE_LASS);
+
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
 		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
 	);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/6] LASS KVM virtualization support
  2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
                   ` (5 preceding siblings ...)
  2023-04-20 13:37 ` [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability Zeng Guang
@ 2023-04-24  1:20 ` Binbin Wu
  2023-04-25  1:49   ` Zeng Guang
  6 siblings, 1 reply; 25+ messages in thread
From: Binbin Wu @ 2023-04-24  1:20 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao


On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Linear Address Space Separation (LASS)[1] is a new mechanism that
> enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
> without traversing the paging structures. Because the protections
> enforced by LASS are applied before paging, "probes" by malicious
> software will provide no paging-based timing information.
>
> LASS works in long mode and partitions the 64-bit canonical linear
> address space into two halves:
>      1. Lower half (LA[63]=0) --> user space
>      2. Upper half (LA[63]=1) --> kernel space
>
> When LASS is enabled, a general protection #GP fault or a stack fault
> #SS will be generated if software accesses the address from the half
> in which it resides to another half,

The accessor's mode is based on CPL, not the address range,
so it feels a bit inaccurate of descripton "in which it resides".


> e.g., either from user space to
> upper half, or from kernel space to lower half. This protection applies
> to data access, code execution.
>
> This series add KVM LASS virtualization support.
>
> When platform has LASS capability, KVM requires to expose this feature
> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
> executed in the guest directly, hardware will perform the LASS violation
> check, while KVM also needs to apply LASS to instructions emulated by
> software and injects #GP or #SS fault to the guest.
>
> Following LASS voilations check will be taken on KVM emulation path.

/s/voilations/violations


> User-mode access to supervisor space address:
>          LA[bit 63] && (CPL == 3)
> Supervisor-mode access to user space address:
>          Instruction fetch: !LA[bit 63] && (CPL < 3)
>          Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                       CPL < 3) || Implicit supervisor access)
>
> We tested the basic function of LASS virtualization including LASS
> enumeration and enabling in non-root and nested environment. As current
> KVM unittest framework is not compatible to LASS rule that kernel should
> run in the upper half, we use kernel module and application test to verify
> LASS functionalities in guest instead. The data access related x86 emulator
> code is verified with forced emulation prefix (FEP) mechanism. Other test
> cases are working in progress.
>
> How to add tests for LASS in KUT or kselftest is still under investigation.
>
> [1] Intel Architecutre Instruction Set Extensions and Future Features

/s/Architecutre/Architecture


> Programming Reference: Chapter Linear Address Space Separation (LASS)
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Zeng Guang (6):
>    KVM: x86: Virtualize CR4.LASS
>    KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
>    KVM: x86: Add emulator helper for LASS violation check
>    KVM: x86: LASS protection on KVM emulation when LASS enabled
>    KVM: x86: Advertise LASS CPUID to user space
>    KVM: x86: Set KVM LASS based on hardware capability
>
>   arch/x86/include/asm/cpuid.h       | 36 +++++++++++++++++++
>   arch/x86/include/asm/kvm-x86-ops.h |  1 +
>   arch/x86/include/asm/kvm_host.h    |  7 +++-
>   arch/x86/kvm/cpuid.c               |  8 +++--
>   arch/x86/kvm/emulate.c             | 36 ++++++++++++++++---
>   arch/x86/kvm/kvm_emulate.h         |  1 +
>   arch/x86/kvm/vmx/nested.c          |  3 ++
>   arch/x86/kvm/vmx/sgx.c             |  2 ++
>   arch/x86/kvm/vmx/vmx.c             | 58 ++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h             |  2 ++
>   arch/x86/kvm/x86.c                 |  9 +++++
>   arch/x86/kvm/x86.h                 |  2 ++
>   12 files changed, 157 insertions(+), 8 deletions(-)
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
@ 2023-04-24  6:45   ` Binbin Wu
  2023-04-25  1:52     ` Zeng Guang
  2023-04-24  7:32   ` Chao Gao
  1 sibling, 1 reply; 25+ messages in thread
From: Binbin Wu @ 2023-04-24  6:45 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

one nit below

On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
under control of KVM or under KVM's control

Or just simply use "intercept"?

> as CR4.LASS generally set once for each vCPU at boot time and won't be
> toggled at runtime. Besides, only if VM has LASS capability enumerated with
> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
> to set CR4.LASS. By design CR4.LASS can be manipulated by nested guest as
> well.
>
> Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
> effect in legacy mode even if CR4.LASS is set.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>   arch/x86/include/asm/kvm_host.h | 2 +-
>   arch/x86/kvm/vmx/vmx.c          | 3 +++
>   arch/x86/kvm/x86.h              | 2 ++
>   3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6aaae18f1854..8ff89a52ef66 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -125,7 +125,7 @@
>   			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>   			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>   			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
>   
>   #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>   
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 43ff3276918b..c923d7599d71 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7569,6 +7569,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
>   	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
>   	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
>   
> +	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
> +	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
> +
>   #undef cr4_fixed1_update
>   }
>   
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 7c8a30d44c29..218f4c73789a 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -475,6 +475,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
>   		__reserved_bits |= X86_CR4_VMXE;        \
>   	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
>   		__reserved_bits |= X86_CR4_PCIDE;       \
> +	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
> +		__reserved_bits |= X86_CR4_LASS;        \
>   	__reserved_bits;                                \
>   })
>   


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
  2023-04-24  6:45   ` Binbin Wu
@ 2023-04-24  7:32   ` Chao Gao
  2023-04-25  2:35     ` Zeng Guang
  1 sibling, 1 reply; 25+ messages in thread
From: Chao Gao @ 2023-04-24  7:32 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Apr 20, 2023 at 09:37:19PM +0800, Zeng Guang wrote:
>Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
>as CR4.LASS generally set once for each vCPU at boot time and won't be
>toggled at runtime. Besides, only if VM has LASS capability enumerated with
>CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>to set CR4.LASS.

>By design CR4.LASS can be manipulated by nested guest as
>well.

This is inaccurate. The change in nested_vmx_cr_fixed1_bits_update() is
to allow L1 guests to set CR4.LASS in VMX operation. I would say:

Set the CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1 MSR for guests
to allow guests to enable LASS in nested VMX operation.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-20 13:37 ` [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
@ 2023-04-24  7:43   ` Binbin Wu
  2023-04-25  3:26     ` Zeng Guang
  2023-04-25  3:10   ` Chao Gao
  1 sibling, 1 reply; 25+ messages in thread
From: Binbin Wu @ 2023-04-24  7:43 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao



On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Intel introduce LASS (Linear Address Separation) feature providing
/s/introduce/introduces

> an independent mechanism to achieve the mode-based protection.
>
> LASS partitions 64-bit linear address space into two halves, user-mode
> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> stops any code execution or data access
>      1. from user mode to supervisor-mode address space
>      2. from supervisor mode to user-mode address space
> and generates LASS violation fault accordingly.
IMO, the description of the point 2 may be misleading that LASS stops
any data access from supervisor mode to user mode address space,
although the description following adds the conditions.


>
> A supervisor mode data access causes a LASS violation only if supervisor
> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> or the access implicitly accesses a system data structure.
>
> Following are the rule of LASS violation check on the linear address(LA).
/s/rule/rules

> User access to supervisor-mode address space:
>      LA[bit 63] && (CPL == 3)
> Supervisor access to user-mode address space:
>      Instruction fetch: !LA[bit 63] && (CPL < 3)
>      Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> 		 CPL < 3) || Implicit supervisor access)
>
> Add new ops in kvm_x86_ops to do LASS violation check.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>   arch/x86/include/asm/kvm-x86-ops.h |  1 +
>   arch/x86/include/asm/kvm_host.h    |  5 +++
>   arch/x86/kvm/vmx/vmx.c             | 55 ++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h             |  2 ++
>   4 files changed, 63 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index abccd51dcfca..f76c07f2674b 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -131,6 +131,7 @@ KVM_X86_OP(msr_filter_changed)
>   KVM_X86_OP(complete_emulated_msr)
>   KVM_X86_OP(vcpu_deliver_sipi_vector)
>   KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> +KVM_X86_OP_OPTIONAL_RET0(check_lass);
>   
>   #undef KVM_X86_OP
>   #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8ff89a52ef66..31fb8699a1ff 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -69,6 +69,9 @@
>   #define KVM_X86_NOTIFY_VMEXIT_VALID_BITS	(KVM_X86_NOTIFY_VMEXIT_ENABLED | \
>   						 KVM_X86_NOTIFY_VMEXIT_USER)
>   
> +/* x86-specific emulation flags */
> +#define KVM_X86_EMULFLAG_SKIP_LASS	_BITULL(1)
Do you use the flag outside of emulator?
For LAM patch, it's planned to move the flags inside emulator.

> +
>   /* x86-specific vcpu->requests bit members */
>   #define KVM_REQ_MIGRATE_TIMER		KVM_ARCH_REQ(0)
>   #define KVM_REQ_REPORT_TPR_ACCESS	KVM_ARCH_REQ(1)
> @@ -1706,6 +1709,8 @@ struct kvm_x86_ops {
>   	 * Returns vCPU specific APICv inhibit reasons
>   	 */
>   	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
The flags may be dropped if the caller knows to skip it or not.

>   };
>   
>   struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c923d7599d71..581327ede66a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8070,6 +8070,59 @@ static void vmx_vm_destroy(struct kvm *kvm)
>   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>   }
>   
> +/*
> + * Determine whether an access to the linear address causes a LASS violation.
> + * LASS protection is only effective in long mode. As a prerequisite, caller
> + * should make sure VM
Should be vCPU?

> running in long mode and invoke this api to do LASS
> + * violation check.
> + */
> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
> +{
> +	bool user_mode, user_as, rflags_ac;
> +
> +	if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> +		return false;
> +
> +	WARN_ON_ONCE(!is_long_mode(vcpu));
> +
> +	user_as = !(la >> 63);
> +
> +	/*
> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> +	 * accesses a system data structure. For implicit accesses to system
> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> +	 */
> +	if (access & PFERR_IMPLICIT_ACCESS) {
> +		user_mode = false;
> +		rflags_ac = false;
> +	} else {
> +		user_mode = vmx_get_cpl(vcpu) == 3;
> +		if (!user_mode)
> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> +	}
> +
> +	if (user_mode != user_as) {
> +		/*
> +		 * Supervisor-mode _data_ accesses to user address space
> +		 * cause LASS violations only if SMAP is enabled.
> +		 */
> +		if (!user_mode && !(access & PFERR_FETCH_MASK)) {
> +			return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
> +			       !rflags_ac;
> +		} else {
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
> +static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
> +{
> +	return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, flags);
> +}
> +
>   static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.name = "kvm_intel",
>   
> @@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.complete_emulated_msr = kvm_complete_insn_gp,
>   
>   	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> +
> +	.check_lass = vmx_check_lass,
>   };
>   
>   static unsigned int vmx_handle_intel_pt_intr(void)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index a3da84f4ea45..6569385a5978 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>   u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>   u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>   
> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
> +
>   static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>   					     int type, bool value)
>   {


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/6] LASS KVM virtualization support
  2023-04-24  1:20 ` [PATCH 0/6] LASS KVM virtualization support Binbin Wu
@ 2023-04-25  1:49   ` Zeng Guang
  0 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  1:49 UTC (permalink / raw)
  To: Binbin Wu, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao


On 4/24/2023 9:20 AM, Binbin Wu wrote:
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Linear Address Space Separation (LASS)[1] is a new mechanism that
>> enforces the same mode-based protections as paging, i.e. SMAP/SMEP but
>> without traversing the paging structures. Because the protections
>> enforced by LASS are applied before paging, "probes" by malicious
>> software will provide no paging-based timing information.
>>
>> LASS works in long mode and partitions the 64-bit canonical linear
>> address space into two halves:
>>       1. Lower half (LA[63]=0) --> user space
>>       2. Upper half (LA[63]=1) --> kernel space
>>
>> When LASS is enabled, a general protection #GP fault or a stack fault
>> #SS will be generated if software accesses the address from the half
>> in which it resides to another half,
> The accessor's mode is based on CPL, not the address range,
> so it feels a bit inaccurate of descripton "in which it resides".
>
This is alternative description to implicitly signify the privilege level,
i.e. code running in upper half means it is in supervisor mode,
otherwise it's in user mode.  :)

>> e.g., either from user space to
>> upper half, or from kernel space to lower half. This protection applies
>> to data access, code execution.
>>
>> This series add KVM LASS virtualization support.
>>
>> When platform has LASS capability, KVM requires to expose this feature
>> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
>> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
>> executed in the guest directly, hardware will perform the LASS violation
>> check, while KVM also needs to apply LASS to instructions emulated by
>> software and injects #GP or #SS fault to the guest.
>>
>> Following LASS voilations check will be taken on KVM emulation path.
> /s/voilations/violations
>
>
>> User-mode access to supervisor space address:
>>           LA[bit 63] && (CPL == 3)
>> Supervisor-mode access to user space address:
>>           Instruction fetch: !LA[bit 63] && (CPL < 3)
>>           Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>>                        CPL < 3) || Implicit supervisor access)
>>
>> We tested the basic function of LASS virtualization including LASS
>> enumeration and enabling in non-root and nested environment. As current
>> KVM unittest framework is not compatible to LASS rule that kernel should
>> run in the upper half, we use kernel module and application test to verify
>> LASS functionalities in guest instead. The data access related x86 emulator
>> code is verified with forced emulation prefix (FEP) mechanism. Other test
>> cases are working in progress.
>>
>> How to add tests for LASS in KUT or kselftest is still under investigation.
>>
>> [1] Intel Architecutre Instruction Set Extensions and Future Features
> /s/Architecutre/Architecture
>
Sorry for typos above. Thanks.
>> Programming Reference: Chapter Linear Address Space Separation (LASS)
>> https://cdrdv2.intel.com/v1/dl/getContent/671368
>>
>> Zeng Guang (6):
>>     KVM: x86: Virtualize CR4.LASS
>>     KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
>>     KVM: x86: Add emulator helper for LASS violation check
>>     KVM: x86: LASS protection on KVM emulation when LASS enabled
>>     KVM: x86: Advertise LASS CPUID to user space
>>     KVM: x86: Set KVM LASS based on hardware capability
>>
>>    arch/x86/include/asm/cpuid.h       | 36 +++++++++++++++++++
>>    arch/x86/include/asm/kvm-x86-ops.h |  1 +
>>    arch/x86/include/asm/kvm_host.h    |  7 +++-
>>    arch/x86/kvm/cpuid.c               |  8 +++--
>>    arch/x86/kvm/emulate.c             | 36 ++++++++++++++++---
>>    arch/x86/kvm/kvm_emulate.h         |  1 +
>>    arch/x86/kvm/vmx/nested.c          |  3 ++
>>    arch/x86/kvm/vmx/sgx.c             |  2 ++
>>    arch/x86/kvm/vmx/vmx.c             | 58 ++++++++++++++++++++++++++++++
>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
>>    arch/x86/kvm/x86.c                 |  9 +++++
>>    arch/x86/kvm/x86.h                 |  2 ++
>>    12 files changed, 157 insertions(+), 8 deletions(-)
>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-24  6:45   ` Binbin Wu
@ 2023-04-25  1:52     ` Zeng Guang
  0 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  1:52 UTC (permalink / raw)
  To: Binbin Wu, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao


On 4/24/2023 2:45 PM, Binbin Wu wrote:
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
>
> one nit below
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
> under control of KVM or under KVM's control
>
> Or just simply use "intercept"?

OK. Will change it.

>> as CR4.LASS generally set once for each vCPU at boot time and won't be
>> toggled at runtime. Besides, only if VM has LASS capability enumerated with
>> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>> to set CR4.LASS. By design CR4.LASS can be manipulated by nested guest as
>> well.
>>
>> Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
>> effect in legacy mode even if CR4.LASS is set.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>    arch/x86/include/asm/kvm_host.h | 2 +-
>>    arch/x86/kvm/vmx/vmx.c          | 3 +++
>>    arch/x86/kvm/x86.h              | 2 ++
>>    3 files changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 6aaae18f1854..8ff89a52ef66 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -125,7 +125,7 @@
>>    			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>>    			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>>    			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
>> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
>> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
>>    
>>    #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>>    
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 43ff3276918b..c923d7599d71 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7569,6 +7569,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
>>    	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
>>    	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
>>    
>> +	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
>> +	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
>> +
>>    #undef cr4_fixed1_update
>>    }
>>    
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index 7c8a30d44c29..218f4c73789a 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -475,6 +475,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
>>    		__reserved_bits |= X86_CR4_VMXE;        \
>>    	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
>>    		__reserved_bits |= X86_CR4_PCIDE;       \
>> +	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
>> +		__reserved_bits |= X86_CR4_LASS;        \
>>    	__reserved_bits;                                \
>>    })
>>    

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-24  7:32   ` Chao Gao
@ 2023-04-25  2:35     ` Zeng Guang
  2023-04-25  3:26       ` Chao Gao
  0 siblings, 1 reply; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  2:35 UTC (permalink / raw)
  To: Gao, Chao
  Cc: Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org


On 4/24/2023 3:32 PM, Gao, Chao wrote:
> On Thu, Apr 20, 2023 at 09:37:19PM +0800, Zeng Guang wrote:
>> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
>> as CR4.LASS generally set once for each vCPU at boot time and won't be
>> toggled at runtime. Besides, only if VM has LASS capability enumerated with
>> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>> to set CR4.LASS.
>> By design CR4.LASS can be manipulated by nested guest as
>> well.
> This is inaccurate. The change in nested_vmx_cr_fixed1_bits_update() is
> to allow L1 guests to set CR4.LASS in VMX operation.

Essentially it allows nested guest to set CR4.LASS. L1 guest uses 
cr4_fixed1 to check
cr4 value requested to set by nested guest valid or not. Nested guest 
will get #GP
fault if it's not allowed.

> I would say:
>
> Set the CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1 MSR for guests
> to allow guests to enable LASS in nested VMX operation.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled
  2023-04-20 13:37 ` [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled Zeng Guang
@ 2023-04-25  2:52   ` Binbin Wu
  2023-04-25  6:40     ` Zeng Guang
  2023-04-26  1:31   ` Yuan Yao
  1 sibling, 1 reply; 25+ messages in thread
From: Binbin Wu @ 2023-04-25  2:52 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao



On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Do LASS violation check for instructions emulated by KVM. Note that for
> instructions executed in the guest directly, hardware will perform the
> check.
>
> Not all instruction emulation leads to accesses to guest linear addresses
> because 1) some instrutions like CPUID, RDMSR, don't take memory as

/s/instrutions/instructions
> operands 2) instruction fetch in most cases is already done inside the
> guest.
What are the instruction fetch cases not covered in non-root mode?
And IIUC, the patch actually doesn't distinguish them and alway checks 
LASS voilation
for instruction fetch in instruction emulation, right?

>
> Four cases in which kvm may access guest linear addresses are identified
> by code inspection:
> - KVM emulator uses segmented address for instruction fetches or data
>    accesses.
> - For implicit data access, KVM emulator gets address to a system data
to or from?

>    structure(GDT/LDT/IDT/TR).
> - For VMX instruction emulation, KVM gets the address from "VM-exit
>    instruction information" field in VMCS.
> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>
> LASS violation check applies to these linear address so as to enforce
address -> addresses

> mode-based protections as hardware behaves.
>
> As exceptions, the target memory address of emulation of invlpg, branch
> and call instructions doesn't require LASS violation check.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>   arch/x86/kvm/emulate.c    | 36 +++++++++++++++++++++++++++++++-----
>   arch/x86/kvm/vmx/nested.c |  3 +++
>   arch/x86/kvm/vmx/sgx.c    |  2 ++
>   3 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 5cc3efa0e21c..a9a022fd712e 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>   				       struct segmented_address addr,
>   				       unsigned *max_size, unsigned size,
>   				       bool write, bool fetch,
> -				       enum x86emul_mode mode, ulong *linear)
> +				       enum x86emul_mode mode, ulong *linear,
> +				       u64 flags)
>   {
>   	struct desc_struct desc;
>   	bool usable;
> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>   	u32 lim;
>   	u16 sel;
>   	u8  va_bits;
> +	u64 access = fetch ? PFERR_FETCH_MASK : 0;
>   
>   	la = seg_base(ctxt, addr.seg) + addr.ea;
>   	*max_size = 0;
> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>   		}
>   		break;
>   	}
> +
> +	if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
> +		goto bad;
> +
>   	if (la & (insn_alignment(ctxt, size) - 1))
>   		return emulate_gp(ctxt, 0);
>   	return X86EMUL_CONTINUE;
> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>   {
>   	unsigned max_size;
>   	return __linearize(ctxt, addr, &max_size, size, write, false,
> -			   ctxt->mode, linear);
> +			   ctxt->mode, linear, 0);
>   }
>   
>   static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>   
>   	if (ctxt->op_bytes != sizeof(unsigned long))
>   		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> -	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
> +
> +	/* LASS doesn't apply to address for branch and call instructions */
> +	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
> +	     &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>   	if (rc == X86EMUL_CONTINUE)
>   		ctxt->_eip = addr.ea;
>   	return rc;
> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>   static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
>   			      void *data, unsigned size)
>   {
> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> +		ctxt->exception.vector = GP_VECTOR;
> +		ctxt->exception.error_code = 0;
> +		ctxt->exception.error_code_valid = true;
> +		return X86EMUL_PROPAGATE_FAULT;
> +	}
> +
>   	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
>   }
>   
> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
>   			       ulong linear, void *data,
>   			       unsigned int size)
>   {
> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> +		ctxt->exception.vector = GP_VECTOR;
> +		ctxt->exception.error_code = 0;
> +		ctxt->exception.error_code_valid = true;
> +		return X86EMUL_PROPAGATE_FAULT;
> +	}
> +
>   	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
>   }
>   
> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>   	 * against op_size.
>   	 */
>   	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
> -			 &linear);
> +			 &linear, 0);
>   	if (unlikely(rc != X86EMUL_CONTINUE))
>   		return rc;
>   
> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>   {
>   	int rc;
>   	ulong linear;
> +	unsigned max_size;
>   
> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> +	/* LASS doesn't apply to the memory address for invlpg */
> +	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
> +	     ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>   	if (rc == X86EMUL_CONTINUE)
>   		ctxt->ops->invlpg(ctxt, linear);
>   	/* Disable writeback. */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c8ae9d0e59b3..55c88c4593a6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>   		 * destination for long mode!
>   		 */
>   		exn = is_noncanonical_address(*ret, vcpu);
> +
> +		if (!exn)
> +			exn = __vmx_check_lass(vcpu, 0, *ret, 0);
>   	} else {
>   		/*
>   		 * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index b12da2a6dec9..30cb5d0980be 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>   		fault = true;
>   	} else if (likely(is_long_mode(vcpu))) {
>   		fault = is_noncanonical_address(*gva, vcpu);
> +		if (!fault)
> +			fault = __vmx_check_lass(vcpu, 0, *gva, 0);
>   	} else {
>   		*gva &= 0xffffffff;
>   		fault = (s.unusable) ||


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability
  2023-04-20 13:37 ` [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability Zeng Guang
@ 2023-04-25  2:57   ` Binbin Wu
  2023-04-25  6:47     ` Zeng Guang
  2023-04-25  7:28   ` Chao Gao
  1 sibling, 1 reply; 25+ messages in thread
From: Binbin Wu @ 2023-04-25  2:57 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Gao Chao



On 4/20/2023 9:37 PM, Zeng Guang wrote:
> Host kernel may clear LASS capability in boot_cpu_data.x86_capability
Is there some option to do it?

> besides explicitly using clearcpuid parameter. That will cause guest
> not being able to manage LASS independently. So set KVM LASS directly
> based on hardware capability to eliminate the dependency.
>
> Add new helper functions to facilitate getting result of CPUID sub-leaf.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>   arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
>   arch/x86/kvm/cpuid.c         |  4 ++++
>   2 files changed, 40 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
> index 9bee3e7bf973..a25dd00b7c0a 100644
> --- a/arch/x86/include/asm/cpuid.h
> +++ b/arch/x86/include/asm/cpuid.h
> @@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
>   	return edx;
>   }
>   
> +static inline unsigned int cpuid_count_eax(unsigned int op, int count)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> +	return eax;
> +}
> +
> +static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> +	return ebx;
> +}
> +
> +static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> +	return ecx;
> +}
> +
> +static inline unsigned int cpuid_count_edx(unsigned int op, int count)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
> +
> +	return edx;
> +}
> +
>   static __always_inline bool cpuid_function_is_indexed(u32 function)
>   {
>   	switch (function) {
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 5facb8037140..e99b99ebe1fe 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
>   		F(AMX_FP16) | F(AVX_IFMA)
>   	);
>   
> +	/* Set LASS based on hardware capability */
> +	if (cpuid_count_eax(7, 1) & F(LASS))
> +		kvm_cpu_cap_set(X86_FEATURE_LASS);
> +
>   	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
>   		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
>   	);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-20 13:37 ` [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
  2023-04-24  7:43   ` Binbin Wu
@ 2023-04-25  3:10   ` Chao Gao
  2023-04-25  7:31     ` Zeng Guang
  1 sibling, 1 reply; 25+ messages in thread
From: Chao Gao @ 2023-04-25  3:10 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Apr 20, 2023 at 09:37:20PM +0800, Zeng Guang wrote:
>+/*
>+ * Determine whether an access to the linear address causes a LASS violation.
>+ * LASS protection is only effective in long mode. As a prerequisite, caller
>+ * should make sure VM running in long mode and invoke this api to do LASS
>+ * violation check.

Could you place the comment above vmx_check_lass()?

And for __vmx_check_lass(), just add:

A variant of vmx_check_lass() without the check for long mode.

>+ */
>+bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>+{
>+	bool user_mode, user_as, rflags_ac;
>+
>+	if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
>+	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>+		return false;
>+
>+	WARN_ON_ONCE(!is_long_mode(vcpu));
>+
>+	user_as = !(la >> 63);
>+


>+	/*
>+	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>+	 * accesses a system data structure. For implicit accesses to system
>+	 * data structure, the processor acts as if RFLAGS.AC is clear.
>+	 */
>+	if (access & PFERR_IMPLICIT_ACCESS) {
>+		user_mode = false;
>+		rflags_ac = false;
>+	} else {
>+		user_mode = vmx_get_cpl(vcpu) == 3;
>+		if (!user_mode)
>+			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>+	}
>+
>+	if (user_mode != user_as) {

to reduce one level of indentation, how about:

	if (user_mode == user_as)
		return false;

	/*
	 * Supervisor-mode _data_ accesses to user address space
	 * cause LASS violations only if SMAP is enabled.
	 */
	if (!user_mode && !(access & PFERR_FETCH_MASK)) {
		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;

	return true;


>+		/*
>+		 * Supervisor-mode _data_ accesses to user address space
>+		 * cause LASS violations only if SMAP is enabled.
>+		 */
>+		if (!user_mode && !(access & PFERR_FETCH_MASK)) {
>+			return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
>+			       !rflags_ac;
>+		} else {
>+			return true;
>+		}
>+	}
>+
>+	return false;
>+}
>+
>+static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>+{
>+	return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, flags);

Why not request all callers to check if vcpu is in long mode?

e.g.,
	return is_long_mode(vcpu) && static_call(kvm_x86_check_lass)(...);

then you can rename __vmx_check_lass() to vmx_check_lass() and drop the
original one.

>+}
>+
> static struct kvm_x86_ops vmx_x86_ops __initdata = {
> 	.name = "kvm_intel",
> 
>@@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> 	.complete_emulated_msr = kvm_complete_insn_gp,
> 
> 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>+
>+	.check_lass = vmx_check_lass,
> };
> 
> static unsigned int vmx_handle_intel_pt_intr(void)
>diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>index a3da84f4ea45..6569385a5978 100644
>--- a/arch/x86/kvm/vmx/vmx.h
>+++ b/arch/x86/kvm/vmx/vmx.h
>@@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
> u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
> u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
> 
>+bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
>+

no one uses this function. You can defer exporting it to when the first
external caller is added.

> static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
> 					     int type, bool value)
> {
>-- 
>2.27.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Virtualize CR4.LASS
  2023-04-25  2:35     ` Zeng Guang
@ 2023-04-25  3:26       ` Chao Gao
  0 siblings, 0 replies; 25+ messages in thread
From: Chao Gao @ 2023-04-25  3:26 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org

On Tue, Apr 25, 2023 at 10:35:25AM +0800, Zeng Guang wrote:
>
>On 4/24/2023 3:32 PM, Gao, Chao wrote:
>> On Thu, Apr 20, 2023 at 09:37:19PM +0800, Zeng Guang wrote:
>> > Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
>> > as CR4.LASS generally set once for each vCPU at boot time and won't be
>> > toggled at runtime. Besides, only if VM has LASS capability enumerated with
>> > CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>> > to set CR4.LASS.
>> > By design CR4.LASS can be manipulated by nested guest as
>> > well.
>> This is inaccurate. The change in nested_vmx_cr_fixed1_bits_update() is
>> to allow L1 guests to set CR4.LASS in VMX operation.
>
>Essentially it allows nested guest to set CR4.LASS. L1 guest uses cr4_fixed1
>to check
>cr4 value requested to set by nested guest valid or not. Nested guest will
>get #GP
>fault if it's not allowed.

The change to CR4_FIXED1 has more impacts. Without CR4_FIXED1 change,
guest shouldn't enable LASS in VMX operation; it means:

1. before VMXON, LASS should be disabled
2. in VMX operation, LASS cannot be enabled

What you said (i.e., L1 guest allows L2 to enable LASS) belongs in #2.
But #1 isn't covered. That's why I said "inaccurate".

>
>> I would say:
>> 
>> Set the CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1 MSR for guests
>> to allow guests to enable LASS in nested VMX operation.
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-24  7:43   ` Binbin Wu
@ 2023-04-25  3:26     ` Zeng Guang
  2023-04-26  1:46       ` Binbin Wu
  0 siblings, 1 reply; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  3:26 UTC (permalink / raw)
  To: Binbin Wu, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao


On 4/24/2023 3:43 PM, Binbin Wu wrote:
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Intel introduce LASS (Linear Address Separation) feature providing
> /s/introduce/introduces
OK.
>
>> an independent mechanism to achieve the mode-based protection.
>>
>> LASS partitions 64-bit linear address space into two halves, user-mode
>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
>> stops any code execution or data access
>>       1. from user mode to supervisor-mode address space
>>       2. from supervisor mode to user-mode address space
>> and generates LASS violation fault accordingly.
> IMO, the description of the point 2 may be misleading that LASS stops
> any data access from supervisor mode to user mode address space,
> although the description following adds the conditions.

May change to " It stops any code execution or conditional data access". 
The condition
is illustrated in next paragraph.

>
>> A supervisor mode data access causes a LASS violation only if supervisor
>> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
>> or the access implicitly accesses a system data structure.
>>
>> Following are the rule of LASS violation check on the linear address(LA).
> /s/rule/rules
OK.
>> User access to supervisor-mode address space:
>>       LA[bit 63] && (CPL == 3)
>> Supervisor access to user-mode address space:
>>       Instruction fetch: !LA[bit 63] && (CPL < 3)
>>       Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>> 		 CPL < 3) || Implicit supervisor access)
>>
>> Add new ops in kvm_x86_ops to do LASS violation check.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>    arch/x86/include/asm/kvm-x86-ops.h |  1 +
>>    arch/x86/include/asm/kvm_host.h    |  5 +++
>>    arch/x86/kvm/vmx/vmx.c             | 55 ++++++++++++++++++++++++++++++
>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
>>    4 files changed, 63 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
>> index abccd51dcfca..f76c07f2674b 100644
>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>> @@ -131,6 +131,7 @@ KVM_X86_OP(msr_filter_changed)
>>    KVM_X86_OP(complete_emulated_msr)
>>    KVM_X86_OP(vcpu_deliver_sipi_vector)
>>    KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>> +KVM_X86_OP_OPTIONAL_RET0(check_lass);
>>    
>>    #undef KVM_X86_OP
>>    #undef KVM_X86_OP_OPTIONAL
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 8ff89a52ef66..31fb8699a1ff 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -69,6 +69,9 @@
>>    #define KVM_X86_NOTIFY_VMEXIT_VALID_BITS	(KVM_X86_NOTIFY_VMEXIT_ENABLED | \
>>    						 KVM_X86_NOTIFY_VMEXIT_USER)
>>    
>> +/* x86-specific emulation flags */
>> +#define KVM_X86_EMULFLAG_SKIP_LASS	_BITULL(1)
> Do you use the flag outside of emulator?
> For LAM patch, it's planned to move the flags inside emulator.
IMO, the detailed flag is implementation specific. Is it necessary to 
bind with emulator
though it's only used inside emulator ?
>> +
>>    /* x86-specific vcpu->requests bit members */
>>    #define KVM_REQ_MIGRATE_TIMER		KVM_ARCH_REQ(0)
>>    #define KVM_REQ_REPORT_TPR_ACCESS	KVM_ARCH_REQ(1)
>> @@ -1706,6 +1709,8 @@ struct kvm_x86_ops {
>>    	 * Returns vCPU specific APICv inhibit reasons
>>    	 */
>>    	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>> +
>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
> The flags may be dropped if the caller knows to skip it or not.
Probably I don't get you right. Do you mean it need define another 
function without flags ?

>>    };
>>    
>>    struct kvm_x86_nested_ops {
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index c923d7599d71..581327ede66a 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -8070,6 +8070,59 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>    	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>    }
>>    
>> +/*
>> + * Determine whether an access to the linear address causes a LASS violation.
>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>> + * should make sure VM
> Should be vCPU?
Similar meaning, I think. :)
>> running in long mode and invoke this api to do LASS
>> + * violation check.
>> + */
>> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>> +{
>> +	bool user_mode, user_as, rflags_ac;
>> +
>> +	if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>> +		return false;
>> +
>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
>> +
>> +	user_as = !(la >> 63);
>> +
>> +	/*
>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>> +	 * accesses a system data structure. For implicit accesses to system
>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>> +	 */
>> +	if (access & PFERR_IMPLICIT_ACCESS) {
>> +		user_mode = false;
>> +		rflags_ac = false;
>> +	} else {
>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>> +		if (!user_mode)
>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>> +	}
>> +
>> +	if (user_mode != user_as) {
>> +		/*
>> +		 * Supervisor-mode _data_ accesses to user address space
>> +		 * cause LASS violations only if SMAP is enabled.
>> +		 */
>> +		if (!user_mode && !(access & PFERR_FETCH_MASK)) {
>> +			return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
>> +			       !rflags_ac;
>> +		} else {
>> +			return true;
>> +		}
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>> +{
>> +	return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, flags);
>> +}
>> +
>>    static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>    	.name = "kvm_intel",
>>    
>> @@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>    	.complete_emulated_msr = kvm_complete_insn_gp,
>>    
>>    	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>> +
>> +	.check_lass = vmx_check_lass,
>>    };
>>    
>>    static unsigned int vmx_handle_intel_pt_intr(void)
>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>> index a3da84f4ea45..6569385a5978 100644
>> --- a/arch/x86/kvm/vmx/vmx.h
>> +++ b/arch/x86/kvm/vmx/vmx.h
>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>>    u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>>    u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>>    
>> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
>> +
>>    static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>>    					     int type, bool value)
>>    {

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled
  2023-04-25  2:52   ` Binbin Wu
@ 2023-04-25  6:40     ` Zeng Guang
  0 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  6:40 UTC (permalink / raw)
  To: Binbin Wu, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao


On 4/25/2023 10:52 AM, Binbin Wu wrote:
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Do LASS violation check for instructions emulated by KVM. Note that for
>> instructions executed in the guest directly, hardware will perform the
>> check.
>>
>> Not all instruction emulation leads to accesses to guest linear addresses
>> because 1) some instrutions like CPUID, RDMSR, don't take memory as
> /s/instrutions/instructions
Oops. :P
>> operands 2) instruction fetch in most cases is already done inside the
>> guest.
> What are the instruction fetch cases not covered in non-root mode?
> And IIUC, the patch actually doesn't distinguish them and alway checks
> LASS voilation
> for instruction fetch in instruction emulation, right?

Here states most of instruction needn't be fetched by KVM. KVM intercept the
most of privileged instructions and complete the function emulation 
directly.
But some instructions requires KVM to fetch the code and emulate 
further, e.g.
lgdt/sgdt etc. KVM will always do LASS violation check on instruction 
fetch once
it happens.

>> Four cases in which kvm may access guest linear addresses are identified
>> by code inspection:
>> - KVM emulator uses segmented address for instruction fetches or data
>>     accesses.
>> - For implicit data access, KVM emulator gets address to a system data
> to or from?

It means the address pointing *to* a system data structure.

>>     structure(GDT/LDT/IDT/TR).
>> - For VMX instruction emulation, KVM gets the address from "VM-exit
>>     instruction information" field in VMCS.
>> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>>
>> LASS violation check applies to these linear address so as to enforce
> address -> addresses
OK.
>
>> mode-based protections as hardware behaves.
>>
>> As exceptions, the target memory address of emulation of invlpg, branch
>> and call instructions doesn't require LASS violation check.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>    arch/x86/kvm/emulate.c    | 36 +++++++++++++++++++++++++++++++-----
>>    arch/x86/kvm/vmx/nested.c |  3 +++
>>    arch/x86/kvm/vmx/sgx.c    |  2 ++
>>    3 files changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 5cc3efa0e21c..a9a022fd712e 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>    				       struct segmented_address addr,
>>    				       unsigned *max_size, unsigned size,
>>    				       bool write, bool fetch,
>> -				       enum x86emul_mode mode, ulong *linear)
>> +				       enum x86emul_mode mode, ulong *linear,
>> +				       u64 flags)
>>    {
>>    	struct desc_struct desc;
>>    	bool usable;
>> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>    	u32 lim;
>>    	u16 sel;
>>    	u8  va_bits;
>> +	u64 access = fetch ? PFERR_FETCH_MASK : 0;
>>    
>>    	la = seg_base(ctxt, addr.seg) + addr.ea;
>>    	*max_size = 0;
>> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>    		}
>>    		break;
>>    	}
>> +
>> +	if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
>> +		goto bad;
>> +
>>    	if (la & (insn_alignment(ctxt, size) - 1))
>>    		return emulate_gp(ctxt, 0);
>>    	return X86EMUL_CONTINUE;
>> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>>    {
>>    	unsigned max_size;
>>    	return __linearize(ctxt, addr, &max_size, size, write, false,
>> -			   ctxt->mode, linear);
>> +			   ctxt->mode, linear, 0);
>>    }
>>    
>>    static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>>    
>>    	if (ctxt->op_bytes != sizeof(unsigned long))
>>    		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
>> -	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
>> +
>> +	/* LASS doesn't apply to address for branch and call instructions */
>> +	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
>> +	     &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>>    	if (rc == X86EMUL_CONTINUE)
>>    		ctxt->_eip = addr.ea;
>>    	return rc;
>> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>>    static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
>>    			      void *data, unsigned size)
>>    {
>> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
>> +		ctxt->exception.vector = GP_VECTOR;
>> +		ctxt->exception.error_code = 0;
>> +		ctxt->exception.error_code_valid = true;
>> +		return X86EMUL_PROPAGATE_FAULT;
>> +	}
>> +
>>    	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
>>    }
>>    
>> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
>>    			       ulong linear, void *data,
>>    			       unsigned int size)
>>    {
>> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
>> +		ctxt->exception.vector = GP_VECTOR;
>> +		ctxt->exception.error_code = 0;
>> +		ctxt->exception.error_code_valid = true;
>> +		return X86EMUL_PROPAGATE_FAULT;
>> +	}
>> +
>>    	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
>>    }
>>    
>> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>>    	 * against op_size.
>>    	 */
>>    	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
>> -			 &linear);
>> +			 &linear, 0);
>>    	if (unlikely(rc != X86EMUL_CONTINUE))
>>    		return rc;
>>    
>> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>>    {
>>    	int rc;
>>    	ulong linear;
>> +	unsigned max_size;
>>    
>> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
>> +	/* LASS doesn't apply to the memory address for invlpg */
>> +	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
>> +	     ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>>    	if (rc == X86EMUL_CONTINUE)
>>    		ctxt->ops->invlpg(ctxt, linear);
>>    	/* Disable writeback. */
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index c8ae9d0e59b3..55c88c4593a6 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>>    		 * destination for long mode!
>>    		 */
>>    		exn = is_noncanonical_address(*ret, vcpu);
>> +
>> +		if (!exn)
>> +			exn = __vmx_check_lass(vcpu, 0, *ret, 0);
>>    	} else {
>>    		/*
>>    		 * When not in long mode, the virtual/linear address is
>> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
>> index b12da2a6dec9..30cb5d0980be 100644
>> --- a/arch/x86/kvm/vmx/sgx.c
>> +++ b/arch/x86/kvm/vmx/sgx.c
>> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>>    		fault = true;
>>    	} else if (likely(is_long_mode(vcpu))) {
>>    		fault = is_noncanonical_address(*gva, vcpu);
>> +		if (!fault)
>> +			fault = __vmx_check_lass(vcpu, 0, *gva, 0);
>>    	} else {
>>    		*gva &= 0xffffffff;
>>    		fault = (s.unusable) ||

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability
  2023-04-25  2:57   ` Binbin Wu
@ 2023-04-25  6:47     ` Zeng Guang
  0 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  6:47 UTC (permalink / raw)
  To: Binbin Wu, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao


On 4/25/2023 10:57 AM, Binbin Wu wrote:
>
> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>> Host kernel may clear LASS capability in boot_cpu_data.x86_capability
> Is there some option to do it?

Kernel supporting LASS will turn off the LASS capability with specific 
option, e.g.
"vsyscall=emulate".

>> besides explicitly using clearcpuid parameter. That will cause guest
>> not being able to manage LASS independently. So set KVM LASS directly
>> based on hardware capability to eliminate the dependency.
>>
>> Add new helper functions to facilitate getting result of CPUID sub-leaf.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>    arch/x86/include/asm/cpuid.h | 36 ++++++++++++++++++++++++++++++++++++
>>    arch/x86/kvm/cpuid.c         |  4 ++++
>>    2 files changed, 40 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
>> index 9bee3e7bf973..a25dd00b7c0a 100644
>> --- a/arch/x86/include/asm/cpuid.h
>> +++ b/arch/x86/include/asm/cpuid.h
>> @@ -127,6 +127,42 @@ static inline unsigned int cpuid_edx(unsigned int op)
>>    	return edx;
>>    }
>>    
>> +static inline unsigned int cpuid_count_eax(unsigned int op, int count)
>> +{
>> +	unsigned int eax, ebx, ecx, edx;
>> +
>> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> +	return eax;
>> +}
>> +
>> +static inline unsigned int cpuid_count_ebx(unsigned int op, int count)
>> +{
>> +	unsigned int eax, ebx, ecx, edx;
>> +
>> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> +	return ebx;
>> +}
>> +
>> +static inline unsigned int cpuid_count_ecx(unsigned int op, int count)
>> +{
>> +	unsigned int eax, ebx, ecx, edx;
>> +
>> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> +	return ecx;
>> +}
>> +
>> +static inline unsigned int cpuid_count_edx(unsigned int op, int count)
>> +{
>> +	unsigned int eax, ebx, ecx, edx;
>> +
>> +	cpuid_count(op, count, &eax, &ebx, &ecx, &edx);
>> +
>> +	return edx;
>> +}
>> +
>>    static __always_inline bool cpuid_function_is_indexed(u32 function)
>>    {
>>    	switch (function) {
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 5facb8037140..e99b99ebe1fe 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -667,6 +667,10 @@ void kvm_set_cpu_caps(void)
>>    		F(AMX_FP16) | F(AVX_IFMA)
>>    	);
>>    
>> +	/* Set LASS based on hardware capability */
>> +	if (cpuid_count_eax(7, 1) & F(LASS))
>> +		kvm_cpu_cap_set(X86_FEATURE_LASS);
>> +
>>    	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
>>    		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
>>    	);

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability
  2023-04-20 13:37 ` [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability Zeng Guang
  2023-04-25  2:57   ` Binbin Wu
@ 2023-04-25  7:28   ` Chao Gao
  1 sibling, 0 replies; 25+ messages in thread
From: Chao Gao @ 2023-04-25  7:28 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Apr 20, 2023 at 09:37:24PM +0800, Zeng Guang wrote:
>Host kernel may clear LASS capability in boot_cpu_data.x86_capability
>besides explicitly using clearcpuid parameter. That will cause guest
>not being able to manage LASS independently. So set KVM LASS directly
>based on hardware capability to eliminate the dependency.

...

>+	/* Set LASS based on hardware capability */
>+	if (cpuid_count_eax(7, 1) & F(LASS))
>+		kvm_cpu_cap_set(X86_FEATURE_LASS);
>+

What if LASS is cleared in boot_cpu_data because not all CPUs support LASS?

In arch/x86/kernel/cpu/common.c, identify_cpu() clears features which are
not supported by all CPUs:

	/*
	 * On SMP, boot_cpu_data holds the common feature set between
	 * all CPUs; so make sure that we indicate which features are
	 * common between the CPUs.  The first time this routine gets
	 * executed, c == &boot_cpu_data.
	 */
	if (c != &boot_cpu_data) {
		/* AND the already accumulated flags with these */
		for (i = 0; i < NCAPINTS; i++)
			boot_cpu_data.x86_capability[i] &= c->x86_capability[i];

LA57 seems to have the same issue. We may need to add some checks for LA57
in KVM's cpu hotplug callback.

> 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
> 		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
> 	);
>-- 
>2.27.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-25  3:10   ` Chao Gao
@ 2023-04-25  7:31     ` Zeng Guang
  0 siblings, 0 replies; 25+ messages in thread
From: Zeng Guang @ 2023-04-25  7:31 UTC (permalink / raw)
  To: Gao, Chao
  Cc: Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org


On 4/25/2023 11:10 AM, Gao, Chao wrote:
> On Thu, Apr 20, 2023 at 09:37:20PM +0800, Zeng Guang wrote:
>> +/*
>> + * Determine whether an access to the linear address causes a LASS violation.
>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>> + * should make sure VM running in long mode and invoke this api to do LASS
>> + * violation check.
> Could you place the comment above vmx_check_lass()?
>
> And for __vmx_check_lass(), just add:
>
> A variant of vmx_check_lass() without the check for long mode.
>
>> + */
>> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>> +{
>> +	bool user_mode, user_as, rflags_ac;
>> +
>> +	if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>> +		return false;
>> +
>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
>> +
>> +	user_as = !(la >> 63);
>> +
>
>> +	/*
>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>> +	 * accesses a system data structure. For implicit accesses to system
>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>> +	 */
>> +	if (access & PFERR_IMPLICIT_ACCESS) {
>> +		user_mode = false;
>> +		rflags_ac = false;
>> +	} else {
>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>> +		if (!user_mode)
>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>> +	}
>> +
>> +	if (user_mode != user_as) {
> to reduce one level of indentation, how about:
>
> 	if (user_mode == user_as)
> 		return false;
>
> 	/*
> 	 * Supervisor-mode _data_ accesses to user address space
> 	 * cause LASS violations only if SMAP is enabled.
> 	 */
> 	if (!user_mode && !(access & PFERR_FETCH_MASK)) {
> 		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
>
> 	return true;
>
Looks better.


>> +		/*
>> +		 * Supervisor-mode _data_ accesses to user address space
>> +		 * cause LASS violations only if SMAP is enabled.
>> +		 */
>> +		if (!user_mode && !(access & PFERR_FETCH_MASK)) {
>> +			return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
>> +			       !rflags_ac;
>> +		} else {
>> +			return true;
>> +		}
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags)
>> +{
>> +	return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, flags);
> Why not request all callers to check if vcpu is in long mode?
>
> e.g.,
> 	return is_long_mode(vcpu) && static_call(kvm_x86_check_lass)(...);
>
> then you can rename __vmx_check_lass() to vmx_check_lass() and drop the
> original one.
By design, __vmx_check_lass() is standalone to be used for checking LASS
violation only. In some cases, cpu mode is already identified prior to 
performing
LASS protection. please refer to patch 4. So we provide two interfaces,
vmx_check_lass() with cpu mode check wrapped for other modules usage, e.g.
kvm emulator and __vmx_check_lass() dedicated for VMX.

I would check the feasibility to re-organize code to be more optimal.

Thanks.
>> +}
>> +
>> static struct kvm_x86_ops vmx_x86_ops __initdata = {
>> 	.name = "kvm_intel",
>>
>> @@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>> 	.complete_emulated_msr = kvm_complete_insn_gp,
>>
>> 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>> +
>> +	.check_lass = vmx_check_lass,
>> };
>>
>> static unsigned int vmx_handle_intel_pt_intr(void)
>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>> index a3da84f4ea45..6569385a5978 100644
>> --- a/arch/x86/kvm/vmx/vmx.h
>> +++ b/arch/x86/kvm/vmx/vmx.h
>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>> u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>> u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>>
>> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u64 flags);
>> +
> no one uses this function. You can defer exporting it to when the first
> external caller is added.
>
>> static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>> 					     int type, bool value)
>> {
>> -- 
>> 2.27.0
>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled
  2023-04-20 13:37 ` [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled Zeng Guang
  2023-04-25  2:52   ` Binbin Wu
@ 2023-04-26  1:31   ` Yuan Yao
  1 sibling, 0 replies; 25+ messages in thread
From: Yuan Yao @ 2023-04-26  1:31 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel, Gao Chao

On Thu, Apr 20, 2023 at 09:37:22PM +0800, Zeng Guang wrote:
> Do LASS violation check for instructions emulated by KVM. Note that for
> instructions executed in the guest directly, hardware will perform the
> check.
>
> Not all instruction emulation leads to accesses to guest linear addresses
> because 1) some instrutions like CPUID, RDMSR, don't take memory as
> operands 2) instruction fetch in most cases is already done inside the
> guest.
>
> Four cases in which kvm may access guest linear addresses are identified
> by code inspection:
> - KVM emulator uses segmented address for instruction fetches or data
>   accesses.
> - For implicit data access, KVM emulator gets address to a system data
>   structure(GDT/LDT/IDT/TR).
> - For VMX instruction emulation, KVM gets the address from "VM-exit
>   instruction information" field in VMCS.
> - For SGX ENCLS instruction emulation, KVM gets the address from registers.
>
> LASS violation check applies to these linear address so as to enforce
> mode-based protections as hardware behaves.
>
> As exceptions, the target memory address of emulation of invlpg, branch
> and call instructions doesn't require LASS violation check.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>  arch/x86/kvm/emulate.c    | 36 +++++++++++++++++++++++++++++++-----
>  arch/x86/kvm/vmx/nested.c |  3 +++
>  arch/x86/kvm/vmx/sgx.c    |  2 ++
>  3 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 5cc3efa0e21c..a9a022fd712e 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -687,7 +687,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  				       struct segmented_address addr,
>  				       unsigned *max_size, unsigned size,
>  				       bool write, bool fetch,
> -				       enum x86emul_mode mode, ulong *linear)
> +				       enum x86emul_mode mode, ulong *linear,
> +				       u64 flags)
>  {
>  	struct desc_struct desc;
>  	bool usable;
> @@ -695,6 +696,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  	u32 lim;
>  	u16 sel;
>  	u8  va_bits;
> +	u64 access = fetch ? PFERR_FETCH_MASK : 0;
>
>  	la = seg_base(ctxt, addr.seg) + addr.ea;
>  	*max_size = 0;
> @@ -740,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  		}
>  		break;
>  	}
> +
> +	if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
> +		goto bad;
> +
>  	if (la & (insn_alignment(ctxt, size) - 1))
>  		return emulate_gp(ctxt, 0);
>  	return X86EMUL_CONTINUE;
> @@ -757,7 +763,7 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>  {
>  	unsigned max_size;
>  	return __linearize(ctxt, addr, &max_size, size, write, false,
> -			   ctxt->mode, linear);
> +			   ctxt->mode, linear, 0);
>  }
>
>  static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
> @@ -770,7 +776,10 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>
>  	if (ctxt->op_bytes != sizeof(unsigned long))
>  		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> -	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
> +
> +	/* LASS doesn't apply to address for branch and call instructions */
> +	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode,
> +	     &linear, KVM_X86_EMULFLAG_SKIP_LASS);

The emulator.c is common part of x86, so may more common
abstraction like permiession_check_before_paging better ?
Let's also wait other guy's input for this.

>  	if (rc == X86EMUL_CONTINUE)
>  		ctxt->_eip = addr.ea;
>  	return rc;
> @@ -845,6 +854,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>  static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
>  			      void *data, unsigned size)
>  {
> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> +		ctxt->exception.vector = GP_VECTOR;
> +		ctxt->exception.error_code = 0;
> +		ctxt->exception.error_code_valid = true;
> +		return X86EMUL_PROPAGATE_FAULT;
> +	}
> +
>  	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
>  }
>
> @@ -852,6 +868,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
>  			       ulong linear, void *data,
>  			       unsigned int size)
>  {
> +	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
> +		ctxt->exception.vector = GP_VECTOR;
> +		ctxt->exception.error_code = 0;
> +		ctxt->exception.error_code_valid = true;
> +		return X86EMUL_PROPAGATE_FAULT;
> +	}
> +
>  	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
>  }
>
> @@ -907,7 +930,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>  	 * against op_size.
>  	 */
>  	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
> -			 &linear);
> +			 &linear, 0);
>  	if (unlikely(rc != X86EMUL_CONTINUE))
>  		return rc;
>
> @@ -3432,8 +3455,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>  {
>  	int rc;
>  	ulong linear;
> +	unsigned max_size;
>
> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> +	/* LASS doesn't apply to the memory address for invlpg */
> +	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, false, false,
> +	     ctxt->mode, &linear, KVM_X86_EMULFLAG_SKIP_LASS);
>  	if (rc == X86EMUL_CONTINUE)
>  		ctxt->ops->invlpg(ctxt, linear);
>  	/* Disable writeback. */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c8ae9d0e59b3..55c88c4593a6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4974,6 +4974,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>  		 * destination for long mode!
>  		 */
>  		exn = is_noncanonical_address(*ret, vcpu);
> +
> +		if (!exn)
> +			exn = __vmx_check_lass(vcpu, 0, *ret, 0);
>  	} else {
>  		/*
>  		 * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index b12da2a6dec9..30cb5d0980be 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -37,6 +37,8 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>  		fault = true;
>  	} else if (likely(is_long_mode(vcpu))) {
>  		fault = is_noncanonical_address(*gva, vcpu);
> +		if (!fault)
> +			fault = __vmx_check_lass(vcpu, 0, *gva, 0);
>  	} else {
>  		*gva &= 0xffffffff;
>  		fault = (s.unusable) ||
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-04-25  3:26     ` Zeng Guang
@ 2023-04-26  1:46       ` Binbin Wu
  0 siblings, 0 replies; 25+ messages in thread
From: Binbin Wu @ 2023-04-26  1:46 UTC (permalink / raw)
  To: Zeng Guang, Paolo Bonzini, Christopherson,, Sean, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin,
	kvm@vger.kernel.org
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Gao, Chao



On 4/25/2023 11:26 AM, Zeng Guang wrote:
>
> On 4/24/2023 3:43 PM, Binbin Wu wrote:
>>
[...]
>> On 4/20/2023 9:37 PM, Zeng Guang wrote:
>>
>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h 
>> b/arch/x86/include/asm/kvm-x86-ops.h
>> index abccd51dcfca..f76c07f2674b 100644
>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>> @@ -131,6 +131,7 @@ KVM_X86_OP(msr_filter_changed)
>>    KVM_X86_OP(complete_emulated_msr)
>>    KVM_X86_OP(vcpu_deliver_sipi_vector)
>>    KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>> +KVM_X86_OP_OPTIONAL_RET0(check_lass);
>>       #undef KVM_X86_OP
>>    #undef KVM_X86_OP_OPTIONAL
>> diff --git a/arch/x86/include/asm/kvm_host.h 
>> b/arch/x86/include/asm/kvm_host.h
>> index 8ff89a52ef66..31fb8699a1ff 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -69,6 +69,9 @@
>>    #define KVM_X86_NOTIFY_VMEXIT_VALID_BITS 
>> (KVM_X86_NOTIFY_VMEXIT_ENABLED | \
>>                             KVM_X86_NOTIFY_VMEXIT_USER)
>>    +/* x86-specific emulation flags */
>> +#define KVM_X86_EMULFLAG_SKIP_LASS    _BITULL(1)
>> Do you use the flag outside of emulator?
>> For LAM patch, it's planned to move the flags inside emulator.
> IMO, the detailed flag is implementation specific. Is it necessary to 
> bind with emulator
> though it's only used inside emulator ?
For the rest part (i.e., VMExit handlings), the code is already in the 
vendor specific implementations.
The callers are aware of the information to skip LASS check or not.

I plan to do a cleanup to consolidate the flags into one parameter for 
__linearize().
And the consolidated flags value will be extended for LAM and other 
features (e.g. LASS).

I post the proposed patch as following, could you help to check wether 
it is OK for LASS to follow?



arch/x86/kvm/emulate.c     | 20 ++++++++++++++------
  arch/x86/kvm/kvm_emulate.h |  4 ++++
  2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index a20bec931764..5fb516bc5731 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct 
x86_emulate_ctxt *ctxt, unsigned size)
  static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
                         struct segmented_address addr,
                         unsigned *max_size, unsigned size,
-                       bool write, bool fetch,
-                       enum x86emul_mode mode, ulong *linear)
+                       u64 flags, enum x86emul_mode mode,
+                       ulong *linear)
  {
      struct desc_struct desc;
      bool usable;
@@ -696,6 +696,8 @@ static __always_inline int __linearize(struct 
x86_emulate_ctxt *ctxt,
      u32 lim;
      u16 sel;
      u8  va_bits;
+    bool fetch = !!(flags & X86_EMULFLAG_FETCH);
+    bool write = !!(flags & X86_EMULFLAG_WRITE);

      la = seg_base(ctxt, addr.seg) + addr.ea;
      *max_size = 0;
@@ -757,7 +759,12 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
               ulong *linear)
  {
      unsigned max_size;
-    return __linearize(ctxt, addr, &max_size, size, write, false,
+    u64 flags = 0;
+
+    if (write)
+        flags |= X86_EMULFLAG_WRITE;
+
+    return __linearize(ctxt, addr, &max_size, size, flags,
                 ctxt->mode, linear);
  }

@@ -768,10 +775,11 @@ static inline int assign_eip(struct 
x86_emulate_ctxt *ctxt, ulong dst)
      unsigned max_size;
      struct segmented_address addr = { .seg = VCPU_SREG_CS,
                         .ea = dst };
+    u64 flags = X86_EMULFLAG_FETCH;

      if (ctxt->op_bytes != sizeof(unsigned long))
          addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
-    rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, 
&linear);
+    rc = __linearize(ctxt, addr, &max_size, 1, flags, ctxt->mode, &linear);
      if (rc == X86EMUL_CONTINUE)
          ctxt->_eip = addr.ea;
      return rc;
@@ -896,6 +904,7 @@ static int __do_insn_fetch_bytes(struct 
x86_emulate_ctxt *ctxt, int op_size)
      int cur_size = ctxt->fetch.end - ctxt->fetch.data;
      struct segmented_address addr = { .seg = VCPU_SREG_CS,
                         .ea = ctxt->eip + cur_size };
+    u64 flags = X86_EMULFLAG_FETCH;

      /*
       * We do not know exactly how many bytes will be needed, and
@@ -907,8 +916,7 @@ static int __do_insn_fetch_bytes(struct 
x86_emulate_ctxt *ctxt, int op_size)
       * boundary check itself.  Instead, we use max_size to check
       * against op_size.
       */
-    rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
-             &linear);
+    rc = __linearize(ctxt, addr, &max_size, 0, flags, ctxt->mode, &linear);
      if (unlikely(rc != X86EMUL_CONTINUE))
          return rc;

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..5451a37f135f 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
  #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
  #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */

+/* x86-specific emulation flags */
+#define X86_EMULFLAG_FETCH            _BITULL(0)
+#define X86_EMULFLAG_WRITE            _BITULL(1)
+
  struct x86_emulate_ops {
      void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
      /*

>>> +
>>>    /* x86-specific vcpu->requests bit members */
>>>    #define KVM_REQ_MIGRATE_TIMER        KVM_ARCH_REQ(0)
>>>    #define KVM_REQ_REPORT_TPR_ACCESS    KVM_ARCH_REQ(1)
>>> @@ -1706,6 +1709,8 @@ struct kvm_x86_ops {
>>>         * Returns vCPU specific APICv inhibit reasons
>>>         */
>>>        unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct 
>>> kvm_vcpu *vcpu);
>>> +
>>> +    bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, 
>>> u64 flags);
>> The flags may be dropped if the caller knows to skip it or not.
> Probably I don't get you right. Do you mean it need define another 
> function without flags ?
>
>>>    };
>>>       struct kvm_x86_nested_ops {
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index c923d7599d71..581327ede66a 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -8070,6 +8070,59 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>>        free_pages((unsigned long)kvm_vmx->pid_table, 
>>> vmx_get_pid_table_order(kvm));
>>>    }
>>>    +/*
>>> + * Determine whether an access to the linear address causes a LASS 
>>> violation.
>>> + * LASS protection is only effective in long mode. As a 
>>> prerequisite, caller
>>> + * should make sure VM
>> Should be vCPU?
> Similar meaning, I think. :)
>>> running in long mode and invoke this api to do LASS
>>> + * violation check.
>>> + */
>>> +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, 
>>> u64 flags)
>>> +{
>>> +    bool user_mode, user_as, rflags_ac;
>>> +
>>> +    if (!!(flags & KVM_X86_EMULFLAG_SKIP_LASS) ||
>>> +        !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>>> +        return false;
>>> +
>>> +    WARN_ON_ONCE(!is_long_mode(vcpu));
>>> +
>>> +    user_as = !(la >> 63);
>>> +
>>> +    /*
>>> +     * An access is a supervisor-mode access if CPL < 3 or if it 
>>> implicitly
>>> +     * accesses a system data structure. For implicit accesses to 
>>> system
>>> +     * data structure, the processor acts as if RFLAGS.AC is clear.
>>> +     */
>>> +    if (access & PFERR_IMPLICIT_ACCESS) {
>>> +        user_mode = false;
>>> +        rflags_ac = false;
>>> +    } else {
>>> +        user_mode = vmx_get_cpl(vcpu) == 3;
>>> +        if (!user_mode)
>>> +            rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>>> +    }
>>> +
>>> +    if (user_mode != user_as) {
>>> +        /*
>>> +         * Supervisor-mode _data_ accesses to user address space
>>> +         * cause LASS violations only if SMAP is enabled.
>>> +         */
>>> +        if (!user_mode && !(access & PFERR_FETCH_MASK)) {
>>> +            return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) &&
>>> +                   !rflags_ac;
>>> +        } else {
>>> +            return true;
>>> +        }
>>> +    }
>>> +
>>> +    return false;
>>> +}
>>> +
>>> +static bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 
>>> la, u64 flags)
>>> +{
>>> +    return is_long_mode(vcpu) && __vmx_check_lass(vcpu, access, la, 
>>> flags);
>>> +}
>>> +
>>>    static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>>        .name = "kvm_intel",
>>>    @@ -8207,6 +8260,8 @@ static struct kvm_x86_ops vmx_x86_ops 
>>> __initdata = {
>>>        .complete_emulated_msr = kvm_complete_insn_gp,
>>>           .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>>> +
>>> +    .check_lass = vmx_check_lass,
>>>    };
>>>       static unsigned int vmx_handle_intel_pt_intr(void)
>>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>>> index a3da84f4ea45..6569385a5978 100644
>>> --- a/arch/x86/kvm/vmx/vmx.h
>>> +++ b/arch/x86/kvm/vmx/vmx.h
>>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct 
>>> kvm_vcpu *vcpu, u32 msr, int type);
>>>    u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>>>    u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>>>    +bool __vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, 
>>> u64 flags);
>>> +
>>>    static inline void vmx_set_intercept_for_msr(struct kvm_vcpu 
>>> *vcpu, u32 msr,
>>>                             int type, bool value)
>>>    {


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-04-26  1:46 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-20 13:37 [PATCH 0/6] LASS KVM virtualization support Zeng Guang
2023-04-20 13:37 ` [PATCH 1/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
2023-04-24  6:45   ` Binbin Wu
2023-04-25  1:52     ` Zeng Guang
2023-04-24  7:32   ` Chao Gao
2023-04-25  2:35     ` Zeng Guang
2023-04-25  3:26       ` Chao Gao
2023-04-20 13:37 ` [PATCH 2/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
2023-04-24  7:43   ` Binbin Wu
2023-04-25  3:26     ` Zeng Guang
2023-04-26  1:46       ` Binbin Wu
2023-04-25  3:10   ` Chao Gao
2023-04-25  7:31     ` Zeng Guang
2023-04-20 13:37 ` [PATCH 3/6] KVM: x86: Add emulator helper " Zeng Guang
2023-04-20 13:37 ` [PATCH 4/6] KVM: x86: LASS protection on KVM emulation when LASS enabled Zeng Guang
2023-04-25  2:52   ` Binbin Wu
2023-04-25  6:40     ` Zeng Guang
2023-04-26  1:31   ` Yuan Yao
2023-04-20 13:37 ` [PATCH 5/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
2023-04-20 13:37 ` [PATCH 6/6] KVM: x86: Set KVM LASS based on hardware capability Zeng Guang
2023-04-25  2:57   ` Binbin Wu
2023-04-25  6:47     ` Zeng Guang
2023-04-25  7:28   ` Chao Gao
2023-04-24  1:20 ` [PATCH 0/6] LASS KVM virtualization support Binbin Wu
2023-04-25  1:49   ` Zeng Guang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).