linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Add support for the Bus Lock Threshold
@ 2025-03-24 13:02 Manali Shukla
  2025-03-24 13:02 ` [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request Manali Shukla
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

Misbehaving guests can cause bus locks to degrade the performance of
a system. Non-WB (write-back) and misaligned locked RMW (read-modify-write)
instructions are referred to as "bus locks" and require system wide
synchronization among all processors to guarantee the atomicity. The bus
locks can impose notable performance penalties for all processors within
the system.

Support for the Bus Lock Threshold is indicated by CPUID
Fn8000_000A_EDX[29] BusLockThreshold=1, the VMCB provides a Bus Lock
Threshold enable bit and an unsigned 16-bit Bus Lock Threshold count.

VMCB intercept bit
    VMCB Offset     Bits    Function
    14h             5       Intercept bus lock operations

Bus lock threshold count
    VMCB Offset     Bits    Function
    120h            15:0    Bus lock counter

During VMRUN, the bus lock threshold count is fetched and stored in an
internal count register.  Prior to executing a bus lock within the guest,
the processor verifies the count in the bus lock register. If the count is
greater than zero, the processor executes the bus lock, reducing the count.
However, if the count is zero, the bus lock operation is not performed, and
instead, a Bus Lock Threshold #VMEXIT is triggered to transfer control to
the Virtual Machine Monitor (VMM).

A Bus Lock Threshold #VMEXIT is reported to the VMM with VMEXIT code 0xA5h,
VMEXIT_BUSLOCK. EXITINFO1 and EXITINFO2 are set to 0 on a VMEXIT_BUSLOCK.
On a #VMEXIT, the processor writes the current value of the Bus Lock
Threshold Counter to the VMCB.

Note: Currently, virtualizing the Bus Lock Threshold feature for L1 guest is
not supported.

More details about the Bus Lock Threshold feature can be found in AMD APM
[1].

v3 -> v4
- Incorporated Sean's review comments
  - Added a preparatory patch to move linear_rip out of kvm_pio_request, so
    that it can be used by the bus lock threshold patches.
  - Added complete_userspace_buslock() function to reload bus_lock_counter
    to '1' only if the usespace has not changed the RIP.
  - Added changes to continue running bus_lock_counter accross the nested
    transitions. 

v2 -> v3
- Drop parch to add virt tag in /proc/cpuinfo.
- Incorporated Tom's review comments.

v1 -> v2
- Incorporated misc review comments from Sean.
- Removed bus_lock_counter module parameter.
- Set the value of bus_lock_counter to zero by default and reload the value by 1
  in bus lock exit handler.
- Add documentation for the behavioral difference for KVM_EXIT_BUS_LOCK.
- Improved selftest for buslock to work on SVM and VMX.
- Rewrite the commit messages.

Patches are prepared on kvm-next/next (c9ea48bb6ee6).

Testing done:
- Tested the Bus Lock Threshold functionality on normal, SEV, SEV-ES and SEV-SNP guests.
- Tested the Bus Lock Threshold functionality on nested guests.

v1: https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/
v2: https://lore.kernel.org/kvm/20241001063413.687787-4-manali.shukla@amd.com/T/
v3: https://lore.kernel.org/kvm/20241004053341.5726-1-manali.shukla@amd.com/T/

[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
     Vol 2, 15.14.5 Bus Lock Threshold.
     https://bugzilla.kernel.org/attachment.cgi?id=306250

Manali Shukla (3):
  KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request
  x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs

Nikunj A Dadhania (2):
  KVM: SVM: Enable Bus lock threshold exit
  KVM: selftests: Add bus lock exit test

 Documentation/virt/kvm/api.rst                |  19 +++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/kvm_host.h               |   2 +-
 arch/x86/include/asm/svm.h                    |   5 +-
 arch/x86/include/uapi/asm/svm.h               |   2 +
 arch/x86/kvm/svm/nested.c                     |  42 ++++++
 arch/x86/kvm/svm/svm.c                        |  38 +++++
 arch/x86/kvm/svm/svm.h                        |   2 +
 arch/x86/kvm/x86.c                            |   8 +-
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../selftests/kvm/x86/kvm_buslock_test.c      | 135 ++++++++++++++++++
 11 files changed, 249 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86/kvm_buslock_test.c


base-commit: c9ea48bb6ee6b28bbc956c1e8af98044618fed5e
-- 
2.34.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request
  2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
@ 2025-03-24 13:02 ` Manali Shukla
  2025-04-23 15:22   ` Sean Christopherson
  2025-03-24 13:02 ` [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold Manali Shukla
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

Add a refactoring prep patch to move linear_rip out of kvm_pio_request
and place it next to complete_userspace_io.  There's nothing port I/O
specific about linear_rip field, it just so happens to that port I/O is the
only case where KVM's ABI is to let userspace stuff state (to emulate
RESET) without first completing the I/O instruction.

No functional changes intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/x86.c              | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d881e7d276b1..f8a564dd5422 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -410,7 +410,6 @@ struct kvm_rmap_head {
 };
 
 struct kvm_pio_request {
-	unsigned long linear_rip;
 	unsigned long count;
 	int in;
 	int port;
@@ -909,6 +908,7 @@ struct kvm_vcpu_arch {
 	bool emulate_regs_need_sync_to_vcpu;
 	bool emulate_regs_need_sync_from_vcpu;
 	int (*complete_userspace_io)(struct kvm_vcpu *vcpu);
+	unsigned long cui_linear_rip;
 
 	gpa_t time;
 	s8  pvclock_tsc_shift;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 69c20a68a3f0..8282ae350eec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9324,7 +9324,7 @@ static int complete_fast_pio_out(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.pio.count = 0;
 
-	if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.pio.linear_rip)))
+	if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.cui_linear_rip)))
 		return 1;
 
 	return kvm_skip_emulated_instruction(vcpu);
@@ -9349,7 +9349,7 @@ static int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size,
 			complete_fast_pio_out_port_0x7e;
 		kvm_skip_emulated_instruction(vcpu);
 	} else {
-		vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu);
+		vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
 		vcpu->arch.complete_userspace_io = complete_fast_pio_out;
 	}
 	return 0;
@@ -9362,7 +9362,7 @@ static int complete_fast_pio_in(struct kvm_vcpu *vcpu)
 	/* We should only ever be called with arch.pio.count equal to 1 */
 	BUG_ON(vcpu->arch.pio.count != 1);
 
-	if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.pio.linear_rip))) {
+	if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.cui_linear_rip))) {
 		vcpu->arch.pio.count = 0;
 		return 1;
 	}
@@ -9391,7 +9391,7 @@ static int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size,
 		return ret;
 	}
 
-	vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu);
+	vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
 	vcpu->arch.complete_userspace_io = complete_fast_pio_in;
 
 	return 0;

base-commit: c9ea48bb6ee6b28bbc956c1e8af98044618fed5e
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
  2025-03-24 13:02 ` [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request Manali Shukla
@ 2025-03-24 13:02 ` Manali Shukla
  2025-03-24 21:56   ` Borislav Petkov
  2025-03-24 13:02 ` [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit Manali Shukla
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

Misbehaving guests can cause bus locks to degrade the performance of
the system. The Bus Lock Threshold feature can be used to address this
issue by providing capability to the hypervisor to limit guest's
ability to generate bus lock, thereby preventing system slowdown due
to performance penalities.

When the Bus Lock Threshold feature is enabled, the processor checks
the bus lock threshold count before executing the buslock and decides
whether to trigger bus lock exit or not.

The value of the bus lock threshold count '0' generates bus lock
exits, and if the value is greater than '0', the bus lock is executed
successfully and the bus lock threshold count is decremented.

Presence of the Bus Lock threshold feature is indicated via CPUID
function 0x8000000A_EDX[29].

Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 8f8aaf94dc00..a3ab8d1f5c17 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -386,6 +386,7 @@
 #define X86_FEATURE_V_SPEC_CTRL		(15*32+20) /* "v_spec_ctrl" Virtual SPEC_CTRL */
 #define X86_FEATURE_VNMI		(15*32+25) /* "vnmi" Virtual NMI */
 #define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* SVME addr check */
+#define X86_FEATURE_BUS_LOCK_THRESHOLD	(15*32+29) /* "buslock" Bus lock threshold */
 #define X86_FEATURE_IDLE_HLT		(15*32+30) /* IDLE HLT intercept */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit
  2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
  2025-03-24 13:02 ` [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request Manali Shukla
  2025-03-24 13:02 ` [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold Manali Shukla
@ 2025-03-24 13:02 ` Manali Shukla
  2025-04-16  6:00   ` Xiaoyao Li
  2025-03-24 13:02 ` [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs Manali Shukla
  2025-03-24 13:02 ` [PATCH v4 5/5] KVM: selftests: Add bus lock exit test Manali Shukla
  4 siblings, 1 reply; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

From: Nikunj A Dadhania <nikunj@amd.com>

Virtual machines can exploit bus locks to degrade the performance of
the system. Bus locks can be caused by Non-WB(Write back) and
misaligned locked RMW (Read-modify-Write) instructions and require
systemwide synchronization among all processors which can result into
significant performance penalties.

To address this issue, the Bus Lock Threshold feature is introduced to
provide ability to hypervisor to restrict guests' capability of
initiating mulitple buslocks, thereby preventing system slowdowns.

Support for the buslock threshold is indicated via CPUID function
0x8000000A_EDX[29].

On the processors that support the Bus Lock Threshold feature, the
VMCB provides a Bus Lock Threshold enable bit and an unsigned 16-bit
Bus Lock threshold count.

VMCB intercept bit
VMCB Offset     Bits    Function
14h             5       Intercept bus lock operations

Bus lock threshold count
VMCB Offset     Bits    Function
120h            15:0    Bus lock counter

When a VMRUN instruction is executed, the bus lock threshold count is
loaded into an internal count register. Before the processor executes
a bus lock in the guest, it checks the value of this register:

 - If the value is greater than '0', the processor successfully
   executes the bus lock and decrements the count.

 - If the value is '0', the bus lock is not executed, and a #VMEXIT to
   the VMM is taken.

The bus lock threshold #VMEXIT is reported to the VMM with the VMEXIT
code A5h, SVM_EXIT_BUS_LOCK.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Co-developed-by: Manali Shukla <manali.shukla@amd.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 arch/x86/include/asm/svm.h      | 5 ++++-
 arch/x86/include/uapi/asm/svm.h | 2 ++
 arch/x86/kvm/svm/svm.c          | 8 ++++++++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 9b7fa99ae951..9dc54da1835a 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -116,6 +116,7 @@ enum {
 	INTERCEPT_INVPCID,
 	INTERCEPT_MCOMMIT,
 	INTERCEPT_TLBSYNC,
+	INTERCEPT_BUSLOCK,
 	INTERCEPT_IDLE_HLT = 166,
 };
 
@@ -159,7 +160,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 	u64 avic_physical_id;	/* Offset 0xf8 */
 	u8 reserved_7[8];
 	u64 vmsa_pa;		/* Used for an SEV-ES guest */
-	u8 reserved_8[720];
+	u8 reserved_8[16];
+	u16 bus_lock_counter;	/* Offset 0x120 */
+	u8 reserved_9[702];
 	/*
 	 * Offset 0x3e0, 32 bytes reserved
 	 * for use by hypervisor/software.
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index ec1321248dac..9c640a521a67 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -95,6 +95,7 @@
 #define SVM_EXIT_CR14_WRITE_TRAP		0x09e
 #define SVM_EXIT_CR15_WRITE_TRAP		0x09f
 #define SVM_EXIT_INVPCID       0x0a2
+#define SVM_EXIT_BUS_LOCK			0x0a5
 #define SVM_EXIT_IDLE_HLT      0x0a6
 #define SVM_EXIT_NPF           0x400
 #define SVM_EXIT_AVIC_INCOMPLETE_IPI		0x401
@@ -225,6 +226,7 @@
 	{ SVM_EXIT_CR4_WRITE_TRAP,	"write_cr4_trap" }, \
 	{ SVM_EXIT_CR8_WRITE_TRAP,	"write_cr8_trap" }, \
 	{ SVM_EXIT_INVPCID,     "invpcid" }, \
+	{ SVM_EXIT_BUS_LOCK,     "buslock" }, \
 	{ SVM_EXIT_IDLE_HLT,     "idle-halt" }, \
 	{ SVM_EXIT_NPF,         "npf" }, \
 	{ SVM_EXIT_AVIC_INCOMPLETE_IPI,		"avic_incomplete_ipi" }, \
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8abeab91d329..244e099e7262 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1375,6 +1375,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 		svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
 	}
 
+	if (vcpu->kvm->arch.bus_lock_detection_enabled)
+		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
+
 	if (sev_guest(vcpu->kvm))
 		sev_init_vmcb(svm);
 
@@ -5299,6 +5302,11 @@ static __init void svm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
 	}
 
+	if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) {
+		pr_info("Bus Lock Threshold supported\n");
+		kvm_caps.has_bus_lock_exit = true;
+	}
+
 	/* CPUID 0x80000008 */
 	if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
 	    boot_cpu_has(X86_FEATURE_AMD_SSBD))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
                   ` (2 preceding siblings ...)
  2025-03-24 13:02 ` [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit Manali Shukla
@ 2025-03-24 13:02 ` Manali Shukla
  2025-04-16  6:14   ` Xiaoyao Li
  2025-04-23 15:44   ` Sean Christopherson
  2025-03-24 13:02 ` [PATCH v4 5/5] KVM: selftests: Add bus lock exit test Manali Shukla
  4 siblings, 2 replies; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs with Bus Lock
Threshold, which is close enough to VMX's Bus Lock Detection VM-Exit to
allow reusing KVM_CAP_X86_BUS_LOCK_EXIT.

The biggest difference between the two features is that Threshold is
fault-like, whereas Detection is trap-like.  To allow the guest to make
forward progress, Threshold provides a per-VMCB counter which is
decremented every time a bus lock occurs, and a VM-Exit is triggered if
and only if the counter is '0'.

To provide Detection-like semantics, initialize the counter to '0', i.e.
exit on every bus lock, and when re-executing the guilty instruction, set
the counter to '1' to effectively step past the instruction.

Note, in the unlikely scenario that re-executing the instruction doesn't
trigger a bus lock, e.g. because the guest has changed memory types or
patched the guilty instruction, the bus lock counter will be left at '1',
i.e. the guest will be able to do a bus lock on a different instruction.
In a perfect world, KVM would ensure the counter is '0' if the guest has
made forward progress, e.g. if RIP has changed.  But trying to close that
hole would incur non-trivial complexity, for marginal benefit; the intent
of KVM_CAP_X86_BUS_LOCK_EXIT is to allow userspace rate-limit bus locks,
not to allow for precise detection of problematic guest code.  And, it's
simply not feasible to fully close the hole, e.g. if an interrupt arrives
before the original instruction can re-execute, the guest could step past
a different bus lock.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 Documentation/virt/kvm/api.rst | 19 +++++++++++++++
 arch/x86/kvm/svm/nested.c      | 42 ++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c         | 30 ++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.h         |  2 ++
 4 files changed, 93 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 5fe84f2427b5..f7c925aa0c4f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
 KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
 to KVM_EXIT_X86_BUS_LOCK.
 
+Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close
+enough  to INTEL's Bus Lock Detection VM-Exit to allow using
+KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
+
+The biggest difference between the two features is that Threshold (AMD CPUs) is
+fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the
+offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus
+lock exit to user space occurs with RIP pointing at the instruction right after
+the guilty one.
+
+The bus lock threshold on AMD CPUs provides a per-VMCB counter which is
+decremented every time a bus lock occurs, and a VM-Exit is triggered if and only
+if the bus lock counter is '0'.
+
+To provide Detection-like semantics for AMD CPUs, the bus lock counter has been
+initialized to '0', i.e. exit on every bus lock, and when re-executing the
+guilty instruction, the bus lock counter has been set to '1' to effectively step
+past the instruction.
+
 Note! Detected bus locks may be coincident with other exits to userspace, i.e.
 KVM_RUN_X86_BUS_LOCK should be checked regardless of the primary exit reason if
 userspace wants to take action on all detected bus locks.
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 834b67672d50..a42ef7dd9143 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -678,6 +678,36 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
 	vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
 
+	/*
+	 * Stash vmcb02's counter if the guest hasn't moved past the guilty
+	 * instrution; otherwise, reset the counter to '0'.
+	 *
+	 * In order to detect if L2 has made forward progress or not, track the
+	 * RIP at which a bus lock has occurred on a per-vmcb12 basis.  If RIP
+	 * is changed, guest has clearly made forward progress, bus_lock_counter
+	 * still remained '1', so reset bus_lock_counter to '0'. Eg. In the
+	 * scenario, where a buslock happened in L1 before VMRUN, the bus lock
+	 * firmly happened on an instruction in the past. Even if vmcb01's
+	 * counter is still '1', (because the guilty instruction got patched),
+	 * the vCPU has clearly made forward progress and so KVM should reset
+	 * vmcb02's counter to '0'.
+	 *
+	 * If the RIP hasn't changed, stash the bus lock counter at nested VMRUN
+	 * to prevent the same guilty instruction from triggering a VM-Exit. Eg.
+	 * if userspace rate-limits the vCPU, then it's entirely possible that
+	 * L1's tick interrupt is pending by the time userspace re-runs the
+	 * vCPU.  If KVM unconditionally clears the counter on VMRUN, then when
+	 * L1 re-enters L2, the same instruction will trigger a VM-Exit and the
+	 * entire cycle start over.
+	 */
+	if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip)) {
+		vmcb02->control.bus_lock_counter = 1;
+		svm->bus_lock_rip = svm->nested.ctl.bus_lock_rip;
+	} else {
+		vmcb02->control.bus_lock_counter = 0;
+	}
+	svm->nested.ctl.bus_lock_rip = INVALID_GPA;
+
 	/* Done at vmrun: asid.  */
 
 	/* Also overwritten later if necessary.  */
@@ -1039,6 +1069,18 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	}
 
+	/*
+	 * If bus_lock_counter is nonzero and the guest has not moved past the
+	 * guilty instruction, save bus_lock_rip in svm_nested_state. This will
+	 * help determine at nested VMRUN whether to stash vmcb02's counter or
+	 * reset it to '0'.
+	 */
+	if (vmcb02->control.bus_lock_counter &&
+	    svm->bus_lock_rip == vmcb02->save.rip)
+		svm->nested.ctl.bus_lock_rip = svm->bus_lock_rip;
+	else
+		svm->nested.ctl.bus_lock_rip = INVALID_GPA;
+
 	nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
 
 	svm_switch_vmcb(svm, &svm->vmcb01);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 244e099e7262..ea12e93ae983 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3310,6 +3310,35 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
 	return kvm_handle_invpcid(vcpu, type, gva);
 }
 
+static inline int complete_userspace_buslock(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	/*
+	 * If userspace has NOT changed RIP, then KVM's ABI is to let the guest
+	 * execute the bus-locking instruction.  Set the bus lock counter to '1'
+	 * to effectively step past the bus lock.
+	 */
+	if (kvm_is_linear_rip(vcpu, vcpu->arch.cui_linear_rip))
+		svm->vmcb->control.bus_lock_counter = 1;
+
+	return 1;
+}
+
+static int bus_lock_exit(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
+	vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
+
+	vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
+	svm->bus_lock_rip = vcpu->arch.cui_linear_rip;
+	vcpu->arch.complete_userspace_io = complete_userspace_buslock;
+
+	return 0;
+}
+
 static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_READ_CR0]			= cr_interception,
 	[SVM_EXIT_READ_CR3]			= cr_interception,
@@ -3379,6 +3408,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_INVPCID]                      = invpcid_interception,
 	[SVM_EXIT_IDLE_HLT]			= kvm_emulate_halt,
 	[SVM_EXIT_NPF]				= npf_interception,
+	[SVM_EXIT_BUS_LOCK]                     = bus_lock_exit,
 	[SVM_EXIT_RSM]                          = rsm_interception,
 	[SVM_EXIT_AVIC_INCOMPLETE_IPI]		= avic_incomplete_ipi_interception,
 	[SVM_EXIT_AVIC_UNACCELERATED_ACCESS]	= avic_unaccelerated_access_interception,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index d4490eaed55d..7a4c5848c952 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -169,6 +169,7 @@ struct vmcb_ctrl_area_cached {
 	u64 nested_cr3;
 	u64 virt_ext;
 	u32 clean;
+	u64 bus_lock_rip;
 	union {
 #if IS_ENABLED(CONFIG_HYPERV) || IS_ENABLED(CONFIG_KVM_HYPERV)
 		struct hv_vmcb_enlightenments hv_enlightenments;
@@ -327,6 +328,7 @@ struct vcpu_svm {
 
 	/* Guest GIF value, used when vGIF is not enabled */
 	bool guest_gif;
+	u64 bus_lock_rip;
 };
 
 struct svm_cpu_data {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 5/5] KVM: selftests: Add bus lock exit test
  2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
                   ` (3 preceding siblings ...)
  2025-03-24 13:02 ` [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs Manali Shukla
@ 2025-03-24 13:02 ` Manali Shukla
  4 siblings, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-03-24 13:02 UTC (permalink / raw)
  To: kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, manali.shukla, bp

From: Nikunj A Dadhania <nikunj@amd.com>

Add a test case to verify the Bus Lock exit feature

The main thing that the selftest verifies is that when a Buslock is
generated in the guest context, the KVM_EXIT_X86_BUS_LOCK is triggered
for SVM or VMX when the KVM capability KVM_CAP_X86_BUS_LOCK_EXIT is
enabled.

This test case also verifies the Bus Lock exit in nested scenario.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Co-developed-by: Manali Shukla <manali.shukla@amd.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../selftests/kvm/x86/kvm_buslock_test.c      | 135 ++++++++++++++++++
 2 files changed, 136 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86/kvm_buslock_test.c

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index f773f8f99249..17c77f947ae8 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -68,6 +68,7 @@ TEST_GEN_PROGS_x86 += x86/hyperv_svm_test
 TEST_GEN_PROGS_x86 += x86/hyperv_tlb_flush
 TEST_GEN_PROGS_x86 += x86/kvm_clock_test
 TEST_GEN_PROGS_x86 += x86/kvm_pv_test
+TEST_GEN_PROGS_x86 += x86/kvm_buslock_test
 TEST_GEN_PROGS_x86 += x86/monitor_mwait_test
 TEST_GEN_PROGS_x86 += x86/nested_emulation_test
 TEST_GEN_PROGS_x86 += x86/nested_exceptions_test
diff --git a/tools/testing/selftests/kvm/x86/kvm_buslock_test.c b/tools/testing/selftests/kvm/x86/kvm_buslock_test.c
new file mode 100644
index 000000000000..9c081525ac2a
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86/kvm_buslock_test.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "svm_util.h"
+#include "vmx.h"
+
+#define NR_ITERATIONS 100
+#define L2_GUEST_STACK_SIZE 64
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Waddress-of-packed-member"
+
+struct buslock_test {
+	unsigned char pad[PAGE_SIZE - 2];
+	atomic_long_t val;
+} __packed;
+
+struct buslock_test test __aligned(PAGE_SIZE);
+
+static __always_inline void buslock_atomic_add(int i, atomic_long_t *v)
+{
+	asm volatile(LOCK_PREFIX "addl %1,%0"
+		     : "+m" (v->counter)
+		     : "ir" (i) : "memory");
+}
+
+static void buslock_add(void)
+{
+	/*
+	 * Increment a page unaligned variable atomically.
+	 * This should generate a bus lock exit.
+	 */
+	for (int i = 0; i < NR_ITERATIONS; i++)
+		buslock_atomic_add(2, &test.val);
+}
+
+#pragma GCC diagnostic pop
+
+static void l2_guest_code(void)
+{
+	buslock_add();
+	GUEST_DONE();
+}
+
+static void l1_svm_code(struct svm_test_data *svm)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb = svm->vmcb;
+
+	generic_svm_setup(svm, l2_guest_code,
+			  &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+	run_guest(vmcb, svm->vmcb_gpa);
+	GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_VMMCALL);
+	GUEST_DONE();
+}
+
+static void l1_vmx_code(struct vmx_pages *vmx)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+
+	GUEST_ASSERT_EQ(prepare_for_vmx_operation(vmx), true);
+	GUEST_ASSERT_EQ(load_vmcs(vmx), true);
+
+	prepare_vmcs(vmx, NULL, &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	GUEST_ASSERT(!vmwrite(GUEST_RIP, (u64)l2_guest_code));
+	GUEST_ASSERT(!vmlaunch());
+
+	GUEST_ASSERT_EQ(vmreadz(VM_EXIT_REASON), EXIT_REASON_VMCALL);
+	GUEST_DONE();
+}
+
+static void guest_code(void *test_data)
+{
+	buslock_add();
+
+	if (this_cpu_has(X86_FEATURE_SVM))
+		l1_svm_code(test_data);
+	else
+		l1_vmx_code(test_data);
+}
+
+int main(int argc, char *argv[])
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	vm_vaddr_t nested_test_data_gva;
+
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM) || kvm_cpu_has(X86_FEATURE_VMX));
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_X86_BUS_LOCK_EXIT));
+
+	vm = vm_create(1);
+	vm_enable_cap(vm, KVM_CAP_X86_BUS_LOCK_EXIT, KVM_BUS_LOCK_DETECTION_EXIT);
+	vcpu = vm_vcpu_add(vm, 0, guest_code);
+
+	if (kvm_cpu_has(X86_FEATURE_SVM))
+		vcpu_alloc_svm(vm, &nested_test_data_gva);
+	else
+		vcpu_alloc_vmx(vm, &nested_test_data_gva);
+
+	vcpu_args_set(vcpu, 1, nested_test_data_gva);
+
+	run = vcpu->run;
+
+	for (;;) {
+		struct ucall uc;
+
+		vcpu_run(vcpu);
+
+		if (run->exit_reason == KVM_EXIT_IO) {
+			switch (get_ucall(vcpu, &uc)) {
+			case UCALL_ABORT:
+				REPORT_GUEST_ASSERT(uc);
+				/* NOT REACHED */
+			case UCALL_SYNC:
+				continue;
+			case UCALL_DONE:
+				goto done;
+			default:
+				TEST_FAIL("Unknown ucall 0x%lx.", uc.cmd);
+			}
+		}
+
+		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_X86_BUS_LOCK);
+	}
+done:
+	kvm_vm_free(vm);
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-03-24 13:02 ` [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold Manali Shukla
@ 2025-03-24 21:56   ` Borislav Petkov
  2025-04-09  6:00     ` Manali Shukla
  0 siblings, 1 reply; 22+ messages in thread
From: Borislav Petkov @ 2025-03-24 21:56 UTC (permalink / raw)
  To: Manali Shukla
  Cc: kvm, linux-kselftest, seanjc, pbonzini, nikunj, thomas.lendacky

On Mon, Mar 24, 2025 at 01:02:45PM +0000, Manali Shukla wrote:
> Misbehaving guests can cause bus locks to degrade the performance of
> the system. The Bus Lock Threshold feature can be used to address this
> issue by providing capability to the hypervisor to limit guest's
> ability to generate bus lock, thereby preventing system slowdown due
> to performance penalities.
> 
> When the Bus Lock Threshold feature is enabled, the processor checks
> the bus lock threshold count before executing the buslock and decides
> whether to trigger bus lock exit or not.
> 
> The value of the bus lock threshold count '0' generates bus lock
> exits, and if the value is greater than '0', the bus lock is executed
> successfully and the bus lock threshold count is decremented.
> 
> Presence of the Bus Lock threshold feature is indicated via CPUID
> function 0x8000000A_EDX[29].
> 
> Signed-off-by: Manali Shukla <manali.shukla@amd.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 8f8aaf94dc00..a3ab8d1f5c17 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -386,6 +386,7 @@
>  #define X86_FEATURE_V_SPEC_CTRL		(15*32+20) /* "v_spec_ctrl" Virtual SPEC_CTRL */
>  #define X86_FEATURE_VNMI		(15*32+25) /* "vnmi" Virtual NMI */
>  #define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* SVME addr check */
> +#define X86_FEATURE_BUS_LOCK_THRESHOLD	(15*32+29) /* "buslock" Bus lock threshold */
							      ^^^^^^^^^

Please read this before you do stuff like that:

Documentation/arch/x86/cpuinfo.rst

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-03-24 21:56   ` Borislav Petkov
@ 2025-04-09  6:00     ` Manali Shukla
  2025-04-09  9:21       ` Borislav Petkov
  0 siblings, 1 reply; 22+ messages in thread
From: Manali Shukla @ 2025-04-09  6:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: kvm, linux-kselftest, seanjc, pbonzini, nikunj, thomas.lendacky

Hi Boris,

On 3/25/2025 3:26 AM, Borislav Petkov wrote:
> On Mon, Mar 24, 2025 at 01:02:45PM +0000, Manali Shukla wrote:
>> Misbehaving guests can cause bus locks to degrade the performance of
>> the system. The Bus Lock Threshold feature can be used to address this
>> issue by providing capability to the hypervisor to limit guest's
>> ability to generate bus lock, thereby preventing system slowdown due
>> to performance penalities.
>>
>> When the Bus Lock Threshold feature is enabled, the processor checks
>> the bus lock threshold count before executing the buslock and decides
>> whether to trigger bus lock exit or not.
>>
>> The value of the bus lock threshold count '0' generates bus lock
>> exits, and if the value is greater than '0', the bus lock is executed
>> successfully and the bus lock threshold count is decremented.
>>
>> Presence of the Bus Lock threshold feature is indicated via CPUID
>> function 0x8000000A_EDX[29].
>>
>> Signed-off-by: Manali Shukla <manali.shukla@amd.com>
>> ---
>>  arch/x86/include/asm/cpufeatures.h | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
>> index 8f8aaf94dc00..a3ab8d1f5c17 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpufeatures.h
>> @@ -386,6 +386,7 @@
>>  #define X86_FEATURE_V_SPEC_CTRL		(15*32+20) /* "v_spec_ctrl" Virtual SPEC_CTRL */
>>  #define X86_FEATURE_VNMI		(15*32+25) /* "vnmi" Virtual NMI */
>>  #define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* SVME addr check */
>> +#define X86_FEATURE_BUS_LOCK_THRESHOLD	(15*32+29) /* "buslock" Bus lock threshold */
> 							      ^^^^^^^^^
> 
> Please read this before you do stuff like that:
> 
> Documentation/arch/x86/cpuinfo.rst
> 
> Thx.
> 

Do you have concern with the decision to expose the flag to /proc/cpuinfo?

The decision to expose the flag to /proc/cpuinfo was already discussed in 
[v1]. As suggested in the discussion [v1], I have added "buslock" to be
enumerated in /proc/cpuinfo.

However, if your concern is with the name, I can change the exposed name from
"buslock" to "bus_lock_threshold".

- Manali

 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-04-09  6:00     ` Manali Shukla
@ 2025-04-09  9:21       ` Borislav Petkov
  2025-04-10 23:25         ` Sean Christopherson
  0 siblings, 1 reply; 22+ messages in thread
From: Borislav Petkov @ 2025-04-09  9:21 UTC (permalink / raw)
  To: Manali Shukla
  Cc: kvm, linux-kselftest, seanjc, pbonzini, nikunj, thomas.lendacky

On Wed, Apr 09, 2025 at 11:30:56AM +0530, Manali Shukla wrote:
> Do you have concern with the decision to expose the flag to /proc/cpuinfo?
> 
> The decision to expose the flag to /proc/cpuinfo was already discussed in 
> [v1]. As suggested in the discussion [v1], I have added "buslock" to be
> enumerated in /proc/cpuinfo.

If you mean this:

https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/#m1dc7e08c54dd91d53a6bc5b1ed0a6721b356a756

I don't see any conclusion there.

I see Sean wanting to export information out of KVM to say what it supports
without adding user ABI and I don't know how far that has come but what you're
doing here ain't it.

Just do:

#define X86_FEATURE_BUS_LOCK_THRESHOLD    (15*32+29) /* Bus lock threshold */

and then later KVM can export this however it prefers.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-04-09  9:21       ` Borislav Petkov
@ 2025-04-10 23:25         ` Sean Christopherson
  2025-04-23  5:58           ` Manali Shukla
  0 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2025-04-10 23:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Manali Shukla, kvm, linux-kselftest, pbonzini, nikunj,
	thomas.lendacky

On Wed, Apr 09, 2025, Borislav Petkov wrote:
> On Wed, Apr 09, 2025 at 11:30:56AM +0530, Manali Shukla wrote:
> > Do you have concern with the decision to expose the flag to /proc/cpuinfo?
> > 
> > The decision to expose the flag to /proc/cpuinfo was already discussed in 
> > [v1]. As suggested in the discussion [v1], I have added "buslock" to be
> > enumerated in /proc/cpuinfo.
> 
> If you mean this:
> 
> https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/#m1dc7e08c54dd91d53a6bc5b1ed0a6721b356a756
> 
> I don't see any conclusion there.
> 
> I see Sean wanting to export information out of KVM to say what it supports
> without adding user ABI and I don't know how far that has come

Not very far :-(

> but what you're
> doing here ain't it.
> 
> Just do:
> 
> #define X86_FEATURE_BUS_LOCK_THRESHOLD    (15*32+29) /* Bus lock threshold */
> 
> and then later KVM can export this however it prefers.

+1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit
  2025-03-24 13:02 ` [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit Manali Shukla
@ 2025-04-16  6:00   ` Xiaoyao Li
  2025-04-23  6:15     ` Manali Shukla
  0 siblings, 1 reply; 22+ messages in thread
From: Xiaoyao Li @ 2025-04-16  6:00 UTC (permalink / raw)
  To: Manali Shukla, kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, bp

On 3/24/2025 9:02 PM, Manali Shukla wrote:
> From: Nikunj A Dadhania <nikunj@amd.com>
> 
> Virtual machines can exploit bus locks to degrade the performance of
> the system. Bus locks can be caused by Non-WB(Write back) and
> misaligned locked RMW (Read-modify-Write) instructions and require
> systemwide synchronization among all processors which can result into
> significant performance penalties.
> 
> To address this issue, the Bus Lock Threshold feature is introduced to
> provide ability to hypervisor to restrict guests' capability of
> initiating mulitple buslocks, thereby preventing system slowdowns.
> 
> Support for the buslock threshold is indicated via CPUID function
> 0x8000000A_EDX[29].
> 
> On the processors that support the Bus Lock Threshold feature, the
> VMCB provides a Bus Lock Threshold enable bit and an unsigned 16-bit
> Bus Lock threshold count.
> 
> VMCB intercept bit
> VMCB Offset     Bits    Function
> 14h             5       Intercept bus lock operations
> 
> Bus lock threshold count
> VMCB Offset     Bits    Function
> 120h            15:0    Bus lock counter
> 
> When a VMRUN instruction is executed, the bus lock threshold count is
> loaded into an internal count register. Before the processor executes
> a bus lock in the guest, it checks the value of this register:
> 
>   - If the value is greater than '0', the processor successfully
>     executes the bus lock and decrements the count.
> 
>   - If the value is '0', the bus lock is not executed, and a #VMEXIT to
>     the VMM is taken.
> 
> The bus lock threshold #VMEXIT is reported to the VMM with the VMEXIT
> code A5h, SVM_EXIT_BUS_LOCK.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Co-developed-by: Manali Shukla <manali.shukla@amd.com>
> Signed-off-by: Manali Shukla <manali.shukla@amd.com>
> ---
>   arch/x86/include/asm/svm.h      | 5 ++++-
>   arch/x86/include/uapi/asm/svm.h | 2 ++
>   arch/x86/kvm/svm/svm.c          | 8 ++++++++
>   3 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 9b7fa99ae951..9dc54da1835a 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -116,6 +116,7 @@ enum {
>   	INTERCEPT_INVPCID,
>   	INTERCEPT_MCOMMIT,
>   	INTERCEPT_TLBSYNC,
> +	INTERCEPT_BUSLOCK,
>   	INTERCEPT_IDLE_HLT = 166,
>   };
>   
> @@ -159,7 +160,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>   	u64 avic_physical_id;	/* Offset 0xf8 */
>   	u8 reserved_7[8];
>   	u64 vmsa_pa;		/* Used for an SEV-ES guest */
> -	u8 reserved_8[720];
> +	u8 reserved_8[16];
> +	u16 bus_lock_counter;	/* Offset 0x120 */
> +	u8 reserved_9[702];
>   	/*
>   	 * Offset 0x3e0, 32 bytes reserved
>   	 * for use by hypervisor/software.
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index ec1321248dac..9c640a521a67 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -95,6 +95,7 @@
>   #define SVM_EXIT_CR14_WRITE_TRAP		0x09e
>   #define SVM_EXIT_CR15_WRITE_TRAP		0x09f
>   #define SVM_EXIT_INVPCID       0x0a2
> +#define SVM_EXIT_BUS_LOCK			0x0a5
>   #define SVM_EXIT_IDLE_HLT      0x0a6
>   #define SVM_EXIT_NPF           0x400
>   #define SVM_EXIT_AVIC_INCOMPLETE_IPI		0x401
> @@ -225,6 +226,7 @@
>   	{ SVM_EXIT_CR4_WRITE_TRAP,	"write_cr4_trap" }, \
>   	{ SVM_EXIT_CR8_WRITE_TRAP,	"write_cr8_trap" }, \
>   	{ SVM_EXIT_INVPCID,     "invpcid" }, \
> +	{ SVM_EXIT_BUS_LOCK,     "buslock" }, \
>   	{ SVM_EXIT_IDLE_HLT,     "idle-halt" }, \
>   	{ SVM_EXIT_NPF,         "npf" }, \
>   	{ SVM_EXIT_AVIC_INCOMPLETE_IPI,		"avic_incomplete_ipi" }, \
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8abeab91d329..244e099e7262 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1375,6 +1375,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>   		svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
>   	}
>   
> +	if (vcpu->kvm->arch.bus_lock_detection_enabled)
> +		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
> +
>   	if (sev_guest(vcpu->kvm))
>   		sev_init_vmcb(svm);
>   
> @@ -5299,6 +5302,11 @@ static __init void svm_set_cpu_caps(void)
>   		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>   	}
>   
> +	if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) {
> +		pr_info("Bus Lock Threshold supported\n");

It will be printed every time kvm-amd.ko module gets loaded.

I think it's for your development and debug purpose. Comparing to the 
existing features in svm_set_cpu_caps(), nothing makes it special for 
BUS_LOCK_THRESHOLD to require a kernel message. So I think we can just 
remove it.

> +		kvm_caps.has_bus_lock_exit = true;

Besides, this patch doesn't ensure the bisectability. It allows 
userspace to enable KVM_BUS_LOCK_DETECTION_EXIT and set intercept of 
INTERCEPT_BUSLOCK but without providing the handler.

So either move next patch before it or just merge them.

> +	}
> +
>   	/* CPUID 0x80000008 */
>   	if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
>   	    boot_cpu_has(X86_FEATURE_AMD_SSBD))


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-03-24 13:02 ` [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs Manali Shukla
@ 2025-04-16  6:14   ` Xiaoyao Li
  2025-04-23 11:26     ` Manali Shukla
  2025-04-23 15:30     ` Sean Christopherson
  2025-04-23 15:44   ` Sean Christopherson
  1 sibling, 2 replies; 22+ messages in thread
From: Xiaoyao Li @ 2025-04-16  6:14 UTC (permalink / raw)
  To: Manali Shukla, kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, bp

On 3/24/2025 9:02 PM, Manali Shukla wrote:
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 5fe84f2427b5..f7c925aa0c4f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
>   KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
>   to KVM_EXIT_X86_BUS_LOCK.
>   
> +Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close
> +enough  to INTEL's Bus Lock Detection VM-Exit to allow using
> +KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
> +
> +The biggest difference between the two features is that Threshold (AMD CPUs) is
> +fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the
> +offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus
> +lock exit to user space occurs with RIP pointing at the instruction right after
> +the guilty one.
>


> +The bus lock threshold on AMD CPUs provides a per-VMCB counter which is
> +decremented every time a bus lock occurs, and a VM-Exit is triggered if and only
> +if the bus lock counter is '0'.
> +
> +To provide Detection-like semantics for AMD CPUs, the bus lock counter has been
> +initialized to '0', i.e. exit on every bus lock, and when re-executing the
> +guilty instruction, the bus lock counter has been set to '1' to effectively step
> +past the instruction.

 From the perspective of API, I don't think the last two paragraphs 
matter much to userspace.

It should describe what userspace can/should do. E.g., when exit to 
userspace due to bus lock on AMD platform, the RIP points at the 
instruction which causes the bus lock. Userspace can advance the RIP 
itself before re-enter the guest to make progress. If userspace doesn't 
change the RIP, KVM internal can handle it by making the re-execution of 
the instruction doesn't trigger bus lock VM exit to allow progress.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold
  2025-04-10 23:25         ` Sean Christopherson
@ 2025-04-23  5:58           ` Manali Shukla
  0 siblings, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-04-23  5:58 UTC (permalink / raw)
  To: Sean Christopherson, Borislav Petkov
  Cc: kvm, linux-kselftest, pbonzini, nikunj, thomas.lendacky

Thank you for reviewing my patches.

On 4/11/2025 4:55 AM, Sean Christopherson wrote:
> On Wed, Apr 09, 2025, Borislav Petkov wrote:
>> On Wed, Apr 09, 2025 at 11:30:56AM +0530, Manali Shukla wrote:
>>> Do you have concern with the decision to expose the flag to /proc/cpuinfo?
>>>
>>> The decision to expose the flag to /proc/cpuinfo was already discussed in 
>>> [v1]. As suggested in the discussion [v1], I have added "buslock" to be
>>> enumerated in /proc/cpuinfo.
>>
>> If you mean this:
>>
>> https://lore.kernel.org/kvm/20240709175145.9986-4-manali.shukla@amd.com/T/#m1dc7e08c54dd91d53a6bc5b1ed0a6721b356a756
>>
>> I don't see any conclusion there.
>>
>> I see Sean wanting to export information out of KVM to say what it supports
>> without adding user ABI and I don't know how far that has come
> 
> Not very far :-(
> 
>> but what you're
>> doing here ain't it.
>>
>> Just do:
>>
>> #define X86_FEATURE_BUS_LOCK_THRESHOLD    (15*32+29) /* Bus lock threshold */
>>
>> and then later KVM can export this however it prefers.
> 
> +1

Ack. I will remove "buslock". 

-Manali

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit
  2025-04-16  6:00   ` Xiaoyao Li
@ 2025-04-23  6:15     ` Manali Shukla
  2025-04-23 15:29       ` Sean Christopherson
  0 siblings, 1 reply; 22+ messages in thread
From: Manali Shukla @ 2025-04-23  6:15 UTC (permalink / raw)
  To: Xiaoyao Li, kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, bp

Hi Xiaoyao,

Thank you for reviewing my patches.

On 4/16/2025 11:30 AM, Xiaoyao Li wrote:
> On 3/24/2025 9:02 PM, Manali Shukla wrote:
>> From: Nikunj A Dadhania <nikunj@amd.com>
>>
>> Virtual machines can exploit bus locks to degrade the performance of
>> the system. Bus locks can be caused by Non-WB(Write back) and
>> misaligned locked RMW (Read-modify-Write) instructions and require
>> systemwide synchronization among all processors which can result into
>> significant performance penalties.
>>
>> To address this issue, the Bus Lock Threshold feature is introduced to
>> provide ability to hypervisor to restrict guests' capability of
>> initiating mulitple buslocks, thereby preventing system slowdowns.
>>
>> Support for the buslock threshold is indicated via CPUID function
>> 0x8000000A_EDX[29].
>>
>> On the processors that support the Bus Lock Threshold feature, the
>> VMCB provides a Bus Lock Threshold enable bit and an unsigned 16-bit
>> Bus Lock threshold count.
>>
>> VMCB intercept bit
>> VMCB Offset     Bits    Function
>> 14h             5       Intercept bus lock operations
>>
>> Bus lock threshold count
>> VMCB Offset     Bits    Function
>> 120h            15:0    Bus lock counter
>>
>> When a VMRUN instruction is executed, the bus lock threshold count is
>> loaded into an internal count register. Before the processor executes
>> a bus lock in the guest, it checks the value of this register:
>>
>>   - If the value is greater than '0', the processor successfully
>>     executes the bus lock and decrements the count.
>>
>>   - If the value is '0', the bus lock is not executed, and a #VMEXIT to
>>     the VMM is taken.
>>
>> The bus lock threshold #VMEXIT is reported to the VMM with the VMEXIT
>> code A5h, SVM_EXIT_BUS_LOCK.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Co-developed-by: Manali Shukla <manali.shukla@amd.com>
>> Signed-off-by: Manali Shukla <manali.shukla@amd.com>
>> ---
>>   arch/x86/include/asm/svm.h      | 5 ++++-
>>   arch/x86/include/uapi/asm/svm.h | 2 ++
>>   arch/x86/kvm/svm/svm.c          | 8 ++++++++
>>   3 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index 9b7fa99ae951..9dc54da1835a 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -116,6 +116,7 @@ enum {
>>       INTERCEPT_INVPCID,
>>       INTERCEPT_MCOMMIT,
>>       INTERCEPT_TLBSYNC,
>> +    INTERCEPT_BUSLOCK,
>>       INTERCEPT_IDLE_HLT = 166,
>>   };
>>   @@ -159,7 +160,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>>       u64 avic_physical_id;    /* Offset 0xf8 */
>>       u8 reserved_7[8];
>>       u64 vmsa_pa;        /* Used for an SEV-ES guest */
>> -    u8 reserved_8[720];
>> +    u8 reserved_8[16];
>> +    u16 bus_lock_counter;    /* Offset 0x120 */
>> +    u8 reserved_9[702];
>>       /*
>>        * Offset 0x3e0, 32 bytes reserved
>>        * for use by hypervisor/software.
>> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
>> index ec1321248dac..9c640a521a67 100644
>> --- a/arch/x86/include/uapi/asm/svm.h
>> +++ b/arch/x86/include/uapi/asm/svm.h
>> @@ -95,6 +95,7 @@
>>   #define SVM_EXIT_CR14_WRITE_TRAP        0x09e
>>   #define SVM_EXIT_CR15_WRITE_TRAP        0x09f
>>   #define SVM_EXIT_INVPCID       0x0a2
>> +#define SVM_EXIT_BUS_LOCK            0x0a5
>>   #define SVM_EXIT_IDLE_HLT      0x0a6
>>   #define SVM_EXIT_NPF           0x400
>>   #define SVM_EXIT_AVIC_INCOMPLETE_IPI        0x401
>> @@ -225,6 +226,7 @@
>>       { SVM_EXIT_CR4_WRITE_TRAP,    "write_cr4_trap" }, \
>>       { SVM_EXIT_CR8_WRITE_TRAP,    "write_cr8_trap" }, \
>>       { SVM_EXIT_INVPCID,     "invpcid" }, \
>> +    { SVM_EXIT_BUS_LOCK,     "buslock" }, \
>>       { SVM_EXIT_IDLE_HLT,     "idle-halt" }, \
>>       { SVM_EXIT_NPF,         "npf" }, \
>>       { SVM_EXIT_AVIC_INCOMPLETE_IPI,        "avic_incomplete_ipi" }, \
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index 8abeab91d329..244e099e7262 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -1375,6 +1375,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>>           svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
>>       }
>>   +    if (vcpu->kvm->arch.bus_lock_detection_enabled)
>> +        svm_set_intercept(svm, INTERCEPT_BUSLOCK);
>> +
>>       if (sev_guest(vcpu->kvm))
>>           sev_init_vmcb(svm);
>>   @@ -5299,6 +5302,11 @@ static __init void svm_set_cpu_caps(void)
>>           kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>>       }
>>   +    if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) {
>> +        pr_info("Bus Lock Threshold supported\n");
> 
> It will be printed every time kvm-amd.ko module gets loaded.
> 
> I think it's for your development and debug purpose. Comparing to the existing features in svm_set_cpu_caps(), nothing makes it special for BUS_LOCK_THRESHOLD to require a kernel message. So I think we can just remove it.

I didn't add this for development and debug purpose. I added this pr_info() to make it easy to find whether BUS Lock threshold is supported or not from dmesg.
I can remove it if you think it is not required.
 
> 
>> +        kvm_caps.has_bus_lock_exit = true;
> 
> Besides, this patch doesn't ensure the bisectability. It allows userspace to enable KVM_BUS_LOCK_DETECTION_EXIT and set intercept of INTERCEPT_BUSLOCK but without providing the handler.
> 
> So either move next patch before it or just merge them.
> 

Oh.., my bad, I will move the next patch before this one in v5.

>> +    }
>> +
>>       /* CPUID 0x80000008 */
>>       if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
>>           boot_cpu_has(X86_FEATURE_AMD_SSBD))
> 

-Manali

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-04-16  6:14   ` Xiaoyao Li
@ 2025-04-23 11:26     ` Manali Shukla
  2025-04-23 15:30     ` Sean Christopherson
  1 sibling, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-04-23 11:26 UTC (permalink / raw)
  To: Xiaoyao Li, kvm, linux-kselftest
  Cc: seanjc, pbonzini, nikunj, thomas.lendacky, bp

Hi Xiaoyao,

Thanks for reviewing my patches.

On 4/16/2025 11:44 AM, Xiaoyao Li wrote:
> On 3/24/2025 9:02 PM, Manali Shukla wrote:
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 5fe84f2427b5..f7c925aa0c4f 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
>>   KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
>>   to KVM_EXIT_X86_BUS_LOCK.
>>   +Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close
>> +enough  to INTEL's Bus Lock Detection VM-Exit to allow using
>> +KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
>> +
>> +The biggest difference between the two features is that Threshold (AMD CPUs) is
>> +fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the
>> +offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus
>> +lock exit to user space occurs with RIP pointing at the instruction right after
>> +the guilty one.
>>
> 
> 
>> +The bus lock threshold on AMD CPUs provides a per-VMCB counter which is
>> +decremented every time a bus lock occurs, and a VM-Exit is triggered if and only
>> +if the bus lock counter is '0'.
>> +
>> +To provide Detection-like semantics for AMD CPUs, the bus lock counter has been
>> +initialized to '0', i.e. exit on every bus lock, and when re-executing the
>> +guilty instruction, the bus lock counter has been set to '1' to effectively step
>> +past the instruction.
> 
> From the perspective of API, I don't think the last two paragraphs matter much to userspace.
> 
> It should describe what userspace can/should do. E.g., when exit to userspace due to bus lock on AMD platform, the RIP points at the instruction which causes the bus lock. Userspace can advance the RIP itself before re-enter the guest to make progress. If userspace doesn't change the RIP, KVM internal can handle it by making the re-execution of the instruction doesn't trigger bus lock VM exit to allow progress.

Sure.

-Manali


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request
  2025-03-24 13:02 ` [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request Manali Shukla
@ 2025-04-23 15:22   ` Sean Christopherson
  0 siblings, 0 replies; 22+ messages in thread
From: Sean Christopherson @ 2025-04-23 15:22 UTC (permalink / raw)
  To: Manali Shukla; +Cc: kvm, linux-kselftest, pbonzini, nikunj, thomas.lendacky, bp

On Mon, Mar 24, 2025, Manali Shukla wrote:
> Add a refactoring prep patch to move linear_rip out of kvm_pio_request
> and place it next to complete_userspace_io.  There's nothing port I/O
> specific about linear_rip field, it just so happens to that port I/O is the
> only case where KVM's ABI is to let userspace stuff state (to emulate
> RESET) without first completing the I/O instruction.

The shortlog+changelog needs to state what the change is, not what the human that
wrote the code is doing.  And the changelog needs to state the motivation.  Yes,
the behavior of linear_rip isn't PIO specific, but the field is obviously specific
to PIO, and has been for years, i.e. there's a very good reason why the field is
in kvm_pio_request.

  KVM: x86: Make kvm_pio_request.linear_rip a common field for user exits

  Move and rename kvm_pio_request.linear_rip to kvm_vcpu_arch.cui_linear_rip
  so that the field can be used by other userspace exit completion flows
  that need to take action if and only if userspace has not modified RIP.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit
  2025-04-23  6:15     ` Manali Shukla
@ 2025-04-23 15:29       ` Sean Christopherson
  2025-04-30 11:15         ` Manali Shukla
  0 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2025-04-23 15:29 UTC (permalink / raw)
  To: Manali Shukla
  Cc: Xiaoyao Li, kvm, linux-kselftest, pbonzini, nikunj,
	thomas.lendacky, bp

On Wed, Apr 23, 2025, Manali Shukla wrote:
> On 4/16/2025 11:30 AM, Xiaoyao Li wrote:
> >>   +    if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) {
> >> +        pr_info("Bus Lock Threshold supported\n");
> > 
> > It will be printed every time kvm-amd.ko module gets loaded.
> > 
> > I think it's for your development and debug purpose. Comparing to the
> > existing features in svm_set_cpu_caps(), nothing makes it special for
> > BUS_LOCK_THRESHOLD to require a kernel message. So I think we can just
> > remove it.
> 
> I didn't add this for development and debug purpose. I added this pr_info()
> to make it easy to find whether BUS Lock threshold is supported or not from
> dmesg.  I can remove it if you think it is not required.

Please remove it.  The user typically doesn't care.

> >> +        kvm_caps.has_bus_lock_exit = true;
> > 
> > Besides, this patch doesn't ensure the bisectability. It allows userspace
> > to enable KVM_BUS_LOCK_DETECTION_EXIT and set intercept of
> > INTERCEPT_BUSLOCK but without providing the handler.
> > 
> > So either move next patch before it or just merge them.
> > 
> 
> Oh.., my bad, I will move the next patch before this one in v5.

No, do exactly as I suggested in v3.

 : I vote to split this into two patches: one to add the architectural collateral,
 : with the above as the changelog, and a second to actually implement support in
 : KVM.  Having the above background is useful, but it makes it quite hard to find
 : information on the KVM design and implementation.

I want this (and any other arch collateral I'm missing) in a separate patch so
that the background on what the hardware feature does is captured.  But I see no
reason to split KVM's implementation into multiple patches.

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2b59b9951c90..d1819c564b1c 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -116,6 +116,7 @@ enum {
        INTERCEPT_INVPCID,
        INTERCEPT_MCOMMIT,
        INTERCEPT_TLBSYNC,
+       INTERCEPT_BUSLOCK,
 };


@@ -158,7 +159,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
        u64 avic_physical_id;   /* Offset 0xf8 */
        u8 reserved_7[8];
        u64 vmsa_pa;            /* Used for an SEV-ES guest */
-       u8 reserved_8[720];
+       u8 reserved_8[16];
+       u16 bus_lock_counter;   /* Offset 0x120 */
+       u8 reserved_9[702];
        /*
         * Offset 0x3e0, 32 bytes reserved
         * for use by hypervisor/software.
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 1814b413fd57..abf6aed88cee 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -95,6 +95,7 @@
 #define SVM_EXIT_CR14_WRITE_TRAP               0x09e
 #define SVM_EXIT_CR15_WRITE_TRAP               0x09f
 #define SVM_EXIT_INVPCID       0x0a2
+#define SVM_EXIT_BUS_LOCK                      0x0a5
 #define SVM_EXIT_NPF           0x400
 #define SVM_EXIT_AVIC_INCOMPLETE_IPI           0x401
 #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS     0x402
@@ -224,6 +225,7 @@
        { SVM_EXIT_CR4_WRITE_TRAP,      "write_cr4_trap" }, \
        { SVM_EXIT_CR8_WRITE_TRAP,      "write_cr8_trap" }, \
        { SVM_EXIT_INVPCID,     "invpcid" }, \
+       { SVM_EXIT_BUS_LOCK,     "buslock" }, \
        { SVM_EXIT_NPF,         "npf" }, \
        { SVM_EXIT_AVIC_INCOMPLETE_IPI,         "avic_incomplete_ipi" }, \
        { SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-04-16  6:14   ` Xiaoyao Li
  2025-04-23 11:26     ` Manali Shukla
@ 2025-04-23 15:30     ` Sean Christopherson
  2025-04-30 11:18       ` Manali Shukla
  1 sibling, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2025-04-23 15:30 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Manali Shukla, kvm, linux-kselftest, pbonzini, nikunj,
	thomas.lendacky, bp

On Wed, Apr 16, 2025, Xiaoyao Li wrote:
> On 3/24/2025 9:02 PM, Manali Shukla wrote:
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 5fe84f2427b5..f7c925aa0c4f 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
> >   KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
> >   to KVM_EXIT_X86_BUS_LOCK.
> > +Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close
> > +enough  to INTEL's Bus Lock Detection VM-Exit to allow using
> > +KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
> > +
> > +The biggest difference between the two features is that Threshold (AMD CPUs) is
> > +fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the
> > +offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus
> > +lock exit to user space occurs with RIP pointing at the instruction right after
> > +the guilty one.
> > 
> 
> 
> > +The bus lock threshold on AMD CPUs provides a per-VMCB counter which is
> > +decremented every time a bus lock occurs, and a VM-Exit is triggered if and only
> > +if the bus lock counter is '0'.
> > +
> > +To provide Detection-like semantics for AMD CPUs, the bus lock counter has been
> > +initialized to '0', i.e. exit on every bus lock, and when re-executing the
> > +guilty instruction, the bus lock counter has been set to '1' to effectively step
> > +past the instruction.
> 
> From the perspective of API, I don't think the last two paragraphs matter
> much to userspace.
> 
> It should describe what userspace can/should do. E.g., when exit to
> userspace due to bus lock on AMD platform, the RIP points at the instruction
> which causes the bus lock. Userspace can advance the RIP itself before
> re-enter the guest to make progress. If userspace doesn't change the RIP,
> KVM internal can handle it by making the re-execution of the instruction
> doesn't trigger bus lock VM exit to allow progress.

Agreed.  It's not just the last two paragraphs, it's the entire doc update.

The existing documentation very carefully doesn't say anything about *how* the
feature is implemented on Intel, so I don't see any reason to mention or compare
Bus Lock Threshold vs. Bus Lock Detection.  As Xiaoyao said, simply state what
is different.

And I would definitely not say anything about whether or not userspace can advance
RIP, as doing so will likely crash/corrupt the guest.  KVM sets bus_lock_counter
to allow forward progress, KVM does NOT skip RIP.

All in all, I think the only that needs to be called out is that RIP will point
to the next instruction on Intel, but the offending instruction on Intel.

Unless I'm missing a detail, I think it's just this:

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 5fe84f2427b5..d9788f9152f1 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7909,6 +7909,11 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
 KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
 to KVM_EXIT_X86_BUS_LOCK.
 
+Due to differences in the underlying hardware implementation, the vCPU's RIP at
+the time of exit diverges between Intel and AMD.  On Intel hosts, RIP points at
+the next instruction, i.e. the exit is trap-like.  On AMD hosts, RIP points at
+the offending instruction, i.e. the exit is fault-like.
+
 Note! Detected bus locks may be coincident with other exits to userspace, i.e.
 KVM_RUN_X86_BUS_LOCK should be checked regardless of the primary exit reason if
 userspace wants to take action on all detected bus locks.


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-03-24 13:02 ` [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs Manali Shukla
  2025-04-16  6:14   ` Xiaoyao Li
@ 2025-04-23 15:44   ` Sean Christopherson
  2025-04-30 11:30     ` Manali Shukla
  1 sibling, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2025-04-23 15:44 UTC (permalink / raw)
  To: Manali Shukla; +Cc: kvm, linux-kselftest, pbonzini, nikunj, thomas.lendacky, bp

On Mon, Mar 24, 2025, Manali Shukla wrote:
> +	if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip)) {
> +		vmcb02->control.bus_lock_counter = 1;
> +		svm->bus_lock_rip = svm->nested.ctl.bus_lock_rip;
> +	} else {
> +		vmcb02->control.bus_lock_counter = 0;
> +	}
> +	svm->nested.ctl.bus_lock_rip = INVALID_GPA;
> +
>  	/* Done at vmrun: asid.  */
>  
>  	/* Also overwritten later if necessary.  */
> @@ -1039,6 +1069,18 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  
>  	}
>  
> +	/*
> +	 * If bus_lock_counter is nonzero and the guest has not moved past the
> +	 * guilty instruction, save bus_lock_rip in svm_nested_state. This will
> +	 * help determine at nested VMRUN whether to stash vmcb02's counter or
> +	 * reset it to '0'.
> +	 */
> +	if (vmcb02->control.bus_lock_counter &&
> +	    svm->bus_lock_rip == vmcb02->save.rip)
> +		svm->nested.ctl.bus_lock_rip = svm->bus_lock_rip;
> +	else
> +		svm->nested.ctl.bus_lock_rip = INVALID_GPA;
> +
>  	nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
>  
>  	svm_switch_vmcb(svm, &svm->vmcb01);

...

> +static int bus_lock_exit(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
> +	vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
> +
> +	vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
> +	svm->bus_lock_rip = vcpu->arch.cui_linear_rip;
> +	vcpu->arch.complete_userspace_io = complete_userspace_buslock;
> +
> +	return 0;
> +}

> @@ -327,6 +328,7 @@ struct vcpu_svm {
>  
>  	/* Guest GIF value, used when vGIF is not enabled */
>  	bool guest_gif;
> +	u64 bus_lock_rip;

I don't think this field is necessary.  Rather than unconditionally invalidate
on nested VMRUN and then conditionally restore on nested #VMEXIT, just leave
svm->nested.ctl.bus_lock_rip set on VMRUN and conditionally invalidate on #VMEXIT.
And then in bus_lock_exit(), update the field if the exit occurred while L2 is
active.

Completely untested:

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a42ef7dd9143..98e065a93516 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -700,13 +700,10 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
         * L1 re-enters L2, the same instruction will trigger a VM-Exit and the
         * entire cycle start over.
         */
-       if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip)) {
+       if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip))
                vmcb02->control.bus_lock_counter = 1;
-               svm->bus_lock_rip = svm->nested.ctl.bus_lock_rip;
-       } else {
+       else
                vmcb02->control.bus_lock_counter = 0;
-       }
-       svm->nested.ctl.bus_lock_rip = INVALID_GPA;
 
        /* Done at vmrun: asid.  */
 
@@ -1070,15 +1067,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
        }
 
        /*
-        * If bus_lock_counter is nonzero and the guest has not moved past the
-        * guilty instruction, save bus_lock_rip in svm_nested_state. This will
-        * help determine at nested VMRUN whether to stash vmcb02's counter or
-        * reset it to '0'.
+        * Invalidate bus_lock_rip unless kVM is still waiting for the guest
+        * to make forward progress before re-enabling bus lock detection.
         */
-       if (vmcb02->control.bus_lock_counter &&
-           svm->bus_lock_rip == vmcb02->save.rip)
-               svm->nested.ctl.bus_lock_rip = svm->bus_lock_rip;
-       else
+       if (!vmcb02->control.bus_lock_counter)
                svm->nested.ctl.bus_lock_rip = INVALID_GPA;
 
        nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ea12e93ae983..11ce031323fd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3333,9 +3333,10 @@ static int bus_lock_exit(struct kvm_vcpu *vcpu)
        vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
 
        vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
-       svm->bus_lock_rip = vcpu->arch.cui_linear_rip;
        vcpu->arch.complete_userspace_io = complete_userspace_buslock;
 
+       if (is_guest_mode(vcpu))
+               svm->nested.ctl.bus_lock_rip = vcpu->arch.cui_linear_rip;
        return 0;
 }
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7a4c5848c952..8667faccaedc 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -328,7 +328,6 @@ struct vcpu_svm {
 
        /* Guest GIF value, used when vGIF is not enabled */
        bool guest_gif;
-       u64 bus_lock_rip;
 };
 
 struct svm_cpu_data {


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit
  2025-04-23 15:29       ` Sean Christopherson
@ 2025-04-30 11:15         ` Manali Shukla
  0 siblings, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-04-30 11:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xiaoyao Li, kvm, linux-kselftest, pbonzini, nikunj,
	thomas.lendacky, bp

Hi Sean,

Thank you for reviewing my patches.

On 4/23/2025 8:59 PM, Sean Christopherson wrote:
> On Wed, Apr 23, 2025, Manali Shukla wrote:
>> On 4/16/2025 11:30 AM, Xiaoyao Li wrote:
>>>>   +    if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) {
>>>> +        pr_info("Bus Lock Threshold supported\n");
>>>
>>> It will be printed every time kvm-amd.ko module gets loaded.
>>>
>>> I think it's for your development and debug purpose. Comparing to the
>>> existing features in svm_set_cpu_caps(), nothing makes it special for
>>> BUS_LOCK_THRESHOLD to require a kernel message. So I think we can just
>>> remove it.
>>
>> I didn't add this for development and debug purpose. I added this pr_info()
>> to make it easy to find whether BUS Lock threshold is supported or not from
>> dmesg.  I can remove it if you think it is not required.
> 
> Please remove it.  The user typically doesn't care.
> 
>>>> +        kvm_caps.has_bus_lock_exit = true;
>>>
>>> Besides, this patch doesn't ensure the bisectability. It allows userspace
>>> to enable KVM_BUS_LOCK_DETECTION_EXIT and set intercept of
>>> INTERCEPT_BUSLOCK but without providing the handler.
>>>
>>> So either move next patch before it or just merge them.
>>>
>>
>> Oh.., my bad, I will move the next patch before this one in v5.
> 
> No, do exactly as I suggested in v3.

Sure. I will remove it from v5.

> 
>  : I vote to split this into two patches: one to add the architectural collateral,
>  : with the above as the changelog, and a second to actually implement support in
>  : KVM.  Having the above background is useful, but it makes it quite hard to find
>  : information on the KVM design and implementation.
> 
> I want this (and any other arch collateral I'm missing) in a separate patch so
> that the background on what the hardware feature does is captured.  But I see no
> reason to split KVM's implementation into multiple patches.
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2b59b9951c90..d1819c564b1c 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -116,6 +116,7 @@ enum {
>         INTERCEPT_INVPCID,
>         INTERCEPT_MCOMMIT,
>         INTERCEPT_TLBSYNC,
> +       INTERCEPT_BUSLOCK,
>  };
> 
> 
> @@ -158,7 +159,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>         u64 avic_physical_id;   /* Offset 0xf8 */
>         u8 reserved_7[8];
>         u64 vmsa_pa;            /* Used for an SEV-ES guest */
> -       u8 reserved_8[720];
> +       u8 reserved_8[16];
> +       u16 bus_lock_counter;   /* Offset 0x120 */
> +       u8 reserved_9[702];
>         /*
>          * Offset 0x3e0, 32 bytes reserved
>          * for use by hypervisor/software.
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index 1814b413fd57..abf6aed88cee 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -95,6 +95,7 @@
>  #define SVM_EXIT_CR14_WRITE_TRAP               0x09e
>  #define SVM_EXIT_CR15_WRITE_TRAP               0x09f
>  #define SVM_EXIT_INVPCID       0x0a2
> +#define SVM_EXIT_BUS_LOCK                      0x0a5
>  #define SVM_EXIT_NPF           0x400
>  #define SVM_EXIT_AVIC_INCOMPLETE_IPI           0x401
>  #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS     0x402
> @@ -224,6 +225,7 @@
>         { SVM_EXIT_CR4_WRITE_TRAP,      "write_cr4_trap" }, \
>         { SVM_EXIT_CR8_WRITE_TRAP,      "write_cr8_trap" }, \
>         { SVM_EXIT_INVPCID,     "invpcid" }, \
> +       { SVM_EXIT_BUS_LOCK,     "buslock" }, \
>         { SVM_EXIT_NPF,         "npf" }, \
>         { SVM_EXIT_AVIC_INCOMPLETE_IPI,         "avic_incomplete_ipi" }, \
>         { SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \
> 
Ack.

-Manali

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-04-23 15:30     ` Sean Christopherson
@ 2025-04-30 11:18       ` Manali Shukla
  0 siblings, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-04-30 11:18 UTC (permalink / raw)
  To: Sean Christopherson, Xiaoyao Li
  Cc: kvm, linux-kselftest, pbonzini, nikunj, thomas.lendacky, bp

Hi Sean,

Thank you for reviewing my patches.

On 4/23/2025 9:00 PM, Sean Christopherson wrote:
> On Wed, Apr 16, 2025, Xiaoyao Li wrote:
>> On 3/24/2025 9:02 PM, Manali Shukla wrote:
>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>> index 5fe84f2427b5..f7c925aa0c4f 100644
>>> --- a/Documentation/virt/kvm/api.rst
>>> +++ b/Documentation/virt/kvm/api.rst
>>> @@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
>>>   KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
>>>   to KVM_EXIT_X86_BUS_LOCK.
>>> +Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close
>>> +enough  to INTEL's Bus Lock Detection VM-Exit to allow using
>>> +KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
>>> +
>>> +The biggest difference between the two features is that Threshold (AMD CPUs) is
>>> +fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the
>>> +offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus
>>> +lock exit to user space occurs with RIP pointing at the instruction right after
>>> +the guilty one.
>>>
>>
>>
>>> +The bus lock threshold on AMD CPUs provides a per-VMCB counter which is
>>> +decremented every time a bus lock occurs, and a VM-Exit is triggered if and only
>>> +if the bus lock counter is '0'.
>>> +
>>> +To provide Detection-like semantics for AMD CPUs, the bus lock counter has been
>>> +initialized to '0', i.e. exit on every bus lock, and when re-executing the
>>> +guilty instruction, the bus lock counter has been set to '1' to effectively step
>>> +past the instruction.
>>
>> From the perspective of API, I don't think the last two paragraphs matter
>> much to userspace.
>>
>> It should describe what userspace can/should do. E.g., when exit to
>> userspace due to bus lock on AMD platform, the RIP points at the instruction
>> which causes the bus lock. Userspace can advance the RIP itself before
>> re-enter the guest to make progress. If userspace doesn't change the RIP,
>> KVM internal can handle it by making the re-execution of the instruction
>> doesn't trigger bus lock VM exit to allow progress.
> 
> Agreed.  It's not just the last two paragraphs, it's the entire doc update.
> 
> The existing documentation very carefully doesn't say anything about *how* the
> feature is implemented on Intel, so I don't see any reason to mention or compare
> Bus Lock Threshold vs. Bus Lock Detection.  As Xiaoyao said, simply state what
> is different.
> 
> And I would definitely not say anything about whether or not userspace can advance
> RIP, as doing so will likely crash/corrupt the guest.  KVM sets bus_lock_counter
> to allow forward progress, KVM does NOT skip RIP.
> 
> All in all, I think the only that needs to be called out is that RIP will point
> to the next instruction on Intel, but the offending instruction on Intel.
> 
> Unless I'm missing a detail, I think it's just this:
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 5fe84f2427b5..d9788f9152f1 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7909,6 +7909,11 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
>  KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
>  to KVM_EXIT_X86_BUS_LOCK.
>  
> +Due to differences in the underlying hardware implementation, the vCPU's RIP at
> +the time of exit diverges between Intel and AMD.  On Intel hosts, RIP points at
> +the next instruction, i.e. the exit is trap-like.  On AMD hosts, RIP points at
> +the offending instruction, i.e. the exit is fault-like.
> +
>  Note! Detected bus locks may be coincident with other exits to userspace, i.e.
>  KVM_RUN_X86_BUS_LOCK should be checked regardless of the primary exit reason if
>  userspace wants to take action on all detected bus locks.
> 

Will update in V5.

-Manali

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
  2025-04-23 15:44   ` Sean Christopherson
@ 2025-04-30 11:30     ` Manali Shukla
  0 siblings, 0 replies; 22+ messages in thread
From: Manali Shukla @ 2025-04-30 11:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kselftest, pbonzini, nikunj, thomas.lendacky, bp

Hi Sean,

Thank you for reviewing my patches.

On 4/23/2025 9:14 PM, Sean Christopherson wrote:
> On Mon, Mar 24, 2025, Manali Shukla wrote:
>> +	if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip)) {
>> +		vmcb02->control.bus_lock_counter = 1;
>> +		svm->bus_lock_rip = svm->nested.ctl.bus_lock_rip;
>> +	} else {
>> +		vmcb02->control.bus_lock_counter = 0;
>> +	}
>> +	svm->nested.ctl.bus_lock_rip = INVALID_GPA;
>> +
>>  	/* Done at vmrun: asid.  */
>>  
>>  	/* Also overwritten later if necessary.  */
>> @@ -1039,6 +1069,18 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>>  
>>  	}
>>  
>> +	/*
>> +	 * If bus_lock_counter is nonzero and the guest has not moved past the
>> +	 * guilty instruction, save bus_lock_rip in svm_nested_state. This will
>> +	 * help determine at nested VMRUN whether to stash vmcb02's counter or
>> +	 * reset it to '0'.
>> +	 */
>> +	if (vmcb02->control.bus_lock_counter &&
>> +	    svm->bus_lock_rip == vmcb02->save.rip)
>> +		svm->nested.ctl.bus_lock_rip = svm->bus_lock_rip;
>> +	else
>> +		svm->nested.ctl.bus_lock_rip = INVALID_GPA;
>> +
>>  	nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
>>  
>>  	svm_switch_vmcb(svm, &svm->vmcb01);
> 
> ...
> 
>> +static int bus_lock_exit(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vcpu_svm *svm = to_svm(vcpu);
>> +
>> +	vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
>> +	vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
>> +
>> +	vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
>> +	svm->bus_lock_rip = vcpu->arch.cui_linear_rip;
>> +	vcpu->arch.complete_userspace_io = complete_userspace_buslock;
>> +
>> +	return 0;
>> +}
> 
>> @@ -327,6 +328,7 @@ struct vcpu_svm {
>>  
>>  	/* Guest GIF value, used when vGIF is not enabled */
>>  	bool guest_gif;
>> +	u64 bus_lock_rip;
> 
> I don't think this field is necessary.  Rather than unconditionally invalidate
> on nested VMRUN and then conditionally restore on nested #VMEXIT, just leave
> svm->nested.ctl.bus_lock_rip set on VMRUN and conditionally invalidate on #VMEXIT.
> And then in bus_lock_exit(), update the field if the exit occurred while L2 is
> active.
> 
> Completely untested:
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index a42ef7dd9143..98e065a93516 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -700,13 +700,10 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
>          * L1 re-enters L2, the same instruction will trigger a VM-Exit and the
>          * entire cycle start over.
>          */
> -       if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip)) {
> +       if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip))
>                 vmcb02->control.bus_lock_counter = 1;
> -               svm->bus_lock_rip = svm->nested.ctl.bus_lock_rip;
> -       } else {
> +       else
>                 vmcb02->control.bus_lock_counter = 0;
> -       }
> -       svm->nested.ctl.bus_lock_rip = INVALID_GPA;
>  
>         /* Done at vmrun: asid.  */
>  
> @@ -1070,15 +1067,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>         }
>  
>         /*
> -        * If bus_lock_counter is nonzero and the guest has not moved past the
> -        * guilty instruction, save bus_lock_rip in svm_nested_state. This will
> -        * help determine at nested VMRUN whether to stash vmcb02's counter or
> -        * reset it to '0'.
> +        * Invalidate bus_lock_rip unless kVM is still waiting for the guest
> +        * to make forward progress before re-enabling bus lock detection.
>          */
> -       if (vmcb02->control.bus_lock_counter &&
> -           svm->bus_lock_rip == vmcb02->save.rip)
> -               svm->nested.ctl.bus_lock_rip = svm->bus_lock_rip;
> -       else
> +       if (!vmcb02->control.bus_lock_counter)
>                 svm->nested.ctl.bus_lock_rip = INVALID_GPA;
>  
>         nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index ea12e93ae983..11ce031323fd 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3333,9 +3333,10 @@ static int bus_lock_exit(struct kvm_vcpu *vcpu)
>         vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
>  
>         vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
> -       svm->bus_lock_rip = vcpu->arch.cui_linear_rip;
>         vcpu->arch.complete_userspace_io = complete_userspace_buslock;
>  
> +       if (is_guest_mode(vcpu))
> +               svm->nested.ctl.bus_lock_rip = vcpu->arch.cui_linear_rip;
>         return 0;
>  }
>  
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 7a4c5848c952..8667faccaedc 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -328,7 +328,6 @@ struct vcpu_svm {
>  
>         /* Guest GIF value, used when vGIF is not enabled */
>         bool guest_gif;
> -       u64 bus_lock_rip;
>  };
>  
>  struct svm_cpu_data {
> 

I have added these changes and tested them. Everything looks good to me. I will include them in V5
and send it soon.

-Manali

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-04-30 11:30 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-24 13:02 [PATCH v4 0/5] Add support for the Bus Lock Threshold Manali Shukla
2025-03-24 13:02 ` [PATCH v4 1/5] KVM: x86: Preparatory patch to move linear_rip out of kvm_pio_request Manali Shukla
2025-04-23 15:22   ` Sean Christopherson
2025-03-24 13:02 ` [PATCH v4 2/5] x86/cpufeatures: Add CPUID feature bit for the Bus Lock Threshold Manali Shukla
2025-03-24 21:56   ` Borislav Petkov
2025-04-09  6:00     ` Manali Shukla
2025-04-09  9:21       ` Borislav Petkov
2025-04-10 23:25         ` Sean Christopherson
2025-04-23  5:58           ` Manali Shukla
2025-03-24 13:02 ` [PATCH v4 3/5] KVM: SVM: Enable Bus lock threshold exit Manali Shukla
2025-04-16  6:00   ` Xiaoyao Li
2025-04-23  6:15     ` Manali Shukla
2025-04-23 15:29       ` Sean Christopherson
2025-04-30 11:15         ` Manali Shukla
2025-03-24 13:02 ` [PATCH v4 4/5] KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs Manali Shukla
2025-04-16  6:14   ` Xiaoyao Li
2025-04-23 11:26     ` Manali Shukla
2025-04-23 15:30     ` Sean Christopherson
2025-04-30 11:18       ` Manali Shukla
2025-04-23 15:44   ` Sean Christopherson
2025-04-30 11:30     ` Manali Shukla
2025-03-24 13:02 ` [PATCH v4 5/5] KVM: selftests: Add bus lock exit test Manali Shukla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).