* [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD
@ 2024-08-08 6:29 Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 1/4] x86/split_lock: Move Split and Bus lock code to a dedicated file Ravi Bangoria
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-08 6:29 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, seanjc, pbonzini, thomas.lendacky,
jmattson
Cc: ravi.bangoria, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla
Add Bus Lock Detect (called Bus Lock Trap in AMD docs) support for AMD
platforms. Bus Lock Detect is enumerated with CPUID Fn0000_0007_ECX_x0
bit [24 / BUSLOCKTRAP]. It can be enabled through MSR_IA32_DEBUGCTLMSR.
When enabled, hardware clears DR6[11] and raises a #DB exception on
occurrence of Bus Lock if CPL > 0. More detail about the feature can be
found in AMD APM:
AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
2023, Vol 2, 13.1.3.6 Bus Lock Trap
https://bugzilla.kernel.org/attachment.cgi?id=304653
Patches are prepared on tip/master (532361ba3a0e).
v3: https://lore.kernel.org/r/20240806125442.1603-1-ravi.bangoria@amd.com
v3->v4:
- Introduce CONFIG_X86_BUS_LOCK_DETECT, make it dependent on
(CONFIG_CPU_SUP_INTEL || CONFIG_CPU_SUP_AMD). And make Bus Lock Detect
feature dependent on CONFIG_X86_BUS_LOCK_DETECT.
- Update documentation about Bus Lock Detect support on AMD.
Note:
A Qemu fix is also require to handle a corner case where a hardware
instruction or data breakpoint is created by Qemu remote debugger (gdb)
on the same instruction which also causes a Bus Lock. Qemu patch to
handle it can be found at:
https://lore.kernel.org/r/20240712095208.1553-1-ravi.bangoria@amd.com
Ravi Bangoria (4):
x86/split_lock: Move Split and Bus lock code to a dedicated file
x86/bus_lock: Add support for AMD
KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is
missing
KVM: SVM: Add Bus Lock Detect support
Documentation/arch/x86/buslock.rst | 3 +-
arch/x86/Kconfig | 8 +
arch/x86/include/asm/cpu.h | 11 +-
arch/x86/kernel/cpu/Makefile | 2 +
arch/x86/kernel/cpu/bus_lock.c | 406 ++++++++++++++++++++++++++++
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/cpu/intel.c | 407 -----------------------------
arch/x86/kvm/svm/nested.c | 3 +-
arch/x86/kvm/svm/svm.c | 16 +-
include/linux/sched.h | 2 +-
kernel/fork.c | 2 +-
11 files changed, 448 insertions(+), 414 deletions(-)
create mode 100644 arch/x86/kernel/cpu/bus_lock.c
--
2.34.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 1/4] x86/split_lock: Move Split and Bus lock code to a dedicated file
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
@ 2024-08-08 6:29 ` Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 2/4] x86/bus_lock: Add support for AMD Ravi Bangoria
` (3 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-08 6:29 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, seanjc, pbonzini, thomas.lendacky,
jmattson
Cc: ravi.bangoria, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla
Bus Lock Detect functionality on AMD platforms works identical to Intel.
Move split_lock and bus_lock specific code from intel.c to a dedicated
file so that it can be compiled and supported on non-Intel platforms.
Also, introduce CONFIG_X86_BUS_LOCK_DETECT, make it dependent on
CONFIG_CPU_SUP_INTEL and add compilation dependency of new bus_lock.c
file on CONFIG_X86_BUS_LOCK_DETECT.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/Kconfig | 8 +
arch/x86/include/asm/cpu.h | 11 +-
arch/x86/kernel/cpu/Makefile | 2 +
arch/x86/kernel/cpu/bus_lock.c | 406 +++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/intel.c | 406 ---------------------------------
include/linux/sched.h | 2 +-
kernel/fork.c | 2 +-
7 files changed, 427 insertions(+), 410 deletions(-)
create mode 100644 arch/x86/kernel/cpu/bus_lock.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 02775a61b557..db2aad850a8f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2427,6 +2427,14 @@ config CFI_AUTO_DEFAULT
source "kernel/livepatch/Kconfig"
+config X86_BUS_LOCK_DETECT
+ bool "Split Lock Detect and Bus Lock Detect support"
+ depends on CPU_SUP_INTEL
+ default y
+ help
+ Enable Split Lock Detect and Bus Lock Detect functionalities.
+ See <file:Documentation/arch/x86/buslock.rst> for more information.
+
endmenu
config CC_HAS_NAMED_AS
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index aa30fd8cad7f..86cf7952ee86 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -26,12 +26,13 @@ int mwait_usable(const struct cpuinfo_x86 *);
unsigned int x86_family(unsigned int sig);
unsigned int x86_model(unsigned int sig);
unsigned int x86_stepping(unsigned int sig);
-#ifdef CONFIG_CPU_SUP_INTEL
+#ifdef CONFIG_X86_BUS_LOCK_DETECT
extern void __init sld_setup(struct cpuinfo_x86 *c);
extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
extern bool handle_guest_split_lock(unsigned long ip);
extern void handle_bus_lock(struct pt_regs *regs);
-u8 get_this_hybrid_cpu_type(void);
+void split_lock_init(void);
+void bus_lock_init(void);
#else
static inline void __init sld_setup(struct cpuinfo_x86 *c) {}
static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -45,7 +46,13 @@ static inline bool handle_guest_split_lock(unsigned long ip)
}
static inline void handle_bus_lock(struct pt_regs *regs) {}
+static inline void split_lock_init(void) {}
+static inline void bus_lock_init(void) {}
+#endif
+#ifdef CONFIG_CPU_SUP_INTEL
+u8 get_this_hybrid_cpu_type(void);
+#else
static inline u8 get_this_hybrid_cpu_type(void)
{
return 0;
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 5857a0f5d514..4efdf5c2efc8 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -59,6 +59,8 @@ obj-$(CONFIG_ACRN_GUEST) += acrn.o
obj-$(CONFIG_DEBUG_FS) += debugfs.o
+obj-$(CONFIG_X86_BUS_LOCK_DETECT) += bus_lock.o
+
quiet_cmd_mkcapflags = MKCAP $@
cmd_mkcapflags = $(CONFIG_SHELL) $(src)/mkcapflags.sh $@ $^
diff --git a/arch/x86/kernel/cpu/bus_lock.c b/arch/x86/kernel/cpu/bus_lock.c
new file mode 100644
index 000000000000..704e9241b964
--- /dev/null
+++ b/arch/x86/kernel/cpu/bus_lock.c
@@ -0,0 +1,406 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define pr_fmt(fmt) "x86/split lock detection: " fmt
+
+#include <linux/semaphore.h>
+#include <linux/workqueue.h>
+#include <linux/delay.h>
+#include <linux/cpuhotplug.h>
+#include <asm/cpu_device_id.h>
+#include <asm/cmdline.h>
+#include <asm/traps.h>
+#include <asm/cpu.h>
+
+enum split_lock_detect_state {
+ sld_off = 0,
+ sld_warn,
+ sld_fatal,
+ sld_ratelimit,
+};
+
+/*
+ * Default to sld_off because most systems do not support split lock detection.
+ * sld_state_setup() will switch this to sld_warn on systems that support
+ * split lock/bus lock detect, unless there is a command line override.
+ */
+static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
+static u64 msr_test_ctrl_cache __ro_after_init;
+
+/*
+ * With a name like MSR_TEST_CTL it should go without saying, but don't touch
+ * MSR_TEST_CTL unless the CPU is one of the whitelisted models. Writing it
+ * on CPUs that do not support SLD can cause fireworks, even when writing '0'.
+ */
+static bool cpu_model_supports_sld __ro_after_init;
+
+static const struct {
+ const char *option;
+ enum split_lock_detect_state state;
+} sld_options[] __initconst = {
+ { "off", sld_off },
+ { "warn", sld_warn },
+ { "fatal", sld_fatal },
+ { "ratelimit:", sld_ratelimit },
+};
+
+static struct ratelimit_state bld_ratelimit;
+
+static unsigned int sysctl_sld_mitigate = 1;
+static DEFINE_SEMAPHORE(buslock_sem, 1);
+
+#ifdef CONFIG_PROC_SYSCTL
+static struct ctl_table sld_sysctls[] = {
+ {
+ .procname = "split_lock_mitigate",
+ .data = &sysctl_sld_mitigate,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+};
+
+static int __init sld_mitigate_sysctl_init(void)
+{
+ register_sysctl_init("kernel", sld_sysctls);
+ return 0;
+}
+
+late_initcall(sld_mitigate_sysctl_init);
+#endif
+
+static inline bool match_option(const char *arg, int arglen, const char *opt)
+{
+ int len = strlen(opt), ratelimit;
+
+ if (strncmp(arg, opt, len))
+ return false;
+
+ /*
+ * Min ratelimit is 1 bus lock/sec.
+ * Max ratelimit is 1000 bus locks/sec.
+ */
+ if (sscanf(arg, "ratelimit:%d", &ratelimit) == 1 &&
+ ratelimit > 0 && ratelimit <= 1000) {
+ ratelimit_state_init(&bld_ratelimit, HZ, ratelimit);
+ ratelimit_set_flags(&bld_ratelimit, RATELIMIT_MSG_ON_RELEASE);
+ return true;
+ }
+
+ return len == arglen;
+}
+
+static bool split_lock_verify_msr(bool on)
+{
+ u64 ctrl, tmp;
+
+ if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
+ return false;
+ if (on)
+ ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+ else
+ ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+ if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
+ return false;
+ rdmsrl(MSR_TEST_CTRL, tmp);
+ return ctrl == tmp;
+}
+
+static void __init sld_state_setup(void)
+{
+ enum split_lock_detect_state state = sld_warn;
+ char arg[20];
+ int i, ret;
+
+ if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+ !boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
+ return;
+
+ ret = cmdline_find_option(boot_command_line, "split_lock_detect",
+ arg, sizeof(arg));
+ if (ret >= 0) {
+ for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
+ if (match_option(arg, ret, sld_options[i].option)) {
+ state = sld_options[i].state;
+ break;
+ }
+ }
+ }
+ sld_state = state;
+}
+
+static void __init __split_lock_setup(void)
+{
+ if (!split_lock_verify_msr(false)) {
+ pr_info("MSR access failed: Disabled\n");
+ return;
+ }
+
+ rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
+
+ if (!split_lock_verify_msr(true)) {
+ pr_info("MSR access failed: Disabled\n");
+ return;
+ }
+
+ /* Restore the MSR to its cached value. */
+ wrmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
+
+ setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
+}
+
+/*
+ * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
+ * is not implemented as one thread could undo the setting of the other
+ * thread immediately after dropping the lock anyway.
+ */
+static void sld_update_msr(bool on)
+{
+ u64 test_ctrl_val = msr_test_ctrl_cache;
+
+ if (on)
+ test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+
+ wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
+}
+
+void split_lock_init(void)
+{
+ /*
+ * #DB for bus lock handles ratelimit and #AC for split lock is
+ * disabled.
+ */
+ if (sld_state == sld_ratelimit) {
+ split_lock_verify_msr(false);
+ return;
+ }
+
+ if (cpu_model_supports_sld)
+ split_lock_verify_msr(sld_state != sld_off);
+}
+
+static void __split_lock_reenable_unlock(struct work_struct *work)
+{
+ sld_update_msr(true);
+ up(&buslock_sem);
+}
+
+static DECLARE_DELAYED_WORK(sl_reenable_unlock, __split_lock_reenable_unlock);
+
+static void __split_lock_reenable(struct work_struct *work)
+{
+ sld_update_msr(true);
+}
+static DECLARE_DELAYED_WORK(sl_reenable, __split_lock_reenable);
+
+/*
+ * If a CPU goes offline with pending delayed work to re-enable split lock
+ * detection then the delayed work will be executed on some other CPU. That
+ * handles releasing the buslock_sem, but because it executes on a
+ * different CPU probably won't re-enable split lock detection. This is a
+ * problem on HT systems since the sibling CPU on the same core may then be
+ * left running with split lock detection disabled.
+ *
+ * Unconditionally re-enable detection here.
+ */
+static int splitlock_cpu_offline(unsigned int cpu)
+{
+ sld_update_msr(true);
+
+ return 0;
+}
+
+static void split_lock_warn(unsigned long ip)
+{
+ struct delayed_work *work;
+ int cpu;
+
+ if (!current->reported_split_lock)
+ pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
+ current->comm, current->pid, ip);
+ current->reported_split_lock = 1;
+
+ if (sysctl_sld_mitigate) {
+ /*
+ * misery factor #1:
+ * sleep 10ms before trying to execute split lock.
+ */
+ if (msleep_interruptible(10) > 0)
+ return;
+ /*
+ * Misery factor #2:
+ * only allow one buslocked disabled core at a time.
+ */
+ if (down_interruptible(&buslock_sem) == -EINTR)
+ return;
+ work = &sl_reenable_unlock;
+ } else {
+ work = &sl_reenable;
+ }
+
+ cpu = get_cpu();
+ schedule_delayed_work_on(cpu, work, 2);
+
+ /* Disable split lock detection on this CPU to make progress */
+ sld_update_msr(false);
+ put_cpu();
+}
+
+bool handle_guest_split_lock(unsigned long ip)
+{
+ if (sld_state == sld_warn) {
+ split_lock_warn(ip);
+ return true;
+ }
+
+ pr_warn_once("#AC: %s/%d %s split_lock trap at address: 0x%lx\n",
+ current->comm, current->pid,
+ sld_state == sld_fatal ? "fatal" : "bogus", ip);
+
+ current->thread.error_code = 0;
+ current->thread.trap_nr = X86_TRAP_AC;
+ force_sig_fault(SIGBUS, BUS_ADRALN, NULL);
+ return false;
+}
+EXPORT_SYMBOL_GPL(handle_guest_split_lock);
+
+void bus_lock_init(void)
+{
+ u64 val;
+
+ if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
+ return;
+
+ rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
+
+ if ((boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+ (sld_state == sld_warn || sld_state == sld_fatal)) ||
+ sld_state == sld_off) {
+ /*
+ * Warn and fatal are handled by #AC for split lock if #AC for
+ * split lock is supported.
+ */
+ val &= ~DEBUGCTLMSR_BUS_LOCK_DETECT;
+ } else {
+ val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
+ }
+
+ wrmsrl(MSR_IA32_DEBUGCTLMSR, val);
+}
+
+bool handle_user_split_lock(struct pt_regs *regs, long error_code)
+{
+ if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
+ return false;
+ split_lock_warn(regs->ip);
+ return true;
+}
+
+void handle_bus_lock(struct pt_regs *regs)
+{
+ switch (sld_state) {
+ case sld_off:
+ break;
+ case sld_ratelimit:
+ /* Enforce no more than bld_ratelimit bus locks/sec. */
+ while (!__ratelimit(&bld_ratelimit))
+ msleep(20);
+ /* Warn on the bus lock. */
+ fallthrough;
+ case sld_warn:
+ pr_warn_ratelimited("#DB: %s/%d took a bus_lock trap at address: 0x%lx\n",
+ current->comm, current->pid, regs->ip);
+ break;
+ case sld_fatal:
+ force_sig_fault(SIGBUS, BUS_ADRALN, NULL);
+ break;
+ }
+}
+
+/*
+ * CPU models that are known to have the per-core split-lock detection
+ * feature even though they do not enumerate IA32_CORE_CAPABILITIES.
+ */
+static const struct x86_cpu_id split_lock_cpu_ids[] __initconst = {
+ X86_MATCH_VFM(INTEL_ICELAKE_X, 0),
+ X86_MATCH_VFM(INTEL_ICELAKE_L, 0),
+ X86_MATCH_VFM(INTEL_ICELAKE_D, 0),
+ {}
+};
+
+static void __init split_lock_setup(struct cpuinfo_x86 *c)
+{
+ const struct x86_cpu_id *m;
+ u64 ia32_core_caps;
+
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
+ return;
+
+ /* Check for CPUs that have support but do not enumerate it: */
+ m = x86_match_cpu(split_lock_cpu_ids);
+ if (m)
+ goto supported;
+
+ if (!cpu_has(c, X86_FEATURE_CORE_CAPABILITIES))
+ return;
+
+ /*
+ * Not all bits in MSR_IA32_CORE_CAPS are architectural, but
+ * MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT is. All CPUs that set
+ * it have split lock detection.
+ */
+ rdmsrl(MSR_IA32_CORE_CAPS, ia32_core_caps);
+ if (ia32_core_caps & MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT)
+ goto supported;
+
+ /* CPU is not in the model list and does not have the MSR bit: */
+ return;
+
+supported:
+ cpu_model_supports_sld = true;
+ __split_lock_setup();
+}
+
+static void sld_state_show(void)
+{
+ if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) &&
+ !boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
+ return;
+
+ switch (sld_state) {
+ case sld_off:
+ pr_info("disabled\n");
+ break;
+ case sld_warn:
+ if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
+ pr_info("#AC: crashing the kernel on kernel split_locks and warning on user-space split_locks\n");
+ if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "x86/splitlock", NULL, splitlock_cpu_offline) < 0)
+ pr_warn("No splitlock CPU offline handler\n");
+ } else if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT)) {
+ pr_info("#DB: warning on user-space bus_locks\n");
+ }
+ break;
+ case sld_fatal:
+ if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
+ pr_info("#AC: crashing the kernel on kernel split_locks and sending SIGBUS on user-space split_locks\n");
+ } else if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT)) {
+ pr_info("#DB: sending SIGBUS on user-space bus_locks%s\n",
+ boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) ?
+ " from non-WB" : "");
+ }
+ break;
+ case sld_ratelimit:
+ if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
+ pr_info("#DB: setting system wide bus lock rate limit to %u/sec\n", bld_ratelimit.burst);
+ break;
+ }
+}
+
+void __init sld_setup(struct cpuinfo_x86 *c)
+{
+ split_lock_setup(c);
+ sld_state_setup();
+ sld_state_show();
+}
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 08b95a35b5cb..8a483f4ad026 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -7,13 +7,9 @@
#include <linux/smp.h>
#include <linux/sched.h>
#include <linux/sched/clock.h>
-#include <linux/semaphore.h>
#include <linux/thread_info.h>
#include <linux/init.h>
#include <linux/uaccess.h>
-#include <linux/workqueue.h>
-#include <linux/delay.h>
-#include <linux/cpuhotplug.h>
#include <asm/cpufeature.h>
#include <asm/msr.h>
@@ -24,8 +20,6 @@
#include <asm/hwcap2.h>
#include <asm/elf.h>
#include <asm/cpu_device_id.h>
-#include <asm/cmdline.h>
-#include <asm/traps.h>
#include <asm/resctrl.h>
#include <asm/numa.h>
#include <asm/thermal.h>
@@ -41,28 +35,6 @@
#include <asm/apic.h>
#endif
-enum split_lock_detect_state {
- sld_off = 0,
- sld_warn,
- sld_fatal,
- sld_ratelimit,
-};
-
-/*
- * Default to sld_off because most systems do not support split lock detection.
- * sld_state_setup() will switch this to sld_warn on systems that support
- * split lock/bus lock detect, unless there is a command line override.
- */
-static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
-static u64 msr_test_ctrl_cache __ro_after_init;
-
-/*
- * With a name like MSR_TEST_CTL it should go without saying, but don't touch
- * MSR_TEST_CTL unless the CPU is one of the whitelisted models. Writing it
- * on CPUs that do not support SLD can cause fireworks, even when writing '0'.
- */
-static bool cpu_model_supports_sld __ro_after_init;
-
/*
* Processors which have self-snooping capability can handle conflicting
* memory type across CPUs by snooping its own cache. However, there exists
@@ -547,9 +519,6 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
}
-static void split_lock_init(void);
-static void bus_lock_init(void);
-
static void init_intel(struct cpuinfo_x86 *c)
{
early_init_intel(c);
@@ -907,381 +876,6 @@ static const struct cpu_dev intel_cpu_dev = {
cpu_dev_register(intel_cpu_dev);
-#undef pr_fmt
-#define pr_fmt(fmt) "x86/split lock detection: " fmt
-
-static const struct {
- const char *option;
- enum split_lock_detect_state state;
-} sld_options[] __initconst = {
- { "off", sld_off },
- { "warn", sld_warn },
- { "fatal", sld_fatal },
- { "ratelimit:", sld_ratelimit },
-};
-
-static struct ratelimit_state bld_ratelimit;
-
-static unsigned int sysctl_sld_mitigate = 1;
-static DEFINE_SEMAPHORE(buslock_sem, 1);
-
-#ifdef CONFIG_PROC_SYSCTL
-static struct ctl_table sld_sysctls[] = {
- {
- .procname = "split_lock_mitigate",
- .data = &sysctl_sld_mitigate,
- .maxlen = sizeof(unsigned int),
- .mode = 0644,
- .proc_handler = proc_douintvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
-};
-
-static int __init sld_mitigate_sysctl_init(void)
-{
- register_sysctl_init("kernel", sld_sysctls);
- return 0;
-}
-
-late_initcall(sld_mitigate_sysctl_init);
-#endif
-
-static inline bool match_option(const char *arg, int arglen, const char *opt)
-{
- int len = strlen(opt), ratelimit;
-
- if (strncmp(arg, opt, len))
- return false;
-
- /*
- * Min ratelimit is 1 bus lock/sec.
- * Max ratelimit is 1000 bus locks/sec.
- */
- if (sscanf(arg, "ratelimit:%d", &ratelimit) == 1 &&
- ratelimit > 0 && ratelimit <= 1000) {
- ratelimit_state_init(&bld_ratelimit, HZ, ratelimit);
- ratelimit_set_flags(&bld_ratelimit, RATELIMIT_MSG_ON_RELEASE);
- return true;
- }
-
- return len == arglen;
-}
-
-static bool split_lock_verify_msr(bool on)
-{
- u64 ctrl, tmp;
-
- if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
- return false;
- if (on)
- ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
- else
- ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
- if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
- return false;
- rdmsrl(MSR_TEST_CTRL, tmp);
- return ctrl == tmp;
-}
-
-static void __init sld_state_setup(void)
-{
- enum split_lock_detect_state state = sld_warn;
- char arg[20];
- int i, ret;
-
- if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
- !boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
- return;
-
- ret = cmdline_find_option(boot_command_line, "split_lock_detect",
- arg, sizeof(arg));
- if (ret >= 0) {
- for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
- if (match_option(arg, ret, sld_options[i].option)) {
- state = sld_options[i].state;
- break;
- }
- }
- }
- sld_state = state;
-}
-
-static void __init __split_lock_setup(void)
-{
- if (!split_lock_verify_msr(false)) {
- pr_info("MSR access failed: Disabled\n");
- return;
- }
-
- rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
-
- if (!split_lock_verify_msr(true)) {
- pr_info("MSR access failed: Disabled\n");
- return;
- }
-
- /* Restore the MSR to its cached value. */
- wrmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
-
- setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
-}
-
-/*
- * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
- * is not implemented as one thread could undo the setting of the other
- * thread immediately after dropping the lock anyway.
- */
-static void sld_update_msr(bool on)
-{
- u64 test_ctrl_val = msr_test_ctrl_cache;
-
- if (on)
- test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-
- wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
-}
-
-static void split_lock_init(void)
-{
- /*
- * #DB for bus lock handles ratelimit and #AC for split lock is
- * disabled.
- */
- if (sld_state == sld_ratelimit) {
- split_lock_verify_msr(false);
- return;
- }
-
- if (cpu_model_supports_sld)
- split_lock_verify_msr(sld_state != sld_off);
-}
-
-static void __split_lock_reenable_unlock(struct work_struct *work)
-{
- sld_update_msr(true);
- up(&buslock_sem);
-}
-
-static DECLARE_DELAYED_WORK(sl_reenable_unlock, __split_lock_reenable_unlock);
-
-static void __split_lock_reenable(struct work_struct *work)
-{
- sld_update_msr(true);
-}
-static DECLARE_DELAYED_WORK(sl_reenable, __split_lock_reenable);
-
-/*
- * If a CPU goes offline with pending delayed work to re-enable split lock
- * detection then the delayed work will be executed on some other CPU. That
- * handles releasing the buslock_sem, but because it executes on a
- * different CPU probably won't re-enable split lock detection. This is a
- * problem on HT systems since the sibling CPU on the same core may then be
- * left running with split lock detection disabled.
- *
- * Unconditionally re-enable detection here.
- */
-static int splitlock_cpu_offline(unsigned int cpu)
-{
- sld_update_msr(true);
-
- return 0;
-}
-
-static void split_lock_warn(unsigned long ip)
-{
- struct delayed_work *work;
- int cpu;
-
- if (!current->reported_split_lock)
- pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
- current->comm, current->pid, ip);
- current->reported_split_lock = 1;
-
- if (sysctl_sld_mitigate) {
- /*
- * misery factor #1:
- * sleep 10ms before trying to execute split lock.
- */
- if (msleep_interruptible(10) > 0)
- return;
- /*
- * Misery factor #2:
- * only allow one buslocked disabled core at a time.
- */
- if (down_interruptible(&buslock_sem) == -EINTR)
- return;
- work = &sl_reenable_unlock;
- } else {
- work = &sl_reenable;
- }
-
- cpu = get_cpu();
- schedule_delayed_work_on(cpu, work, 2);
-
- /* Disable split lock detection on this CPU to make progress */
- sld_update_msr(false);
- put_cpu();
-}
-
-bool handle_guest_split_lock(unsigned long ip)
-{
- if (sld_state == sld_warn) {
- split_lock_warn(ip);
- return true;
- }
-
- pr_warn_once("#AC: %s/%d %s split_lock trap at address: 0x%lx\n",
- current->comm, current->pid,
- sld_state == sld_fatal ? "fatal" : "bogus", ip);
-
- current->thread.error_code = 0;
- current->thread.trap_nr = X86_TRAP_AC;
- force_sig_fault(SIGBUS, BUS_ADRALN, NULL);
- return false;
-}
-EXPORT_SYMBOL_GPL(handle_guest_split_lock);
-
-static void bus_lock_init(void)
-{
- u64 val;
-
- if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
- return;
-
- rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
-
- if ((boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
- (sld_state == sld_warn || sld_state == sld_fatal)) ||
- sld_state == sld_off) {
- /*
- * Warn and fatal are handled by #AC for split lock if #AC for
- * split lock is supported.
- */
- val &= ~DEBUGCTLMSR_BUS_LOCK_DETECT;
- } else {
- val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
- }
-
- wrmsrl(MSR_IA32_DEBUGCTLMSR, val);
-}
-
-bool handle_user_split_lock(struct pt_regs *regs, long error_code)
-{
- if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
- return false;
- split_lock_warn(regs->ip);
- return true;
-}
-
-void handle_bus_lock(struct pt_regs *regs)
-{
- switch (sld_state) {
- case sld_off:
- break;
- case sld_ratelimit:
- /* Enforce no more than bld_ratelimit bus locks/sec. */
- while (!__ratelimit(&bld_ratelimit))
- msleep(20);
- /* Warn on the bus lock. */
- fallthrough;
- case sld_warn:
- pr_warn_ratelimited("#DB: %s/%d took a bus_lock trap at address: 0x%lx\n",
- current->comm, current->pid, regs->ip);
- break;
- case sld_fatal:
- force_sig_fault(SIGBUS, BUS_ADRALN, NULL);
- break;
- }
-}
-
-/*
- * CPU models that are known to have the per-core split-lock detection
- * feature even though they do not enumerate IA32_CORE_CAPABILITIES.
- */
-static const struct x86_cpu_id split_lock_cpu_ids[] __initconst = {
- X86_MATCH_VFM(INTEL_ICELAKE_X, 0),
- X86_MATCH_VFM(INTEL_ICELAKE_L, 0),
- X86_MATCH_VFM(INTEL_ICELAKE_D, 0),
- {}
-};
-
-static void __init split_lock_setup(struct cpuinfo_x86 *c)
-{
- const struct x86_cpu_id *m;
- u64 ia32_core_caps;
-
- if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
- return;
-
- /* Check for CPUs that have support but do not enumerate it: */
- m = x86_match_cpu(split_lock_cpu_ids);
- if (m)
- goto supported;
-
- if (!cpu_has(c, X86_FEATURE_CORE_CAPABILITIES))
- return;
-
- /*
- * Not all bits in MSR_IA32_CORE_CAPS are architectural, but
- * MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT is. All CPUs that set
- * it have split lock detection.
- */
- rdmsrl(MSR_IA32_CORE_CAPS, ia32_core_caps);
- if (ia32_core_caps & MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT)
- goto supported;
-
- /* CPU is not in the model list and does not have the MSR bit: */
- return;
-
-supported:
- cpu_model_supports_sld = true;
- __split_lock_setup();
-}
-
-static void sld_state_show(void)
-{
- if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) &&
- !boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
- return;
-
- switch (sld_state) {
- case sld_off:
- pr_info("disabled\n");
- break;
- case sld_warn:
- if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
- pr_info("#AC: crashing the kernel on kernel split_locks and warning on user-space split_locks\n");
- if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
- "x86/splitlock", NULL, splitlock_cpu_offline) < 0)
- pr_warn("No splitlock CPU offline handler\n");
- } else if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT)) {
- pr_info("#DB: warning on user-space bus_locks\n");
- }
- break;
- case sld_fatal:
- if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
- pr_info("#AC: crashing the kernel on kernel split_locks and sending SIGBUS on user-space split_locks\n");
- } else if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT)) {
- pr_info("#DB: sending SIGBUS on user-space bus_locks%s\n",
- boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) ?
- " from non-WB" : "");
- }
- break;
- case sld_ratelimit:
- if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
- pr_info("#DB: setting system wide bus lock rate limit to %u/sec\n", bld_ratelimit.burst);
- break;
- }
-}
-
-void __init sld_setup(struct cpuinfo_x86 *c)
-{
- split_lock_setup(c);
- sld_state_setup();
- sld_state_show();
-}
-
#define X86_HYBRID_CPU_TYPE_ID_SHIFT 24
/**
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d4cc144f72a3..2b16e5cfeb36 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -991,7 +991,7 @@ struct task_struct {
#ifdef CONFIG_ARCH_HAS_CPU_PASID
unsigned pasid_activated:1;
#endif
-#ifdef CONFIG_CPU_SUP_INTEL
+#ifdef CONFIG_X86_BUS_LOCK_DETECT
unsigned reported_split_lock:1;
#endif
#ifdef CONFIG_TASK_DELAY_ACCT
diff --git a/kernel/fork.c b/kernel/fork.c
index c1b343cba560..0b71fc9fa750 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1182,7 +1182,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
tsk->active_memcg = NULL;
#endif
-#ifdef CONFIG_CPU_SUP_INTEL
+#ifdef CONFIG_X86_BUS_LOCK_DETECT
tsk->reported_split_lock = 0;
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 2/4] x86/bus_lock: Add support for AMD
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 1/4] x86/split_lock: Move Split and Bus lock code to a dedicated file Ravi Bangoria
@ 2024-08-08 6:29 ` Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 3/4] KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing Ravi Bangoria
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-08 6:29 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, seanjc, pbonzini, thomas.lendacky,
jmattson
Cc: ravi.bangoria, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla
Add Bus Lock Detect (called Bus Lock Trap in AMD docs) support for AMD
platforms. Bus Lock Detect is enumerated with CPUID Fn0000_0007_ECX_x0
bit [24 / BUSLOCKTRAP]. It can be enabled through MSR_IA32_DEBUGCTLMSR.
When enabled, hardware clears DR6[11] and raises a #DB exception on
occurrence of Bus Lock if CPL > 0. More detail about the feature can be
found in AMD APM[1].
[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
2023, Vol 2, 13.1.3.6 Bus Lock Trap
https://bugzilla.kernel.org/attachment.cgi?id=304653
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
Documentation/arch/x86/buslock.rst | 3 ++-
arch/x86/Kconfig | 2 +-
arch/x86/kernel/cpu/common.c | 2 ++
arch/x86/kernel/cpu/intel.c | 1 -
4 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/x86/buslock.rst b/Documentation/arch/x86/buslock.rst
index 4c5a4822eeb7..31f1bfdff16f 100644
--- a/Documentation/arch/x86/buslock.rst
+++ b/Documentation/arch/x86/buslock.rst
@@ -26,7 +26,8 @@ Detection
=========
Intel processors may support either or both of the following hardware
-mechanisms to detect split locks and bus locks.
+mechanisms to detect split locks and bus locks. Some AMD processors also
+support bus lock detect.
#AC exception for split lock detection
--------------------------------------
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index db2aad850a8f..d422247b2882 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2429,7 +2429,7 @@ source "kernel/livepatch/Kconfig"
config X86_BUS_LOCK_DETECT
bool "Split Lock Detect and Bus Lock Detect support"
- depends on CPU_SUP_INTEL
+ depends on CPU_SUP_INTEL || CPU_SUP_AMD
default y
help
Enable Split Lock Detect and Bus Lock Detect functionalities.
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d4e539d4e158..a37670e1ab4d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1832,6 +1832,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
if (this_cpu->c_init)
this_cpu->c_init(c);
+ bus_lock_init();
+
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 8a483f4ad026..799f18545c6e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -610,7 +610,6 @@ static void init_intel(struct cpuinfo_x86 *c)
init_intel_misc_features(c);
split_lock_init();
- bus_lock_init();
intel_init_thermal(c);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 3/4] KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 1/4] x86/split_lock: Move Split and Bus lock code to a dedicated file Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 2/4] x86/bus_lock: Add support for AMD Ravi Bangoria
@ 2024-08-08 6:29 ` Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support Ravi Bangoria
2024-08-23 23:47 ` [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Sean Christopherson
4 siblings, 0 replies; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-08 6:29 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, seanjc, pbonzini, thomas.lendacky,
jmattson
Cc: ravi.bangoria, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla
If host supports Bus Lock Detect, KVM advertises it to guests even if
SVM support is absent. Additionally, guest wouldn't be able to use it
despite guest CPUID bit being set. Fix it by unconditionally clearing
the feature bit in KVM cpu capability.
Reported-by: Jim Mattson <jmattson@google.com>
Closes: https://lore.kernel.org/r/CALMp9eRet6+v8Y1Q-i6mqPm4hUow_kJNhmVHfOV8tMfuSS=tVg@mail.gmail.com
Fixes: 76ea438b4afc ("KVM: X86: Expose bus lock debug exception to guest")
Cc: stable@vger.kernel.org
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/kvm/svm/svm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d6f252555ab3..e1b6a16e97c0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5224,6 +5224,9 @@ static __init void svm_set_cpu_caps(void)
/* CPUID 0x8000001F (SME/SEV features) */
sev_set_cpu_caps();
+
+ /* Don't advertise Bus Lock Detect to guest if SVM support is absent */
+ kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
}
static __init int svm_hardware_setup(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
` (2 preceding siblings ...)
2024-08-08 6:29 ` [PATCH v4 3/4] KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing Ravi Bangoria
@ 2024-08-08 6:29 ` Ravi Bangoria
2024-08-17 0:13 ` Sean Christopherson
2024-08-23 23:47 ` [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Sean Christopherson
4 siblings, 1 reply; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-08 6:29 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, seanjc, pbonzini, thomas.lendacky,
jmattson
Cc: ravi.bangoria, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla
Add Bus Lock Detect support in AMD SVM. Bus Lock Detect is enabled through
MSR_IA32_DEBUGCTLMSR and MSR_IA32_DEBUGCTLMSR is virtualized only if LBR
Virtualization is enabled. Add this dependency in the SVM.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
arch/x86/kvm/svm/nested.c | 3 ++-
arch/x86/kvm/svm/svm.c | 17 ++++++++++++++---
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 6f704c1037e5..1df9158c72c1 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -586,7 +586,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
/* These bits will be set properly on the first execution when new_vmc12 is true */
if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
vmcb02->save.dr7 = svm->nested.save.dr7 | DR7_FIXED_1;
- svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_ACTIVE_LOW;
+ /* DR6_RTM is a reserved bit on AMD and as such must be set to 1 */
+ svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_FIXED_1 | DR6_RTM;
vmcb_mark_dirty(vmcb02, VMCB_DR);
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e1b6a16e97c0..9f3d31a5d231 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
- bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
+ u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
+ bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
(is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
@@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
if (data & DEBUGCTL_RESERVED_BITS)
return 1;
+ if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
+ !guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
+ return 1;
+
svm_get_lbr_vmcb(svm)->save.dbgctl = data;
svm_update_lbrv(vcpu);
break;
@@ -5225,8 +5230,14 @@ static __init void svm_set_cpu_caps(void)
/* CPUID 0x8000001F (SME/SEV features) */
sev_set_cpu_caps();
- /* Don't advertise Bus Lock Detect to guest if SVM support is absent */
- kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
+ /*
+ * LBR Virtualization must be enabled to support BusLockTrap inside the
+ * guest, since BusLockTrap is enabled through MSR_IA32_DEBUGCTLMSR and
+ * MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
+ * enabled.
+ */
+ if (!lbrv)
+ kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
}
static __init int svm_hardware_setup(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-08 6:29 ` [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support Ravi Bangoria
@ 2024-08-17 0:13 ` Sean Christopherson
2024-08-20 16:38 ` Ravi Bangoria
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Sean Christopherson @ 2024-08-17 0:13 UTC (permalink / raw)
To: Ravi Bangoria
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla
On Thu, Aug 08, 2024, Ravi Bangoria wrote:
> Add Bus Lock Detect support in AMD SVM. Bus Lock Detect is enabled through
> MSR_IA32_DEBUGCTLMSR and MSR_IA32_DEBUGCTLMSR is virtualized only if LBR
> Virtualization is enabled. Add this dependency in the SVM.
This doesn't depend on the x86 patches that have gone into tip-tree, correct?
In the future, unless there's an actual depenency in code or functionality,
please send separate series for patches that obviously need to be routed through
different trees.
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/kvm/svm/nested.c | 3 ++-
> arch/x86/kvm/svm/svm.c | 17 ++++++++++++++---
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 6f704c1037e5..1df9158c72c1 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -586,7 +586,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
> /* These bits will be set properly on the first execution when new_vmc12 is true */
> if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
> vmcb02->save.dr7 = svm->nested.save.dr7 | DR7_FIXED_1;
> - svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_ACTIVE_LOW;
> + /* DR6_RTM is a reserved bit on AMD and as such must be set to 1 */
> + svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_FIXED_1 | DR6_RTM;
> vmcb_mark_dirty(vmcb02, VMCB_DR);
> }
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e1b6a16e97c0..9f3d31a5d231 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
> - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
> + u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
> + bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
Out of sight, but this leads to calling svm_enable_lbrv() even when the guest
just wants to enable BUS_LOCK_DETECT. Ignoring SEV-ES guests, KVM will intercept
writes to DEBUGCTL, so can't KVM defer mucking with the intercepts and
svm_copy_lbrs() until the guest actually wants to use LBRs?
Hmm, and I think the existing code is broken. If L1 passes DEBUGCTL through to
L2, then KVM will handles writes to L1's effective value. And if L1 also passes
through the LBRs, then KVM will fail to update the MSR bitmaps for vmcb02.
Ah, it's just a performance issue though, because KVM will still emulate RDMSR.
Ugh, this code is silly. The LBR MSRs are read-only, yet KVM passes them through
for write.
Anyways, I'm thinking something like this? Note, using msr_write_intercepted()
is wrong, because that'll check L2's bitmap if is_guest_mode(), and the idea is
to use L1's bitmap as the canary.
static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
{
struct vcpu_svm *svm = to_svm(vcpu);
KVM_BUG_ON(!passthrough && sev_es_guest(vcpu->kvm), vcpu->kvm);
if (!msr_write_intercepted(vcpu, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
return;
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, passthrough, 0);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, passthrough, 0);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, passthrough, 0);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, passthrough, 0);
/*
* When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
* and then move them back to vmcb01 when disabling to avoid copying
* them on nested guest entries.
*/
if (is_guest_mode(vcpu)) {
if (passthrough)
svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
else
svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
}
}
void svm_enable_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
if (WARN_ON_ONCE(!sev_es_guest(vcpu->kvm)))
return;
svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
svm_update_passthrough_lbrs(vcpu, true);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1);
}
static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
{
/*
* If LBR virtualization is disabled, the LBR MSRs are always kept in
* vmcb01. If LBR virtualization is enabled and L1 is running VMs of
* its own, the MSRs are moved between vmcb01 and vmcb02 as needed.
*/
return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb :
svm->vmcb01.ptr;
}
void svm_update_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
u64 guest_debugctl = svm_get_lbr_vmcb(svm)->save.dbgctl;
bool enable_lbrv = (guest_debugctl & DEBUGCTLMSR_LBR) ||
(is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
if (enable_lbrv || (guest_debugctl & DEBUGCTLMSR_BUS_LOCK_DETECT))
svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
else
svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
svm_update_passthrough_lbrs(vcpu, enable_lbrv);
}
> @@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> if (data & DEBUGCTL_RESERVED_BITS)
Not your code, but why does DEBUGCTL_RESERVED_BITS = ~0x3f!?!? That means the
introduction of the below check, which is architecturally correct, has the
potential to break guests. *sigh*
I doubt it will cause a problem, but it's something to look out for.
> return 1;
>
> + if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
> + !guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
> + return 1;
> +
> svm_get_lbr_vmcb(svm)->save.dbgctl = data;
> svm_update_lbrv(vcpu);
> break;
> @@ -5225,8 +5230,14 @@ static __init void svm_set_cpu_caps(void)
> /* CPUID 0x8000001F (SME/SEV features) */
> sev_set_cpu_caps();
>
> - /* Don't advertise Bus Lock Detect to guest if SVM support is absent */
> - kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
> + /*
> + * LBR Virtualization must be enabled to support BusLockTrap inside the
> + * guest, since BusLockTrap is enabled through MSR_IA32_DEBUGCTLMSR and
> + * MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
> + * enabled.
> + */
> + if (!lbrv)
> + kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
> }
>
> static __init int svm_hardware_setup(void)
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-17 0:13 ` Sean Christopherson
@ 2024-08-20 16:38 ` Ravi Bangoria
2024-09-05 11:40 ` Ravi Bangoria
2024-08-21 5:36 ` Ravi Bangoria
2025-11-11 10:03 ` Shivansh Dhiman
2 siblings, 1 reply; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-20 16:38 UTC (permalink / raw)
To: Sean Christopherson
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla, Ravi Bangoria
Sean,
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e1b6a16e97c0..9f3d31a5d231 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>> bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
>> - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
>> + u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
>> + bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
>> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
>> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>
> Out of sight, but this leads to calling svm_enable_lbrv() even when the guest
> just wants to enable BUS_LOCK_DETECT. Ignoring SEV-ES guests, KVM will intercept
> writes to DEBUGCTL, so can't KVM defer mucking with the intercepts and
> svm_copy_lbrs() until the guest actually wants to use LBRs?
>
> Hmm, and I think the existing code is broken. If L1 passes DEBUGCTL through to
> L2, then KVM will handles writes to L1's effective value. And if L1 also passes
> through the LBRs, then KVM will fail to update the MSR bitmaps for vmcb02.
>
> Ah, it's just a performance issue though, because KVM will still emulate RDMSR.
>
> Ugh, this code is silly. The LBR MSRs are read-only, yet KVM passes them through
> for write.
>
> Anyways, I'm thinking something like this? Note, using msr_write_intercepted()
> is wrong, because that'll check L2's bitmap if is_guest_mode(), and the idea is
> to use L1's bitmap as the canary.
>
> static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> KVM_BUG_ON(!passthrough && sev_es_guest(vcpu->kvm), vcpu->kvm);
>
> if (!msr_write_intercepted(vcpu, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
> return;
>
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, passthrough, 0);
>
> /*
> * When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
> * and then move them back to vmcb01 when disabling to avoid copying
> * them on nested guest entries.
> */
> if (is_guest_mode(vcpu)) {
> if (passthrough)
> svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
> else
> svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
> }
> }
>
> void svm_enable_lbrv(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> if (WARN_ON_ONCE(!sev_es_guest(vcpu->kvm)))
> return;
>
> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
> svm_update_passthrough_lbrs(vcpu, true);
>
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1);
> }
>
> static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
> {
> /*
> * If LBR virtualization is disabled, the LBR MSRs are always kept in
> * vmcb01. If LBR virtualization is enabled and L1 is running VMs of
> * its own, the MSRs are moved between vmcb01 and vmcb02 as needed.
> */
> return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb :
> svm->vmcb01.ptr;
> }
>
> void svm_update_lbrv(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> u64 guest_debugctl = svm_get_lbr_vmcb(svm)->save.dbgctl;
> bool enable_lbrv = (guest_debugctl & DEBUGCTLMSR_LBR) ||
> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>
> if (enable_lbrv || (guest_debugctl & DEBUGCTLMSR_BUS_LOCK_DETECT))
> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
> else
> svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
>
> svm_update_passthrough_lbrs(vcpu, enable_lbrv);
> }
This refactored code looks fine. I did some sanity testing with SVM/SEV/SEV-ES
guests and not seeing any issues. I'll respin with above change included.
Thanks for the feedback,
Ravi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-17 0:13 ` Sean Christopherson
2024-08-20 16:38 ` Ravi Bangoria
@ 2024-08-21 5:36 ` Ravi Bangoria
2024-08-21 10:40 ` Ravi Bangoria
2025-11-11 10:03 ` Shivansh Dhiman
2 siblings, 1 reply; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-21 5:36 UTC (permalink / raw)
To: Sean Christopherson
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla, Ravi Bangoria
>> @@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>> if (data & DEBUGCTL_RESERVED_BITS)
>
> Not your code, but why does DEBUGCTL_RESERVED_BITS = ~0x3f!?!? That means the
> introduction of the below check, which is architecturally correct, has the
> potential to break guests. *sigh*
>
> I doubt it will cause a problem, but it's something to look out for.
This dates back to 2008: https://git.kernel.org/torvalds/c/24e09cbf480a7
The legacy definition[1] of DEBUGCTL MSR is:
5:2 PB: performance monitor pin control. Read-write. Reset: 0h.
This field does not control any hardware.
1 BTF. Read-write. Reset: 0. 1=Enable branch single step.
0 LBR. Read-write. Reset: 0. 1=Enable last branch record.
[1]: https://bugzilla.kernel.org/attachment.cgi?id=287389
Thanks,
Ravi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-21 5:36 ` Ravi Bangoria
@ 2024-08-21 10:40 ` Ravi Bangoria
2024-08-26 18:31 ` Sean Christopherson
0 siblings, 1 reply; 15+ messages in thread
From: Ravi Bangoria @ 2024-08-21 10:40 UTC (permalink / raw)
To: Sean Christopherson
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla, Ravi Bangoria
On 21-Aug-24 11:06 AM, Ravi Bangoria wrote:
>>> @@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>>> if (data & DEBUGCTL_RESERVED_BITS)
>>
>> Not your code, but why does DEBUGCTL_RESERVED_BITS = ~0x3f!?!? That means the
>> introduction of the below check, which is architecturally correct, has the
>> potential to break guests. *sigh*
>>
>> I doubt it will cause a problem, but it's something to look out for.
> This dates back to 2008: https://git.kernel.org/torvalds/c/24e09cbf480a7
>
> The legacy definition[1] of DEBUGCTL MSR is:
>
> 5:2 PB: performance monitor pin control. Read-write. Reset: 0h.
> This field does not control any hardware.
> 1 BTF. Read-write. Reset: 0. 1=Enable branch single step.
> 0 LBR. Read-write. Reset: 0. 1=Enable last branch record.
>
> [1]: https://bugzilla.kernel.org/attachment.cgi?id=287389
How about adding cpu_feature_enabled() check:
if (data & DEBUGCTL_RESERVED_BITS)
return 1;
if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_DETECT) &&
(data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
!guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
return 1;
Thanks,
Ravi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
` (3 preceding siblings ...)
2024-08-08 6:29 ` [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support Ravi Bangoria
@ 2024-08-23 23:47 ` Sean Christopherson
4 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2024-08-23 23:47 UTC (permalink / raw)
To: Sean Christopherson, tglx, mingo, bp, dave.hansen, pbonzini,
thomas.lendacky, jmattson, Ravi Bangoria
Cc: hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla
On Thu, 08 Aug 2024 06:29:33 +0000, Ravi Bangoria wrote:
> Add Bus Lock Detect (called Bus Lock Trap in AMD docs) support for AMD
> platforms. Bus Lock Detect is enumerated with CPUID Fn0000_0007_ECX_x0
> bit [24 / BUSLOCKTRAP]. It can be enabled through MSR_IA32_DEBUGCTLMSR.
> When enabled, hardware clears DR6[11] and raises a #DB exception on
> occurrence of Bus Lock if CPL > 0. More detail about the feature can be
> found in AMD APM:
>
> [...]
Applied patch 3 to kvm-x86 fixes, thanks!
[3/4] KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing
https://github.com/kvm-x86/linux/commit/54950bfe2b69
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-21 10:40 ` Ravi Bangoria
@ 2024-08-26 18:31 ` Sean Christopherson
0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2024-08-26 18:31 UTC (permalink / raw)
To: Ravi Bangoria
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla
On Wed, Aug 21, 2024, Ravi Bangoria wrote:
> On 21-Aug-24 11:06 AM, Ravi Bangoria wrote:
> >>> @@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> >>> if (data & DEBUGCTL_RESERVED_BITS)
> >>
> >> Not your code, but why does DEBUGCTL_RESERVED_BITS = ~0x3f!?!? That means the
> >> introduction of the below check, which is architecturally correct, has the
> >> potential to break guests. *sigh*
> >>
> >> I doubt it will cause a problem, but it's something to look out for.
> > This dates back to 2008: https://git.kernel.org/torvalds/c/24e09cbf480a7
> >
> > The legacy definition[1] of DEBUGCTL MSR is:
> >
> > 5:2 PB: performance monitor pin control. Read-write. Reset: 0h.
> > This field does not control any hardware.
Uh, what? So the CPU provided 4 bits of scratch space? Or is that saying that
5:2 controlled some perfmon stuff on older CPUs, but that Zen deprecated those
bits?
> > 1 BTF. Read-write. Reset: 0. 1=Enable branch single step.
> > 0 LBR. Read-write. Reset: 0. 1=Enable last branch record.
> >
> > [1]: https://bugzilla.kernel.org/attachment.cgi?id=287389
>
> How about adding cpu_feature_enabled() check:
That doesn't fix anything, KVM will still break, just on a smaller set of CPUs.
The only way to avoid breaking guests is to ignore bits 5:2, though we could
quirk that so that userspace can effectively enable what is now the architectural
behavior.
Though I'm very tempted to just add a prep patch to disallow setting bits 5:2 and
see if anyone complains. If they do, then we can add a quirk. And if no one
complains, yay.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-20 16:38 ` Ravi Bangoria
@ 2024-09-05 11:40 ` Ravi Bangoria
0 siblings, 0 replies; 15+ messages in thread
From: Ravi Bangoria @ 2024-09-05 11:40 UTC (permalink / raw)
To: Sean Christopherson
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla, Ravi Bangoria
On 20-Aug-24 10:08 PM, Ravi Bangoria wrote:
> Sean,
>
>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>> index e1b6a16e97c0..9f3d31a5d231 100644
>>> --- a/arch/x86/kvm/svm/svm.c
>>> +++ b/arch/x86/kvm/svm/svm.c
>>> @@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
>>> {
>>> struct vcpu_svm *svm = to_svm(vcpu);
>>> bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
>>> - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
>>> + u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
>>> + bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
>>> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
>>> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>>
>> Out of sight, but this leads to calling svm_enable_lbrv() even when the guest
>> just wants to enable BUS_LOCK_DETECT. Ignoring SEV-ES guests, KVM will intercept
>> writes to DEBUGCTL, so can't KVM defer mucking with the intercepts and
>> svm_copy_lbrs() until the guest actually wants to use LBRs?
>>
>> Hmm, and I think the existing code is broken. If L1 passes DEBUGCTL through to
>> L2, then KVM will handles writes to L1's effective value. And if L1 also passes
>> through the LBRs, then KVM will fail to update the MSR bitmaps for vmcb02.
>>
>> Ah, it's just a performance issue though, because KVM will still emulate RDMSR.
>>
>> Ugh, this code is silly. The LBR MSRs are read-only, yet KVM passes them through
>> for write.
>>
>> Anyways, I'm thinking something like this? Note, using msr_write_intercepted()
>> is wrong, because that'll check L2's bitmap if is_guest_mode(), and the idea is
>> to use L1's bitmap as the canary.
>>
>> static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>>
>> KVM_BUG_ON(!passthrough && sev_es_guest(vcpu->kvm), vcpu->kvm);
>>
>> if (!msr_write_intercepted(vcpu, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
>> return;
>>
>> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, passthrough, 0);
>> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, passthrough, 0);
>> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, passthrough, 0);
>> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, passthrough, 0);
>>
>> /*
>> * When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
>> * and then move them back to vmcb01 when disabling to avoid copying
>> * them on nested guest entries.
>> */
>> if (is_guest_mode(vcpu)) {
>> if (passthrough)
>> svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
>> else
>> svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
>> }
>> }
>>
>> void svm_enable_lbrv(struct kvm_vcpu *vcpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>>
>> if (WARN_ON_ONCE(!sev_es_guest(vcpu->kvm)))
>> return;
>>
>> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
>> svm_update_passthrough_lbrs(vcpu, true);
>>
>> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1);
>> }
>>
>> static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
>> {
>> /*
>> * If LBR virtualization is disabled, the LBR MSRs are always kept in
>> * vmcb01. If LBR virtualization is enabled and L1 is running VMs of
>> * its own, the MSRs are moved between vmcb01 and vmcb02 as needed.
>> */
>> return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb :
>> svm->vmcb01.ptr;
>> }
>>
>> void svm_update_lbrv(struct kvm_vcpu *vcpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>> u64 guest_debugctl = svm_get_lbr_vmcb(svm)->save.dbgctl;
>> bool enable_lbrv = (guest_debugctl & DEBUGCTLMSR_LBR) ||
>> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
>> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>>
>> if (enable_lbrv || (guest_debugctl & DEBUGCTLMSR_BUS_LOCK_DETECT))
>> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
>> else
>> svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
>>
>> svm_update_passthrough_lbrs(vcpu, enable_lbrv);
>> }
>
> This refactored code looks fine. I did some sanity testing with SVM/SEV/SEV-ES
> guests and not seeing any issues. I'll respin with above change included.
Realised that KUT LBR tests were failing with this change and I had
to do this to fix those:
---
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0b807099cb19..3dd737db85ef 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -795,6 +795,21 @@ static bool valid_msr_intercept(u32 index)
return direct_access_msr_slot(index) != -ENOENT;
}
+static bool msr_read_intercepted_msrpm(u32 *msrpm, u32 msr)
+{
+ unsigned long tmp;
+ u8 bit_read;
+ u32 offset;
+
+ offset = svm_msrpm_offset(msr);
+ bit_read = 2 * (msr & 0x0f);
+ tmp = msrpm[offset];
+
+ BUG_ON(offset == MSR_INVALID);
+
+ return test_bit(bit_read, &tmp);
+}
+
static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
{
u8 bit_write;
@@ -1000,7 +1015,7 @@ static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
KVM_BUG_ON(!passthrough && sev_es_guest(vcpu->kvm), vcpu->kvm);
- if (!msr_write_intercepted(vcpu, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
+ if (!msr_read_intercepted_msrpm(svm->msrpm, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
return;
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, passthrough, 0);
---
I've added a new api for read interception since LBR register writes are
always intercepted.
Does this looks good?
Thanks,
Ravi
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2024-08-17 0:13 ` Sean Christopherson
2024-08-20 16:38 ` Ravi Bangoria
2024-08-21 5:36 ` Ravi Bangoria
@ 2025-11-11 10:03 ` Shivansh Dhiman
2025-11-11 15:15 ` Sean Christopherson
2 siblings, 1 reply; 15+ messages in thread
From: Shivansh Dhiman @ 2025-11-11 10:03 UTC (permalink / raw)
To: Sean Christopherson, Ravi Bangoria
Cc: tglx, mingo, bp, dave.hansen, pbonzini, thomas.lendacky, jmattson,
hpa, rmk+kernel, peterz, james.morse, lukas.bulwahn, arjan,
j.granados, sibs, nik.borisov, michael.roth, nikunj.dadhania,
babu.moger, x86, kvm, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, manali.shukla, yosry.ahmed,
Shivansh Dhiman
Hi,
On 17-08-2024 05:43, Sean Christopherson wrote:
> On Thu, Aug 08, 2024, Ravi Bangoria wrote:
>> Add Bus Lock Detect support in AMD SVM. Bus Lock Detect is enabled through
>> MSR_IA32_DEBUGCTLMSR and MSR_IA32_DEBUGCTLMSR is virtualized only if LBR
>> Virtualization is enabled. Add this dependency in the SVM.
>
> This doesn't depend on the x86 patches that have gone into tip-tree, correct?
>
> In the future, unless there's an actual depenency in code or functionality,
> please send separate series for patches that obviously need to be routed through
> different trees.
>
>> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>> arch/x86/kvm/svm/nested.c | 3 ++-
>> arch/x86/kvm/svm/svm.c | 17 ++++++++++++++---
>> 2 files changed, 16 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
>> index 6f704c1037e5..1df9158c72c1 100644
>> --- a/arch/x86/kvm/svm/nested.c
>> +++ b/arch/x86/kvm/svm/nested.c
>> @@ -586,7 +586,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>> /* These bits will be set properly on the first execution when new_vmc12 is true */
>> if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
>> vmcb02->save.dr7 = svm->nested.save.dr7 | DR7_FIXED_1;
>> - svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_ACTIVE_LOW;
>> + /* DR6_RTM is a reserved bit on AMD and as such must be set to 1 */
>> + svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_FIXED_1 | DR6_RTM;
>> vmcb_mark_dirty(vmcb02, VMCB_DR);
>> }
>>
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e1b6a16e97c0..9f3d31a5d231 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>> bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
>> - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
>> + u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
>> + bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
>> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
>> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>
> Out of sight, but this leads to calling svm_enable_lbrv() even when the guest
> just wants to enable BUS_LOCK_DETECT. Ignoring SEV-ES guests, KVM will intercept
> writes to DEBUGCTL, so can't KVM defer mucking with the intercepts and
> svm_copy_lbrs() until the guest actually wants to use LBRs?
>
> Hmm, and I think the existing code is broken. If L1 passes DEBUGCTL through to
> L2, then KVM will handles writes to L1's effective value. And if L1 also passes
> through the LBRs, then KVM will fail to update the MSR bitmaps for vmcb02.
>
> Ah, it's just a performance issue though, because KVM will still emulate RDMSR.
>
> Ugh, this code is silly. The LBR MSRs are read-only, yet KVM passes them through
> for write.
>
> Anyways, I'm thinking something like this? Note, using msr_write_intercepted()
> is wrong, because that'll check L2's bitmap if is_guest_mode(), and the idea is
> to use L1's bitmap as the canary.
>
> static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> KVM_BUG_ON(!passthrough && sev_es_guest(vcpu->kvm), vcpu->kvm);
>
> if (!msr_write_intercepted(vcpu, MSR_IA32_LASTBRANCHFROMIP) == passthrough)
> return;
>
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, passthrough, 0);
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, passthrough, 0);
>
> /*
> * When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
> * and then move them back to vmcb01 when disabling to avoid copying
> * them on nested guest entries.
> */
> if (is_guest_mode(vcpu)) {
> if (passthrough)
> svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
> else
> svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
> }
> }
>
> void svm_enable_lbrv(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> if (WARN_ON_ONCE(!sev_es_guest(vcpu->kvm)))
> return;
>
> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
> svm_update_passthrough_lbrs(vcpu, true);
>
> set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1);
> }
>
> static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
> {
> /*
> * If LBR virtualization is disabled, the LBR MSRs are always kept in
> * vmcb01. If LBR virtualization is enabled and L1 is running VMs of
> * its own, the MSRs are moved between vmcb01 and vmcb02 as needed.
> */
> return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb :
> svm->vmcb01.ptr;
> }
>
> void svm_update_lbrv(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> u64 guest_debugctl = svm_get_lbr_vmcb(svm)->save.dbgctl;
> bool enable_lbrv = (guest_debugctl & DEBUGCTLMSR_LBR) ||
> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>
> if (enable_lbrv || (guest_debugctl & DEBUGCTLMSR_BUS_LOCK_DETECT))
> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
> else
> svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
>
> svm_update_passthrough_lbrs(vcpu, enable_lbrv);
> }
>
>
>> @@ -3158,6 +3159,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>> if (data & DEBUGCTL_RESERVED_BITS)
>
> Not your code, but why does DEBUGCTL_RESERVED_BITS = ~0x3f!?!? That means the
> introduction of the below check, which is architecturally correct, has the
> potential to break guests. *sigh*
>
> I doubt it will cause a problem, but it's something to look out for.
>
>> return 1;
>>
>> + if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
>> + !guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
>> + return 1;
>> +
>> svm_get_lbr_vmcb(svm)->save.dbgctl = data;
>> svm_update_lbrv(vcpu);
>> break;
>> @@ -5225,8 +5230,14 @@ static __init void svm_set_cpu_caps(void)
>> /* CPUID 0x8000001F (SME/SEV features) */
>> sev_set_cpu_caps();
>>
>> - /* Don't advertise Bus Lock Detect to guest if SVM support is absent */
>> - kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
>> + /*
>> + * LBR Virtualization must be enabled to support BusLockTrap inside the
>> + * guest, since BusLockTrap is enabled through MSR_IA32_DEBUGCTLMSR and
>> + * MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
>> + * enabled.
>> + */
>> + if (!lbrv)
>> + kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
>> }
>>
>> static __init int svm_hardware_setup(void)
>> --
>> 2.34.1
>>
Thanks Sean for the refactored code. I ported your implementation to the 6.16
kernel and did some testing with KVM unit tests. After instrumentation I found
couple of issues and potential fixes.
===========================================================
Issue 1: Interception still enabled after enabling LBRV
===========================================================
Using the 6.16 upstream kernel (unpatched) I ran the KUT tests and they passed
when run from both the bare metal and from inside a L1 guest. However for L2
guest, when looking at the logs I found that RDMSR interception of LBR MSRs is
still enabled despite the LBRV is enabled for the L2 guest. Effectively, the
reads are emulated instead of being virtualized, which is not the intended
behaviour. KUT cannot distinguish between emulated and virtualized RDMSR, and
hence the test passes regardless.
===========================================================
Issue 2: Basic LBR KUT fails with Sean's implementation
===========================================================
After using your implementation, all KUTs passed on the bare metal. With LBRV
enabled for L2, RDMSR interception of LBR MSRs is disabled as intended.
However, when running KUT tests inside an L1 guest, the tests fail.
After digging deeper, I figured that when L2 attempts to update the LBR using
DBGCTL, the L1's LBRs are copied to L2's LBR MSRs from
svm_update_passthrough_lbrs().
/*
* When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
* and then move them back to vmcb01 when disabling to avoid copying
* them on nested guest entries.
*/
if (is_guest_mode(vcpu)) {
if (passthrough)
svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); <---- copy happens here
else
svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
}
This results in DBGCTL and other LBRs of L2 being overwritten by L1's value.
So if L1 has DBGCTL.LBR disabled, L2 won't be able to turn on LBR. This results
in failing KUT test.
L2> wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
L2> DO_BRANCH(host_branch0);
L2> dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
L2> wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
L2> TEST_EXPECT_EQ(dbgctl, DEBUGCTLMSR_LBR); <---- This check fails since
dbgctl is still 0
Line-by-line code analysis
--------------------------
KVM> # start L1 guest
<L1> # start KUT test svm_lbrv_test0
<L2> wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
KVM> # inject wrmsr into L1
<L1> vmcb01->control.virt_ext = 1; <---- L1's vmcb01 is vmcb12 wrt KVM
<L1> # turn off intercepts
<L1> vmrun
KVM> vmcb02->control.virt_ext = 1;
KVM> svm_copy_lbrs(vmcb02, vmcb12); <---- Correct value loaded here
KVM> nested.ctl.virt_ext = 1;
KVM> # turn off intercepts
KVM> svm_copy_lbrs(vmcb02, vmcb01); <---- This overwrites the value
KVM> # start L2 guest
<L2> DO_BRANCH(host_branch0);
<L2> dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
...
<L2> wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
...
<L2> TEST_EXPECT_EQ(dbgctl, DEBUGCTLMSR_LBR); <---- This check fails
===========================================================
Potential Solution
===========================================================
A potential fix is to copy the LBRs only when L1 has DBGCTL.LBR is enabled.
This will prevent the overwrite. I successfully tested it by running KUT on
all levels, viz, bare metal, L1 and L2. Please share your thoughts on
this.
if (is_guest_mode(vcpu) && svm->vmcb01.ptr->save.dbgctl & DEBUGCTLMSR_LBR) {
if (passthrough)
svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
else
svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
}
If the patch below looks good, I'll split it and send it as a series.
Additionaly, I added a MSR read interception API similar to the one
implemented by Ravi.
---
arch/x86/kvm/svm/svm.c | 66 ++++++++++++++++++++++++++++--------------
1 file changed, 44 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9931c6c4bc6..f0f77199ec12 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -663,6 +663,11 @@ static void clr_dr_intercepts(struct vcpu_svm *svm)
recalc_intercepts(svm);
}
+static bool msr_read_intercepted_msrpm(void *msrpm, u32 msr)
+{
+ return svm_test_msr_bitmap_read(msrpm, msr);
+}
+
static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
{
/*
@@ -864,32 +869,49 @@ void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb)
vmcb_mark_dirty(to_vmcb, VMCB_LBR);
}
-void svm_enable_lbrv(struct kvm_vcpu *vcpu)
+static void svm_update_passthrough_lbrs(struct kvm_vcpu *vcpu, bool passthrough)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ bool to_intercept = !passthrough;
- svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
- svm_recalc_lbr_msr_intercepts(vcpu);
+ KVM_BUG_ON(to_intercept && sev_es_guest(vcpu->kvm), vcpu->kvm);
- /* Move the LBR msrs to the vmcb02 so that the guest can see them. */
- if (is_guest_mode(vcpu))
- svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
+ if (msr_read_intercepted_msrpm(svm->msrpm, MSR_IA32_LASTBRANCHFROMIP) == to_intercept)
+ return;
+
+ svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_R, to_intercept);
+ svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_R, to_intercept);
+ svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTINTFROMIP, MSR_TYPE_R, to_intercept);
+ svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTINTTOIP, MSR_TYPE_R, to_intercept);
+
+ /*
+ * When enabling, move the LBR msrs to vmcb02 so that L2 can see them,
+ * and then move them back to vmcb01 when disabling to avoid copying
+ * them on nested guest entries.
+ *
+ * Perform this only when L1 has enabled LBR to prevent the overwrite.
+ */
+ if (is_guest_mode(vcpu) && svm->vmcb01.ptr->save.dbgctl & DEBUGCTLMSR_LBR) {
+ if (passthrough)
+ svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
+ else
+ svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
+ }
}
-static void svm_disable_lbrv(struct kvm_vcpu *vcpu)
+void svm_enable_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm);
- svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
- svm_recalc_lbr_msr_intercepts(vcpu);
+ /* Allow the function call from SEV-ES guests only */
+ if (WARN_ON_ONCE(!sev_es_guest(vcpu->kvm)))
+ return;
- /*
- * Move the LBR msrs back to the vmcb01 to avoid copying them
- * on nested guest entries.
- */
- if (is_guest_mode(vcpu))
- svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb);
+ svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
+
+ svm_update_passthrough_lbrs(vcpu, true);
+
+ svm_set_intercept_for_msr(vcpu, MSR_IA32_DEBUGCTLMSR, MSR_TYPE_RW, 0);
}
static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
@@ -906,18 +928,18 @@ static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm)
void svm_update_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
- bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
+ u64 guest_debugctl = svm_get_lbr_vmcb(svm)->save.dbgctl;
+ bool enable_lbrv = (guest_debugctl & DEBUGCTLMSR_LBR) ||
(is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) &&
(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
- if (enable_lbrv == current_enable_lbrv)
- return;
if (enable_lbrv)
- svm_enable_lbrv(vcpu);
+ svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
else
- svm_disable_lbrv(vcpu);
+ svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
+
+ svm_update_passthrough_lbrs(vcpu, enable_lbrv);
}
void disable_nmi_singlestep(struct vcpu_svm *svm)
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2025-11-11 10:03 ` Shivansh Dhiman
@ 2025-11-11 15:15 ` Sean Christopherson
2025-11-14 8:50 ` Shivansh Dhiman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2025-11-11 15:15 UTC (permalink / raw)
To: Shivansh Dhiman
Cc: Ravi Bangoria, tglx, mingo, bp, dave.hansen, pbonzini,
thomas.lendacky, jmattson, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla,
yosry.ahmed
On Tue, Nov 11, 2025, Shivansh Dhiman wrote:
> On 17-08-2024 05:43, Sean Christopherson wrote:
> > On Thu, Aug 08, 2024, Ravi Bangoria wrote:
> >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> >> index e1b6a16e97c0..9f3d31a5d231 100644
> >> --- a/arch/x86/kvm/svm/svm.c
> >> +++ b/arch/x86/kvm/svm/svm.c
> >> @@ -1047,7 +1047,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
> >> {
> >> struct vcpu_svm *svm = to_svm(vcpu);
> >> bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
> >> - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
> >> + u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
> >> + bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
> >> (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
> >> (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
> >
> > Out of sight, but this leads to calling svm_enable_lbrv() even when the guest
> > just wants to enable BUS_LOCK_DETECT. Ignoring SEV-ES guests, KVM will intercept
> > writes to DEBUGCTL, so can't KVM defer mucking with the intercepts and
> > svm_copy_lbrs() until the guest actually wants to use LBRs?
> >
> > Hmm, and I think the existing code is broken. If L1 passes DEBUGCTL through to
> > L2, then KVM will handles writes to L1's effective value. And if L1 also passes
> > through the LBRs, then KVM will fail to update the MSR bitmaps for vmcb02.
> >
> > Ah, it's just a performance issue though, because KVM will still emulate RDMSR.
> >
> > Ugh, this code is silly. The LBR MSRs are read-only, yet KVM passes them through
> > for write.
> >
> > Anyways, I'm thinking something like this? Note, using msr_write_intercepted()
> > is wrong, because that'll check L2's bitmap if is_guest_mode(), and the idea is
> > to use L1's bitmap as the canary.
...
> ===========================================================
> Issue 1: Interception still enabled after enabling LBRV
> ===========================================================
> Using the 6.16 upstream kernel (unpatched) I ran the KUT tests and they passed
> when run from both the bare metal and from inside a L1 guest. However for L2
> guest, when looking at the logs I found that RDMSR interception of LBR MSRs is
> still enabled despite the LBRV is enabled for the L2 guest. Effectively, the
> reads are emulated instead of being virtualized, which is not the intended
> behaviour. KUT cannot distinguish between emulated and virtualized RDMSR, and
> hence the test passes regardless.
I haven't looked closely at your patch or at Yosry's patches, but I suspect this
was _just_ fixed:
https://lore.kernel.org/all/20251108004524.1600006-1-yosry.ahmed@linux.dev
> ===========================================================
> Issue 2: Basic LBR KUT fails with Sean's implementation
> ===========================================================
> After using your implementation, all KUTs passed on the bare metal. With LBRV
> enabled for L2, RDMSR interception of LBR MSRs is disabled as intended.
> However, when running KUT tests inside an L1 guest, the tests fail.
Same story here: I haven't had cycles to actually look at code, but Yosry also
posted a pile of changes for KUT:
https://lore.kernel.org/all/20251110232642.633672-1-yosry.ahmed@linux.dev
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support
2025-11-11 15:15 ` Sean Christopherson
@ 2025-11-14 8:50 ` Shivansh Dhiman
0 siblings, 0 replies; 15+ messages in thread
From: Shivansh Dhiman @ 2025-11-14 8:50 UTC (permalink / raw)
To: Sean Christopherson
Cc: Ravi Bangoria, tglx, mingo, bp, dave.hansen, pbonzini,
thomas.lendacky, jmattson, hpa, rmk+kernel, peterz, james.morse,
lukas.bulwahn, arjan, j.granados, sibs, nik.borisov, michael.roth,
nikunj.dadhania, babu.moger, x86, kvm, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das, manali.shukla,
yosry.ahmed
On 11-11-2025 20:45, Sean Christopherson wrote:
> On Tue, Nov 11, 2025, Shivansh Dhiman wrote:
>> On 17-08-2024 05:43, Sean Christopherson wrote:
>>> On Thu, Aug 08, 2024, Ravi Bangoria wrote:
...
>
>> ===========================================================
>> Issue 1: Interception still enabled after enabling LBRV
>> ===========================================================
>> Using the 6.16 upstream kernel (unpatched) I ran the KUT tests and they passed
>> when run from both the bare metal and from inside a L1 guest. However for L2
>> guest, when looking at the logs I found that RDMSR interception of LBR MSRs is
>> still enabled despite the LBRV is enabled for the L2 guest. Effectively, the
>> reads are emulated instead of being virtualized, which is not the intended
>> behaviour. KUT cannot distinguish between emulated and virtualized RDMSR, and
>> hence the test passes regardless.
>
> I haven't looked closely at your patch or at Yosry's patches, but I suspect this
> was _just_ fixed:
>
> https://lore.kernel.org/all/20251108004524.1600006-1-yosry.ahmed@linux.dev
Thanks Sean. I tested Yosry's patches and they indeed have solved this issue.
>
>> ===========================================================
>> Issue 2: Basic LBR KUT fails with Sean's implementation
>> ===========================================================
>> After using your implementation, all KUTs passed on the bare metal. With LBRV
>> enabled for L2, RDMSR interception of LBR MSRs is disabled as intended.
>> However, when running KUT tests inside an L1 guest, the tests fail.
>
> Same story here: I haven't had cycles to actually look at code, but Yosry also
> posted a pile of changes for KUT:
>
> https://lore.kernel.org/all/20251110232642.633672-1-yosry.ahmed@linux.dev
This issue was also related to buggy LBRV in the kernel. Yosry's patches have
fixed this issue as well and I've verified it. There was a slight flakiness in
the KUT which was later fixed by another patch by Yosry's. [1]
Thanks,
Shivansh
1. https://lore.kernel.org/all/20251113224639.2916783-1-yosry.ahmed@linux.dev/
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-11-14 8:50 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 6:29 [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 1/4] x86/split_lock: Move Split and Bus lock code to a dedicated file Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 2/4] x86/bus_lock: Add support for AMD Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 3/4] KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing Ravi Bangoria
2024-08-08 6:29 ` [PATCH v4 4/4] KVM: SVM: Add Bus Lock Detect support Ravi Bangoria
2024-08-17 0:13 ` Sean Christopherson
2024-08-20 16:38 ` Ravi Bangoria
2024-09-05 11:40 ` Ravi Bangoria
2024-08-21 5:36 ` Ravi Bangoria
2024-08-21 10:40 ` Ravi Bangoria
2024-08-26 18:31 ` Sean Christopherson
2025-11-11 10:03 ` Shivansh Dhiman
2025-11-11 15:15 ` Sean Christopherson
2025-11-14 8:50 ` Shivansh Dhiman
2024-08-23 23:47 ` [PATCH v4 0/4] x86/cpu: Add Bus Lock Detect support for AMD Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).