* [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
@ 2024-12-02 12:04 ` Borislav Petkov
2024-12-10 6:53 ` Josh Poimboeuf
2024-12-30 17:02 ` [tip: x86/bugs] " tip-bot2 for Borislav Petkov (AMD)
2024-12-02 12:04 ` [PATCH v2 2/4] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace Borislav Petkov
` (4 subsequent siblings)
5 siblings, 2 replies; 68+ messages in thread
From: Borislav Petkov @ 2024-12-02 12:04 UTC (permalink / raw)
To: Sean Christopherson, X86 ML
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
From: "Borislav Petkov (AMD)" <bp@alien8.de>
If the machine has:
CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1,
it indicates the CPU is not subject to the SRSO vulnerability across
user/kernel boundaries.
have it fall back to IBPB on VMEXIT only, in the case it is going to run
VMs:
Speculative Return Stack Overflow: CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT
Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/bugs.c | 6 ++++++
arch/x86/kernel/cpu/common.c | 1 +
3 files changed, 8 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 17b6590748c0..2787227a8b42 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -464,6 +464,7 @@
#define X86_FEATURE_SBPB (20*32+27) /* Selective Branch Prediction Barrier */
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
+#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..8854d9bce2a5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2615,6 +2615,11 @@ static void __init srso_select_mitigation(void)
break;
case SRSO_CMD_SAFE_RET:
+ if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) {
+ pr_notice("CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT\n");
+ goto ibpb_on_vmexit;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
/*
* Enable the return thunk for generated code
@@ -2658,6 +2663,7 @@ static void __init srso_select_mitigation(void)
}
break;
+ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a5c28975c608..954f9c727f11 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1270,6 +1270,7 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
VULNBL_AMD(0x19, SRSO),
+ VULNBL_AMD(0x1a, SRSO),
{}
};
--
2.43.0
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-02 12:04 ` [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support Borislav Petkov
@ 2024-12-10 6:53 ` Josh Poimboeuf
2024-12-10 15:37 ` Borislav Petkov
2024-12-30 17:02 ` [tip: x86/bugs] " tip-bot2 for Borislav Petkov (AMD)
1 sibling, 1 reply; 68+ messages in thread
From: Josh Poimboeuf @ 2024-12-10 6:53 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, X86 ML, Paolo Bonzini, Pawan Gupta, KVM,
LKML, Borislav Petkov (AMD)
On Mon, Dec 02, 2024 at 01:04:13PM +0100, Borislav Petkov wrote:
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -2615,6 +2615,11 @@ static void __init srso_select_mitigation(void)
> break;
>
> case SRSO_CMD_SAFE_RET:
> + if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) {
> + pr_notice("CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT\n");
> + goto ibpb_on_vmexit;
The presence of SRSO_USER_KERNEL_NO should indeed change the default,
but if the user requests "safe_ret" specifically, shouldn't we give it
to them? That would be more consistent with how we handle requested
mitigations.
Also it doesn't really make sense to add a printk here as the mitigation
will be printed at the end of the function.
--
Josh
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-10 6:53 ` Josh Poimboeuf
@ 2024-12-10 15:37 ` Borislav Petkov
2024-12-11 7:53 ` Josh Poimboeuf
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-10 15:37 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Borislav Petkov, Sean Christopherson, X86 ML, Paolo Bonzini,
Pawan Gupta, KVM, LKML
On Mon, Dec 09, 2024 at 10:53:31PM -0800, Josh Poimboeuf wrote:
> The presence of SRSO_USER_KERNEL_NO should indeed change the default,
> but if the user requests "safe_ret" specifically, shouldn't we give it
> to them?
Hardly a valid use case except for debugging but if you're doing that, you can
change the kernel too.
Because we just fixed this and if some poor soul has left
spec_rstack_overflow=safe-ret in her /etc/default/grub (it has happened to me
a bunch on my test boxes), she'll never get her performance back and that
would be a yucky situation.
> That would be more consistent with how we handle requested
> mitigations.
Yeah, but if there were a point to this, I guess. We don't really need this
mitigation as there's nothing to mitigate on the user/kernel transition
anymore.
> Also it doesn't really make sense to add a printk here as the mitigation
> will be printed at the end of the function.
This is us letting the user know that we don't need Safe-RET anymore and we're
falling back. But I'm not that hung up on that printk...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-10 15:37 ` Borislav Petkov
@ 2024-12-11 7:53 ` Josh Poimboeuf
2024-12-11 20:38 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Josh Poimboeuf @ 2024-12-11 7:53 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, Sean Christopherson, X86 ML, Paolo Bonzini,
Pawan Gupta, KVM, LKML
On Tue, Dec 10, 2024 at 04:37:10PM +0100, Borislav Petkov wrote:
> > Also it doesn't really make sense to add a printk here as the mitigation
> > will be printed at the end of the function.
>
> This is us letting the user know that we don't need Safe-RET anymore and we're
> falling back. But I'm not that hung up on that printk...
The printk makes sense when it's actually a fallback from
"spec_rstack_overflow=safe-ret", but if nothing was specified on the
cmdline, it's the default rather than a fallback. In which case I think
the printk would be confusing.
--
Josh
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-11 7:53 ` Josh Poimboeuf
@ 2024-12-11 20:38 ` Borislav Petkov
2024-12-11 22:35 ` Sean Christopherson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-11 20:38 UTC (permalink / raw)
To: Josh Poimboeuf, Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Pawan Gupta, KVM, LKML
On Tue, Dec 10, 2024 at 11:53:15PM -0800, Josh Poimboeuf wrote:
> The printk makes sense when it's actually a fallback from
> "spec_rstack_overflow=safe-ret"
Well, it kinda is as safe-ret is the default and we're falling back from it...
> but if nothing was specified on the cmdline, it's the default rather than
> a fallback. In which case I think the printk would be confusing.
... but as I said, I'm not hung on that printk - zapped it is.
Btw, Sean, how should we merge this?
Should I take it all through tip and give you an immutable branch?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-11 20:38 ` Borislav Petkov
@ 2024-12-11 22:35 ` Sean Christopherson
2024-12-16 17:21 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2024-12-11 22:35 UTC (permalink / raw)
To: Borislav Petkov
Cc: Josh Poimboeuf, Borislav Petkov, X86 ML, Paolo Bonzini,
Pawan Gupta, KVM, LKML
On Wed, Dec 11, 2024, Borislav Petkov wrote:
> Btw, Sean, how should we merge this?
>
> Should I take it all through tip and give you an immutable branch?
Hmm, that should work. I don't anticipate any conflicts other than patch 2
(Advertise SRSO_USER_KERNEL_NO to userspace), which is amusingly the most trivial
patch.
Patch 2 is going to conflict with the CPUID/cpu_caps rework[*], but the conflict
won't be hard to resolve, and I'm pretty sure that if I merge in your branch after
applying the rework, the merge commit will show an "obviously correct" resolution.
Or if I screw it up, an obviously wrong resolution :-)
Alternatively, take 1, 3, and 4 through tip, and 2 through my tree, but that
seems unnecessarily convoluted.
[*] https://lore.kernel.org/all/20241128013424.4096668-40-seanjc@google.com
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-11 22:35 ` Sean Christopherson
@ 2024-12-16 17:21 ` Borislav Petkov
0 siblings, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2024-12-16 17:21 UTC (permalink / raw)
To: Sean Christopherson
Cc: Josh Poimboeuf, Borislav Petkov, X86 ML, Paolo Bonzini,
Pawan Gupta, KVM, LKML
On Wed, Dec 11, 2024 at 02:35:39PM -0800, Sean Christopherson wrote:
> On Wed, Dec 11, 2024, Borislav Petkov wrote:
> > Btw, Sean, how should we merge this?
> >
> > Should I take it all through tip and give you an immutable branch?
>
> Hmm, that should work. I don't anticipate any conflicts other than patch 2
> (Advertise SRSO_USER_KERNEL_NO to userspace), which is amusingly the most trivial
> patch.
>
> Patch 2 is going to conflict with the CPUID/cpu_caps rework[*], but the conflict
> won't be hard to resolve, and I'm pretty sure that if I merge in your branch after
> applying the rework, the merge commit will show an "obviously correct" resolution.
> Or if I screw it up, an obviously wrong resolution :-)
>
> Alternatively, take 1, 3, and 4 through tip, and 2 through my tree, but that
> seems unnecessarily convoluted.
Ok, lemme queue them all through tip and we'll see what conflicts we encounter
along the way and then sync again.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* [tip: x86/bugs] x86/bugs: Add SRSO_USER_KERNEL_NO support
2024-12-02 12:04 ` [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support Borislav Petkov
2024-12-10 6:53 ` Josh Poimboeuf
@ 2024-12-30 17:02 ` tip-bot2 for Borislav Petkov (AMD)
1 sibling, 0 replies; 68+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-12-30 17:02 UTC (permalink / raw)
To: linux-tip-commits
Cc: Borislav Petkov (AMD), Nikolay Borisov, x86, linux-kernel
The following commit has been merged into the x86/bugs branch of tip:
Commit-ID: 877818802c3e970f67ccb53012facc78bef5f97a
Gitweb: https://git.kernel.org/tip/877818802c3e970f67ccb53012facc78bef5f97a
Author: Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate: Mon, 11 Nov 2024 17:22:08 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 17:48:33 +01:00
x86/bugs: Add SRSO_USER_KERNEL_NO support
If the machine has:
CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1,
it indicates the CPU is not subject to the SRSO vulnerability across
user/kernel boundaries.
have it fall back to IBPB on VMEXIT only, in the case it is going to run
VMs:
Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20241202120416.6054-2-bp@kernel.org
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/bugs.c | 4 ++++
arch/x86/kernel/cpu/common.c | 1 +
3 files changed, 6 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 645aa36..0e2d817 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -465,6 +465,7 @@
#define X86_FEATURE_SBPB (20*32+27) /* Selective Branch Prediction Barrier */
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
+#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4..5a505aa 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2615,6 +2615,9 @@ static void __init srso_select_mitigation(void)
break;
case SRSO_CMD_SAFE_RET:
+ if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO))
+ goto ibpb_on_vmexit;
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
/*
* Enable the return thunk for generated code
@@ -2658,6 +2661,7 @@ static void __init srso_select_mitigation(void)
}
break;
+ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e90376..7e8d811 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1270,6 +1270,7 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
VULNBL_AMD(0x19, SRSO),
+ VULNBL_AMD(0x1a, SRSO),
{}
};
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH v2 2/4] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
2024-12-02 12:04 ` [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support Borislav Petkov
@ 2024-12-02 12:04 ` Borislav Petkov
2024-12-30 17:02 ` [tip: x86/bugs] " tip-bot2 for Borislav Petkov (AMD)
2024-12-02 12:04 ` [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX Borislav Petkov
` (3 subsequent siblings)
5 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-02 12:04 UTC (permalink / raw)
To: Sean Christopherson, X86 ML
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
From: "Borislav Petkov (AMD)" <bp@alien8.de>
SRSO_USER_KERNEL_NO denotes whether the CPU is affected by SRSO across
user/kernel boundaries. Advertise it to guest userspace.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
arch/x86/kvm/cpuid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 097bdc022d0f..7cf5fa77e399 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -800,7 +800,7 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
- F(WRMSR_XX_BASE_NS)
+ F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)
);
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
--
2.43.0
^ permalink raw reply related [flat|nested] 68+ messages in thread* [tip: x86/bugs] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace
2024-12-02 12:04 ` [PATCH v2 2/4] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace Borislav Petkov
@ 2024-12-30 17:02 ` tip-bot2 for Borislav Petkov (AMD)
0 siblings, 0 replies; 68+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-12-30 17:02 UTC (permalink / raw)
To: linux-tip-commits
Cc: Borislav Petkov (AMD), Nikolay Borisov, x86, linux-kernel
The following commit has been merged into the x86/bugs branch of tip:
Commit-ID: 716f86b523d8ec3c17015ee0b03135c7aa6f2f08
Gitweb: https://git.kernel.org/tip/716f86b523d8ec3c17015ee0b03135c7aa6f2f08
Author: Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate: Wed, 13 Nov 2024 13:28:33 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 17:56:00 +01:00
KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace
SRSO_USER_KERNEL_NO denotes whether the CPU is affected by SRSO across
user/kernel boundaries. Advertise it to guest userspace.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20241202120416.6054-3-bp@kernel.org
---
arch/x86/kvm/cpuid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ae0b438..f7e2229 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -821,7 +821,7 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
- F(WRMSR_XX_BASE_NS)
+ F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)
);
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
2024-12-02 12:04 ` [PATCH v2 1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support Borislav Petkov
2024-12-02 12:04 ` [PATCH v2 2/4] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace Borislav Petkov
@ 2024-12-02 12:04 ` Borislav Petkov
2024-12-11 22:27 ` Sean Christopherson
2024-12-02 12:04 ` [PATCH v2 4/4] Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text Borislav Petkov
` (2 subsequent siblings)
5 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-02 12:04 UTC (permalink / raw)
To: Sean Christopherson, X86 ML
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
v2: Add some doc blurb about the modalities of the mitigation.
Documentation/admin-guide/hw-vuln/srso.rst | 10 ++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 10 +++++++++-
arch/x86/kvm/svm/svm.c | 6 ++++++
arch/x86/lib/msr.c | 2 ++
6 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..79a8f7dea06d 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,7 +104,17 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ Currently, the mitigation is automatically enabled when KVM enables
+ virtualization and can incur some cost. If no VMs will run on the system,
+ you can either disable virtualization or set kvm.enable_virt_at_load=0 to
+ enable it only when a VM gets started and thus when really needed. See the
+ text in Documentation/admin-guide/kernel-parameters.txt on this parameter
+ for more details.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 2787227a8b42..94582c0ed9f2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -465,6 +465,7 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_MSR_FIX (20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..1372a569fb58 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -717,6 +717,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 8854d9bce2a5..a2eb7c0700da 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2523,6 +2523,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2540,7 +2541,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2665,6 +2667,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index dd15cc635655..e4fad330cd25 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -685,6 +688,9 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
return 0;
}
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 4bf4fad5b148..5a18ecc04a6c 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, true);
}
+EXPORT_SYMBOL_GPL(msr_set_bit);
/**
* msr_clear_bit - Clear @bit in a MSR @msr.
@@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, false);
}
+EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(unsigned int msr, u64 val, int failed)
--
2.43.0
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-02 12:04 ` [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX Borislav Petkov
@ 2024-12-11 22:27 ` Sean Christopherson
2024-12-16 17:31 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2024-12-11 22:27 UTC (permalink / raw)
To: Borislav Petkov
Cc: X86 ML, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
On Mon, Dec 02, 2024, Borislav Petkov wrote:
> diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
> index 2ad1c05b8c88..79a8f7dea06d 100644
> --- a/Documentation/admin-guide/hw-vuln/srso.rst
> +++ b/Documentation/admin-guide/hw-vuln/srso.rst
> @@ -104,7 +104,17 @@ The possible values in this file are:
>
> (spec_rstack_overflow=ibpb-vmexit)
>
> + * 'Mitigation: Reduced Speculation':
>
> + This mitigation gets automatically enabled when the above one "IBPB on
> + VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
> +
> + Currently, the mitigation is automatically enabled when KVM enables
> + virtualization and can incur some cost.
How much cost are we talking?
> static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
> @@ -2665,6 +2667,12 @@ static void __init srso_select_mitigation(void)
>
> ibpb_on_vmexit:
> case SRSO_CMD_IBPB_ON_VMEXIT:
> + if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) {
> + pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
> + srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
> + break;
> + }
> +
> if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
> if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
> setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index dd15cc635655..e4fad330cd25 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
> kvm_cpu_svm_disable();
>
> amd_pmu_disable_virt();
> +
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> }
>
> static int svm_enable_virtualization_cpu(void)
> @@ -685,6 +688,9 @@ static int svm_enable_virtualization_cpu(void)
> rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
> }
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
IIUC, this magic bit reduces how much the CPU is allowed to speculate in order
to mitigate potential VM=>host attacks, and that reducing speculation also reduces
overall performance.
If that's correct, then enabling the magic bit needs to be gated by an appropriate
mitagation being enabled, not forced on automatically just because the CPU supports
X86_FEATURE_SRSO_MSR_FIX.
And depending on the cost, it might also make sense to set the bit on-demand, and
then clean up when KVM disables virtualization. E.g. wait to set the bit until
entry to a guest is imminent.
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-11 22:27 ` Sean Christopherson
@ 2024-12-16 17:31 ` Borislav Petkov
2024-12-16 18:51 ` Sean Christopherson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-16 17:31 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Wed, Dec 11, 2024 at 02:27:42PM -0800, Sean Christopherson wrote:
> How much cost are we talking?
Likely 1-2%.
That's why I'm simply enabling it by default.
> IIUC, this magic bit reduces how much the CPU is allowed to speculate in order
> to mitigate potential VM=>host attacks, and that reducing speculation also reduces
> overall performance.
>
> If that's correct, then enabling the magic bit needs to be gated by an appropriate
> mitagation being enabled, not forced on automatically just because the CPU supports
> X86_FEATURE_SRSO_MSR_FIX.
Well, in the default case we have safe-RET - the default - but since it is
not needed anymore, it falls back to this thing which is needed when the
mitigation is enabled.
That's why it also is in the SRSO_CMD_IBPB_ON_VMEXIT case as it is part of the
spec_rstack_overflow=ibpb-vmexit mitigation option.
So it kinda already does that. When you disable the mitigation, this one won't
get enabled either.
> And depending on the cost, it might also make sense to set the bit on-demand, and
> then clean up when KVM disables virtualization. E.g. wait to set the bit until
> entry to a guest is imminent.
So the "when to set that bit" discussion kinda remained unfinished the last
time. Here's gist:
You:
| "It's not strictly KVM module load, it's when KVM enables virtualization. E.g.
| if userspace clears enable_virt_at_load, the MSR will be toggled every time the
| number of VMs goes from 0=>1 and 1=>0.
|
| But why do this in KVM? E.g. why not set-and-forget in init_amd_zen4()?"
I:
| "Because there's no need to impose an unnecessary - albeit small - perf impact
| on users who don't do virt.
|
| I'm currently gravitating towards the MSR toggling thing, i.e., only when the
| VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
| yes, the toggling thing sounds better. I.e., set it only when really needed."
So to answer your current question, it sounds like the user can control the
on-demand thing with enable_virt_at_load=0, right?
Or do you mean something else different?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-16 17:31 ` Borislav Petkov
@ 2024-12-16 18:51 ` Sean Christopherson
2024-12-17 9:34 ` Borislav Petkov
2024-12-30 11:14 ` Borislav Petkov
0 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2024-12-16 18:51 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Mon, Dec 16, 2024, Borislav Petkov wrote:
> On Wed, Dec 11, 2024 at 02:27:42PM -0800, Sean Christopherson wrote:
> > How much cost are we talking?
>
> Likely 1-2%.
>
> That's why I'm simply enabling it by default.
>
> > IIUC, this magic bit reduces how much the CPU is allowed to speculate in order
> > to mitigate potential VM=>host attacks, and that reducing speculation also reduces
> > overall performance.
> >
> > If that's correct, then enabling the magic bit needs to be gated by an appropriate
> > mitagation being enabled, not forced on automatically just because the CPU supports
> > X86_FEATURE_SRSO_MSR_FIX.
>
> Well, in the default case we have safe-RET - the default - but since it is
> not needed anymore, it falls back to this thing which is needed when the
> mitigation is enabled.
>
> That's why it also is in the SRSO_CMD_IBPB_ON_VMEXIT case as it is part of the
> spec_rstack_overflow=ibpb-vmexit mitigation option.
>
> So it kinda already does that. When you disable the mitigation, this one won't
> get enabled either.
But it's a hardware-defined feature flag, so won't it be set by this code?
if (c->extended_cpuid_level >= 0x8000001f)
c->x86_capability[CPUID_8000_001F_EAX] = cpuid_eax(0x8000001f);
srso_select_mitigation() checks the flag for SRSO_CMD_IBPB_ON_VMEXIT
case SRSO_CMD_IBPB_ON_VMEXIT:
if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) {
pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
break;
}
but I don't see any code that would clear X86_FEATURE_SRSO_MSR_FIX. Am I missing
something?
> > And depending on the cost, it might also make sense to set the bit on-demand, and
> > then clean up when KVM disables virtualization. E.g. wait to set the bit until
> > entry to a guest is imminent.
>
> So the "when to set that bit" discussion kinda remained unfinished the last
> time.
Gah, sorry. I suspect I got thinking about how best to "set it only when really
needed", and got lost in analysis paralysis.
> Here's gist:
>
> You:
>
> | "It's not strictly KVM module load, it's when KVM enables virtualization. E.g.
> | if userspace clears enable_virt_at_load, the MSR will be toggled every time the
> | number of VMs goes from 0=>1 and 1=>0.
> |
> | But why do this in KVM? E.g. why not set-and-forget in init_amd_zen4()?"
>
> I:
>
> | "Because there's no need to impose an unnecessary - albeit small - perf impact
> | on users who don't do virt.
> |
> | I'm currently gravitating towards the MSR toggling thing, i.e., only when the
> | VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
> | yes, the toggling thing sounds better. I.e., set it only when really needed."
>
> So to answer your current question, it sounds like the user can control the
> on-demand thing with enable_virt_at_load=0, right?
To some extent. But I strongly suspect that the vast, vast majority of end users
will end up with systems that automatically load kvm.ko, but don't run VMs the
majority of the time. Expecting non-KVM to users to detect a 1-2% regression and
track down enable_virt_at_load doesn't seem like a winning strategy.
> Or do you mean something else different?
The other possibility would be to wait to set the bit until a CPU is actually
going to do VMRUN. If we use KVM's "user-return MSR" framework, the bit would
be cleared when the CPU returns to userspace. The only downside to that is KVM
would toggle the bit on CPUs running vCPUs on every exit to userspace, e.g. to
emulate MMIO/IO and other things.
But, userspace exits are relatively slow paths, so if the below is a wash for
performance when running VMs, i.e. the cost of the WRMSRs is either in the noise
or is offset by the regained 1-2% performance for userspace, then I think it's a
no-brainer.
Enabling "full" speculation on return to usersepace means non-KVM tasks won't be
affected, and there's no "sticky" behavior. E.g. another idea would be to defer
setting the bit until VMRUN is imminent, but then wait to clear the bit until
virtualization is disabled. But that has the downside of the bit being set on all
CPUs over time, especially if enable_virt_at_load is true.
Compile tested only...
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e4fad330cd25..061ac5940432 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -256,6 +256,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -608,9 +609,6 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
-
- if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
- msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -688,9 +686,6 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
- if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
- msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
-
return 0;
}
@@ -1547,6 +1542,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT,
+ MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
svm->guest_state_loaded = true;
}
@@ -5313,6 +5313,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot) < 0) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-16 18:51 ` Sean Christopherson
@ 2024-12-17 9:34 ` Borislav Petkov
2024-12-30 11:14 ` Borislav Petkov
1 sibling, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2024-12-17 9:34 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Mon, Dec 16, 2024 at 10:51:13AM -0800, Sean Christopherson wrote:
> but I don't see any code that would clear X86_FEATURE_SRSO_MSR_FIX. Am I missing
> something?
Ah, you want the toggles in svm_{enable,disable}_virtualization_cpu() to not
happen when the mitigation is disabled. Yeah, I guess we should clear the flag
when the mitigation is disabled...
> Gah, sorry. I suspect I got thinking about how best to "set it only when really
> needed", and got lost in analysis paralysis.
I know *exactly* what you're talking about :-P
> To some extent. But I strongly suspect that the vast, vast majority of end users
> will end up with systems that automatically load kvm.ko, but don't run VMs the
> majority of the time. Expecting non-KVM to users to detect a 1-2% regression and
> track down enable_virt_at_load doesn't seem like a winning strategy.
Yap, that was my fear too. Frankly, I don't have a really good answer to
that yet.
> The other possibility would be to wait to set the bit until a CPU is actually
> going to do VMRUN. If we use KVM's "user-return MSR" framework, the bit would
> be cleared when the CPU returns to userspace. The only downside to that is KVM
> would toggle the bit on CPUs running vCPUs on every exit to userspace, e.g. to
> emulate MMIO/IO and other things.
>
> But, userspace exits are relatively slow paths, so if the below is a wash for
> performance when running VMs, i.e. the cost of the WRMSRs is either in the noise
> or is offset by the regained 1-2% performance for userspace, then I think it's a
> no-brainer.
>
> Enabling "full" speculation on return to usersepace means non-KVM tasks won't be
> affected, and there's no "sticky" behavior. E.g. another idea would be to defer
> setting the bit until VMRUN is imminent, but then wait to clear the bit until
> virtualization is disabled. But that has the downside of the bit being set on all
> CPUs over time, especially if enable_virt_at_load is true.
Yeah, I think we should keep it simple initially and only do anything more
involved when it turns out that we really need it.
> Compile tested only...
Thanks, looks good, I'll run it after I get back from vacation to make sure
we're good and then we'll talk again. :)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-16 18:51 ` Sean Christopherson
2024-12-17 9:34 ` Borislav Petkov
@ 2024-12-30 11:14 ` Borislav Petkov
2025-01-08 13:38 ` Sean Christopherson
1 sibling, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-30 11:14 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Mon, Dec 16, 2024 at 10:51:13AM -0800, Sean Christopherson wrote:
> @@ -1547,6 +1542,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
> kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> + kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
> + MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT,
> + MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
I think this needs to be:
if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
I.e., value and mask are both the 4th bit: (1 << 4)
> svm->guest_state_loaded = true;
> }
>
> @@ -5313,6 +5313,14 @@ static __init int svm_hardware_setup(void)
>
> tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) {
> + zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
> + if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot) < 0) {
> + r = -EIO;
> + goto err;
> + }
And this needs to be:
if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) {
zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
r = -EIO;
goto err;
}
}
Note the WARN_ON_ONCE bracketing. But I know you're doing this on purpose - to
see if I'm paying attention and not taking your patch blindly :-P
With that fixed, this approach still doesn't look sane to me: before I start
the guest I have all SPEC_REDUCE bits correctly clear:
# rdmsr -a 0xc001102e | uniq -c
128 420000
... start a guest, shut it down cleanly, qemu exits properly...
# rdmsr -a 0xc001102e | uniq -c
1 420010
6 420000
1 420010
3 420000
1 420010
3 420000
1 420010
1 420000
1 420010
6 420000
1 420010
1 420000
1 420010
6 420000
1 420010
5 420000
1 420010
18 420000
1 420010
5 420000
1 420010
6 420000
1 420010
3 420000
1 420010
3 420000
1 420010
1 420000
1 420010
6 420000
1 420010
1 420000
1 420010
6 420000
1 420010
5 420000
1 420010
18 420000
1 420010
5 420000
so SPEC_REDUCE remains set on some cores. Not good since I'm not running VMs
anymore.
# rmmod kvm_amd kvm
# rdmsr -a 0xc001102e | uniq -c
128 420000
that looks more like it.
Also, this user-return MSR toggling does show up higher in the profile:
4.31% qemu-system-x86 [kvm] [k] 0x000000000000d23f
2.44% qemu-system-x86 [kernel.kallsyms] [k] read_tsc
1.66% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
1.50% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
vs
1.01% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
0.81% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
so it really is noticeable.
So I wanna say, let's do the below and be done with it. My expectation is that
this won't be needed in the future anymore either so it'll be a noop on most
machines...
---
@@ -609,6 +609,9 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -686,6 +689,9 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
return 0;
}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-30 11:14 ` Borislav Petkov
@ 2025-01-08 13:38 ` Sean Christopherson
2025-01-08 15:49 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-01-08 13:38 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Mon, Dec 30, 2024, Borislav Petkov wrote:
> On Mon, Dec 16, 2024 at 10:51:13AM -0800, Sean Christopherson wrote:
> Note the WARN_ON_ONCE bracketing. But I know you're doing this on purpose - to
> see if I'm paying attention and not taking your patch blindly :-P
LOL, yeah, totally on purpose.
> With that fixed, this approach still doesn't look sane to me: before I start
> the guest I have all SPEC_REDUCE bits correctly clear:
>
> # rdmsr -a 0xc001102e | uniq -c
> 128 420000
>
> ... start a guest, shut it down cleanly, qemu exits properly...
>
> # rdmsr -a 0xc001102e | uniq -c
...
> so SPEC_REDUCE remains set on some cores. Not good since I'm not running VMs
> anymore.
>
> # rmmod kvm_amd kvm
> # rdmsr -a 0xc001102e | uniq -c
> 128 420000
>
> that looks more like it.
The "host" value will only be restored when the CPU exits to userspace, so if
there are no userspace tasks running on those CPUs, i.e. nothing that forces them
back to userspace, then it's expected for them to have the "guest" value loaded,
even after the guest is long gone. Unloading KVM effectively forces KVM to simulate
a return to userspace and thus restore the host values.
It seems unlikely that someone would care deeply about the performance of a CPU
that is only running kernel code, but I agree it's odd and not exactly desirable.
> Also, this user-return MSR toggling does show up higher in the profile:
>
> 4.31% qemu-system-x86 [kvm] [k] 0x000000000000d23f
> 2.44% qemu-system-x86 [kernel.kallsyms] [k] read_tsc
> 1.66% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
> 1.50% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
>
> vs
>
> 1.01% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
> 0.81% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
>
> so it really is noticeable.
Hmm, mostly out of curiosity, what's the "workload"? And do you know what 0xd23f
corresponds to?
For most setups, exits all the way to userspace are relatively uncommon. There
are scenarios where the number of userspace exits is quite high, e.g. if the guest
is spamming its emulated serial console, but I wouldn't expect switching the MSR
on user entry/exit to be that noticeable.
> So I wanna say, let's do the below and be done with it. My expectation is that
> this won't be needed in the future anymore either so it'll be a noop on most
> machines...
Yeah, especially if this is all an improvement over the existing mitigation.
Though since it can impact non-virtualization workloads, maybe it should be a
separately selectable mitigation? I.e. not piggybacked on top of ibpb-vmexit?
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 13:38 ` Sean Christopherson
@ 2025-01-08 15:49 ` Borislav Petkov
2025-01-08 17:18 ` Sean Christopherson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-01-08 15:49 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Wed, Jan 08, 2025 at 05:38:39AM -0800, Sean Christopherson wrote:
> The "host" value will only be restored when the CPU exits to userspace, so if
> there are no userspace tasks running on those CPUs, i.e. nothing that forces them
> back to userspace, then it's expected for them to have the "guest" value loaded,
> even after the guest is long gone. Unloading KVM effectively forces KVM to simulate
> a return to userspace and thus restore the host values.
Aha, makes sense.
> Hmm, mostly out of curiosity, what's the "workload"?
Oh, very very exciting: booting a guest! :-P
> And do you know what 0xd23f corresponds to?
How's that:
$ objdump -D arch/x86/kvm/kvm.ko
...
000000000000d1a0 <kvm_vcpu_halt>:
d1a0: e8 00 00 00 00 call d1a5 <kvm_vcpu_halt+0x5>
d1a5: 55 push %rbp
...
d232: e8 09 93 ff ff call 6540 <kvm_vcpu_check_block>
d237: 85 c0 test %eax,%eax
d239: 0f 88 f6 01 00 00 js d435 <kvm_vcpu_halt+0x295>
d23f: f3 90 pause
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
d241: e8 00 00 00 00 call d246 <kvm_vcpu_halt+0xa6>
d246: 48 89 c3 mov %rax,%rbx
d249: e8 00 00 00 00 call d24e <kvm_vcpu_halt+0xae>
d24e: 84 c0 test %al,%al
Which makes sense :-)
> Yeah, especially if this is all an improvement over the existing mitigation.
> Though since it can impact non-virtualization workloads, maybe it should be a
> separately selectable mitigation? I.e. not piggybacked on top of ibpb-vmexit?
Well, ibpb-on-vmexit is your typical cloud provider scenario where you address
the VM/VM attack vector by doing an IBPB on VMEXIT. This SRSO_MSR_FIX thing
protects the *host* from a malicious guest so you need both enabled for full
protection on the guest/host vector.
(And yeah, I'm talking about attack vectors because this way of thinking about
the mitigations will simplify stuff a lot:
https://lore.kernel.org/r/20241105215455.359471-1-david.kaplan@amd.com
)
Ok, lemme send a proper patch...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 15:49 ` Borislav Petkov
@ 2025-01-08 17:18 ` Sean Christopherson
2025-01-08 18:14 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-01-08 17:18 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Wed, Jan 08, 2025, Borislav Petkov wrote:
> > And do you know what 0xd23f corresponds to?
>
> How's that:
>
> $ objdump -D arch/x86/kvm/kvm.ko
> ...
> 000000000000d1a0 <kvm_vcpu_halt>:
> d1a0: e8 00 00 00 00 call d1a5 <kvm_vcpu_halt+0x5>
> d1a5: 55 push %rbp
> ...
>
> d232: e8 09 93 ff ff call 6540 <kvm_vcpu_check_block>
> d237: 85 c0 test %eax,%eax
> d239: 0f 88 f6 01 00 00 js d435 <kvm_vcpu_halt+0x295>
> d23f: f3 90 pause
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> d241: e8 00 00 00 00 call d246 <kvm_vcpu_halt+0xa6>
> d246: 48 89 c3 mov %rax,%rbx
> d249: e8 00 00 00 00 call d24e <kvm_vcpu_halt+0xae>
> d24e: 84 c0 test %al,%al
>
>
> Which makes sense :-)
Ooh, it's just the MSR writes that increased. I misinterpreted the profile
statement and thought that something in KVM was jumping from ~0% to 4.31%. If
the cost really is just this:
1.66% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
1.50% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
vs
1.01% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
0.81% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe
then my vote is to go with the user_return approach. It's unfortunate that
restoring full speculation may be delayed until a CPU exits to userspace or KVM
is unloaded, but given that enable_virt_at_load is enabled by default, in practice
it's likely still far better than effectively always running the host with reduced
speculation.
> > Yeah, especially if this is all an improvement over the existing mitigation.
> > Though since it can impact non-virtualization workloads, maybe it should be a
> > separately selectable mitigation? I.e. not piggybacked on top of ibpb-vmexit?
>
> Well, ibpb-on-vmexit is your typical cloud provider scenario where you address
> the VM/VM attack vector by doing an IBPB on VMEXIT.
No? svm_vcpu_load() emits IBPB when switching VMCBs, i.e. when switching between
vCPUs that may live in separate security contexts. That IBPB is skipped when
X86_FEATURE_IBPB_ON_VMEXIT is enabled, because the host is trusted to not attack
its guests.
> This SRSO_MSR_FIX thing protects the *host* from a malicious guest so you
> need both enabled for full protection on the guest/host vector.
If reducing speculation protects the host, why wouldn't that also protect other
guests? The CPU needs to bounce through the host before enterring a different
guest.
And if for some reason reducing speculation doesn't suffice, wouldn't it be
better to fall back to doing IBPB only when switching VMCBs?
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 17:18 ` Sean Christopherson
@ 2025-01-08 18:14 ` Borislav Petkov
2025-01-08 18:37 ` Jim Mattson
2025-01-11 12:52 ` [PATCH] " Borislav Petkov
0 siblings, 2 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-01-08 18:14 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Wed, Jan 08, 2025 at 09:18:17AM -0800, Sean Christopherson wrote:
> then my vote is to go with the user_return approach. It's unfortunate that
> restoring full speculation may be delayed until a CPU exits to userspace or KVM
> is unloaded, but given that enable_virt_at_load is enabled by default, in practice
> it's likely still far better than effectively always running the host with reduced
> speculation.
I guess. Kaplan just said something to that effect so I guess we can start
with that and then see who complains and address it if she cries loud enough.
:-P
> No? svm_vcpu_load() emits IBPB when switching VMCBs, i.e. when switching between
Bah, nevermind. I got confused by our own whitepaper. /facepalm.
So here's the deal:
The machine has SRSO_USER_KERNEL_NO=1. Which means, you don't need safe-RET.
So we fallback to ibpb-on-vmexit.
Now, if the machine sports BpSpecReduce, we do that and that covers all the
vectors. Otherwise, IBPB-on-VMEXIT it is.
The VM/VM attack vector the paper is talking about and having to IBPB is for
the Spectre v2 side of things. Not SRSO.
Yeah, lemme document that while it is fresh in my head. This is exactly why
I wanted Josh to start mitigation documentation - exactly for such nasties.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 18:14 ` Borislav Petkov
@ 2025-01-08 18:37 ` Jim Mattson
2025-01-08 19:14 ` Borislav Petkov
2025-01-11 12:52 ` [PATCH] " Borislav Petkov
1 sibling, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2025-01-08 18:37 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML
On Wed, Jan 8, 2025 at 10:15 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jan 08, 2025 at 09:18:17AM -0800, Sean Christopherson wrote:
> > then my vote is to go with the user_return approach. It's unfortunate that
> > restoring full speculation may be delayed until a CPU exits to userspace or KVM
> > is unloaded, but given that enable_virt_at_load is enabled by default, in practice
> > it's likely still far better than effectively always running the host with reduced
> > speculation.
>
> I guess. Kaplan just said something to that effect so I guess we can start
> with that and then see who complains and address it if she cries loud enough.
> :-P
>
> > No? svm_vcpu_load() emits IBPB when switching VMCBs, i.e. when switching between
>
> Bah, nevermind. I got confused by our own whitepaper. /facepalm.
>
> So here's the deal:
>
> The machine has SRSO_USER_KERNEL_NO=1. Which means, you don't need safe-RET.
> So we fallback to ibpb-on-vmexit.
Surely, IBPB-on-VMexit is worse for performance than safe-RET?!?
(But, maybe I have a warped perspective, since I only care about VM
performance).
> Now, if the machine sports BpSpecReduce, we do that and that covers all the
> vectors. Otherwise, IBPB-on-VMEXIT it is.
>
> The VM/VM attack vector the paper is talking about and having to IBPB is for
> the Spectre v2 side of things. Not SRSO.
>
> Yeah, lemme document that while it is fresh in my head. This is exactly why
> I wanted Josh to start mitigation documentation - exactly for such nasties.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 18:37 ` Jim Mattson
@ 2025-01-08 19:14 ` Borislav Petkov
2025-01-08 19:43 ` Jim Mattson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-01-08 19:14 UTC (permalink / raw)
To: Jim Mattson
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML
On Wed, Jan 08, 2025 at 10:37:57AM -0800, Jim Mattson wrote:
> Surely, IBPB-on-VMexit is worse for performance than safe-RET?!?
We don't need safe-RET with SRSO_USER_KERNEL_NO=1. And there's no safe-RET for
virt only. So IBPB-on-VMEXIT is the next best thing. The good thing is, those
machines have BpSpecReduce too so you won't be doing IBPB-on-VMEXIT either but
what we're talking about here - BpSpecReduce.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 19:14 ` Borislav Petkov
@ 2025-01-08 19:43 ` Jim Mattson
2025-01-08 19:45 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2025-01-08 19:43 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML
On Wed, Jan 8, 2025 at 11:15 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jan 08, 2025 at 10:37:57AM -0800, Jim Mattson wrote:
> > Surely, IBPB-on-VMexit is worse for performance than safe-RET?!?
>
> We don't need safe-RET with SRSO_USER_KERNEL_NO=1. And there's no safe-RET for
> virt only. So IBPB-on-VMEXIT is the next best thing. The good thing is, those
> machines have BpSpecReduce too so you won't be doing IBPB-on-VMEXIT either but
> what we're talking about here - BpSpecReduce.
I'm suggesting that IBPB-on-VMexit is probably the *worst* thing. If
it weren't for BpSpecReduce, I would want safe-RET for virt only.
(Well, if it weren't for ASI, I would want that, anyway.)
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 19:43 ` Jim Mattson
@ 2025-01-08 19:45 ` Borislav Petkov
0 siblings, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-01-08 19:45 UTC (permalink / raw)
To: Jim Mattson
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML
On Wed, Jan 08, 2025 at 11:43:14AM -0800, Jim Mattson wrote:
> > We don't need safe-RET with SRSO_USER_KERNEL_NO=1. And there's no safe-RET for
> > virt only. So IBPB-on-VMEXIT is the next best thing. The good thing is, those
> > machines have BpSpecReduce too so you won't be doing IBPB-on-VMEXIT either but
> > what we're talking about here - BpSpecReduce.
>
> I'm suggesting that IBPB-on-VMexit is probably the *worst* thing.
I know what you're suggesting and I don't think so but I'd need to look at the
numbers.
> If it weren't for BpSpecReduce, I would want safe-RET for virt only.
Read my reply again.
> (Well, if it weren't for ASI, I would want that, anyway.)
I'm sure Brendan is waiting for reviews there.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-08 18:14 ` Borislav Petkov
2025-01-08 18:37 ` Jim Mattson
@ 2025-01-11 12:52 ` Borislav Petkov
2025-01-17 18:56 ` Sean Christopherson
1 sibling, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-01-11 12:52 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
Ok,
here's a new version, I think I've addressed all outstanding review comments.
Lemme know how we should proceed here, you take it or I? Judging by the
diffstat probably I should and you ack it or so.
Also, lemme know if I should add your Co-developed-by for the user_return
portion.
Thx.
---
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
Documentation/admin-guide/hw-vuln/srso.rst | 21 +++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/bugs.c | 16 +++++++++++++++-
arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
arch/x86/lib/msr.c | 2 ++
7 files changed, 55 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..b856538083a2 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,27 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
+
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116b..471447a31605 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,7 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_MSR_FIX (20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3f3e2bc99162..4cbd461081a1 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -719,6 +719,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c0cd10182e90..a956cd578df6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -762,6 +762,7 @@ enum mds_mitigations {
};
extern bool gds_ucode_mitigated(void);
+extern bool srso_spec_reduce_enabled(void);
/*
* Make previous memory operations globally visible before
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 5a505aa65489..07c04b3844fc 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2523,6 +2523,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2540,12 +2541,19 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
static enum srso_mitigation_cmd srso_cmd __ro_after_init = SRSO_CMD_SAFE_RET;
+bool srso_spec_reduce_enabled(void)
+{
+ return srso_mitigation == SRSO_MITIGATION_BP_SPEC_REDUCE;
+}
+EXPORT_SYMBOL_GPL(srso_spec_reduce_enabled);
+
static int __init srso_parse_cmdline(char *str)
{
if (!str)
@@ -2663,6 +2671,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21dacd312779..59656fd51d57 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -256,6 +256,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1541,6 +1542,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (srso_spec_reduce_enabled())
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5298,6 +5304,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (srso_spec_reduce_enabled()) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 4bf4fad5b148..5a18ecc04a6c 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, true);
}
+EXPORT_SYMBOL_GPL(msr_set_bit);
/**
* msr_clear_bit - Clear @bit in a MSR @msr.
@@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, false);
}
+EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(unsigned int msr, u64 val, int failed)
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-11 12:52 ` [PATCH] " Borislav Petkov
@ 2025-01-17 18:56 ` Sean Christopherson
2025-01-18 15:26 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-01-17 18:56 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Sat, Jan 11, 2025, Borislav Petkov wrote:
> Ok,
>
> here's a new version, I think I've addressed all outstanding review comments.
>
> Lemme know how we should proceed here, you take it or I? Judging by the
> diffstat probably I should and you ack it or so.
No preference, either way works for me.
> Also, lemme know if I should add your Co-developed-by for the user_return
> portion.
Heh, no preference again. In case you want to add Co-developed-by, here's my SoB:
Signed-off-by: Sean Christopherson <seanjc@google.com>
> In order to exploit vulnerability, an attacker needs to:
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 508c0dad116b..471447a31605 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -468,6 +468,7 @@
> #define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
> #define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
> #define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
> +#define X86_FEATURE_SRSO_MSR_FIX (20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs */
Any objection to calling this X86_FEATURE_SRSO_BP_SPEC_REDUCE? More below.
> @@ -2540,12 +2541,19 @@ static const char * const srso_strings[] = {
> [SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
> [SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
> [SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
> - [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
> + [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
> + [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
> };
>
> static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
> static enum srso_mitigation_cmd srso_cmd __ro_after_init = SRSO_CMD_SAFE_RET;
>
> +bool srso_spec_reduce_enabled(void)
> +{
> + return srso_mitigation == SRSO_MITIGATION_BP_SPEC_REDUCE;
At the risk of getting too clever, what if we use the X86_FEATURE to communicate
that KVM should toggle the magic MSR? That'd avoid the helper+export, and up to
this point "srso_mitigation" has been used only for documentation purposes.
The flag just needs to be cleared in two locations, which doesn't seem too awful.
Though if we go that route, SRSO_MSR_FIX is a pretty crappy name :-)
> diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
> index 4bf4fad5b148..5a18ecc04a6c 100644
> --- a/arch/x86/lib/msr.c
> +++ b/arch/x86/lib/msr.c
> @@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit)
> {
> return __flip_bit(msr, bit, true);
> }
> +EXPORT_SYMBOL_GPL(msr_set_bit);
>
> /**
> * msr_clear_bit - Clear @bit in a MSR @msr.
> @@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit)
> {
> return __flip_bit(msr, bit, false);
> }
> +EXPORT_SYMBOL_GPL(msr_clear_bit);
These exports are no longer necessary.
Compile tested only...
---
Documentation/admin-guide/hw-vuln/srso.rst | 21 +++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 15 ++++++++++++++-
arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
5 files changed, 51 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..b856538083a2 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,27 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
+
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 0e2d81763615..00f2649c36ab 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -466,6 +466,7 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..1372a569fb58 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -717,6 +717,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 8854d9bce2a5..971d3373eeaf 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2523,6 +2523,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2540,7 +2541,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2577,6 +2579,8 @@ static void __init srso_select_mitigation(void)
if (!boot_cpu_has_bug(X86_BUG_SRSO) ||
cpu_mitigations_off() ||
srso_cmd == SRSO_CMD_OFF) {
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
return;
@@ -2665,6 +2669,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2686,6 +2696,9 @@ static void __init srso_select_mitigation(void)
}
out:
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
pr_info("%s\n", srso_strings[srso_mitigation]);
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7640a84e554a..6ea3632af580 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -257,6 +257,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1540,6 +1541,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5306,6 +5312,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
base-commit: 9bec4a1f4ea92b99f96894e0c51c192b5345aa91
--
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-17 18:56 ` Sean Christopherson
@ 2025-01-18 15:26 ` Borislav Petkov
2025-01-23 16:25 ` Sean Christopherson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-01-18 15:26 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Fri, Jan 17, 2025 at 10:56:15AM -0800, Sean Christopherson wrote:
> No preference, either way works for me.
Yeah, too late now. Will queue it after -rc1.
> Heh, no preference again. In case you want to add Co-developed-by, here's my SoB:
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Thanks. Added.
> Any objection to calling this X86_FEATURE_SRSO_BP_SPEC_REDUCE? More below.
It sure is a better name. Lets leave SRSO_MSR_FIX in the comment so that
grepping can find it as SRSO_MSR_FIX is what is in the official doc. IOW:
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /*
+ * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs.
+ * (SRSO_MSR_FIX in the official doc).
+
> At the risk of getting too clever, what if we use the X86_FEATURE to communicate
> that KVM should toggle the magic MSR?
Not too clever at all. I myself have used exactly this scheme to communicate
settings to other code and actually I should've thought about it... :-\
Definitely better than the silly export.
> That'd avoid the helper+export, and up to this point "srso_mitigation" has
> been used only for documentation purposes.
>
> The flag just needs to be cleared in two locations, which doesn't seem too
> awful. Though if we go that route, SRSO_MSR_FIX is a pretty crappy name :-)
Yap, most def.
> These exports are no longer necessary.
Doh.
> Compile tested only...
Tested on hw.
Thanks!
---
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Wed, 13 Nov 2024 13:41:10 +0100
Subject: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
Documentation/admin-guide/hw-vuln/srso.rst | 21 +++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 4 ++++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 14 +++++++++++++-
arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
5 files changed, 53 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..b856538083a2 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,27 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
+
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116b..43653f2704c9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,10 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /*
+ * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs.
+ * (SRSO_MSR_FIX in the official doc).
+ */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3f3e2bc99162..4cbd461081a1 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -719,6 +719,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 5a505aa65489..9e3ea7f1b358 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2523,6 +2523,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2540,7 +2541,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2663,6 +2665,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2684,6 +2692,10 @@ static void __init srso_select_mitigation(void)
}
out:
+
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
pr_info("%s\n", srso_strings[srso_mitigation]);
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21dacd312779..ee14833f00e5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -256,6 +256,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1541,6 +1542,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5298,6 +5304,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-18 15:26 ` Borislav Petkov
@ 2025-01-23 16:25 ` Sean Christopherson
2025-01-23 17:01 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-01-23 16:25 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Sat, Jan 18, 2025, Borislav Petkov wrote:
> static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
> @@ -2663,6 +2665,12 @@ static void __init srso_select_mitigation(void)
Unless I'm missing something, the cpu_mitigations_off() and "srso_cmd == SRSO_CMD_OFF"
cases need to clear the feature
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 9e3ea7f1b3587..3939a8dee27d4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2581,6 +2581,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
return;
}
There's also the Zen1/Zen2 ucode+!SMT path, which I assume is irreveleant in
practice:
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
return;
}
But if we wanted to catch all paths, wrap the guts and clear the feature in the
outer layer?
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 9e3ea7f1b3587..0501e31971421 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2572,7 +2572,7 @@ early_param("spec_rstack_overflow", srso_parse_cmdline);
#define SRSO_NOTICE "WARNING: See https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html for mitigation options."
-static void __init srso_select_mitigation(void)
+static void __init __srso_select_mitigation(void)
{
bool has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE);
@@ -2692,11 +2692,15 @@ static void __init srso_select_mitigation(void)
}
out:
+ pr_info("%s\n", srso_strings[srso_mitigation]);
+}
+
+static void __init srso_select_mitigation(void)
+{
+ __srso_select_mitigation();
if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
-
- pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
> ibpb_on_vmexit:
> case SRSO_CMD_IBPB_ON_VMEXIT:
> + if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
> + pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
> + srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
> + break;
> + }
> +
> if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
> if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
> setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
> @@ -2684,6 +2692,10 @@ static void __init srso_select_mitigation(void)
> }
>
> out:
> +
Spurious newlines.
> + if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
> + setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
> +
> pr_info("%s\n", srso_strings[srso_mitigation]);
> }
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-23 16:25 ` Sean Christopherson
@ 2025-01-23 17:01 ` Borislav Petkov
2025-01-23 18:04 ` Sean Christopherson
2025-02-13 10:53 ` Patrick Bellasi
0 siblings, 2 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-01-23 17:01 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Thu, Jan 23, 2025 at 08:25:17AM -0800, Sean Christopherson wrote:
> But if we wanted to catch all paths, wrap the guts and clear the feature in the
> outer layer?
Yap, all valid points, thanks for catching those.
> +static void __init srso_select_mitigation(void)
> +{
> + __srso_select_mitigation();
>
> if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
> setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
> -
> - pr_info("%s\n", srso_strings[srso_mitigation]);
> }
What I'd like, though, here is to not dance around this srso_mitigation
variable setting in __srso_select_mitigation() and then know that the __
function did modify it and now we can eval it.
I'd like for the __ function to return it like __ssb_select_mitigation() does.
But then if we do that, we'll have to do the same changes and turn the returns
to "goto out" where all the paths converge. And I'd prefer if those paths
converged anyway and not have some "early escapes" like those returns which
I completely overlooked. :-\
And that code is going to change soon anyway after David's attack vectors
series.
So, long story short, I guess the simplest thing to do would be to simply do
the below.
I *think*.
I'll stare at it later, on a clear head again and test all cases to make sure
nothing's escaping anymore.
Thx!
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 9e3ea7f1b358..11cafe293c29 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2581,7 +2581,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2593,7 +2593,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2692,11 +2692,11 @@ static void __init srso_select_mitigation(void)
}
out:
-
if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
- pr_info("%s\n", srso_strings[srso_mitigation]);
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-23 17:01 ` Borislav Petkov
@ 2025-01-23 18:04 ` Sean Christopherson
2025-01-24 12:58 ` Borislav Petkov
2025-02-13 10:53 ` Patrick Bellasi
1 sibling, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-01-23 18:04 UTC (permalink / raw)
To: Borislav Petkov
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Thu, Jan 23, 2025, Borislav Petkov wrote:
> On Thu, Jan 23, 2025 at 08:25:17AM -0800, Sean Christopherson wrote:
> > But if we wanted to catch all paths, wrap the guts and clear the feature in the
> > outer layer?
>
> Yap, all valid points, thanks for catching those.
>
> > +static void __init srso_select_mitigation(void)
> > +{
> > + __srso_select_mitigation();
> >
> > if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
> > setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
> > -
> > - pr_info("%s\n", srso_strings[srso_mitigation]);
> > }
>
> What I'd like, though, here is to not dance around this srso_mitigation
> variable setting in __srso_select_mitigation() and then know that the __
> function did modify it and now we can eval it.
>
> I'd like for the __ function to return it like __ssb_select_mitigation() does.
>
> But then if we do that, we'll have to do the same changes and turn the returns
> to "goto out" where all the paths converge. And I'd prefer if those paths
> converged anyway and not have some "early escapes" like those returns which
> I completely overlooked. :-\
>
> And that code is going to change soon anyway after David's attack vectors
> series.
>
> So, long story short, I guess the simplest thing to do would be to simply do
> the below.
I almost proposed that as well, the only reason I didn't is because I wasn't sure
what to do with the pr_info() at the end.
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-23 18:04 ` Sean Christopherson
@ 2025-01-24 12:58 ` Borislav Petkov
2025-02-11 19:19 ` Jim Mattson
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-01-24 12:58 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, X86 ML, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, KVM, LKML
On Thu, Jan 23, 2025 at 10:04:17AM -0800, Sean Christopherson wrote:
> I almost proposed that as well, the only reason I didn't is because I wasn't
> sure what to do with the pr_info() at the end.
Yeah, that came up too while looking at David's patches. We're not really
consistent when issuing the mitigation strings depending on the mitigation or
whether we're affected or not...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-24 12:58 ` Borislav Petkov
@ 2025-02-11 19:19 ` Jim Mattson
2025-02-11 20:51 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2025-02-11 19:19 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML, Yosry Ahmed,
Brendan Jackman
Borislav,
What is the performance impact of BpSpecReduce on Turin?
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-11 19:19 ` Jim Mattson
@ 2025-02-11 20:51 ` Borislav Petkov
0 siblings, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-02-11 20:51 UTC (permalink / raw)
To: Jim Mattson
Cc: Sean Christopherson, Borislav Petkov, X86 ML, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, KVM, LKML, Yosry Ahmed,
Brendan Jackman
On Tue, Feb 11, 2025 at 11:19:25AM -0800, Jim Mattson wrote:
> What is the performance impact of BpSpecReduce on Turin?
See upthread:
https://lore.kernel.org/all/20241216173142.GDZ2Bj_uPBG3TTPYd_@fat_crate.local
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-01-23 17:01 ` Borislav Petkov
2025-01-23 18:04 ` Sean Christopherson
@ 2025-02-13 10:53 ` Patrick Bellasi
2025-02-13 13:44 ` Patrick Bellasi
1 sibling, 1 reply; 68+ messages in thread
From: Patrick Bellasi @ 2025-02-13 10:53 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta,
x86, kvm, linux-kernel, Patrick Bellasi
FWIW, this should be the updated version of the patch with all the review
comments posted so far.
Posting here just to have an overall view of how the new patch should look like.
This is also based on todays Linus's master branch.
Compile tested only...
Best,
Patrick
---
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
Documentation/admin-guide/hw-vuln/srso.rst | 20 ++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/bugs.c | 21 +++++++++++++++++----
arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
tools/arch/x86/include/asm/msr-index.h | 1 +
5 files changed, 53 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c883..49680ab99c393 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,26 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116bc..c46754298507b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,7 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /* BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs (SRSO_MSR_FIX in AMD docs). */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a5d0998d76049..d2007dbfcc1cc 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2522,6 +2522,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2539,7 +2540,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2578,7 +2580,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2590,7 +2592,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2670,6 +2672,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2691,7 +2699,12 @@ static void __init srso_select_mitigation(void)
}
out:
- pr_info("%s\n", srso_strings[srso_mitigation]);
+
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7640a84e554a6..6ea3632af5807 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -257,6 +257,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1540,6 +1541,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5306,6 +5312,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6db..1372a569fb585 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -717,6 +717,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-13 10:53 ` Patrick Bellasi
@ 2025-02-13 13:44 ` Patrick Bellasi
2025-02-13 14:28 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Patrick Bellasi @ 2025-02-13 13:44 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta,
x86, kvm, linux-kernel, Patrick Bellasi
And, of course, this bit:
> diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
> index 3ae84c3b8e6db..1372a569fb585 100644
> --- a/tools/arch/x86/include/asm/msr-index.h
> +++ b/tools/arch/x86/include/asm/msr-index.h
> @@ -717,6 +717,7 @@
> >
> /* Zen4 */
> #define MSR_ZEN4_BP_CFG 0xc001102e
> +#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
has to be added to arch/x86/include/asm/msr-index.h as well.
Following (yet another) updated versions accounting for this.
Best,
Patrick
---
From: Borislav Petkov <bp@alien8.de>
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Patrick Bellasi <derkling@google.com>
---
Documentation/admin-guide/hw-vuln/srso.rst | 20 ++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 21 +++++++++++++++++----
arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
tools/arch/x86/include/asm/msr-index.h | 1 +
6 files changed, 54 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c883..49680ab99c393 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,26 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116bc..c46754298507b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,7 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /* BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs (SRSO_MSR_FIX in AMD docs). */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 9a71880eec070..6bbc8836d6766 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -720,6 +720,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a5d0998d76049..d2007dbfcc1cc 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2522,6 +2522,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2539,7 +2540,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2578,7 +2580,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2590,7 +2592,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2670,6 +2672,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2691,7 +2699,12 @@ static void __init srso_select_mitigation(void)
}
out:
- pr_info("%s\n", srso_strings[srso_mitigation]);
+
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7640a84e554a6..6ea3632af5807 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -257,6 +257,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1540,6 +1541,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5306,6 +5312,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6db..1372a569fb585 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -717,6 +717,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-13 13:44 ` Patrick Bellasi
@ 2025-02-13 14:28 ` Borislav Petkov
2025-02-13 17:50 ` Patrick Bellasi
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-02-13 14:28 UTC (permalink / raw)
To: Patrick Bellasi, Sean Christopherson
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, x86, kvm,
linux-kernel, Patrick Bellasi, Brendan Jackman
+ bjackman.
On Thu, Feb 13, 2025 at 01:44:08PM +0000, Patrick Bellasi wrote:
> Following (yet another) updated versions accounting for this.
Here's a tested one. You could've simply waited as I told you I'm working on
it.
I'm going to queue this one next week into tip once Patrick's fix becomes part
of -rc3.
Thx.
---
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Wed, 13 Nov 2024 13:41:10 +0100
Subject: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
Documentation/admin-guide/hw-vuln/srso.rst | 21 +++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 4 ++++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++----
arch/x86/kvm/svm/svm.c | 14 +++++++++++++
5 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..b856538083a2 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,6 +104,27 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
+
+ Currently, the mitigation uses KVM's user_return approach
+ (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
+ a guest and reset it upon return to host userspace or when the KVM module
+ is unloaded. The intent being, the small perf impact of BpSpecReduce should
+ be incurred only when really necessary.
+
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116b..43653f2704c9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,10 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /*
+ * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs.
+ * (SRSO_MSR_FIX in the official doc).
+ */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 72765b2fe0d8..d35519b337ba 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -721,6 +721,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a5d0998d7604..1d7afc40f227 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2522,6 +2522,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2539,7 +2540,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2578,7 +2580,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2590,7 +2592,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2670,6 +2672,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2691,7 +2699,15 @@ static void __init srso_select_mitigation(void)
}
out:
- pr_info("%s\n", srso_strings[srso_mitigation]);
+ /*
+ * Clear the feature flag if this mitigation is not selected as that
+ * feature flag controls the BpSpecReduce MSR bit toggling in KVM.
+ */
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7640a84e554a..6ea3632af580 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -257,6 +257,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data);
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
static int tsc_aux_uret_slot __read_mostly = -1;
+static int zen4_bp_cfg_uret_slot __ro_after_init = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@@ -1540,6 +1541,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ kvm_set_user_return_msr(zen4_bp_cfg_uret_slot,
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT),
+ BIT_ULL(MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT));
+
svm->guest_state_loaded = true;
}
@@ -5306,6 +5312,14 @@ static __init int svm_hardware_setup(void)
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG);
+ if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot < 0)) {
+ r = -EIO;
+ goto err;
+ }
+ }
+
if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
kvm_enable_efer_bits(EFER_AUTOIBRS);
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-13 14:28 ` Borislav Petkov
@ 2025-02-13 17:50 ` Patrick Bellasi
2025-02-14 20:10 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Patrick Bellasi @ 2025-02-13 17:50 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta,
x86, kvm, linux-kernel, Patrick Bellasi, Brendan Jackman,
Yosry Ahmed
> On Thu, Feb 13, 2025 at 01:44:08PM +0000, Patrick Bellasi wrote:
> > Following (yet another) updated versions accounting for this.
> >
> Here's a tested one. You could've simply waited as I told you I'm working on
> it.
Thanks Borislav, no problems for me, I'm just looking into these SRSO fixes and
what I did was mostly a way to ramp up.
> I'm going to queue this one next week into tip once Patrick's fix becomes part
> of -rc3.
Thanks for all the efforts.
Meanwhile I've a couple of questions about the approach proposed here.
> + * 'Mitigation: Reduced Speculation':
> +
> + This mitigation gets automatically enabled when the above one "IBPB on
> + VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
> +
> + It gets automatically enabled on machines which have the
> + SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
> + to the above =ibpb-vmexit mitigation because the user/kernel boundary is
> + not affected anymore and thus "safe RET" is not needed.
> +
> + After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
> + is detected (functionality present on all such machines) and that
> + practically overrides IBPB on VMEXIT as it has a lot less performance
> + impact and takes care of the guest->host attack vector too.
> +
> + Currently, the mitigation uses KVM's user_return approach
> + (kvm_set_user_return_msr()) to set the BpSpecReduce bit when a vCPU runs
> + a guest and reset it upon return to host userspace or when the KVM module
> + is unloaded. The intent being, the small perf impact of BpSpecReduce should
> + be incurred only when really necessary.
According to [1], the AMD Whitepaper on SRSO says:
Processors which set SRSO_MSR_FIX=1 support an MSR bit which mitigates SRSO
across guest/host boundaries. Software may enable this by setting bit 4
(BpSpecReduce) of MSR C001_102E. This bit can be set once during boot and
should be set identically across all processors in the system.
The "should be set identically across all processors in the system" makes me
wondering if using the "KVM's user_return approach" proposed here is robust
enough. Could this not lead to the bit being possibly set only on some CPU
but not others?
Am I missing something? Is the MSR used something "shared" across all the cores
of a CPU, i.e. as long a vCPU is running the entire system is still protected?
Maybe there is even just a more recent whitepaper (I did not find) online with
a different recommendation?
Also, setting the MSR bit only while the guest is running, is it sufficient to
prevent both speculation and training?
Think about this scenario:
(a) Guest runs with BpSpecReduce MSR set by KVM and performs training.
(b) We exit into userspace and clear BpSpecReduce.
(c) We go back into the kernel with BpSpecReduce cleared and the guest
training intact.
If (a) can be done (i.e. the MSR does not prevent training), don't we end up in
a vulnerable state in (c)?
If BpSpecReduce does not prevent training, but only the training from being
used, should not we keep it consistently set after a guest has run, or until an
IBPB is executed?
Best,
Patrick
[1] https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-13 17:50 ` Patrick Bellasi
@ 2025-02-14 20:10 ` Borislav Petkov
2025-02-15 0:57 ` Yosry Ahmed
2025-02-15 12:53 ` Borislav Petkov
0 siblings, 2 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-02-14 20:10 UTC (permalink / raw)
To: Patrick Bellasi
Cc: Sean Christopherson, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta,
x86, kvm, linux-kernel, Patrick Bellasi, Brendan Jackman,
Yosry Ahmed
On Thu, Feb 13, 2025 at 05:50:57PM +0000, Patrick Bellasi wrote:
> The "should be set identically across all processors in the system" makes me
> wondering if using the "KVM's user_return approach" proposed here is robust
> enough. Could this not lead to the bit being possibly set only on some CPU
> but not others?
That's fine, we should update that paper.
> If BpSpecReduce does not prevent training, but only the training from being
> used, should not we keep it consistently set after a guest has run, or until an
> IBPB is executed?
After talking with folks internally, you're probably right. We should slap an
IBPB before clearing. Which means, I cannot use the MSR return slots anymore.
I will have to resurrect some of the other solutions we had lined up...
Stay tuned.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-14 20:10 ` Borislav Petkov
@ 2025-02-15 0:57 ` Yosry Ahmed
2025-02-15 9:15 ` Borislav Petkov
2025-02-15 12:53 ` Borislav Petkov
1 sibling, 1 reply; 68+ messages in thread
From: Yosry Ahmed @ 2025-02-15 0:57 UTC (permalink / raw)
To: Borislav Petkov
Cc: Patrick Bellasi, Sean Christopherson, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Fri, Feb 14, 2025 at 09:10:05PM +0100, Borislav Petkov wrote:
> On Thu, Feb 13, 2025 at 05:50:57PM +0000, Patrick Bellasi wrote:
> > The "should be set identically across all processors in the system" makes me
> > wondering if using the "KVM's user_return approach" proposed here is robust
> > enough. Could this not lead to the bit being possibly set only on some CPU
> > but not others?
>
> That's fine, we should update that paper.
>
> > If BpSpecReduce does not prevent training, but only the training from being
> > used, should not we keep it consistently set after a guest has run, or until an
> > IBPB is executed?
>
> After talking with folks internally, you're probably right. We should slap an
> IBPB before clearing. Which means, I cannot use the MSR return slots anymore.
> I will have to resurrect some of the other solutions we had lined up...
>
> Stay tuned.
Thanks for working on this!
Should this patch (and the two previously merged patches) be backported
to stable? I noticed they did not have CC:stable.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-15 0:57 ` Yosry Ahmed
@ 2025-02-15 9:15 ` Borislav Petkov
2025-02-17 5:47 ` Yosry Ahmed
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-02-15 9:15 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Patrick Bellasi, Sean Christopherson, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Sat, Feb 15, 2025 at 12:57:07AM +0000, Yosry Ahmed wrote:
> Should this patch (and the two previously merged patches) be backported
> to stable? I noticed they did not have CC:stable.
Because a stable backport would require manual backporting, depending on where
it goes. And Brendan, Patrick or I are willing to do that - it's a question of
who gets there first. :-)
And folks, whoever wants to backport, pls tell the others so that we don't
duplicate work.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-15 9:15 ` Borislav Petkov
@ 2025-02-17 5:47 ` Yosry Ahmed
2025-02-17 15:26 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Yosry Ahmed @ 2025-02-17 5:47 UTC (permalink / raw)
To: Borislav Petkov
Cc: Patrick Bellasi, Sean Christopherson, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Sat, Feb 15, 2025 at 10:15:27AM +0100, Borislav Petkov wrote:
> On Sat, Feb 15, 2025 at 12:57:07AM +0000, Yosry Ahmed wrote:
> > Should this patch (and the two previously merged patches) be backported
> > to stable? I noticed they did not have CC:stable.
>
> Because a stable backport would require manual backporting, depending on where
> it goes. And Brendan, Patrick or I are willing to do that - it's a question of
> who gets there first. :-)
Thanks for the context, didn't realize a manual backport was needed.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 5:47 ` Yosry Ahmed
@ 2025-02-17 15:26 ` Borislav Petkov
0 siblings, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-02-17 15:26 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Patrick Bellasi, Sean Christopherson, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Sun, Feb 16, 2025 at 09:47:56PM -0800, Yosry Ahmed wrote:
> Thanks for the context, didn't realize a manual backport was needed.
Feb 17 gregkh@linuxfoundation.org (:9.1K|) FAILED: patch "[PATCH] x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit" failed to apply to 6.6-stable tree
Feb 17 gregkh@linuxfoundation.org (:9.1K|) FAILED: patch "[PATCH] x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit" failed to apply to 6.1-stable tree
Feb 17 gregkh@linuxfoundation.org (:9.1K|) FAILED: patch "[PATCH] x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit" failed to apply to 5.15-stable tree
Feb 17 gregkh@linuxfoundation.org (:9.2K|) FAILED: patch "[PATCH] x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit" failed to apply to 5.10-stable tree
Feb 17 gregkh@linuxfoundation.org (:9.5K|) FAILED: patch "[PATCH] x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit" failed to apply to 5.4-stable tree
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-14 20:10 ` Borislav Petkov
2025-02-15 0:57 ` Yosry Ahmed
@ 2025-02-15 12:53 ` Borislav Petkov
2025-02-17 5:59 ` Yosry Ahmed
1 sibling, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-02-15 12:53 UTC (permalink / raw)
To: Sean Christopherson
Cc: Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, x86,
kvm, linux-kernel, Patrick Bellasi, Brendan Jackman, Yosry Ahmed
On Fri, Feb 14, 2025 at 09:10:05PM +0100, Borislav Petkov wrote:
> After talking with folks internally, you're probably right. We should slap an
> IBPB before clearing. Which means, I cannot use the MSR return slots anymore.
> I will have to resurrect some of the other solutions we had lined up...
So I'm thinking about this (totally untested ofc) but I'm doing it in the CLGI
region so no need to worry about migration etc.
Sean, that ok this way?
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 6ea3632af580..dcc4e5935b82 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4272,8 +4272,16 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ indirect_branch_prediction_barrier();
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+ }
+
if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-15 12:53 ` Borislav Petkov
@ 2025-02-17 5:59 ` Yosry Ahmed
2025-02-17 16:07 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Yosry Ahmed @ 2025-02-17 5:59 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Sat, Feb 15, 2025 at 01:53:07PM +0100, Borislav Petkov wrote:
> On Fri, Feb 14, 2025 at 09:10:05PM +0100, Borislav Petkov wrote:
> > After talking with folks internally, you're probably right. We should slap an
> > IBPB before clearing. Which means, I cannot use the MSR return slots anymore.
> > I will have to resurrect some of the other solutions we had lined up...
>
> So I'm thinking about this (totally untested ofc) but I'm doing it in the CLGI
> region so no need to worry about migration etc.
>
> Sean, that ok this way?
I am no Sean, but I have some questions about this approach if that's
okay :)
First of all, the use of indirect_branch_prediction_barrier() is
interesting to me because it only actually performs an IBPB if
X86_FEATURE_USE_IBPB is set, which is set according to the spectre_v2
mitigation. It seems to me that indirect_branch_prediction_barrier() was
originally intended for use only in switch_mm_irqs_off() ->
cond_mitigation(), where the spectre_v2 mitigations are executed, then
made its way to other places like KVM.
Second, and probably more importantly, it seems like with this approach
we will essentially be doing an IBPB on every VM-exit AND running the
guest with reduced speculation. At least on the surface, this looks
worse than an IBPB on VM-exit. My understanding is that this MSR is
intended to be a more efficient mitigation than IBPB on VM-exit.
This probably performs considerably worse than the previous approaches,
so I am wondering which approach is the 1-2% regression figure
associated with.
If 1-2% is the cost for keeping the MSR enabled at all times, I wonder
if we should just do that for simplicitly, and have it its own
mitigation option (chosen by the cmdline).
Alternatively, if that's too expensive, perhaps we can choose another
boundary to clear the MSR at and perform an IBPB. I can think of two
places:
- Upon return to userspace (similar to your previous proposal). In this
case we run userspace with the MSR cleared, and only perform an IBPB
in the exit to userspace pace.
- In the switch_mm() path (around cond_mitigation()). Basically we keep
the MSR bit set until we go to a different process, at which point we
clear it and do an IBPB. The MSR will bet set while the VMM is
running, but other processes in the system won't take the hit. We can
even be smarter and only do the MSR clear + IBPB if we're going from
a process using KVM to process that isn't. We'll need more bookkeeping
for that though.
Any thoughts? Am I completely off base?
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 6ea3632af580..dcc4e5935b82 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4272,8 +4272,16 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
> if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
> x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> +
> svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
> + indirect_branch_prediction_barrier();
> + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> + }
> +
> if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
> x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 5:59 ` Yosry Ahmed
@ 2025-02-17 16:07 ` Borislav Petkov
2025-02-17 19:56 ` Yosry Ahmed
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-02-17 16:07 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Sean Christopherson, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Sun, Feb 16, 2025 at 09:59:59PM -0800, Yosry Ahmed wrote:
> If 1-2% is the cost for keeping the MSR enabled at all times, I wonder
> if we should just do that for simplicitly, and have it its own
> mitigation option (chosen by the cmdline).
Yes, that's what David and I think we should do initially.
Then we can chase some more involved scheme like setting the bit before the
first VM runs and then clearing it when the last one exits. I *think* I've
seen something like that in KVM but I need to stare in detail.
> - Upon return to userspace (similar to your previous proposal). In this
> case we run userspace with the MSR cleared, and only perform an IBPB
> in the exit to userspace pace.
You want to IBPB before clearing the MSR as otherwise host kernel will be
running with the mistrained gunk from the guest.
> Any thoughts?
Yes, let's keep it simple and do anything more involved *only* if it is really
necessary.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 16:07 ` Borislav Petkov
@ 2025-02-17 19:56 ` Yosry Ahmed
2025-02-17 20:20 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Yosry Ahmed @ 2025-02-17 19:56 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Mon, Feb 17, 2025 at 05:07:28PM +0100, Borislav Petkov wrote:
> On Sun, Feb 16, 2025 at 09:59:59PM -0800, Yosry Ahmed wrote:
> > If 1-2% is the cost for keeping the MSR enabled at all times, I wonder
> > if we should just do that for simplicitly, and have it its own
> > mitigation option (chosen by the cmdline).
>
> Yes, that's what David and I think we should do initially.
>
> Then we can chase some more involved scheme like setting the bit before the
> first VM runs and then clearing it when the last one exits. I *think* I've
> seen something like that in KVM but I need to stare in detail.
>
> > - Upon return to userspace (similar to your previous proposal). In this
> > case we run userspace with the MSR cleared, and only perform an IBPB
> > in the exit to userspace pace.
>
> You want to IBPB before clearing the MSR as otherwise host kernel will be
> running with the mistrained gunk from the guest.
I meant IBPB + MSR clear before going to userspace, or IBPB + MSR clear
before a context switch.
>
> > Any thoughts?
>
> Yes, let's keep it simple and do anything more involved *only* if it is really
> necessary.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Re: Re: Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 19:56 ` Yosry Ahmed
@ 2025-02-17 20:20 ` Borislav Petkov
2025-02-17 20:32 ` Yosry Ahmed
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-02-17 20:20 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Sean Christopherson, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
On Mon, Feb 17, 2025 at 11:56:22AM -0800, Yosry Ahmed wrote:
> I meant IBPB + MSR clear before going to userspace, or IBPB + MSR clear
> before a context switch.
Basically what I said already:
"Yes, let's keep it simple and do anything more involved *only* if it is
really necessary."
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: [PATCH] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 20:20 ` Borislav Petkov
@ 2025-02-17 20:32 ` Yosry Ahmed
2025-02-18 11:13 ` [PATCH final?] " Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Yosry Ahmed @ 2025-02-17 20:32 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86, kvm, linux-kernel,
Patrick Bellasi, Brendan Jackman
February 17, 2025 at 12:20 PM, "Borislav Petkov" <bp@alien8.de> wrote:
>
> On Mon, Feb 17, 2025 at 11:56:22AM -0800, Yosry Ahmed wrote:
>
> >
> > I meant IBPB + MSR clear before going to userspace, or IBPB + MSR clear
> > before a context switch.
> >
>
> Basically what I said already:
>
> "Yes, let's keep it simple and do anything more involved *only* if it is
> really necessary."
Right, I agree as I mentioned earlier (assuming the perf hit is 1-2%), just wanted to clarify what I meant.
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH final?] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-17 20:32 ` Yosry Ahmed
@ 2025-02-18 11:13 ` Borislav Petkov
2025-02-18 14:42 ` Patrick Bellasi
2025-04-29 13:25 ` x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar Borislav Petkov
0 siblings, 2 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-02-18 11:13 UTC (permalink / raw)
To: Sean Christopherson
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan
So,
in the interest of finally making some progress here I'd like to commit this
below (will test it one more time just in case but it should work :-P). It is
simple and straight-forward and doesn't need an IBPB when the bit gets
cleared.
A potential future improvement is David's suggestion that there could be a way
for tracking when the first guest gets started, we set the bit then, we make
sure the bit gets set on each logical CPU when the guests migrate across the
machine and when the *last* guest exists, that bit gets cleared again.
It all depends on how much machinery is additionally needed and how much ugly
it becomes and how much noticeable the perf impact even is from simply setting
that bit blindly on every CPU...
But something we can chase later, once *some* fix is there first.
Thx.
---
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Switch back to enabling the bit when virtualization is enabled and to
clear the bit when virtualization is disabled because using a MSR slot
would clear the bit when the guest is exited and any training the guest
has done, would potentially influence the host kernel when execution
enters the kernel and hasn't VMRUN the guest yet.
More detail on the public thread in Link.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241202120416.6054-1-bp@kernel.org
---
Documentation/admin-guide/hw-vuln/srso.rst | 13 ++++++++++++
arch/x86/include/asm/cpufeatures.h | 4 ++++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++----
arch/x86/kvm/svm/svm.c | 6 ++++++
arch/x86/lib/msr.c | 2 ++
6 files changed, 46 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05b8c88..66af95251a3d 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,7 +104,20 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0dad116b..43653f2704c9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,10 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /*
+ * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs.
+ * (SRSO_MSR_FIX in the official doc).
+ */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 72765b2fe0d8..d35519b337ba 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -721,6 +721,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a5d0998d7604..1d7afc40f227 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2522,6 +2522,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2539,7 +2540,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2578,7 +2580,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2590,7 +2592,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2670,6 +2672,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2691,7 +2699,15 @@ static void __init srso_select_mitigation(void)
}
out:
- pr_info("%s\n", srso_strings[srso_mitigation]);
+ /*
+ * Clear the feature flag if this mitigation is not selected as that
+ * feature flag controls the BpSpecReduce MSR bit toggling in KVM.
+ */
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a713c803a3a3..77ab66c5bb96 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -607,6 +607,9 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -684,6 +687,9 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
return 0;
}
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 4bf4fad5b148..5a18ecc04a6c 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, true);
}
+EXPORT_SYMBOL_GPL(msr_set_bit);
/**
* msr_clear_bit - Clear @bit in a MSR @msr.
@@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, false);
}
+EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(unsigned int msr, u64 val, int failed)
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH final?] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-18 11:13 ` [PATCH final?] " Borislav Petkov
@ 2025-02-18 14:42 ` Patrick Bellasi
2025-02-18 15:34 ` Borislav Petkov
2025-04-29 13:25 ` x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar Borislav Petkov
1 sibling, 1 reply; 68+ messages in thread
From: Patrick Bellasi @ 2025-02-18 14:42 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Yosry Ahmed, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan
> in the interest of finally making some progress here I'd like to commit this
> below (will test it one more time just in case but it should work :-P). It is
> simple and straight-forward and doesn't need an IBPB when the bit gets
> cleared.
That's indeed simple and straight-forward for the time being.
Maybe a small improvement we could add on top is to have a separate and
dedicated cmdline option?
Indeed, with `X86_FEATURE_SRSO_USER_KERNEL_NO` we are not effectively using an
IBPB on VM-Exit anymore. Something like the diff down below?
Best,
Patrick
---
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 1d7afc40f2272..7609d80eda123 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2531,6 +2531,7 @@ enum srso_mitigation_cmd {
SRSO_CMD_SAFE_RET,
SRSO_CMD_IBPB,
SRSO_CMD_IBPB_ON_VMEXIT,
+ SRSO_CMD_BP_SPEC_REDUCE,
};
static const char * const srso_strings[] = {
@@ -2562,6 +2563,8 @@ static int __init srso_parse_cmdline(char *str)
srso_cmd = SRSO_CMD_IBPB;
else if (!strcmp(str, "ibpb-vmexit"))
srso_cmd = SRSO_CMD_IBPB_ON_VMEXIT;
+ else if (!strcmp(str, "spec-reduce"))
+ srso_cmd = SRSO_CMD_BP_SPEC_REDUCE;
else
pr_err("Ignoring unknown SRSO option (%s).", str);
@@ -2617,7 +2620,7 @@ static void __init srso_select_mitigation(void)
case SRSO_CMD_SAFE_RET:
if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO))
- goto ibpb_on_vmexit;
+ goto spec_reduce;
if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
/*
@@ -2670,14 +2673,7 @@ static void __init srso_select_mitigation(void)
}
break;
-ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
- if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
- pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
- srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
- break;
- }
-
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2694,6 +2690,14 @@ static void __init srso_select_mitigation(void)
pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n");
}
break;
+
+spec_reduce:
+ case SRSO_CMD_BP_SPEC_REDUCE:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
default:
break;
}
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: [PATCH final?] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2025-02-18 14:42 ` Patrick Bellasi
@ 2025-02-18 15:34 ` Borislav Petkov
0 siblings, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-02-18 15:34 UTC (permalink / raw)
To: Patrick Bellasi
Cc: Sean Christopherson, Yosry Ahmed, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan
On Tue, Feb 18, 2025 at 02:42:57PM +0000, Patrick Bellasi wrote:
> Maybe a small improvement we could add on top is to have a separate and
> dedicated cmdline option?
>
> Indeed, with `X86_FEATURE_SRSO_USER_KERNEL_NO` we are not effectively using an
> IBPB on VM-Exit anymore. Something like the diff down below?
Except that I don't see the point of this yet one more cmdline option. Our
mitigations options space is a nightmare. Why do we want to add another one?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-02-18 11:13 ` [PATCH final?] " Borislav Petkov
2025-02-18 14:42 ` Patrick Bellasi
@ 2025-04-29 13:25 ` Borislav Petkov
2025-04-30 23:33 ` Sean Christopherson
1 sibling, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-04-29 13:25 UTC (permalink / raw)
To: Sean Christopherson
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan, Michael Larabel
On Tue, Feb 18, 2025 at 12:13:33PM +0100, Borislav Petkov wrote:
> So,
>
> in the interest of finally making some progress here I'd like to commit this
> below (will test it one more time just in case but it should work :-P). It is
> simple and straight-forward and doesn't need an IBPB when the bit gets
> cleared.
>
> A potential future improvement is David's suggestion that there could be a way
> for tracking when the first guest gets started, we set the bit then, we make
> sure the bit gets set on each logical CPU when the guests migrate across the
> machine and when the *last* guest exists, that bit gets cleared again.
Well, that "simplicity" was short-lived:
https://www.phoronix.com/review/linux-615-amd-regression
Sean, how about this below?
It is hacky and RFC-ish - i.e., don't look too hard at it - but basically
I'm pushing down into arch code the decision whether to enable virt on load.
And it has no effects on anything else but machines which have this
SRSO_MSR_FIX (Zen5).
And it seems to work here - the MSR is set only when I create a VM - i.e., as
expected.
Thoughts? Better ideas?
Thx.
---
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 823c0434bbad..6cc8698df1a5 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -16,6 +16,7 @@ BUILD_BUG_ON(1)
KVM_X86_OP(check_processor_compatibility)
KVM_X86_OP(enable_virtualization_cpu)
KVM_X86_OP(disable_virtualization_cpu)
+KVM_X86_OP_OPTIONAL(enable_virt_on_load)
KVM_X86_OP(hardware_unsetup)
KVM_X86_OP(has_emulated_msr)
KVM_X86_OP(vcpu_after_set_cpuid)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3131abcac4f1..c1a29d7fee45 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1664,6 +1664,7 @@ struct kvm_x86_ops {
int (*enable_virtualization_cpu)(void);
void (*disable_virtualization_cpu)(void);
+ bool (*enable_virt_on_load)(void);
cpu_emergency_virt_cb *emergency_disable_virtualization_cpu;
void (*hardware_unsetup)(void);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 67657b3a36ce..dcbba55cb949 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -693,6 +693,16 @@ static int svm_enable_virtualization_cpu(void)
return 0;
}
+static bool svm_enable_virt_on_load(void)
+{
+ bool ret = true;
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ ret = false;
+
+ return ret;
+}
+
static void svm_cpu_uninit(int cpu)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
@@ -5082,6 +5092,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.hardware_unsetup = svm_hardware_unsetup,
.enable_virtualization_cpu = svm_enable_virtualization_cpu,
.disable_virtualization_cpu = svm_disable_virtualization_cpu,
+ .enable_virt_on_load = svm_enable_virt_on_load,
.emergency_disable_virtualization_cpu = svm_emergency_disable_virtualization_cpu,
.has_emulated_msr = svm_has_emulated_msr,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4c6553985e75..a09dc8cbd59f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12576,9 +12576,15 @@ void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
}
EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_sipi_vector);
-void kvm_arch_enable_virtualization(void)
+bool kvm_arch_enable_virtualization(bool allow_arch_override)
{
+ if (allow_arch_override)
+ if (!kvm_x86_call(enable_virt_on_load)())
+ return false;
+
cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
+
+ return true;
}
void kvm_arch_disable_virtualization(void)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 291d49b9bf05..4353ef54d45d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1599,7 +1599,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
* kvm_usage_count, i.e. at the beginning of the generic hardware enabling
* sequence, and at the end of the generic hardware disabling sequence.
*/
-void kvm_arch_enable_virtualization(void);
+bool kvm_arch_enable_virtualization(bool);
void kvm_arch_disable_virtualization(void);
/*
* kvm_arch_{enable,disable}_virtualization_cpu() are called on "every" CPU to
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e85b33a92624..0009661dee1d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -143,8 +143,8 @@ static int kvm_no_compat_open(struct inode *inode, struct file *file)
#define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl, \
.open = kvm_no_compat_open
#endif
-static int kvm_enable_virtualization(void);
-static void kvm_disable_virtualization(void);
+static int kvm_enable_virtualization(bool allow_arch_override);
+static void kvm_disable_virtualization(bool allow_arch_override);
static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
@@ -1187,7 +1187,7 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
if (r)
goto out_err_no_arch_destroy_vm;
- r = kvm_enable_virtualization();
+ r = kvm_enable_virtualization(false);
if (r)
goto out_err_no_disable;
@@ -1224,7 +1224,7 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
#endif
out_err_no_mmu_notifier:
- kvm_disable_virtualization();
+ kvm_disable_virtualization(false);
out_err_no_disable:
kvm_arch_destroy_vm(kvm);
out_err_no_arch_destroy_vm:
@@ -1320,7 +1320,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
#endif
kvm_arch_free_vm(kvm);
preempt_notifier_dec();
- kvm_disable_virtualization();
+ kvm_disable_virtualization(false);
mmdrop(mm);
}
@@ -5489,9 +5489,9 @@ static DEFINE_PER_CPU(bool, virtualization_enabled);
static DEFINE_MUTEX(kvm_usage_lock);
static int kvm_usage_count;
-__weak void kvm_arch_enable_virtualization(void)
+__weak bool kvm_arch_enable_virtualization(bool)
{
-
+ return false;
}
__weak void kvm_arch_disable_virtualization(void)
@@ -5589,8 +5589,9 @@ static struct syscore_ops kvm_syscore_ops = {
.shutdown = kvm_shutdown,
};
-static int kvm_enable_virtualization(void)
+static int kvm_enable_virtualization(bool allow_arch_override)
{
+ bool do_init;
int r;
guard(mutex)(&kvm_usage_lock);
@@ -5598,7 +5599,9 @@ static int kvm_enable_virtualization(void)
if (kvm_usage_count++)
return 0;
- kvm_arch_enable_virtualization();
+ do_init = kvm_arch_enable_virtualization(allow_arch_override);
+ if (!do_init)
+ goto out;
r = cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
kvm_online_cpu, kvm_offline_cpu);
@@ -5631,11 +5634,13 @@ static int kvm_enable_virtualization(void)
cpuhp_remove_state(CPUHP_AP_KVM_ONLINE);
err_cpuhp:
kvm_arch_disable_virtualization();
+
+out:
--kvm_usage_count;
return r;
}
-static void kvm_disable_virtualization(void)
+static void kvm_disable_virtualization(bool allow_arch_override)
{
guard(mutex)(&kvm_usage_lock);
@@ -5650,7 +5655,7 @@ static void kvm_disable_virtualization(void)
static int kvm_init_virtualization(void)
{
if (enable_virt_at_load)
- return kvm_enable_virtualization();
+ return kvm_enable_virtualization(true);
return 0;
}
@@ -5658,10 +5663,10 @@ static int kvm_init_virtualization(void)
static void kvm_uninit_virtualization(void)
{
if (enable_virt_at_load)
- kvm_disable_virtualization();
+ kvm_disable_virtualization(true);
}
#else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */
-static int kvm_enable_virtualization(void)
+static int kvm_enable_virtualization(bool allow_arch_override)
{
return 0;
}
@@ -5671,7 +5676,7 @@ static int kvm_init_virtualization(void)
return 0;
}
-static void kvm_disable_virtualization(void)
+static void kvm_disable_virtualization(bool allow_arch_override)
{
}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-04-29 13:25 ` x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar Borislav Petkov
@ 2025-04-30 23:33 ` Sean Christopherson
2025-05-01 0:42 ` Michael Larabel
2025-05-01 8:19 ` Borislav Petkov
0 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2025-04-30 23:33 UTC (permalink / raw)
To: Borislav Petkov
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan, Michael Larabel
On Tue, Apr 29, 2025, Borislav Petkov wrote:
> On Tue, Feb 18, 2025 at 12:13:33PM +0100, Borislav Petkov wrote:
> > So,
> >
> > in the interest of finally making some progress here I'd like to commit this
> > below (will test it one more time just in case but it should work :-P). It is
> > simple and straight-forward and doesn't need an IBPB when the bit gets
> > cleared.
> >
> > A potential future improvement is David's suggestion that there could be a way
> > for tracking when the first guest gets started, we set the bit then, we make
> > sure the bit gets set on each logical CPU when the guests migrate across the
> > machine and when the *last* guest exists, that bit gets cleared again.
>
> Well, that "simplicity" was short-lived:
>
> https://www.phoronix.com/review/linux-615-amd-regression
LOL.
> Sean, how about this below?
Eww. That's quite painful, and completely disallowing enable_virt_on_load is
undesirable, e.g. for use cases where the host is (almost) exclusively running
VMs.
Best idea I have is to throw in the towel on getting fancy, and just maintain a
dedicated count in SVM.
Alternatively, we could plumb an arch hook into kvm_create_vm() and kvm_destroy_vm()
that's called when KVM adds/deletes a VM from vm_list, and key off vm_list being
empty. But that adds a lot of boilerplate just to avoid a mutex+count.
I haven't tested on a system with X86_FEATURE_SRSO_BP_SPEC_REDUCE, but did verify
the mechanics by inverting the flag.
--
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 30 Apr 2025 15:34:50 -0700
Subject: [PATCH] KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count
transitions
Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and
only if KVM has at least one active VM. Leaving the bit set at all times
unfortunately degrades performance by a wee bit more than expected.
Use a dedicated mutex and counter instead of hooking virtualization
enablement, as changing the behavior of kvm.enable_virt_at_load based on
SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could
result in performance issues for flows that are sensity to VM creation
latency.
Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
Reported-by: Michael Larabel <Michael@michaellarabel.com>
Closes: https://www.phoronix.com/review/linux-615-amd-regression
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d5d0c5c3300b..fe8866572218 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
-
- if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
- msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
- if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
- msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
-
return 0;
}
@@ -5032,10 +5026,42 @@ static void svm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
sev_vcpu_deliver_sipi_vector(vcpu, vector);
}
+static DEFINE_MUTEX(srso_lock);
+static int srso_nr_vms;
+
+static void svm_toggle_srso_spec_reduce(void *set)
+{
+ if (set)
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+ else
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+}
+
+static void svm_srso_add_remove_vm(int count)
+{
+ bool set;
+
+ if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ return;
+
+ guard(mutex)(&srso_lock);
+
+ set = !srso_nr_vms;
+ srso_nr_vms += count;
+
+ WARN_ON_ONCE(srso_nr_vms < 0);
+ if (!set && srso_nr_vms)
+ return;
+
+ on_each_cpu(svm_toggle_srso_spec_reduce, (void *)set, 1);
+}
+
static void svm_vm_destroy(struct kvm *kvm)
{
avic_vm_destroy(kvm);
sev_vm_destroy(kvm);
+
+ svm_srso_add_remove_vm(-1);
}
static int svm_vm_init(struct kvm *kvm)
@@ -5061,6 +5087,7 @@ static int svm_vm_init(struct kvm *kvm)
return ret;
}
+ svm_srso_add_remove_vm(1);
return 0;
}
base-commit: f158e1b145f73aae1d3b7e756eb129a15b2b7a90
--
^ permalink raw reply related [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-04-30 23:33 ` Sean Christopherson
@ 2025-05-01 0:42 ` Michael Larabel
2025-05-01 8:19 ` Borislav Petkov
1 sibling, 0 replies; 68+ messages in thread
From: Michael Larabel @ 2025-05-01 0:42 UTC (permalink / raw)
To: Sean Christopherson, Borislav Petkov
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan
On 4/30/25 6:33 PM, Sean Christopherson wrote:
> On Tue, Apr 29, 2025, Borislav Petkov wrote:
>> On Tue, Feb 18, 2025 at 12:13:33PM +0100, Borislav Petkov wrote:
>>> So,
>>>
>>> in the interest of finally making some progress here I'd like to commit this
>>> below (will test it one more time just in case but it should work :-P). It is
>>> simple and straight-forward and doesn't need an IBPB when the bit gets
>>> cleared.
>>>
>>> A potential future improvement is David's suggestion that there could be a way
>>> for tracking when the first guest gets started, we set the bit then, we make
>>> sure the bit gets set on each logical CPU when the guests migrate across the
>>> machine and when the *last* guest exists, that bit gets cleared again.
>> Well, that "simplicity" was short-lived:
>>
>> https://www.phoronix.com/review/linux-615-amd-regression
> LOL.
>
>> Sean, how about this below?
> Eww. That's quite painful, and completely disallowing enable_virt_on_load is
> undesirable, e.g. for use cases where the host is (almost) exclusively running
> VMs.
>
> Best idea I have is to throw in the towel on getting fancy, and just maintain a
> dedicated count in SVM.
>
> Alternatively, we could plumb an arch hook into kvm_create_vm() and kvm_destroy_vm()
> that's called when KVM adds/deletes a VM from vm_list, and key off vm_list being
> empty. But that adds a lot of boilerplate just to avoid a mutex+count.
>
> I haven't tested on a system with X86_FEATURE_SRSO_BP_SPEC_REDUCE, but did verify
> the mechanics by inverting the flag.
Testing this patch on the same EPYC Turin server as my original tests, I
can confirm that on a clean boot without any VMs running, the
performance is back to where it was on v6.14. :)
Thanks,
Michael
>
> --
> From: Sean Christopherson <seanjc@google.com>
> Date: Wed, 30 Apr 2025 15:34:50 -0700
> Subject: [PATCH] KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count
> transitions
>
> Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and
> only if KVM has at least one active VM. Leaving the bit set at all times
> unfortunately degrades performance by a wee bit more than expected.
>
> Use a dedicated mutex and counter instead of hooking virtualization
> enablement, as changing the behavior of kvm.enable_virt_at_load based on
> SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could
> result in performance issues for flows that are sensity to VM creation
> latency.
>
> Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
> Reported-by: Michael Larabel <Michael@michaellarabel.com>
> Closes: https://www.phoronix.com/review/linux-615-amd-regression
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/svm.c | 39 +++++++++++++++++++++++++++++++++------
> 1 file changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index d5d0c5c3300b..fe8866572218 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void)
> kvm_cpu_svm_disable();
>
> amd_pmu_disable_virt();
> -
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> }
>
> static int svm_enable_virtualization_cpu(void)
> @@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void)
> rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
> }
>
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> -
> return 0;
> }
>
> @@ -5032,10 +5026,42 @@ static void svm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
> sev_vcpu_deliver_sipi_vector(vcpu, vector);
> }
>
> +static DEFINE_MUTEX(srso_lock);
> +static int srso_nr_vms;
> +
> +static void svm_toggle_srso_spec_reduce(void *set)
> +{
> + if (set)
> + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> + else
> + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> +}
> +
> +static void svm_srso_add_remove_vm(int count)
> +{
> + bool set;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> + return;
> +
> + guard(mutex)(&srso_lock);
> +
> + set = !srso_nr_vms;
> + srso_nr_vms += count;
> +
> + WARN_ON_ONCE(srso_nr_vms < 0);
> + if (!set && srso_nr_vms)
> + return;
> +
> + on_each_cpu(svm_toggle_srso_spec_reduce, (void *)set, 1);
> +}
> +
> static void svm_vm_destroy(struct kvm *kvm)
> {
> avic_vm_destroy(kvm);
> sev_vm_destroy(kvm);
> +
> + svm_srso_add_remove_vm(-1);
> }
>
> static int svm_vm_init(struct kvm *kvm)
> @@ -5061,6 +5087,7 @@ static int svm_vm_init(struct kvm *kvm)
> return ret;
> }
>
> + svm_srso_add_remove_vm(1);
> return 0;
> }
>
>
> base-commit: f158e1b145f73aae1d3b7e756eb129a15b2b7a90
> --
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-04-30 23:33 ` Sean Christopherson
2025-05-01 0:42 ` Michael Larabel
@ 2025-05-01 8:19 ` Borislav Petkov
2025-05-01 16:56 ` Sean Christopherson
1 sibling, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-05-01 8:19 UTC (permalink / raw)
To: Sean Christopherson
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan, Michael Larabel
On Wed, Apr 30, 2025 at 04:33:19PM -0700, Sean Christopherson wrote:
> Eww. That's quite painful, and completely disallowing enable_virt_on_load is
> undesirable, e.g. for use cases where the host is (almost) exclusively running
> VMs.
I wanted to stay generic... :-)
> Best idea I have is to throw in the towel on getting fancy, and just maintain a
> dedicated count in SVM.
>
> Alternatively, we could plumb an arch hook into kvm_create_vm() and kvm_destroy_vm()
> that's called when KVM adds/deletes a VM from vm_list, and key off vm_list being
> empty. But that adds a lot of boilerplate just to avoid a mutex+count.
FWIW, that was Tom's idea.
> +static void svm_srso_add_remove_vm(int count)
> +{
> + bool set;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> + return;
> +
> + guard(mutex)(&srso_lock);
> +
> + set = !srso_nr_vms;
> + srso_nr_vms += count;
> +
> + WARN_ON_ONCE(srso_nr_vms < 0);
> + if (!set && srso_nr_vms)
> + return;
So instead of doing this "by-foot", I would've used any of those
atomic_inc_return() and atomic_dec_and_test() and act upon the value when it
becomes 0 or !0 instead of passing 1 and -1. Because the count is kinda
implicit...
But yeah, not a biggie - that solves the issue too.
Thanks!
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-01 8:19 ` Borislav Petkov
@ 2025-05-01 16:56 ` Sean Christopherson
2025-05-05 15:25 ` Borislav Petkov
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-05-01 16:56 UTC (permalink / raw)
To: Borislav Petkov
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan, Michael Larabel
On Thu, May 01, 2025, Borislav Petkov wrote:
> On Wed, Apr 30, 2025 at 04:33:19PM -0700, Sean Christopherson wrote:
> > +static void svm_srso_add_remove_vm(int count)
> > +{
> > + bool set;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> > + return;
> > +
> > + guard(mutex)(&srso_lock);
> > +
> > + set = !srso_nr_vms;
> > + srso_nr_vms += count;
> > +
> > + WARN_ON_ONCE(srso_nr_vms < 0);
> > + if (!set && srso_nr_vms)
> > + return;
>
> So instead of doing this "by-foot", I would've used any of those
> atomic_inc_return() and atomic_dec_and_test() and act upon the value when it
> becomes 0 or !0 instead of passing 1 and -1. Because the count is kinda
> implicit...
Heh, I considered that, and even tried it this morning because I thought it wouldn't
be as tricky as I first thought, but turns out, yeah, it's tricky. The complication
is that KVM needs to ensure BP_SPEC_REDUCE=1 on all CPUs before any VM is created.
I thought it wouldn't be _that_ tricky once I realized the 1=>0 case doesn't require
ordering, e.g. running host code while other CPUs have BP_SPEC_REDUCE=1 is totally
fine, KVM just needs to ensure no guest code is executed with BP_SPEC_REDUCE=0.
But guarding against all the possible edge cases is comically difficult.
For giggles, I did get it working, but it's a rather absurd amount of complexity
for very little gain. In practice, when srso_nr_vms >= 1, CPUs will hold the lock
for only a handful of cycles. The latency of VM creation is somewhat important,
but it's certainly not that latency sensitive, e.g. KVM already serializes VM
creation on kvm_lock (which is a mutex) in order to add VMs to vm_list.
The only truly slow path is the 0<=>1 transitions, and those *must* be slow to
get the ordering correct.
One thing I can/will do is change the mutex into a spinlock, so that on non-RT
systems there's zero chance of VM creation getting bogged down because a task
happens to get preempted at just the wrong time. I was thinking it's illegal to
use on_each_cpu() in a spinlock, but that's fine; it's using it with IRQs disabled
that's problematic.
I'll post a proper patch with the spinlock and CONFIG_CPU_MITIGATIONS #ifdef,
assuming testing comes back clean.
As for all the problematic scenarios... If KVM increments the counter before
sending IPIs, then reacing VM creation can result in the second VM skipping the
mutex and doing KVM_RUN with BP_SPEC_REDUCE=0.
VMs MSR CPU-0 CPU-1
-----------------------------
0 0 CREATE
0 0 lock()
1 0 inc()
1 0 CREATE
2 0 inc()
2 0 KVM_RUN :-(
2 0 IPI
2 1 WRMSR WRMSR
But simply incrementing the count after sending IPIs obviously doesn't work.
VMs MSR CPU-0 CPU-1
-----------------------------
0 0 CREATE
0 0 lock()
0 0 IPI
0 0 WRMSR WRMSR
1 0 inc()
1 0 KVM_RUN :-(
And passing in set/clear (or using separate IPI handlers) doesn't handle the case
where the IPI from destroy arrives after the IPI from create.
VMs MSR CPU-0 CPU-1
-----------------------------
1 1 DESTROY
0 1 CREATE dec()
0 1 lock()
0 1 IPI(1)
0 1 WRMSR WRMSR
0 1 IPI(0)
0 0 WRMSR WRMSR
1 0 inc()
1 0 KVM_RUN :-(
Addressing that by adding a global flag to track that SRSO needs to be set
almost works, but there's still a race with a destroy IPI if the callback only
checks the "set" flag.
VMs MSR CPU-0 CPU-1
-----------------------------
1 1 DESTROY
0 1 CREATE dec()
0 1 lock()
0 1 set=1
0 1 IPI
0 1 WRMSR WRMSR
1 0 inc()
1 1 set=0
1 1 IPI
1 0 WRMSR WRMSR
1 0 KVM_RUN :-(
To address all of those, I ended up with the below. It's not actually that much
code, but amount of documentation needed to explain everything is ugly.
#ifndef CONFIG_CPU_MITIGATIONS
static DEFINE_MUTEX(srso_add_vm_lock);
static atomic_t srso_nr_vms;
static bool srso_set;
static void svm_toggle_srso_spec_reduce(void *ign)
{
/*
* Read both srso_set and the count (and don't pass in set/clear!) so
* that BP_SPEC_REDUCE isn't incorrectly cleared if IPIs from destroying
* he last VM arrive after IPIs from creating the first VM (in the new
* "generation").
*/
if (READ_ONCE(srso_set) || atomic_read(&srso_nr_vms))
msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
else
msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static void svm_srso_vm_init(void)
{
if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
return;
/*
* Acquire the mutex on a 0=>1 transition to ensure BP_SPEC_REDUCE is
* set before any VM is fully created.
*/
if (atomic_inc_not_zero(&srso_nr_vms))
return;
guard(mutex)(&srso_add_vm_lock);
/*
* Re-check the count before sending IPIs, only the first task needs to
* toggle BP_SPEC_REDUCE, other tasks just need to wait. For the 0=>1
* case, update the count *after* BP_SPEC_REDUCE is set on all CPUs to
* ensure creating multiple VMs concurrently doesn't result in a task
* skipping the mutex before BP_SPEC_REDUCE is set.
*
* Atomically increment the count in all cases as the mutex only guards
* 0=>1 transitions, e.g. another task can decrement the count if a VM
* was created (0=>1) *and* destroyed (1=>0) between observing a count
* of '0' and acquiring the mutex, and another task can increment the
* count if the count is already >= 1.
*/
if (!atomic_inc_not_zero(&srso_nr_vms)) {
WRITE_ONCE(srso_set, true);
on_each_cpu(svm_toggle_srso_spec_reduce, NULL, 1);
atomic_inc(&srso_nr_vms);
smp_mb__after_atomic();
WRITE_ONCE(srso_set, false);
}
}
static void svm_srso_vm_destroy(void)
{
if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
return;
/*
* If the last VM is being destroyed, clear BP_SPEC_REDUCE on all CPUs.
* Unlike the creation case, there is no need to wait on other CPUs as
* running code with BP_SPEC_REDUCE=1 is always safe, KVM just needs to
* ensure guest code never runs with BP_SPEC_REDUCE=0.
*/
if (atomic_dec_and_test(&srso_nr_vms))
on_each_cpu(svm_toggle_srso_spec_reduce, NULL, 0);
}
#else
static void svm_srso_vm_init(void) { }
static void svm_srso_vm_destroy(void) { }
#endif /* CONFIG_CPU_MITIGATIONS */
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-01 16:56 ` Sean Christopherson
@ 2025-05-05 15:25 ` Borislav Petkov
2025-05-05 15:40 ` Kaplan, David
0 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2025-05-05 15:25 UTC (permalink / raw)
To: Sean Christopherson
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86, kvm, linux-kernel, Patrick Bellasi,
Brendan Jackman, David Kaplan, Michael Larabel
On Thu, May 01, 2025 at 09:56:44AM -0700, Sean Christopherson wrote:
> Heh, I considered that, and even tried it this morning because I thought it wouldn't
> be as tricky as I first thought, but turns out, yeah, it's tricky. The complication
> is that KVM needs to ensure BP_SPEC_REDUCE=1 on all CPUs before any VM is created.
>
> I thought it wouldn't be _that_ tricky once I realized the 1=>0 case doesn't require
> ordering, e.g. running host code while other CPUs have BP_SPEC_REDUCE=1 is totally
> fine, KVM just needs to ensure no guest code is executed with BP_SPEC_REDUCE=0.
> But guarding against all the possible edge cases is comically difficult.
>
> For giggles, I did get it working, but it's a rather absurd amount of complexity
Thanks for taking the time to explain - that's, well, funky. :-\
Btw, in talking about this, David had this other idea which sounds
interesting:
How about we do a per-CPU var which holds down whether BP_SPEC_REDUCE is
enabled on the CPU?
It'll toggle the MSR bit before VMRUN on the CPU when num VMs goes 0=>1. This
way you avoid the IPIs and you set the bit on time.
You'd still need to do an IPI on VMEXIT when VM count does 1=>0 but that's
easy.
Dunno, there probably already is a per-CPU setting in KVM so you could add
that to it...
Anyway, something along those lines...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* RE: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 15:25 ` Borislav Petkov
@ 2025-05-05 15:40 ` Kaplan, David
2025-05-05 15:47 ` Borislav Petkov
2025-05-05 16:30 ` Sean Christopherson
0 siblings, 2 replies; 68+ messages in thread
From: Kaplan, David @ 2025-05-05 15:40 UTC (permalink / raw)
To: Borislav Petkov, Sean Christopherson
Cc: Yosry Ahmed, Patrick Bellasi, Paolo Bonzini, Josh Poimboeuf,
Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Borislav Petkov <bp@alien8.de>
> Sent: Monday, May 5, 2025 10:26 AM
> To: Sean Christopherson <seanjc@google.com>
> Cc: Yosry Ahmed <yosry.ahmed@linux.dev>; Patrick Bellasi
> <derkling@google.com>; Paolo Bonzini <pbonzini@redhat.com>; Josh Poimboeuf
> <jpoimboe@redhat.com>; Pawan Gupta <pawan.kumar.gupta@linux.intel.com>;
> x86@kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; Patrick Bellasi
> <derkling@matbug.net>; Brendan Jackman <jackmanb@google.com>; Kaplan,
> David <David.Kaplan@amd.com>; Michael Larabel <Michael@michaellarabel.com>
> Subject: Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Thu, May 01, 2025 at 09:56:44AM -0700, Sean Christopherson wrote:
> > Heh, I considered that, and even tried it this morning because I
> > thought it wouldn't be as tricky as I first thought, but turns out,
> > yeah, it's tricky. The complication is that KVM needs to ensure
> BP_SPEC_REDUCE=1 on all CPUs before any VM is created.
> >
> > I thought it wouldn't be _that_ tricky once I realized the 1=>0 case
> > doesn't require ordering, e.g. running host code while other CPUs have
> > BP_SPEC_REDUCE=1 is totally fine, KVM just needs to ensure no guest code is
> executed with BP_SPEC_REDUCE=0.
> > But guarding against all the possible edge cases is comically difficult.
> >
> > For giggles, I did get it working, but it's a rather absurd amount of
> > complexity
>
> Thanks for taking the time to explain - that's, well, funky. :-\
>
> Btw, in talking about this, David had this other idea which sounds
> interesting:
>
> How about we do a per-CPU var which holds down whether BP_SPEC_REDUCE is
> enabled on the CPU?
>
> It'll toggle the MSR bit before VMRUN on the CPU when num VMs goes 0=>1. This
> way you avoid the IPIs and you set the bit on time.
Almost. My thought was that kvm_run could do something like:
If (!this_cpu_read(bp_spec_reduce_is_set)) {
wrmsrl to set BP_SEC_REDUCE
this_cpu_write(bp_spec_reduce_is_set, 1)
}
That ensures the bit is set for your core before VMRUN. And as noted below, you can clear the bit when the count drops to 0 but that one is safe from race conditions.
>
> You'd still need to do an IPI on VMEXIT when VM count does 1=>0 but that's easy.
>
> Dunno, there probably already is a per-CPU setting in KVM so you could add that to
> it...
>
> Anyway, something along those lines...
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 15:40 ` Kaplan, David
@ 2025-05-05 15:47 ` Borislav Petkov
2025-05-05 16:30 ` Sean Christopherson
1 sibling, 0 replies; 68+ messages in thread
From: Borislav Petkov @ 2025-05-05 15:47 UTC (permalink / raw)
To: Kaplan, David
Cc: Sean Christopherson, Yosry Ahmed, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
On Mon, May 05, 2025 at 03:40:20PM +0000, Kaplan, David wrote:
> Almost. My thought was that kvm_run could do something like:
>
> If (!this_cpu_read(bp_spec_reduce_is_set)) {
> wrmsrl to set BP_SEC_REDUCE
> this_cpu_write(bp_spec_reduce_is_set, 1)
> }
That's what I meant too. :-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 15:40 ` Kaplan, David
2025-05-05 15:47 ` Borislav Petkov
@ 2025-05-05 16:30 ` Sean Christopherson
2025-05-05 16:42 ` Kaplan, David
1 sibling, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-05-05 16:30 UTC (permalink / raw)
To: David Kaplan
Cc: Borislav Petkov, Yosry Ahmed, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
On Mon, May 05, 2025, David Kaplan wrote:
> > On Thu, May 01, 2025 at 09:56:44AM -0700, Sean Christopherson wrote:
> > > Heh, I considered that, and even tried it this morning because I
> > > thought it wouldn't be as tricky as I first thought, but turns out,
> > > yeah, it's tricky. The complication is that KVM needs to ensure
> > BP_SPEC_REDUCE=1 on all CPUs before any VM is created.
> > >
> > > I thought it wouldn't be _that_ tricky once I realized the 1=>0 case
> > > doesn't require ordering, e.g. running host code while other CPUs have
> > > BP_SPEC_REDUCE=1 is totally fine, KVM just needs to ensure no guest code is
> > executed with BP_SPEC_REDUCE=0.
> > > But guarding against all the possible edge cases is comically difficult.
> > >
> > > For giggles, I did get it working, but it's a rather absurd amount of
> > > complexity
> >
> > Thanks for taking the time to explain - that's, well, funky. :-\
> >
> > Btw, in talking about this, David had this other idea which sounds
> > interesting:
> >
> > How about we do a per-CPU var which holds down whether BP_SPEC_REDUCE is
> > enabled on the CPU?
> >
> > It'll toggle the MSR bit before VMRUN on the CPU when num VMs goes 0=>1. This
> > way you avoid the IPIs and you set the bit on time.
>
> Almost. My thought was that kvm_run could do something like:
>
> If (!this_cpu_read(bp_spec_reduce_is_set)) {
> wrmsrl to set BP_SEC_REDUCE
> this_cpu_write(bp_spec_reduce_is_set, 1)
> }
>
> That ensures the bit is set for your core before VMRUN. And as noted below,
> you can clear the bit when the count drops to 0 but that one is safe from
> race conditions.
/facepalm
I keep inverting the scenario in my head. I'm so used to KVM needing to ensure
it doesn't run with guest state that I keep forgetting that running with
BP_SPEC_REDUCE=1 is fine, just a bit slower.
With that in mind, the best blend of simplicity and performance is likely to hook
svm_prepare_switch_to_guest() and svm_prepare_host_switch(). switch_to_guest()
is called when KVM is about to do VMRUN, and host_switch() is called when the
vCPU is put, i.e. when the task is scheduled out or when KVM_RUN exits to
userspace.
The existing svm->guest_state_loaded guard avoids toggling the bit when KVM
handles a VM-Exit and re-enters the guest. The kernel may run a non-trivial
amount of code with BP_SPEC_REDUCE, e.g. if #NPF triggers swap-in, an IRQ arrives
while handling the exit, etc., but that's all fine from a security perspective.
IIUC, per Boris[*] an IBPB is needed when toggling BP_SPEC_REDUCE on-demand:
: You want to IBPB before clearing the MSR as otherwise host kernel will be
: running with the mistrained gunk from the guest.
[*] https://lore.kernel.org/all/20250217160728.GFZ7NewJHpMaWdiX2M@fat_crate.local
Assuming that's the case...
Compile-tested only. If this looks/sounds sane, I'll test the mechanics and
write a changelog.
---
arch/x86/include/asm/msr-index.h | 2 +-
arch/x86/kvm/svm/svm.c | 26 +++++++++++++++++++-------
arch/x86/kvm/x86.h | 1 +
arch/x86/lib/msr.c | 2 --
4 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e6134ef2263d..0cc9267b872e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -725,7 +725,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
-#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE BIT(4)
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cc1c721ba067..2d87ec216811 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
-
- if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
- msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
- if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
- msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
-
return 0;
}
@@ -1550,12 +1544,25 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
(!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ wrmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg |
+ MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE);
+
svm->guest_state_loaded = true;
}
static void svm_prepare_host_switch(struct kvm_vcpu *vcpu)
{
- to_svm(vcpu)->guest_state_loaded = false;
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!svm->guest_state_loaded)
+ return;
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ indirect_branch_prediction_barrier();
+ rdmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg);
+ }
+ svm->guest_state_loaded = false;
}
static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -5364,6 +5371,11 @@ static __init int svm_hardware_setup(void)
init_msrpm_offsets();
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ rdmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg);
+ WARN_ON(kvm_host.zen4_bp_cfg & MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE);
+ }
+
kvm_caps.supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS |
XFEATURE_MASK_BNDCSR);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 88a9475899c8..629eae9e4f59 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -51,6 +51,7 @@ struct kvm_host_values {
u64 xcr0;
u64 xss;
u64 arch_capabilities;
+ u64 zen4_bp_cfg;
};
void kvm_spurious_fault(void);
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 5a18ecc04a6c..4bf4fad5b148 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,7 +103,6 @@ int msr_set_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, true);
}
-EXPORT_SYMBOL_GPL(msr_set_bit);
/**
* msr_clear_bit - Clear @bit in a MSR @msr.
@@ -119,7 +118,6 @@ int msr_clear_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, false);
}
-EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(unsigned int msr, u64 val, int failed)
base-commit: 45eb29140e68ffe8e93a5471006858a018480a45
--
^ permalink raw reply related [flat|nested] 68+ messages in thread* RE: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 16:30 ` Sean Christopherson
@ 2025-05-05 16:42 ` Kaplan, David
2025-05-05 18:03 ` Sean Christopherson
0 siblings, 1 reply; 68+ messages in thread
From: Kaplan, David @ 2025-05-05 16:42 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, Yosry Ahmed, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Monday, May 5, 2025 11:30 AM
> To: Kaplan, David <David.Kaplan@amd.com>
> Cc: Borislav Petkov <bp@alien8.de>; Yosry Ahmed <yosry.ahmed@linux.dev>;
> Patrick Bellasi <derkling@google.com>; Paolo Bonzini <pbonzini@redhat.com>;
> Josh Poimboeuf <jpoimboe@redhat.com>; Pawan Gupta
> <pawan.kumar.gupta@linux.intel.com>; x86@kernel.org; kvm@vger.kernel.org;
> linux-kernel@vger.kernel.org; Patrick Bellasi <derkling@matbug.net>; Brendan
> Jackman <jackmanb@google.com>; Michael Larabel
> <Michael@michaellarabel.com>
> Subject: Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, May 05, 2025, David Kaplan wrote:
> > > On Thu, May 01, 2025 at 09:56:44AM -0700, Sean Christopherson wrote:
> > > > Heh, I considered that, and even tried it this morning because I
> > > > thought it wouldn't be as tricky as I first thought, but turns
> > > > out, yeah, it's tricky. The complication is that KVM needs to
> > > > ensure
> > > BP_SPEC_REDUCE=1 on all CPUs before any VM is created.
> > > >
> > > > I thought it wouldn't be _that_ tricky once I realized the 1=>0
> > > > case doesn't require ordering, e.g. running host code while other
> > > > CPUs have
> > > > BP_SPEC_REDUCE=1 is totally fine, KVM just needs to ensure no
> > > > guest code is
> > > executed with BP_SPEC_REDUCE=0.
> > > > But guarding against all the possible edge cases is comically difficult.
> > > >
> > > > For giggles, I did get it working, but it's a rather absurd amount
> > > > of complexity
> > >
> > > Thanks for taking the time to explain - that's, well, funky. :-\
> > >
> > > Btw, in talking about this, David had this other idea which sounds
> > > interesting:
> > >
> > > How about we do a per-CPU var which holds down whether
> > > BP_SPEC_REDUCE is enabled on the CPU?
> > >
> > > It'll toggle the MSR bit before VMRUN on the CPU when num VMs goes
> > > 0=>1. This way you avoid the IPIs and you set the bit on time.
> >
> > Almost. My thought was that kvm_run could do something like:
> >
> > If (!this_cpu_read(bp_spec_reduce_is_set)) {
> > wrmsrl to set BP_SEC_REDUCE
> > this_cpu_write(bp_spec_reduce_is_set, 1) }
> >
> > That ensures the bit is set for your core before VMRUN. And as noted
> > below, you can clear the bit when the count drops to 0 but that one is
> > safe from race conditions.
>
> /facepalm
>
> I keep inverting the scenario in my head. I'm so used to KVM needing to ensure it
> doesn't run with guest state that I keep forgetting that running with
> BP_SPEC_REDUCE=1 is fine, just a bit slower.
>
> With that in mind, the best blend of simplicity and performance is likely to hook
> svm_prepare_switch_to_guest() and svm_prepare_host_switch().
> switch_to_guest() is called when KVM is about to do VMRUN, and host_switch() is
> called when the vCPU is put, i.e. when the task is scheduled out or when KVM_RUN
> exits to userspace.
>
> The existing svm->guest_state_loaded guard avoids toggling the bit when KVM
> handles a VM-Exit and re-enters the guest. The kernel may run a non-trivial
> amount of code with BP_SPEC_REDUCE, e.g. if #NPF triggers swap-in, an IRQ
> arrives while handling the exit, etc., but that's all fine from a security perspective.
>
> IIUC, per Boris[*] an IBPB is needed when toggling BP_SPEC_REDUCE on-
> demand:
>
> : You want to IBPB before clearing the MSR as otherwise host kernel will be
> : running with the mistrained gunk from the guest.
>
> [*]
> https://lore.kernel.org/all/20250217160728.GFZ7NewJHpMaWdiX2M@fat_crate.loc
> al
>
> Assuming that's the case...
>
> Compile-tested only. If this looks/sounds sane, I'll test the mechanics and write a
> changelog.
I'm having trouble following the patch...where do you clear the MSR bit?
I thought a per-cpu "cache" of the MSR bit might be good to avoid having to issue slow RDMSRs, if these paths are 'hot'. I don't know if that's the case or not.
Also keep in mind this is a shared (per-core) MSR bit. You can't just clear it when you exit one guest because the sibling thread might be running another guest.
So I think you still want the global count of VMs. You clear the MSR bit when that count drops to 0. But you can set the bit on your core before you enter guest mode if needed.
There is still a race condition I suppose where the count drops to 0, and you go to send the IPI to clear the MSR bit, but in the meantime somebody else starts a new VM. However that probably works naturally because in order to process the IPI you exit guest mode and on the next VMRUN, you'd check and see your MSR bit isn't set and go and set it.
--David Kaplan
>
> ---
> arch/x86/include/asm/msr-index.h | 2 +-
> arch/x86/kvm/svm/svm.c | 26 +++++++++++++++++++-------
> arch/x86/kvm/x86.h | 1 +
> arch/x86/lib/msr.c | 2 --
> 4 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index e6134ef2263d..0cc9267b872e 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -725,7 +725,7 @@
>
> /* Zen4 */
> #define MSR_ZEN4_BP_CFG 0xc001102e
> -#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
> +#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE BIT(4)
> #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
>
> /* Fam 19h MSRs */
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index
> cc1c721ba067..2d87ec216811 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void)
> kvm_cpu_svm_disable();
>
> amd_pmu_disable_virt();
> -
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_clear_bit(MSR_ZEN4_BP_CFG,
> MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> }
>
> static int svm_enable_virtualization_cpu(void)
> @@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void)
> rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
> }
>
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_set_bit(MSR_ZEN4_BP_CFG,
> MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> -
> return 0;
> }
>
> @@ -1550,12 +1544,25 @@ static void svm_prepare_switch_to_guest(struct
> kvm_vcpu *vcpu)
> (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu-
> >kvm)))
> kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> + wrmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg |
> + MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE);
> +
> svm->guest_state_loaded = true;
> }
>
> static void svm_prepare_host_switch(struct kvm_vcpu *vcpu) {
> - to_svm(vcpu)->guest_state_loaded = false;
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!svm->guest_state_loaded)
> + return;
> +
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
> + indirect_branch_prediction_barrier();
> + rdmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg);
> + }
> + svm->guest_state_loaded = false;
> }
>
> static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu) @@ -5364,6 +5371,11
> @@ static __init int svm_hardware_setup(void)
>
> init_msrpm_offsets();
>
> + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
> + rdmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg);
> + WARN_ON(kvm_host.zen4_bp_cfg &
> MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE);
> + }
> +
> kvm_caps.supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS |
> XFEATURE_MASK_BNDCSR);
>
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index
> 88a9475899c8..629eae9e4f59 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -51,6 +51,7 @@ struct kvm_host_values {
> u64 xcr0;
> u64 xss;
> u64 arch_capabilities;
> + u64 zen4_bp_cfg;
> };
>
> void kvm_spurious_fault(void);
> diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c index
> 5a18ecc04a6c..4bf4fad5b148 100644
> --- a/arch/x86/lib/msr.c
> +++ b/arch/x86/lib/msr.c
> @@ -103,7 +103,6 @@ int msr_set_bit(u32 msr, u8 bit) {
> return __flip_bit(msr, bit, true); } -EXPORT_SYMBOL_GPL(msr_set_bit);
>
> /**
> * msr_clear_bit - Clear @bit in a MSR @msr.
> @@ -119,7 +118,6 @@ int msr_clear_bit(u32 msr, u8 bit) {
> return __flip_bit(msr, bit, false); } -EXPORT_SYMBOL_GPL(msr_clear_bit);
>
> #ifdef CONFIG_TRACEPOINTS
> void do_trace_write_msr(unsigned int msr, u64 val, int failed)
>
> base-commit: 45eb29140e68ffe8e93a5471006858a018480a45
> --
^ permalink raw reply [flat|nested] 68+ messages in thread* Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 16:42 ` Kaplan, David
@ 2025-05-05 18:03 ` Sean Christopherson
2025-05-05 18:25 ` Kaplan, David
0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2025-05-05 18:03 UTC (permalink / raw)
To: David Kaplan
Cc: Borislav Petkov, Yosry Ahmed, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
On Mon, May 05, 2025, David Kaplan wrote:
> > > Almost. My thought was that kvm_run could do something like:
> > >
> > > If (!this_cpu_read(bp_spec_reduce_is_set)) {
> > > wrmsrl to set BP_SEC_REDUCE
> > > this_cpu_write(bp_spec_reduce_is_set, 1) }
> > >
> > > That ensures the bit is set for your core before VMRUN. And as noted
> > > below, you can clear the bit when the count drops to 0 but that one is
> > > safe from race conditions.
> >
> > /facepalm
> >
> > I keep inverting the scenario in my head. I'm so used to KVM needing to ensure it
> > doesn't run with guest state that I keep forgetting that running with
> > BP_SPEC_REDUCE=1 is fine, just a bit slower.
> >
> > With that in mind, the best blend of simplicity and performance is likely to hook
> > svm_prepare_switch_to_guest() and svm_prepare_host_switch().
> > switch_to_guest() is called when KVM is about to do VMRUN, and host_switch() is
> > called when the vCPU is put, i.e. when the task is scheduled out or when KVM_RUN
> > exits to userspace.
> >
> > The existing svm->guest_state_loaded guard avoids toggling the bit when KVM
> > handles a VM-Exit and re-enters the guest. The kernel may run a non-trivial
> > amount of code with BP_SPEC_REDUCE, e.g. if #NPF triggers swap-in, an IRQ
> > arrives while handling the exit, etc., but that's all fine from a security perspective.
> >
> > IIUC, per Boris[*] an IBPB is needed when toggling BP_SPEC_REDUCE on-
> > demand:
> >
> > : You want to IBPB before clearing the MSR as otherwise host kernel will be
> > : running with the mistrained gunk from the guest.
> >
> > [*]
> > https://lore.kernel.org/all/20250217160728.GFZ7NewJHpMaWdiX2M@fat_crate.loc
> > al
> >
> > Assuming that's the case...
> >
> > Compile-tested only. If this looks/sounds sane, I'll test the mechanics and write a
> > changelog.
>
> I'm having trouble following the patch...where do you clear the MSR bit?
Gah, the rdmsrl() in svm_prepare_host_switch() should be a wrmsrl().
> I thought a per-cpu "cache" of the MSR bit might be good to avoid having to
> issue slow RDMSRs, if these paths are 'hot'. I don't know if that's the case
> or not.
Oh, you were only proposing deferring setting BP_SPEC_REDUCE. I like it. That
avoids setting BP_SPEC_REDUCE on CPUs that aren't running VMs, e.g. housekeeping
CPUs, and makes it a heck of a lot easier to elide the lock on 1<=>N transitions.
I posted v2 using that approach (when lore catches up):
https://lore.kernel.org/all/20250505180300.973137-1-seanjc@google.com
> Also keep in mind this is a shared (per-core) MSR bit. You can't just clear
> it when you exit one guest because the sibling thread might be running
> another guest.
Heh, well then never mind. I'm not touching SMT coordination.
> > static void svm_prepare_host_switch(struct kvm_vcpu *vcpu) {
> > - to_svm(vcpu)->guest_state_loaded = false;
> > + struct vcpu_svm *svm = to_svm(vcpu);
> > +
> > + if (!svm->guest_state_loaded)
> > + return;
> > +
> > + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
> > + indirect_branch_prediction_barrier();
> > + rdmsrl(MSR_ZEN4_BP_CFG, kvm_host.zen4_bp_cfg);
As above, this is supposed to be wrmsrl().
> > + }
> > + svm->guest_state_loaded = false;
> > }
^ permalink raw reply [flat|nested] 68+ messages in thread* RE: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
2025-05-05 18:03 ` Sean Christopherson
@ 2025-05-05 18:25 ` Kaplan, David
0 siblings, 0 replies; 68+ messages in thread
From: Kaplan, David @ 2025-05-05 18:25 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, Yosry Ahmed, Patrick Bellasi, Paolo Bonzini,
Josh Poimboeuf, Pawan Gupta, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Patrick Bellasi, Brendan Jackman,
Michael Larabel
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Monday, May 5, 2025 1:04 PM
> To: Kaplan, David <David.Kaplan@amd.com>
> Cc: Borislav Petkov <bp@alien8.de>; Yosry Ahmed <yosry.ahmed@linux.dev>;
> Patrick Bellasi <derkling@google.com>; Paolo Bonzini <pbonzini@redhat.com>;
> Josh Poimboeuf <jpoimboe@redhat.com>; Pawan Gupta
> <pawan.kumar.gupta@linux.intel.com>; x86@kernel.org; kvm@vger.kernel.org;
> linux-kernel@vger.kernel.org; Patrick Bellasi <derkling@matbug.net>; Brendan
> Jackman <jackmanb@google.com>; Michael Larabel
> <Michael@michaellarabel.com>
> Subject: Re: x86/bugs: KVM: Add support for SRSO_MSR_FIX, back for moar
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, May 05, 2025, David Kaplan wrote:
> > > > Almost. My thought was that kvm_run could do something like:
> > > >
> > > > If (!this_cpu_read(bp_spec_reduce_is_set)) {
> > > > wrmsrl to set BP_SEC_REDUCE
> > > > this_cpu_write(bp_spec_reduce_is_set, 1) }
> > > >
> > > > That ensures the bit is set for your core before VMRUN. And as
> > > > noted below, you can clear the bit when the count drops to 0 but
> > > > that one is safe from race conditions.
> > >
> > > /facepalm
> > >
> > > I keep inverting the scenario in my head. I'm so used to KVM
> > > needing to ensure it doesn't run with guest state that I keep
> > > forgetting that running with
> > > BP_SPEC_REDUCE=1 is fine, just a bit slower.
> > >
> > > With that in mind, the best blend of simplicity and performance is
> > > likely to hook
> > > svm_prepare_switch_to_guest() and svm_prepare_host_switch().
> > > switch_to_guest() is called when KVM is about to do VMRUN, and
> > > host_switch() is called when the vCPU is put, i.e. when the task is
> > > scheduled out or when KVM_RUN exits to userspace.
> > >
> > > The existing svm->guest_state_loaded guard avoids toggling the bit
> > > when KVM handles a VM-Exit and re-enters the guest. The kernel may
> > > run a non-trivial amount of code with BP_SPEC_REDUCE, e.g. if #NPF
> > > triggers swap-in, an IRQ arrives while handling the exit, etc., but that's all fine
> from a security perspective.
> > >
> > > IIUC, per Boris[*] an IBPB is needed when toggling BP_SPEC_REDUCE
> > > on-
> > > demand:
> > >
> > > : You want to IBPB before clearing the MSR as otherwise host kernel
> > > will be
> > > : running with the mistrained gunk from the guest.
> > >
> > > [*]
> > > https://lore.kernel.org/all/20250217160728.GFZ7NewJHpMaWdiX2M@fat_cr
> > > ate.loc
> > > al
> > >
> > > Assuming that's the case...
> > >
> > > Compile-tested only. If this looks/sounds sane, I'll test the
> > > mechanics and write a changelog.
> >
> > I'm having trouble following the patch...where do you clear the MSR bit?
>
> Gah, the rdmsrl() in svm_prepare_host_switch() should be a wrmsrl().
>
> > I thought a per-cpu "cache" of the MSR bit might be good to avoid
> > having to issue slow RDMSRs, if these paths are 'hot'. I don't know
> > if that's the case or not.
>
> Oh, you were only proposing deferring setting BP_SPEC_REDUCE. I like it. That
> avoids setting BP_SPEC_REDUCE on CPUs that aren't running VMs, e.g.
> housekeeping CPUs, and makes it a heck of a lot easier to elide the lock on 1<=>N
> transitions.
>
> I posted v2 using that approach (when lore catches up):
>
> https://lore.kernel.org/all/20250505180300.973137-1-seanjc@google.com
>
LGTM.
Thanks --David Kaplan
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 4/4] Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
` (2 preceding siblings ...)
2024-12-02 12:04 ` [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX Borislav Petkov
@ 2024-12-02 12:04 ` Borislav Petkov
2024-12-30 17:21 ` [tip: x86/cleanups] " tip-bot2 for Borislav Petkov (AMD)
2024-12-03 14:30 ` [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Nikolay Borisov
2025-02-26 14:32 ` [tip: x86/bugs] x86/bugs: KVM: Add support for SRSO_MSR_FIX tip-bot2 for Borislav Petkov
5 siblings, 1 reply; 68+ messages in thread
From: Borislav Petkov @ 2024-12-02 12:04 UTC (permalink / raw)
To: Sean Christopherson, X86 ML
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
From: "Borislav Petkov (AMD)" <bp@alien8.de>
s/lode/load/
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
Documentation/admin-guide/kernel-parameters.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index dc663c0ca670..e623e2b53be2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2695,7 +2695,7 @@
VMs, i.e. on the 0=>1 and 1=>0 transitions of the
number of VMs.
- Enabling virtualization at module lode avoids potential
+ Enabling virtualization at module load avoids potential
latency for creation of the 0=>1 VM, as KVM serializes
virtualization enabling across all online CPUs. The
"cost" of enabling virtualization when KVM is loaded,
--
2.43.0
^ permalink raw reply related [flat|nested] 68+ messages in thread* [tip: x86/cleanups] Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text
2024-12-02 12:04 ` [PATCH v2 4/4] Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text Borislav Petkov
@ 2024-12-30 17:21 ` tip-bot2 for Borislav Petkov (AMD)
0 siblings, 0 replies; 68+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-12-30 17:21 UTC (permalink / raw)
To: linux-tip-commits; +Cc: Borislav Petkov (AMD), x86, linux-kernel
The following commit has been merged into the x86/cleanups branch of tip:
Commit-ID: 1146f7429f610d51b886402f1f7a43faa08d814a
Gitweb: https://git.kernel.org/tip/1146f7429f610d51b886402f1f7a43faa08d814a
Author: Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate: Mon, 02 Dec 2024 13:04:16 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 17:58:58 +01:00
Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text
s/lode/load/
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241202120416.6054-5-bp@kernel.org
---
Documentation/admin-guide/kernel-parameters.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 316cccf..262a946 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2769,7 +2769,7 @@
VMs, i.e. on the 0=>1 and 1=>0 transitions of the
number of VMs.
- Enabling virtualization at module lode avoids potential
+ Enabling virtualization at module load avoids potential
latency for creation of the 0=>1 VM, as KVM serializes
virtualization enabling across all online CPUs. The
"cost" of enabling virtualization when KVM is loaded,
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
` (3 preceding siblings ...)
2024-12-02 12:04 ` [PATCH v2 4/4] Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text Borislav Petkov
@ 2024-12-03 14:30 ` Nikolay Borisov
2025-02-26 14:32 ` [tip: x86/bugs] x86/bugs: KVM: Add support for SRSO_MSR_FIX tip-bot2 for Borislav Petkov
5 siblings, 0 replies; 68+ messages in thread
From: Nikolay Borisov @ 2024-12-03 14:30 UTC (permalink / raw)
To: Borislav Petkov, Sean Christopherson, X86 ML
Cc: Paolo Bonzini, Josh Poimboeuf, Pawan Gupta, KVM, LKML,
Borislav Petkov (AMD)
On 2.12.24 г. 14:04 ч., Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
>
> Hi,
>
> here's the next revision, with hopefully all review feedback addressed.
>
> Changelog:
> v1:
>
> https://lore.kernel.org/r/20241104101543.31885-1-bp@kernel.org
The series is pretty self-explanatory. So:
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
>
> Thx.
>
> Borislav Petkov (AMD) (4):
> x86/bugs: Add SRSO_USER_KERNEL_NO support
> KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace
> x86/bugs: KVM: Add support for SRSO_MSR_FIX
> Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load
> text
>
> Documentation/admin-guide/hw-vuln/srso.rst | 10 ++++++++++
> Documentation/admin-guide/kernel-parameters.txt | 2 +-
> arch/x86/include/asm/cpufeatures.h | 2 ++
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/kernel/cpu/bugs.c | 16 +++++++++++++++-
> arch/x86/kernel/cpu/common.c | 1 +
> arch/x86/kvm/cpuid.c | 2 +-
> arch/x86/kvm/svm/svm.c | 6 ++++++
> arch/x86/lib/msr.c | 2 ++
> 9 files changed, 39 insertions(+), 3 deletions(-)
>
>
> base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
^ permalink raw reply [flat|nested] 68+ messages in thread* [tip: x86/bugs] x86/bugs: KVM: Add support for SRSO_MSR_FIX
2024-12-02 12:04 [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Borislav Petkov
` (4 preceding siblings ...)
2024-12-03 14:30 ` [PATCH v2 0/4] x86/bugs: Adjust SRSO mitigation to new features Nikolay Borisov
@ 2025-02-26 14:32 ` tip-bot2 for Borislav Petkov
5 siblings, 0 replies; 68+ messages in thread
From: tip-bot2 for Borislav Petkov @ 2025-02-26 14:32 UTC (permalink / raw)
To: linux-tip-commits
Cc: Sean Christopherson, Borislav Petkov (AMD), x86, linux-kernel
The following commit has been merged into the x86/bugs branch of tip:
Commit-ID: 8442df2b49ed9bcd67833ad4f091d15ac91efd00
Gitweb: https://git.kernel.org/tip/8442df2b49ed9bcd67833ad4f091d15ac91efd00
Author: Borislav Petkov <bp@alien8.de>
AuthorDate: Tue, 18 Feb 2025 12:13:33 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 26 Feb 2025 15:13:06 +01:00
x86/bugs: KVM: Add support for SRSO_MSR_FIX
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Switch back to enabling the bit when virtualization is enabled and to
clear the bit when virtualization is disabled because using a MSR slot
would clear the bit when the guest is exited and any training the guest
has done, would potentially influence the host kernel when execution
enters the kernel and hasn't VMRUN the guest yet.
More detail on the public thread in Link below.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241202120416.6054-1-bp@kernel.org
---
Documentation/admin-guide/hw-vuln/srso.rst | 13 +++++++++++-
arch/x86/include/asm/cpufeatures.h | 4 ++++-
arch/x86/include/asm/msr-index.h | 1 +-
arch/x86/kernel/cpu/bugs.c | 24 +++++++++++++++++----
arch/x86/kvm/svm/svm.c | 6 +++++-
arch/x86/lib/msr.c | 2 ++-
6 files changed, 46 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index 2ad1c05..66af952 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -104,7 +104,20 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+ This mitigation gets automatically enabled when the above one "IBPB on
+ VMEXIT" has been selected and the CPU supports the BpSpecReduce bit.
+
+ It gets automatically enabled on machines which have the
+ SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch
+ to the above =ibpb-vmexit mitigation because the user/kernel boundary is
+ not affected anymore and thus "safe RET" is not needed.
+
+ After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit
+ is detected (functionality present on all such machines) and that
+ practically overrides IBPB on VMEXIT as it has a lot less performance
+ impact and takes care of the guest->host attack vector too.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 508c0da..43653f2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -468,6 +468,10 @@
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /*
+ * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs.
+ * (SRSO_MSR_FIX in the official doc).
+ */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 72765b2..d35519b 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -721,6 +721,7 @@
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a5d0998..1d7afc4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2522,6 +2522,7 @@ enum srso_mitigation {
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
+ SRSO_MITIGATION_BP_SPEC_REDUCE,
};
enum srso_mitigation_cmd {
@@ -2539,7 +2540,8 @@ static const char * const srso_strings[] = {
[SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
- [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
+ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only",
+ [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2578,7 +2580,7 @@ static void __init srso_select_mitigation(void)
srso_cmd == SRSO_CMD_OFF) {
if (boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
- return;
+ goto out;
}
if (has_microcode) {
@@ -2590,7 +2592,7 @@ static void __init srso_select_mitigation(void)
*/
if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
- return;
+ goto out;
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
@@ -2670,6 +2672,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit:
case SRSO_CMD_IBPB_ON_VMEXIT:
+ if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) {
+ pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+ srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+ break;
+ }
+
if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
@@ -2691,7 +2699,15 @@ ibpb_on_vmexit:
}
out:
- pr_info("%s\n", srso_strings[srso_mitigation]);
+ /*
+ * Clear the feature flag if this mitigation is not selected as that
+ * feature flag controls the BpSpecReduce MSR bit toggling in KVM.
+ */
+ if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE)
+ setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE);
+
+ if (srso_mitigation != SRSO_MITIGATION_NONE)
+ pr_info("%s\n", srso_strings[srso_mitigation]);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a713c80..77ab66c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -607,6 +607,9 @@ static void svm_disable_virtualization_cpu(void)
kvm_cpu_svm_disable();
amd_pmu_disable_virt();
+
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
static int svm_enable_virtualization_cpu(void)
@@ -684,6 +687,9 @@ static int svm_enable_virtualization_cpu(void)
rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
}
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
+ msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
return 0;
}
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 4bf4fad..5a18ecc 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, true);
}
+EXPORT_SYMBOL_GPL(msr_set_bit);
/**
* msr_clear_bit - Clear @bit in a MSR @msr.
@@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit)
{
return __flip_bit(msr, bit, false);
}
+EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS
void do_trace_write_msr(unsigned int msr, u64 val, int failed)
^ permalink raw reply related [flat|nested] 68+ messages in thread