* [RFC PATCH v2 0/3] Add support for the ERAPS feature
@ 2024-11-11 16:39 Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Amit Shah @ 2024-11-11 16:39 UTC (permalink / raw)
To: linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3,
Amit Shah
Newer AMD CPUs (Zen5+) have the ERAPS feature bit that allows us to remove the
RSB filling loops required during context switches and VM exits.
This patchset implements the feature to:
* remove the need for RSB filling on context switches and VMEXITs in host and
guests
* allow KVM guests to use the full default RSB stack
The feature isn't yet part of an APM update that details its working, so this
is still tagged as RFC.
The v1 posting resulted in some questions on patch 1 -- I've included the
context and comments in the commit text for patch 1 in this posting.
v2:
* reword comments to highlight context switch as the main trigger for RSB
flushes in hardware (Dave Hansen)
* Split out outdated comment updates in (v1) patch1 to be a standalone
patch1 in this series, to reinforce RSB filling is only required for RSB
poisoning cases for AMD
* Remove mentions of BTC/BTC_NO (Andrew Cooper)
* Add braces in case stmt (kernel test robot)
* s/boot_cpu_has/cpu_feature_enabled (Boris Petkov)
Amit Shah (3):
x86: cpu/bugs: update SpectreRSB comments for AMD
x86: cpu/bugs: add support for AMD ERAPS feature
x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN
Documentation/admin-guide/hw-vuln/spectre.rst | 5 ++-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/nospec-branch.h | 12 +++++
arch/x86/include/asm/svm.h | 6 ++-
arch/x86/kernel/cpu/bugs.c | 34 ++++++++------
arch/x86/kvm/cpuid.c | 18 +++++++-
arch/x86/kvm/svm/svm.c | 44 +++++++++++++++++++
arch/x86/kvm/svm/svm.h | 15 +++++++
8 files changed, 117 insertions(+), 18 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 16:39 [RFC PATCH v2 0/3] Add support for the ERAPS feature Amit Shah
@ 2024-11-11 16:39 ` Amit Shah
2024-11-11 17:01 ` Dave Hansen
2024-11-11 19:33 ` Josh Poimboeuf
2024-11-11 16:39 ` [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN Amit Shah
2 siblings, 2 replies; 34+ messages in thread
From: Amit Shah @ 2024-11-11 16:39 UTC (permalink / raw)
To: linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
From: Amit Shah <amit.shah@amd.com>
AMD CPUs do not fall back to the BTB when the RSB underflows for RET
address speculation. AMD CPUs have not needed to stuff the RSB for
underflow conditions.
The RSB poisoning case is addressed by RSB filling - clean up the FIXME
comment about it.
Signed-off-by: Amit Shah <amit.shah@amd.com>
---
arch/x86/kernel/cpu/bugs.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..0aa629b5537d 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
* speculated return targets may come from the branch predictor,
* which could have a user-poisoned BTB or BHB entry.
*
- * AMD has it even worse: *all* returns are speculated from the BTB,
- * regardless of the state of the RSB.
- *
* When IBRS or eIBRS is enabled, the "user -> kernel" attack
* scenario is mitigated by the IBRS branch prediction isolation
* properties, so the RSB buffer filling wouldn't be necessary to
@@ -1852,8 +1849,6 @@ static void __init spectre_v2_select_mitigation(void)
*
* So to mitigate all cases, unconditionally fill RSB on context
* switches.
- *
- * FIXME: Is this pointless for retbleed-affected AMD?
*/
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
--
2.47.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature
2024-11-11 16:39 [RFC PATCH v2 0/3] Add support for the ERAPS feature Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
@ 2024-11-11 16:39 ` Amit Shah
2024-11-11 18:57 ` Dave Hansen
2024-11-11 16:39 ` [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN Amit Shah
2 siblings, 1 reply; 34+ messages in thread
From: Amit Shah @ 2024-11-11 16:39 UTC (permalink / raw)
To: linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
From: Amit Shah <amit.shah@amd.com>
Remove explicit RET stuffing / filling on VMEXITs and context
switches on AMD CPUs with the ERAPS feature (Zen5+).
With the Enhanced Return Address Prediction Security feature, any of
the following conditions triggers a flush of the RSB (aka RAP in AMD
manuals) in hardware:
* context switch (e.g., move to CR3)
* TLB flush
* some writes to CR4
The feature also explicitly tags host and guest addresses in the RSB -
eliminating the need for explicit flushing of the RSB on VMEXIT.
[RFC note: We'll wait for the APM to be updated with the real wording,
but assuming it's going to say the ERAPS feature works the way described
above, let's continue the discussion re: when the kernel currently calls
FILL_RETURN_BUFFER, and what dropping it entirely means.
Dave Hansen pointed out __switch_to_asm fills the RSB each time it's
called, so let's address the cases there:
1. user->kernel switch: Switching from userspace to kernelspace, and
then using user-stuffed RSB entries in the kernel is not possible
thanks to SMEP. We can safely drop the FILL_RETURN_BUFFER call for
this case. In fact, this is how the original code was when dwmw2
added it originally in c995efd5a. So while this case currently
triggers an RSB flush (and will not after this ERAPS patch), the
current flush isn't necessary for AMD systems with SMEP anyway.
2. user->user or kernel->kernel: If a user->user switch does not result
in a CR3 change, it's a different thread in the same process context.
That's the same case for kernel->kernel switch. In this case, the
RSB entries are still valid in that context, just not the correct
ones in the new thread's context. It's difficult to imagine this
being a security risk. The current code clearing it, and this patch
not doing so for AMD-with-ERAPS, isn't a concern as far as I see.
]
Feature mentioned in AMD PPR 57238. Will be resubmitted once APM is
public - which I'm told is imminent.
Signed-off-by: Amit Shah <amit.shah@amd.com>
---
Documentation/admin-guide/hw-vuln/spectre.rst | 5 ++--
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/nospec-branch.h | 12 ++++++++
arch/x86/kernel/cpu/bugs.c | 29 ++++++++++++++-----
4 files changed, 37 insertions(+), 10 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index 132e0bc6007e..647c10c0307a 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -417,9 +417,10 @@ The possible values in this file are:
- Return stack buffer (RSB) protection status:
- ============= ===========================================
+ ============= ========================================================
'RSB filling' Protection of RSB on context switch enabled
- ============= ===========================================
+ 'ERAPS' Hardware RSB flush on context switches + guest/host tags
+ ============= ========================================================
- EIBRS Post-barrier Return Stack Buffer (PBRSB) protection status:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 913fd3a7bac6..665032b12871 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -458,6 +458,7 @@
#define X86_FEATURE_AUTOIBRS (20*32+ 8) /* Automatic IBRS */
#define X86_FEATURE_NO_SMM_CTL_MSR (20*32+ 9) /* SMM_CTL MSR is not present */
+#define X86_FEATURE_ERAPS (20*32+24) /* Enhanced RAP / RSB / RAS Security */
#define X86_FEATURE_SBPB (20*32+27) /* Selective Branch Prediction Barrier */
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 96b410b1d4e8..f5ee7fc71db5 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -117,6 +117,18 @@
* We define a CPP macro such that it can be used from both .S files and
* inline assembly. It's possible to do a .macro and then include that
* from C via asm(".include <asm/nospec-branch.h>") but let's not go there.
+ *
+ * AMD CPUs with the ERAPS feature may have a larger default RSB. These CPUs
+ * use the default number of entries on a host, and can optionally (based on
+ * hypervisor setup) use 32 (old) or the new default in a guest. The number
+ * of default entries is reflected in CPUID 8000_0021:EBX[23:16].
+ *
+ * With the ERAPS feature, RSB filling is not necessary anymore: the RSB is
+ * auto-cleared by hardware on context switches, TLB flushes, or some CR4
+ * writes. Adapting the value of RSB_CLEAR_LOOPS below for ERAPS would change
+ * it to a runtime variable instead of the current compile-time constant, so
+ * leave it as-is, as this works for both older CPUs, as well as newer ones
+ * with ERAPS.
*/
#define RETPOLINE_THUNK_SIZE 32
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 0aa629b5537d..02446815b0de 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1818,9 +1818,12 @@ static void __init spectre_v2_select_mitigation(void)
pr_info("%s\n", spectre_v2_strings[mode]);
/*
- * If Spectre v2 protection has been enabled, fill the RSB during a
- * context switch. In general there are two types of RSB attacks
- * across context switches, for which the CALLs/RETs may be unbalanced.
+ * If Spectre v2 protection has been enabled, the RSB needs to be
+ * cleared during a context switch. Either do it in software by
+ * filling the RSB, or in hardware via ERAPS.
+ *
+ * In general there are two types of RSB attacks across context
+ * switches, for which the CALLs/RETs may be unbalanced.
*
* 1) RSB underflow
*
@@ -1848,12 +1851,21 @@ static void __init spectre_v2_select_mitigation(void)
* RSB clearing.
*
* So to mitigate all cases, unconditionally fill RSB on context
- * switches.
+ * switches when ERAPS is not present.
*/
- setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
- pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
+ if (!boot_cpu_has(X86_FEATURE_ERAPS)) {
+ setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+ pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
- spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
+ /*
+ * For guest -> host (or vice versa) RSB poisoning scenarios,
+ * determine the mitigation mode here. With ERAPS, RSB
+ * entries are tagged as host or guest - ensuring that neither
+ * the host nor the guest have to clear or fill RSB entries to
+ * avoid poisoning: skip RSB filling at VMEXIT in that case.
+ */
+ spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
+ }
/*
* Retpoline protects the kernel, but doesn't protect firmware. IBRS
@@ -2866,7 +2878,7 @@ static ssize_t spectre_v2_show_state(char *buf)
spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
return sysfs_emit(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
- return sysfs_emit(buf, "%s%s%s%s%s%s%s%s\n",
+ return sysfs_emit(buf, "%s%s%s%s%s%s%s%s%s\n",
spectre_v2_strings[spectre_v2_enabled],
ibpb_state(),
boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? "; IBRS_FW" : "",
@@ -2874,6 +2886,7 @@ static ssize_t spectre_v2_show_state(char *buf)
boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? "; RSB filling" : "",
pbrsb_eibrs_state(),
spectre_bhi_state(),
+ boot_cpu_has(X86_FEATURE_ERAPS) ? "; ERAPS hardware RSB flush" : "",
/* this should always be at the end */
spectre_v2_module_string());
}
--
2.47.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN
2024-11-11 16:39 [RFC PATCH v2 0/3] Add support for the ERAPS feature Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature Amit Shah
@ 2024-11-11 16:39 ` Amit Shah
2024-11-11 19:02 ` Dave Hansen
2 siblings, 1 reply; 34+ messages in thread
From: Amit Shah @ 2024-11-11 16:39 UTC (permalink / raw)
To: linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
From: Amit Shah <amit.shah@amd.com>
AMD CPUs with the ERAPS feature (Zen5+) have a larger RSB (aka RAP).
While the new default RSB size is used on the host without any software
modification necessary, the RSB usage for guests is limited to the older
value (32 entries) for backwards compatibility. With this patch, KVM
enables guest mode to also use the default number of entries by setting
the new ALLOW_LARGER_RAP bit in the VMCB.
The two cases for backward compatibility that need special handling are
nested guests, and guests using shadow paging (or when NPT is disabled):
For nested guests: the ERAPS feature adds host/guest tagging to entries
in the RSB, but does not distinguish between ASIDs. On a nested exit,
the L0 hypervisor instructs the microcode (via another new VMCB bit,
FLUSH_RAP_ON_VMRUN) to flush the RSB on the next VMRUN to prevent RSB
poisoning attacks from an L2 guest to an L1 guest. With that in place,
this feature can be exposed to guests.
For shadow paging guests: do not expose this feature to guests; only
expose if nested paging is enabled, to ensure context switches within
guests trigger TLB flushes on the CPU -- thereby ensuring guest context
switches flush guest RSB entries. For shadow paging, the CPU's CR3 is
not used for guest processes, and hence cannot benefit from this
feature.
Signed-off-by: Amit Shah <amit.shah@amd.com>
---
arch/x86/include/asm/svm.h | 6 +++++-
arch/x86/kvm/cpuid.c | 18 ++++++++++++++--
arch/x86/kvm/svm/svm.c | 44 ++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 15 +++++++++++++
4 files changed, 80 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2b59b9951c90..f8584a63c859 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -129,7 +129,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
u64 tsc_offset;
u32 asid;
u8 tlb_ctl;
- u8 reserved_2[3];
+ u8 erap_ctl;
+ u8 reserved_2[2];
u32 int_ctl;
u32 int_vector;
u32 int_state;
@@ -175,6 +176,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define TLB_CONTROL_FLUSH_ASID 3
#define TLB_CONTROL_FLUSH_ASID_LOCAL 7
+#define ERAP_CONTROL_ALLOW_LARGER_RAP 0
+#define ERAP_CONTROL_FLUSH_RAP 1
+
#define V_TPR_MASK 0x0f
#define V_IRQ_SHIFT 8
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 41786b834b16..b432fe3a9f49 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -797,6 +797,8 @@ void kvm_set_cpu_caps(void)
F(WRMSR_XX_BASE_NS)
);
+ if (tdp_enabled)
+ kvm_cpu_cap_check_and_set(X86_FEATURE_ERAPS);
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
@@ -1356,10 +1358,22 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
case 0x80000020:
entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
break;
- case 0x80000021:
- entry->ebx = entry->ecx = entry->edx = 0;
+ case 0x80000021: {
+ unsigned int ebx_mask = 0;
+
+ entry->ecx = entry->edx = 0;
cpuid_entry_override(entry, CPUID_8000_0021_EAX);
+
+ /*
+ * Bits 23:16 in EBX indicate the size of the RSB.
+ * Expose the value in the hardware to the guest.
+ */
+ if (kvm_cpu_cap_has(X86_FEATURE_ERAPS))
+ ebx_mask |= GENMASK(23, 16);
+
+ entry->ebx &= ebx_mask;
break;
+ }
/* AMD Extended Performance Monitoring and Debug */
case 0x80000022: {
union cpuid_0x80000022_ebx ebx;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9df3e1e5ae81..c98ae5ee3646 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1360,6 +1360,28 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
if (boot_cpu_has(X86_FEATURE_V_SPEC_CTRL))
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1);
+ /*
+ * If the hardware has a larger RSB, use it in the guest context as
+ * well.
+ *
+ * When running nested guests: the hardware tags host and guest RSB
+ * entries, but the entries are ASID agnostic. Differentiating L1 and
+ * L2 guests isn't possible in hardware. To prevent L2->L1 RSB
+ * poisoning attacks in this case, the L0 hypervisor must set
+ * FLUSH_RAP_ON_VMRUN in the L1's VMCB on a nested #VMEXIT to ensure
+ * the next VMRUN flushes the RSB.
+ *
+ * For shadow paging / NPT disabled case: the CPU's CR3 does not
+ * contain the CR3 of the running guest process, and hence intra-guest
+ * context switches will not cause a hardware TLB flush, which in turn
+ * does not result in a guest RSB flush that the ERAPS feature
+ * provides. Do not expose ERAPS or the larger RSB to the guest in
+ * this case, so the guest continues implementing software mitigations
+ * as well as only sees 32 entries for the RSB.
+ */
+ if (cpu_feature_enabled(X86_FEATURE_ERAPS) && npt_enabled)
+ vmcb_set_larger_rap(svm->vmcb);
+
if (kvm_vcpu_apicv_active(vcpu))
avic_init_vmcb(svm, vmcb);
@@ -3393,6 +3415,7 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
pr_err("%-20s%016llx\n", "tsc_offset:", control->tsc_offset);
pr_err("%-20s%d\n", "asid:", control->asid);
pr_err("%-20s%d\n", "tlb_ctl:", control->tlb_ctl);
+ pr_err("%-20s%d\n", "erap_ctl:", control->erap_ctl);
pr_err("%-20s%08x\n", "int_ctl:", control->int_ctl);
pr_err("%-20s%08x\n", "int_vector:", control->int_vector);
pr_err("%-20s%08x\n", "int_state:", control->int_state);
@@ -3559,6 +3582,27 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
trace_kvm_nested_vmexit(vcpu, KVM_ISA_SVM);
+ if (boot_cpu_has(X86_FEATURE_ERAPS)
+ && vmcb_is_larger_rap(svm->vmcb01.ptr)) {
+ /*
+ * XXX a few further optimizations can be made:
+ *
+ * 1. In pre_svm_run() we can reset this bit when a hw
+ * TLB flush has happened - any context switch on a
+ * CPU (which causes a TLB flush) auto-flushes the RSB
+ * - eg when this vCPU is scheduled on a different
+ * pCPU.
+ *
+ * 2. This is also not needed in the case where the
+ * vCPU is being scheduled on the same pCPU, but there
+ * was a context switch between the #VMEXIT and VMRUN.
+ *
+ * 3. If the guest returns to L2 again after this
+ * #VMEXIT, there's no need to flush the RSB.
+ */
+ vmcb_set_flush_rap(svm->vmcb01.ptr);
+ }
+
vmexit = nested_svm_exit_special(svm);
if (vmexit == NESTED_EXIT_CONTINUE)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 43fa6a16eb19..8a7877f46dc5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -500,6 +500,21 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
return vmcb_is_intercept(&svm->vmcb->control, bit);
}
+static inline void vmcb_set_flush_rap(struct vmcb *vmcb)
+{
+ __set_bit(ERAP_CONTROL_FLUSH_RAP, (unsigned long *)&vmcb->control.erap_ctl);
+}
+
+static inline void vmcb_set_larger_rap(struct vmcb *vmcb)
+{
+ __set_bit(ERAP_CONTROL_ALLOW_LARGER_RAP, (unsigned long *)&vmcb->control.erap_ctl);
+}
+
+static inline bool vmcb_is_larger_rap(struct vmcb *vmcb)
+{
+ return test_bit(ERAP_CONTROL_ALLOW_LARGER_RAP, (unsigned long *)&vmcb->control.erap_ctl);
+}
+
static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
{
return guest_can_use(&svm->vcpu, X86_FEATURE_VGIF) &&
--
2.47.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
@ 2024-11-11 17:01 ` Dave Hansen
2024-11-11 19:33 ` Josh Poimboeuf
1 sibling, 0 replies; 34+ messages in thread
From: Dave Hansen @ 2024-11-11 17:01 UTC (permalink / raw)
To: Amit Shah, linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
On 11/11/24 08:39, Amit Shah wrote:
> From: Amit Shah <amit.shah@amd.com>
>
> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
> address speculation. AMD CPUs have not needed to stuff the RSB for
> underflow conditions.
>
> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
> comment about it.
This amounts to "Josh was wrong" in commit 9756bba284. Before moving
forward with this, it would be great to get his ack on this to make sure
you two are on the same page.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature
2024-11-11 16:39 ` [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature Amit Shah
@ 2024-11-11 18:57 ` Dave Hansen
2024-11-13 10:33 ` Shah, Amit
0 siblings, 1 reply; 34+ messages in thread
From: Dave Hansen @ 2024-11-11 18:57 UTC (permalink / raw)
To: Amit Shah, linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
On 11/11/24 08:39, Amit Shah wrote:
> Remove explicit RET stuffing / filling on VMEXITs and context
> switches on AMD CPUs with the ERAPS feature (Zen5+).
I'm not a super big fan of how this changelog is constructed. I
personally really like the "background, problem, solution" form. This
starts with the solution before even telling us what it is.
> With the Enhanced Return Address Prediction Security feature, any of
> the following conditions triggers a flush of the RSB (aka RAP in AMD
> manuals) in hardware:
> * context switch (e.g., move to CR3)
IMNHO, context switching is a broader topic that process-to-process
switches. A user=>kernel thread switch is definitely a context switch,
but doesn't involve a CR3 write. A user=>user switch in the same mm is
also highly arguable to be a context switch and doesn't involve a CR3
write. There are userspace threading libraries that do "context switches".
This is further muddled by the very comments you are editing in this
patch like: "unconditionally fill RSB on context switches".
Please be very, very precise in the language that's chosen here.
> * TLB flush
> * some writes to CR4
... and only one of those is relevant for this patch. Aren't the others
just noise?
> The feature also explicitly tags host and guest addresses in the RSB -
> eliminating the need for explicit flushing of the RSB on VMEXIT.
>
> [RFC note: We'll wait for the APM to be updated with the real wording,
> but assuming it's going to say the ERAPS feature works the way described
> above, let's continue the discussion re: when the kernel currently calls
> FILL_RETURN_BUFFER, and what dropping it entirely means.
>
> Dave Hansen pointed out __switch_to_asm fills the RSB each time it's
> called, so let's address the cases there:
>
> 1. user->kernel switch: Switching from userspace to kernelspace, and
> then using user-stuffed RSB entries in the kernel is not possible
> thanks to SMEP. We can safely drop the FILL_RETURN_BUFFER call for
> this case. In fact, this is how the original code was when dwmw2
> added it originally in c995efd5a. So while this case currently
> triggers an RSB flush (and will not after this ERAPS patch), the
> current flush isn't necessary for AMD systems with SMEP anyway.
This description makes me uneasy. It basically boils down to, "SMEP
guarantees that the kernel can't consume user-placed RSB entries."
It makes me uneasy because I recall that not holding true on some Intel
CPUs. I believe some CPUs have a partial-width RSB. Kernel consumption
of a user-placed entry would induce the kernel to speculate to a
*kernel* address so SMEP is rather ineffective.
So, instead of just saying, "SMEP magically fixes everything, trust us",
could we please step through a few of the properties of the hardware and
software that make SMEP effective?
First, I think that you're saying that AMD hardware has a full-width,
precise RSB. That ensures that speculation always happens back
precisely to the original address, not some weird alias. Second, ERAPS
guarantees that the the address still has the same stuff mapped there.
Any change to the address space involves either a CR3 write or a TLB
flush, both of which would have flushed any user-placed content in the RSB.
Aside: Even the kernel-only text poking mm or EFI mm would "just
work" in this scenario since they have their own mm_structs,
page tables and root CR3 values.
> 2. user->user or kernel->kernel: If a user->user switch does not result
> in a CR3 change, it's a different thread in the same process context.
> That's the same case for kernel->kernel switch. In this case, the
> RSB entries are still valid in that context, just not the correct
> ones in the new thread's context. It's difficult to imagine this
> being a security risk. The current code clearing it, and this patch
> not doing so for AMD-with-ERAPS, isn't a concern as far as I see.
> ]
The current rules are dirt simple: if the kernel stack gets switched,
the RSB is flushed. It's kinda hard to have a mismatched stack if it's
never switched in the first place. That makes the current rules dirt
simple. The problem (stack switching) is dirt simple to correlate 1:1
with the fix (RSB stuffing).
This patch proposes separating the problem (stack switching) and the
mitigations (implicit new microcode actions). That's a hell of a lot
more complicated and hardware to audit than the existing rules. Agreed?
So, how are the rules relaxed?
First, it opens up the case where user threads consume RSB entries from
other threads in the same process. Threads are usually at the same
privilege level. Instead of using a speculative attack, an attacker
could just read the data directly. The only place I can imagine this
even remotely being a problem would be if threads were relying on
protection keys to keep secrets from each other.
The kernel consuming RSB entries from another kernel thread seems less
clear. I disagree with the idea that a valid entry in a given context
is a _safe_ entry though. Spectre v2 in _general_ involves nasty
speculation to otherwise perfectly safe code. A problematic scenario
would be where kswapd wakes up after waiting for I/O and starts
speculating back along the return path of the userspace thread that
kswapd interrupted. Userspace has some level of control over both stacks
and it doesn't seem super far fetched to think that there could be some
disclosure gadgets in there.
In short: the user-consumes-user-RSB-entry attacks seem fundamentally
improbable because user threads are unlikely to have secrets from each
other.
The kernel-consumes-kernel-RSB-entry attacks seem hard rather than
fundamentally improbable. I can't even count how many times our "oh
that attack would be too hard to pull off" optimism has gone wrong.
What does that all _mean_? ERAPS sucks? Nah. Maybe we just make sure
that the existing spectre_v2=whatever controls can be used to stop
relying on it if asked.
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 96b410b1d4e8..f5ee7fc71db5 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -117,6 +117,18 @@
> * We define a CPP macro such that it can be used from both .S files and
> * inline assembly. It's possible to do a .macro and then include that
> * from C via asm(".include <asm/nospec-branch.h>") but let's not go there.
> + *
> + * AMD CPUs with the ERAPS feature may have a larger default RSB. These CPUs
> + * use the default number of entries on a host, and can optionally (based on
> + * hypervisor setup) use 32 (old) or the new default in a guest. The number
> + * of default entries is reflected in CPUID 8000_0021:EBX[23:16].
> + *
> + * With the ERAPS feature, RSB filling is not necessary anymore: the RSB is
> + * auto-cleared by hardware on context switches, TLB flushes, or some CR4
> + * writes. Adapting the value of RSB_CLEAR_LOOPS below for ERAPS would change
> + * it to a runtime variable instead of the current compile-time constant, so
> + * leave it as-is, as this works for both older CPUs, as well as newer ones
> + * with ERAPS.
> */
This comment feels out of place and noisy to me. Why the heck would we
need to document the RSB size enumeration here and not use it?
Then this goes on to explain why this patch *didn't* bother to change
RSB_CLEAR_LOOPS. That seems much more like changelog material than a
code comment to me.
To me, RSB_CLEAR_LOOPS, is for, well, clearing the RSB. The existing
comments say that some CPUs don't need to clear the RSB. I don't think
we need further explanation of why one specific CPU doesn't need to
clear the RSB. The new CPU isn't special.
Something slightly more useful would be to actually read CPUID
8000_0021:EBX[23:16] and compare it to RSB_CLEAR_LOOPS:
void amd_check_rsb_clearing(void)
{
/*
* Systems with both these features off
* do not perform RSB clearing:
*/
if (!boot_cpu_has(X86_FEATURE_RSB_CTXSW) &&
!boot_cpu_has(X86_FEATURE_RSB_VMEXIT))
return;
// with actual helpers and mask generation, not this mess:
rsb_size = (cpuid_ebx(0x80000021) >> 16) & 0xff;
WARN_ON_ONCE(rsb_size > RSB_CLEAR_LOOPS);
}
That's almost shorter than the proposed comment.
> #define RETPOLINE_THUNK_SIZE 32
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 0aa629b5537d..02446815b0de 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1818,9 +1818,12 @@ static void __init spectre_v2_select_mitigation(void)
> pr_info("%s\n", spectre_v2_strings[mode]);
>
> /*
> - * If Spectre v2 protection has been enabled, fill the RSB during a
> - * context switch. In general there are two types of RSB attacks
> - * across context switches, for which the CALLs/RETs may be unbalanced.
> + * If Spectre v2 protection has been enabled, the RSB needs to be
> + * cleared during a context switch. Either do it in software by
> + * filling the RSB, or in hardware via ERAPS.
I'd move this comment about using ERAPS down to the ERAPS check itself.
> + * In general there are two types of RSB attacks across context
> + * switches, for which the CALLs/RETs may be unbalanced.
> *
> * 1) RSB underflow
> *
> @@ -1848,12 +1851,21 @@ static void __init spectre_v2_select_mitigation(void)
> * RSB clearing.
> *
> * So to mitigate all cases, unconditionally fill RSB on context
> - * switches.
> + * switches when ERAPS is not present.
> */
> - setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> - pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
> + if (!boot_cpu_has(X86_FEATURE_ERAPS)) {
> + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> + pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
>
> - spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
> + /*
> + * For guest -> host (or vice versa) RSB poisoning scenarios,
> + * determine the mitigation mode here. With ERAPS, RSB
> + * entries are tagged as host or guest - ensuring that neither
> + * the host nor the guest have to clear or fill RSB entries to
> + * avoid poisoning: skip RSB filling at VMEXIT in that case.
> + */
> + spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
> + }
This seems to be following a pattern of putting excessive amounts of
very AMD-specific commenting right in the middle of common code.
There's a *HUGE* comment in
spectre_v2_determine_rsb_fill_type_at_vmexit(). Wouldn't this be more
at home there?
> /*
> * Retpoline protects the kernel, but doesn't protect firmware. IBRS
> @@ -2866,7 +2878,7 @@ static ssize_t spectre_v2_show_state(char *buf)
> spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
> return sysfs_emit(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
>
> - return sysfs_emit(buf, "%s%s%s%s%s%s%s%s\n",
> + return sysfs_emit(buf, "%s%s%s%s%s%s%s%s%s\n",
> spectre_v2_strings[spectre_v2_enabled],
> ibpb_state(),
> boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? "; IBRS_FW" : "",
> @@ -2874,6 +2886,7 @@ static ssize_t spectre_v2_show_state(char *buf)
> boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? "; RSB filling" : "",
> pbrsb_eibrs_state(),
> spectre_bhi_state(),
> + boot_cpu_has(X86_FEATURE_ERAPS) ? "; ERAPS hardware RSB flush" : "",
> /* this should always be at the end */
> spectre_v2_module_string());
> }
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN
2024-11-11 16:39 ` [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN Amit Shah
@ 2024-11-11 19:02 ` Dave Hansen
0 siblings, 0 replies; 34+ messages in thread
From: Dave Hansen @ 2024-11-11 19:02 UTC (permalink / raw)
To: Amit Shah, linux-kernel, kvm, x86, linux-doc
Cc: amit.shah, thomas.lendacky, bp, tglx, peterz, jpoimboe,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
On 11/11/24 08:39, Amit Shah wrote:
...
> For shadow paging guests: do not expose this feature to guests; only
> expose if nested paging is enabled, to ensure context switches within
> guests trigger TLB flushes on the CPU
Just a note that the need for "context switch" disambiguation is
pervasive across this series. Please clarify the wording everywhere.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
2024-11-11 17:01 ` Dave Hansen
@ 2024-11-11 19:33 ` Josh Poimboeuf
2024-11-11 19:58 ` Kaplan, David
2024-11-12 0:29 ` Andrew Cooper
1 sibling, 2 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-11 19:33 UTC (permalink / raw)
To: Amit Shah
Cc: linux-kernel, kvm, x86, linux-doc, amit.shah, thomas.lendacky, bp,
tglx, peterz, pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa,
seanjc, pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw, andrew.cooper3
On Mon, Nov 11, 2024 at 05:39:11PM +0100, Amit Shah wrote:
> From: Amit Shah <amit.shah@amd.com>
>
> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
> address speculation. AMD CPUs have not needed to stuff the RSB for
> underflow conditions.
>
> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
> comment about it.
I'm thinking the comments need more clarification in light of BTC and
SRSO.
This:
> - * AMD has it even worse: *all* returns are speculated from the BTB,
> - * regardless of the state of the RSB.
is still true (mostly: "all" should be "some"), though it doesn't belong
in the "RSB underflow" section.
Also the RSB stuffing not only mitigates RET, it mitigates any other
instruction which happen to be predicted as a RET. Which is presumably
why it's still needed even when SRSO is enabled.
Something like below?
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..e95d3aa14259 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
* speculated return targets may come from the branch predictor,
* which could have a user-poisoned BTB or BHB entry.
*
- * AMD has it even worse: *all* returns are speculated from the BTB,
- * regardless of the state of the RSB.
- *
* When IBRS or eIBRS is enabled, the "user -> kernel" attack
* scenario is mitigated by the IBRS branch prediction isolation
* properties, so the RSB buffer filling wouldn't be necessary to
@@ -1850,10 +1847,22 @@ static void __init spectre_v2_select_mitigation(void)
* The "user -> user" scenario, also known as SpectreBHB, requires
* RSB clearing.
*
+ * AMD Branch Type Confusion (aka "AMD retbleed") adds some
+ * additional wrinkles:
+ *
+ * - A RET can be mispredicted as a direct or indirect branch,
+ * causing the CPU to speculatively branch to a BTB target, in
+ * which case the RSB filling obviously doesn't help. That case
+ * is mitigated by removing all the RETs (SRSO mitigation).
+ *
+ * - The RSB is not only used for architectural RET instructions,
+ * it may also be used for other instructions which happen to
+ * get mispredicted as RETs. Therefore RSB filling is still
+ * needed even when the RETs have all been removed by the SRSO
+ * mitigation.
+ *
* So to mitigate all cases, unconditionally fill RSB on context
* switches.
- *
- * FIXME: Is this pointless for retbleed-affected AMD?
*/
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
^ permalink raw reply related [flat|nested] 34+ messages in thread
* RE: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 19:33 ` Josh Poimboeuf
@ 2024-11-11 19:58 ` Kaplan, David
2024-11-11 20:39 ` Josh Poimboeuf
2024-11-12 0:29 ` Andrew Cooper
1 sibling, 1 reply; 34+ messages in thread
From: Kaplan, David @ 2024-11-11 19:58 UTC (permalink / raw)
To: Josh Poimboeuf, Amit Shah
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, x86@kernel.org,
linux-doc@vger.kernel.org, Shah, Amit, Lendacky, Thomas,
bp@alien8.de, tglx@linutronix.de, peterz@infradead.org,
pawan.kumar.gupta@linux.intel.com, corbet@lwn.net,
mingo@redhat.com, dave.hansen@linux.intel.com, hpa@zytor.com,
seanjc@google.com, pbonzini@redhat.com,
daniel.sneddon@linux.intel.com, kai.huang@intel.com,
Das1, Sandipan, boris.ostrovsky@oracle.com, Moger, Babu,
dwmw@amazon.co.uk, andrew.cooper3@citrix.com
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Josh Poimboeuf <jpoimboe@kernel.org>
> Sent: Monday, November 11, 2024 1:33 PM
> To: Amit Shah <amit@kernel.org>
> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org; x86@kernel.org; linux-
> doc@vger.kernel.org; Shah, Amit <Amit.Shah@amd.com>; Lendacky, Thomas
> <Thomas.Lendacky@amd.com>; bp@alien8.de; tglx@linutronix.de;
> peterz@infradead.org; pawan.kumar.gupta@linux.intel.com; corbet@lwn.net;
> mingo@redhat.com; dave.hansen@linux.intel.com; hpa@zytor.com;
> seanjc@google.com; pbonzini@redhat.com; daniel.sneddon@linux.intel.com;
> kai.huang@intel.com; Das1, Sandipan <Sandipan.Das@amd.com>;
> boris.ostrovsky@oracle.com; Moger, Babu <Babu.Moger@amd.com>; Kaplan,
> David <David.Kaplan@amd.com>; dwmw@amazon.co.uk;
> andrew.cooper3@citrix.com
> Subject: Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments
> for AMD
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On Mon, Nov 11, 2024 at 05:39:11PM +0100, Amit Shah wrote:
> > From: Amit Shah <amit.shah@amd.com>
> >
> > AMD CPUs do not fall back to the BTB when the RSB underflows for RET
> > address speculation. AMD CPUs have not needed to stuff the RSB for
> > underflow conditions.
> >
> > The RSB poisoning case is addressed by RSB filling - clean up the
> > FIXME comment about it.
>
> I'm thinking the comments need more clarification in light of BTC and SRSO.
>
> This:
>
> > - * AMD has it even worse: *all* returns are speculated from the BTB,
> > - * regardless of the state of the RSB.
>
> is still true (mostly: "all" should be "some"), though it doesn't belong in the "RSB
> underflow" section.
>
> Also the RSB stuffing not only mitigates RET, it mitigates any other instruction
> which happen to be predicted as a RET. Which is presumably why it's still needed
> even when SRSO is enabled.
>
While that's partly true, I'm not sure I'm understanding which scenario you're concerned with. As noted in the AMD BTC whitepaper, there are various types of potential mis-speculation depending on what the actual branch is. The 'late redirect' ones are the most concerning since those have enough of a speculation window to be able to do a load-op-load gadget. The only 'late redirect' case involving an instruction being incorrectly predicted as a RET is when the actual instruction is an indirect JMP/CALL. But those instructions are either removed entirely (due to retpoline) or being protected with IBRS. The cases of other instructions (like Jcc) being predicted as a RET are subject to the 'early redirect' behavior which is much more limited (and note that these can also be predicted as indirect branches for which there is no special protection). So I'm not sure why BTC specifically would necessitate needing to stuff the RSB here.
Also, BTC only affects some AMD parts (not Family 19h and later for instance).
--David Kaplan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 19:58 ` Kaplan, David
@ 2024-11-11 20:39 ` Josh Poimboeuf
2024-11-11 20:40 ` Josh Poimboeuf
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-11 20:39 UTC (permalink / raw)
To: Kaplan, David
Cc: Amit Shah, linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
x86@kernel.org, linux-doc@vger.kernel.org, Shah, Amit,
Lendacky, Thomas, bp@alien8.de, tglx@linutronix.de,
peterz@infradead.org, pawan.kumar.gupta@linux.intel.com,
corbet@lwn.net, mingo@redhat.com, dave.hansen@linux.intel.com,
hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
daniel.sneddon@linux.intel.com, kai.huang@intel.com,
Das1, Sandipan, boris.ostrovsky@oracle.com, Moger, Babu,
dwmw@amazon.co.uk, andrew.cooper3@citrix.com
On Mon, Nov 11, 2024 at 07:58:03PM +0000, Kaplan, David wrote:
> > I'm thinking the comments need more clarification in light of BTC and SRSO.
> >
> > This:
> >
> > > - * AMD has it even worse: *all* returns are speculated from the BTB,
> > > - * regardless of the state of the RSB.
> >
> > is still true (mostly: "all" should be "some"), though it doesn't belong in the "RSB
> > underflow" section.
> >
> > Also the RSB stuffing not only mitigates RET, it mitigates any other instruction
> > which happen to be predicted as a RET. Which is presumably why it's still needed
> > even when SRSO is enabled.
> >
>
> While that's partly true, I'm not sure I'm understanding which
> scenario you're concerned with. As noted in the AMD BTC whitepaper,
> there are various types of potential mis-speculation depending on what
> the actual branch is. The 'late redirect' ones are the most concerning
> since those have enough of a speculation window to be able to do a
> load-op-load gadget. The only 'late redirect' case involving an
> instruction being incorrectly predicted as a RET is when the actual
> instruction is an indirect JMP/CALL. But those instructions are
> either removed entirely (due to retpoline) or being protected with
> IBRS. The cases of other instructions (like Jcc) being predicted as a
> RET are subject to the 'early redirect' behavior which is much more
> limited (and note that these can also be predicted as indirect
> branches for which there is no special protection). So I'm not sure
> why BTC specifically would necessitate needing to stuff the RSB here.
>
> Also, BTC only affects some AMD parts (not Family 19h and later for
> instance).
This is why it's important to spell out all the different cases in the
comments. I was attempting to document the justifications for the
existing behavior.
You make some good points, though backing up a bit, I realize my comment
was flawed for another reason: the return thunks only protect the
kernel, but RSB filling on context switch is meant to protect user
space.
So, never mind...
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 20:39 ` Josh Poimboeuf
@ 2024-11-11 20:40 ` Josh Poimboeuf
2024-11-12 0:30 ` Josh Poimboeuf
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-11 20:40 UTC (permalink / raw)
To: Kaplan, David
Cc: Amit Shah, linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
x86@kernel.org, linux-doc@vger.kernel.org, Shah, Amit,
Lendacky, Thomas, bp@alien8.de, tglx@linutronix.de,
peterz@infradead.org, pawan.kumar.gupta@linux.intel.com,
corbet@lwn.net, mingo@redhat.com, dave.hansen@linux.intel.com,
hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
daniel.sneddon@linux.intel.com, kai.huang@intel.com,
Das1, Sandipan, boris.ostrovsky@oracle.com, Moger, Babu,
dwmw@amazon.co.uk, andrew.cooper3@citrix.com
On Mon, Nov 11, 2024 at 12:39:09PM -0800, Josh Poimboeuf wrote:
> This is why it's important to spell out all the different cases in the
> comments. I was attempting to document the justifications for the
> existing behavior.
>
> You make some good points, though backing up a bit, I realize my comment
> was flawed for another reason: the return thunks only protect the
> kernel, but RSB filling on context switch is meant to protect user
> space.
>
> So, never mind...
That said, I still think the comments need an update. I'll try to come
up with something later.
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 19:33 ` Josh Poimboeuf
2024-11-11 19:58 ` Kaplan, David
@ 2024-11-12 0:29 ` Andrew Cooper
2024-11-12 1:46 ` Josh Poimboeuf
1 sibling, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2024-11-12 0:29 UTC (permalink / raw)
To: Josh Poimboeuf, Amit Shah
Cc: linux-kernel, kvm, x86, linux-doc, amit.shah, thomas.lendacky, bp,
tglx, peterz, pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa,
seanjc, pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On 11/11/2024 7:33 pm, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 05:39:11PM +0100, Amit Shah wrote:
>> From: Amit Shah <amit.shah@amd.com>
>>
>> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
>> address speculation. AMD CPUs have not needed to stuff the RSB for
>> underflow conditions.
>>
>> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
>> comment about it.
> I'm thinking the comments need more clarification in light of BTC and
> SRSO.
>
> This:
>
>> - * AMD has it even worse: *all* returns are speculated from the BTB,
>> - * regardless of the state of the RSB.
> is still true (mostly: "all" should be "some"), though it doesn't belong
> in the "RSB underflow" section.
>
> Also the RSB stuffing not only mitigates RET, it mitigates any other
> instruction which happen to be predicted as a RET. Which is presumably
> why it's still needed even when SRSO is enabled.
>
> Something like below?
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 47a01d4028f6..e95d3aa14259 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
> * speculated return targets may come from the branch predictor,
> * which could have a user-poisoned BTB or BHB entry.
> *
> - * AMD has it even worse: *all* returns are speculated from the BTB,
> - * regardless of the state of the RSB.
> - *
> * When IBRS or eIBRS is enabled, the "user -> kernel" attack
> * scenario is mitigated by the IBRS branch prediction isolation
> * properties, so the RSB buffer filling wouldn't be necessary to
> @@ -1850,10 +1847,22 @@ static void __init spectre_v2_select_mitigation(void)
> * The "user -> user" scenario, also known as SpectreBHB, requires
> * RSB clearing.
> *
> + * AMD Branch Type Confusion (aka "AMD retbleed") adds some
> + * additional wrinkles:
> + *
> + * - A RET can be mispredicted as a direct or indirect branch,
> + * causing the CPU to speculatively branch to a BTB target, in
> + * which case the RSB filling obviously doesn't help. That case
> + * is mitigated by removing all the RETs (SRSO mitigation).
> + *
> + * - The RSB is not only used for architectural RET instructions,
> + * it may also be used for other instructions which happen to
> + * get mispredicted as RETs. Therefore RSB filling is still
> + * needed even when the RETs have all been removed by the SRSO
> + * mitigation.
This is my take. On AMD CPUs, there are two unrelated issues to take
into account:
1) SRSO
Affects anything which doesn't enumerate SRSO_NO, which is all parts to
date including Zen5.
SRSO ends up overflowing the RAS with arbitrary BTB targets, such that a
subsequent genuine RET follows a prediction which never came from a real
CALL instruction.
Mitigations for SRSO are either safe-ret, or IBPB-on-entry. Parts
without IBPB_RET using IBPB-on-entry need to manually flush the RAS.
Importantly, SMEP does not protection you against SRSO across the
user->kernel boundary, because the bad RAS entries are arbitrary. New
in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
more. So on Zen5, you can in principle get away without a RAS flush on
entry.
2) BTC
Affects anything which doesn't enumerate BTC_NO, which is Zen2 and older
(Fam17h for AMD, Fam18h for Hygon).
Attacker can forge any branch type prediction, and the most dangerous
one is RET-mispredicted-as-INDIRECT. This causes a genuine RET
instruction to follow a prediction that was believed to be an indirect
branch.
All CPUs which suffer BTC also suffer SRSO, so while jmp2ret is a
mitigation for BTC, it's utility became 0 when SRSO was discovered.
(Which as shame, because it's equal parts beautiful and terrifying.)
Mitigations for BTC are therefore safe-ret or IBPB-on-entry.
Flushing the RAS has no effect on BTC, because the whole problem with
BTC is that the prediction comes from the "wrong" predictor, but you
need to do it for other reasons.
~Andrew
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-11 20:40 ` Josh Poimboeuf
@ 2024-11-12 0:30 ` Josh Poimboeuf
0 siblings, 0 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-12 0:30 UTC (permalink / raw)
To: Kaplan, David
Cc: Amit Shah, linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
x86@kernel.org, linux-doc@vger.kernel.org, Shah, Amit,
Lendacky, Thomas, bp@alien8.de, tglx@linutronix.de,
peterz@infradead.org, pawan.kumar.gupta@linux.intel.com,
corbet@lwn.net, mingo@redhat.com, dave.hansen@linux.intel.com,
hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
daniel.sneddon@linux.intel.com, kai.huang@intel.com,
Das1, Sandipan, boris.ostrovsky@oracle.com, Moger, Babu,
dwmw@amazon.co.uk, andrew.cooper3@citrix.com
On Mon, Nov 11, 2024 at 12:40:41PM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:39:09PM -0800, Josh Poimboeuf wrote:
> > This is why it's important to spell out all the different cases in the
> > comments. I was attempting to document the justifications for the
> > existing behavior.
> >
> > You make some good points, though backing up a bit, I realize my comment
> > was flawed for another reason: the return thunks only protect the
> > kernel, but RSB filling on context switch is meant to protect user
> > space.
> >
> > So, never mind...
>
> That said, I still think the comments need an update. I'll try to come
> up with something later.
Here are some clarifications to the comments. Amit, feel free to
include this in your next revision.
----8<----
From: Josh Poimboeuf <jpoimboe@kernel.org>
Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks
The long comment above the setting of X86_FEATURE_RSB_CTXSW is a bit
confusing. It starts out being about context switching specifically,
but then goes on to describe "user -> kernel" mitigations, which aren't
necessarily limited to context switches.
Clarify that it's about *all* RSB attacks and their mitigations.
For consistency, add the "guest -> host" mitigations as well. Then the
comment above spectre_v2_determine_rsb_fill_type_at_vmexit() can be
removed and the overall line count is reduced.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/kernel/cpu/bugs.c | 59 ++++++++++++--------------------------
1 file changed, 19 insertions(+), 40 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..fbdfa151b7a9 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1581,26 +1581,6 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
{
- /*
- * Similar to context switches, there are two types of RSB attacks
- * after VM exit:
- *
- * 1) RSB underflow
- *
- * 2) Poisoned RSB entry
- *
- * When retpoline is enabled, both are mitigated by filling/clearing
- * the RSB.
- *
- * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
- * prediction isolation protections, RSB still needs to be cleared
- * because of #2. Note that SMEP provides no protection here, unlike
- * user-space-poisoned RSB entries.
- *
- * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
- * bug is present then a LITE version of RSB protection is required,
- * just a single call needs to retire before a RET is executed.
- */
switch (mode) {
case SPECTRE_V2_NONE:
return;
@@ -1818,43 +1798,42 @@ static void __init spectre_v2_select_mitigation(void)
pr_info("%s\n", spectre_v2_strings[mode]);
/*
- * If Spectre v2 protection has been enabled, fill the RSB during a
- * context switch. In general there are two types of RSB attacks
- * across context switches, for which the CALLs/RETs may be unbalanced.
+ * In general there are two types of RSB attacks:
*
- * 1) RSB underflow
+ * 1) RSB underflow ("Intel Retbleed")
*
* Some Intel parts have "bottomless RSB". When the RSB is empty,
* speculated return targets may come from the branch predictor,
* which could have a user-poisoned BTB or BHB entry.
*
- * AMD has it even worse: *all* returns are speculated from the BTB,
- * regardless of the state of the RSB.
+ * When IBRS or eIBRS is enabled, the "user -> kernel" attack is
+ * mitigated by the IBRS branch prediction isolation properties, so
+ * the RSB buffer filling wouldn't be necessary to protect against
+ * this type of attack.
*
- * When IBRS or eIBRS is enabled, the "user -> kernel" attack
- * scenario is mitigated by the IBRS branch prediction isolation
- * properties, so the RSB buffer filling wouldn't be necessary to
- * protect against this type of attack.
+ * The "user -> user" attack is mitigated by RSB filling on context
+ * switch.
*
- * The "user -> user" attack scenario is mitigated by RSB filling.
+ * The "guest -> host" attack is mitigated by IBRS or eIBRS.
*
* 2) Poisoned RSB entry
*
* If the 'next' in-kernel return stack is shorter than 'prev',
* 'next' could be tricked into speculating with a user-poisoned RSB
- * entry.
+ * entry. Speculative Type Confusion ("AMD retbleed") can also
+ * create poisoned RSB entries.
*
- * The "user -> kernel" attack scenario is mitigated by SMEP and
- * eIBRS.
+ * The "user -> kernel" attack is mitigated by SMEP and eIBRS.
*
- * The "user -> user" scenario, also known as SpectreBHB, requires
- * RSB clearing.
+ * The "user -> user" attack, also known as SpectreBHB, requires RSB
+ * clearing.
*
- * So to mitigate all cases, unconditionally fill RSB on context
- * switches.
- *
- * FIXME: Is this pointless for retbleed-affected AMD?
+ * The "guest -> host" attack is mitigated by eIBRS (not IBRS!) or
+ * RSB clearing on vmexit. Note that eIBRS implementations with
+ * X86_BUG_EIBRS_PBRSB still need "lite" RSB clearing which retires
+ * a single CALL before the first RET.
*/
+
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
--
2.47.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 0:29 ` Andrew Cooper
@ 2024-11-12 1:46 ` Josh Poimboeuf
2024-11-12 11:58 ` Borislav Petkov
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-12 1:46 UTC (permalink / raw)
To: Andrew Cooper
Cc: Amit Shah, linux-kernel, kvm, x86, linux-doc, amit.shah,
thomas.lendacky, bp, tglx, peterz, pawan.kumar.gupta, corbet,
mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Tue, Nov 12, 2024 at 12:29:28AM +0000, Andrew Cooper wrote:
> This is my take. On AMD CPUs, there are two unrelated issues to take
> into account:
>
> 1) SRSO
>
> Affects anything which doesn't enumerate SRSO_NO, which is all parts to
> date including Zen5.
>
> SRSO ends up overflowing the RAS with arbitrary BTB targets, such that a
> subsequent genuine RET follows a prediction which never came from a real
> CALL instruction.
>
> Mitigations for SRSO are either safe-ret, or IBPB-on-entry. Parts
> without IBPB_RET using IBPB-on-entry need to manually flush the RAS.
>
> Importantly, SMEP does not protection you against SRSO across the
> user->kernel boundary, because the bad RAS entries are arbitrary. New
> in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
> more. So on Zen5, you can in principle get away without a RAS flush on
> entry.
Updated to mention SRSO:
/*
* In general there are two types of RSB attacks:
*
* 1) RSB underflow ("Intel Retbleed")
*
* Some Intel parts have "bottomless RSB". When the RSB is empty,
* speculated return targets may come from the branch predictor,
* which could have a user-poisoned BTB or BHB entry.
*
* When IBRS or eIBRS is enabled, the "user -> kernel" attack is
* mitigated by the IBRS branch prediction isolation properties, so
* the RSB buffer filling wouldn't be necessary to protect against
* this type of attack.
*
* The "user -> user" attack is mitigated by RSB filling on context
* switch.
*
* The "guest -> host" attack is mitigated by IBRS or eIBRS.
*
* 2) Poisoned RSB entry
*
* If the 'next' in-kernel return stack is shorter than 'prev',
* 'next' could be tricked into speculating with a user-poisoned RSB
* entry. Poisoned RSB entries can also be created by Branch Type
* Confusion ("AMD retbleed") or SRSO.
*
* The "user -> kernel" attack is mitigated by SMEP and eIBRS. AMD
* without SRSO_NO also needs the SRSO mitigation.
*
* The "user -> user" attack, also known as SpectreBHB, requires RSB
* clearing.
*
* The "guest -> host" attack is mitigated by either eIBRS (not
* IBRS!) or RSB clearing on vmexit. Note that eIBRS
* implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
* clearing which retires a single CALL before the first RET.
*/
---8<---
From: Josh Poimboeuf <jpoimboe@kernel.org>
Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks
The long comment above the setting of X86_FEATURE_RSB_CTXSW is a bit
confusing. It starts out being about context switching specifically,
but then goes on to describe "user -> kernel" mitigations, which aren't
necessarily limited to context switches.
Clarify that it's about *all* RSB attacks and their mitigations.
For consistency, add the "guest -> host" mitigations as well. Then the
comment above spectre_v2_determine_rsb_fill_type_at_vmexit() can be
removed and the overall line count is reduced.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/kernel/cpu/bugs.c | 60 +++++++++++++-------------------------
1 file changed, 20 insertions(+), 40 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..3dd1e504d706 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1581,26 +1581,6 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
{
- /*
- * Similar to context switches, there are two types of RSB attacks
- * after VM exit:
- *
- * 1) RSB underflow
- *
- * 2) Poisoned RSB entry
- *
- * When retpoline is enabled, both are mitigated by filling/clearing
- * the RSB.
- *
- * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
- * prediction isolation protections, RSB still needs to be cleared
- * because of #2. Note that SMEP provides no protection here, unlike
- * user-space-poisoned RSB entries.
- *
- * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
- * bug is present then a LITE version of RSB protection is required,
- * just a single call needs to retire before a RET is executed.
- */
switch (mode) {
case SPECTRE_V2_NONE:
return;
@@ -1818,43 +1798,43 @@ static void __init spectre_v2_select_mitigation(void)
pr_info("%s\n", spectre_v2_strings[mode]);
/*
- * If Spectre v2 protection has been enabled, fill the RSB during a
- * context switch. In general there are two types of RSB attacks
- * across context switches, for which the CALLs/RETs may be unbalanced.
+ * In general there are two types of RSB attacks:
*
- * 1) RSB underflow
+ * 1) RSB underflow ("Intel Retbleed")
*
* Some Intel parts have "bottomless RSB". When the RSB is empty,
* speculated return targets may come from the branch predictor,
* which could have a user-poisoned BTB or BHB entry.
*
- * AMD has it even worse: *all* returns are speculated from the BTB,
- * regardless of the state of the RSB.
+ * When IBRS or eIBRS is enabled, the "user -> kernel" attack is
+ * mitigated by the IBRS branch prediction isolation properties, so
+ * the RSB buffer filling wouldn't be necessary to protect against
+ * this type of attack.
*
- * When IBRS or eIBRS is enabled, the "user -> kernel" attack
- * scenario is mitigated by the IBRS branch prediction isolation
- * properties, so the RSB buffer filling wouldn't be necessary to
- * protect against this type of attack.
+ * The "user -> user" attack is mitigated by RSB filling on context
+ * switch.
*
- * The "user -> user" attack scenario is mitigated by RSB filling.
+ * The "guest -> host" attack is mitigated by IBRS or eIBRS.
*
* 2) Poisoned RSB entry
*
* If the 'next' in-kernel return stack is shorter than 'prev',
* 'next' could be tricked into speculating with a user-poisoned RSB
- * entry.
+ * entry. Poisoned RSB entries can also be created by Branch Type
+ * Confusion ("AMD retbleed") or SRSO.
*
- * The "user -> kernel" attack scenario is mitigated by SMEP and
- * eIBRS.
+ * The "user -> kernel" attack is mitigated by SMEP and eIBRS. AMD
+ * without SRSO_NO also needs the SRSO mitigation.
*
- * The "user -> user" scenario, also known as SpectreBHB, requires
- * RSB clearing.
+ * The "user -> user" attack, also known as SpectreBHB, requires RSB
+ * clearing.
*
- * So to mitigate all cases, unconditionally fill RSB on context
- * switches.
- *
- * FIXME: Is this pointless for retbleed-affected AMD?
+ * The "guest -> host" attack is mitigated by either eIBRS (not
+ * IBRS!) or RSB clearing on vmexit. Note that eIBRS
+ * implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
+ * clearing which retires a single CALL before the first RET.
*/
+
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
--
2.47.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 1:46 ` Josh Poimboeuf
@ 2024-11-12 11:58 ` Borislav Petkov
2024-11-13 21:24 ` Josh Poimboeuf
2024-11-12 15:16 ` Shah, Amit
2024-11-12 21:43 ` Pawan Gupta
2 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2024-11-12 11:58 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, tglx, peterz, pawan.kumar.gupta,
corbet, mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks
Why don't you stick this insanely long comment in
Documentation/admin-guide/hw-vuln/ while at it?
Its place is hardly in the code. You can point to it from the code tho...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 1:46 ` Josh Poimboeuf
2024-11-12 11:58 ` Borislav Petkov
@ 2024-11-12 15:16 ` Shah, Amit
2024-11-12 21:43 ` Pawan Gupta
2 siblings, 0 replies; 34+ messages in thread
From: Shah, Amit @ 2024-11-12 15:16 UTC (permalink / raw)
To: jpoimboe@kernel.org, amit@kernel.org, andrew.cooper3@citrix.com
Cc: corbet@lwn.net, pawan.kumar.gupta@linux.intel.com,
kai.huang@intel.com, kvm@vger.kernel.org,
dave.hansen@linux.intel.com, daniel.sneddon@linux.intel.com,
Lendacky, Thomas, boris.ostrovsky@oracle.com,
linux-kernel@vger.kernel.org, seanjc@google.com, mingo@redhat.com,
pbonzini@redhat.com, tglx@linutronix.de, Moger, Babu,
Das1, Sandipan, linux-doc@vger.kernel.org, dwmw@amazon.co.uk,
hpa@zytor.com, peterz@infradead.org, bp@alien8.de, Kaplan, David,
x86@kernel.org
On Mon, 2024-11-11 at 17:46 -0800, Josh Poimboeuf wrote:
> On Tue, Nov 12, 2024 at 12:29:28AM +0000, Andrew Cooper wrote:
> > This is my take. On AMD CPUs, there are two unrelated issues to
> > take
> > into account:
> >
> > 1) SRSO
> >
> > Affects anything which doesn't enumerate SRSO_NO, which is all
> > parts to
> > date including Zen5.
> >
> > SRSO ends up overflowing the RAS with arbitrary BTB targets, such
> > that a
> > subsequent genuine RET follows a prediction which never came from a
> > real
> > CALL instruction.
> >
> > Mitigations for SRSO are either safe-ret, or IBPB-on-entry. Parts
> > without IBPB_RET using IBPB-on-entry need to manually flush the
> > RAS.
> >
> > Importantly, SMEP does not protection you against SRSO across the
> > user->kernel boundary, because the bad RAS entries are arbitrary.
> > New
> > in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
> > more. So on Zen5, you can in principle get away without a RAS
> > flush on
> > entry.
>
> Updated to mention SRSO:
>
> /*
> * In general there are two types of RSB attacks:
> *
> * 1) RSB underflow ("Intel Retbleed")
> *
> * Some Intel parts have "bottomless RSB". When the RSB
> is empty,
> * speculated return targets may come from the branch
> predictor,
> * which could have a user-poisoned BTB or BHB entry.
> *
> * When IBRS or eIBRS is enabled, the "user -> kernel"
> attack is
> * mitigated by the IBRS branch prediction isolation
> properties, so
> * the RSB buffer filling wouldn't be necessary to
> protect against
> * this type of attack.
> *
> * The "user -> user" attack is mitigated by RSB filling
> on context
> * switch.
> *
> * The "guest -> host" attack is mitigated by IBRS or
> eIBRS.
> *
> * 2) Poisoned RSB entry
> *
> * If the 'next' in-kernel return stack is shorter than
> 'prev',
> * 'next' could be tricked into speculating with a user-
> poisoned RSB
> * entry. Poisoned RSB entries can also be created by
> Branch Type
> * Confusion ("AMD retbleed") or SRSO.
> *
> * The "user -> kernel" attack is mitigated by SMEP and
> eIBRS. AMD
> * without SRSO_NO also needs the SRSO mitigation.
> *
> * The "user -> user" attack, also known as SpectreBHB,
> requires RSB
> * clearing.
> *
> * The "guest -> host" attack is mitigated by either
> eIBRS (not
> * IBRS!) or RSB clearing on vmexit. Note that eIBRS
> * implementations with X86_BUG_EIBRS_PBRSB still need
> "lite" RSB
> * clearing which retires a single CALL before the first
> RET.
> */
Thanks, Josh and Andrew. This reads well to me. In the context of
ERAPS, I'll end up adding a couple more sentences there as well.
Amit
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 1:46 ` Josh Poimboeuf
2024-11-12 11:58 ` Borislav Petkov
2024-11-12 15:16 ` Shah, Amit
@ 2024-11-12 21:43 ` Pawan Gupta
2024-11-13 20:21 ` Josh Poimboeuf
2 siblings, 1 reply; 34+ messages in thread
From: Pawan Gupta @ 2024-11-12 21:43 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> + * 1) RSB underflow ("Intel Retbleed")
> *
> * Some Intel parts have "bottomless RSB". When the RSB is empty,
> * speculated return targets may come from the branch predictor,
> * which could have a user-poisoned BTB or BHB entry.
> *
> - * AMD has it even worse: *all* returns are speculated from the BTB,
> - * regardless of the state of the RSB.
> + * When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> + * mitigated by the IBRS branch prediction isolation properties, so
> + * the RSB buffer filling wouldn't be necessary to protect against
> + * this type of attack.
> *
> - * When IBRS or eIBRS is enabled, the "user -> kernel" attack
> - * scenario is mitigated by the IBRS branch prediction isolation
> - * properties, so the RSB buffer filling wouldn't be necessary to
> - * protect against this type of attack.
> + * The "user -> user" attack is mitigated by RSB filling on context
> + * switch.
user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
unnecessary when IBPB is issued. Also, when an appication does not opted-in
for IBPB at context switch, spectre-v2 for that app is not mitigated,
filling RSB is only a half measure in that case.
Is RSB filling really serving any purpose for userspace?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature
2024-11-11 18:57 ` Dave Hansen
@ 2024-11-13 10:33 ` Shah, Amit
0 siblings, 0 replies; 34+ messages in thread
From: Shah, Amit @ 2024-11-13 10:33 UTC (permalink / raw)
To: kvm@vger.kernel.org, dave.hansen@intel.com,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
x86@kernel.org
Cc: corbet@lwn.net, pawan.kumar.gupta@linux.intel.com,
kai.huang@intel.com, jpoimboe@kernel.org, amit@kernel.org,
dave.hansen@linux.intel.com, daniel.sneddon@linux.intel.com,
Lendacky, Thomas, boris.ostrovsky@oracle.com, seanjc@google.com,
mingo@redhat.com, pbonzini@redhat.com, tglx@linutronix.de,
Moger, Babu, Das1, Sandipan, dwmw@amazon.co.uk, hpa@zytor.com,
peterz@infradead.org, andrew.cooper3@citrix.com, bp@alien8.de,
Kaplan, David
On Mon, 2024-11-11 at 10:57 -0800, Dave Hansen wrote:
> On 11/11/24 08:39, Amit Shah wrote:
> > Remove explicit RET stuffing / filling on VMEXITs and context
> > switches on AMD CPUs with the ERAPS feature (Zen5+).
>
> I'm not a super big fan of how this changelog is constructed. I
> personally really like the "background, problem, solution" form.
> This
> starts with the solution before even telling us what it is.
The subject line is usually the one with the impact; but in this case I
couldn't fit the impact (add new feature + highlight we're getting rid
of existing mitigation mechanisms). If you have ideas on a better
subject line, let me know.
> > With the Enhanced Return Address Prediction Security feature, any
> > of
> > the following conditions triggers a flush of the RSB (aka RAP in
> > AMD
> > manuals) in hardware:
> > * context switch (e.g., move to CR3)
>
> IMNHO, context switching is a broader topic that process-to-process
> switches. A user=>kernel thread switch is definitely a context
> switch,
> but doesn't involve a CR3 write. A user=>user switch in the same mm
> is
> also highly arguable to be a context switch and doesn't involve a CR3
> write. There are userspace threading libraries that do "context
> switches".
>
> This is further muddled by the very comments you are editing in this
> patch like: "unconditionally fill RSB on context switches".
>
> Please be very, very precise in the language that's chosen here.
I'm not saying I disagree, and I'm also saying the same thing - context
switch is broader than just CR3 update. But I take the point you're
really making - just limit this description to "CR3 update" -- because
that really is the case we're targeting with this feature in this patch
-- along with the arguments why the other cases (like user->kernel
switch or user->user switch in the same mm) are not affected by the
series.
> > * TLB flush
> > * some writes to CR4
>
> ... and only one of those is relevant for this patch. Aren't the
> others
> just noise?
I won't say so - for the purposes of this patch, yes. But if some
other condition requires Linux to flush the RSB on a non-CR3-write
event in the future, they'll find this commit. Leads to fewer "this is
unknown, let's assume the worst" conditions, somewhat like in the
comments in patch 1.
> > The feature also explicitly tags host and guest addresses in the
> > RSB -
> > eliminating the need for explicit flushing of the RSB on VMEXIT.
> >
> > [RFC note: We'll wait for the APM to be updated with the real
> > wording,
> > but assuming it's going to say the ERAPS feature works the way
> > described
> > above, let's continue the discussion re: when the kernel currently
> > calls
> > FILL_RETURN_BUFFER, and what dropping it entirely means.
> >
> > Dave Hansen pointed out __switch_to_asm fills the RSB each time
> > it's
> > called, so let's address the cases there:
> >
> > 1. user->kernel switch: Switching from userspace to kernelspace,
> > and
> > then using user-stuffed RSB entries in the kernel is not
> > possible
> > thanks to SMEP. We can safely drop the FILL_RETURN_BUFFER call
> > for
> > this case. In fact, this is how the original code was when
> > dwmw2
> > added it originally in c995efd5a. So while this case currently
> > triggers an RSB flush (and will not after this ERAPS patch), the
> > current flush isn't necessary for AMD systems with SMEP anyway.
>
> This description makes me uneasy. It basically boils down to, "SMEP
> guarantees that the kernel can't consume user-placed RSB entries."
>
> It makes me uneasy because I recall that not holding true on some
> Intel
> CPUs. I believe some CPUs have a partial-width RSB. Kernel
> consumption
> of a user-placed entry would induce the kernel to speculate to a
> *kernel* address so SMEP is rather ineffective.
>
> So, instead of just saying, "SMEP magically fixes everything, trust
> us",
> could we please step through a few of the properties of the hardware
> and
> software that make SMEP effective?
That's fair. And that's really the reason why I highlighted the first
line in the commit message: "we're really getting rid of software
mitigations here - focus on that please". Though I'll note here that
none of the questions raised in these series have come as a surprise to
me - I've had those exact questions asked of hardware engineers
internally, and answered to my satisfaction, and hence this patchset.
> First, I think that you're saying that AMD hardware has a full-width,
> precise RSB. That ensures that speculation always happens back
> precisely to the original address, not some weird alias. Second,
> ERAPS
> guarantees that the the address still has the same stuff mapped
> there.
> Any change to the address space involves either a CR3 write or a TLB
> flush, both of which would have flushed any user-placed content in
> the RSB.
>
> Aside: Even the kernel-only text poking mm or EFI mm would
> "just
> work" in this scenario since they have their own mm_structs,
> page tables and root CR3 values.
All true. David (Kaplan), please say if that's not right.
> > 2. user->user or kernel->kernel: If a user->user switch does not
> > result
> > in a CR3 change, it's a different thread in the same process
> > context.
> > That's the same case for kernel->kernel switch. In this case,
> > the
> > RSB entries are still valid in that context, just not the
> > correct
> > ones in the new thread's context. It's difficult to imagine
> > this
> > being a security risk. The current code clearing it, and this
> > patch
> > not doing so for AMD-with-ERAPS, isn't a concern as far as I
> > see.
> > ]
>
> The current rules are dirt simple: if the kernel stack gets switched,
> the RSB is flushed. It's kinda hard to have a mismatched stack if
> it's
> never switched in the first place. That makes the current rules dirt
> simple. The problem (stack switching) is dirt simple to correlate
> 1:1
> with the fix (RSB stuffing).
... and it could be argued that's overkill. These are security
mitigations -- and we're not mitigating security by clearing the RSB on
stack switches. And perhaps we're inviting performance penalties by
doing more than what's required for security, and working against the
hardware rather than with it.
> This patch proposes separating the problem (stack switching) and the
> mitigations (implicit new microcode actions). That's a hell of a lot
> more complicated and hardware to audit than the existing rules.
> Agreed?
Partially. We've got to trust the hardware and tools we've been given
- if the hardware says it behaves in a particular way on a given input
or event, we've got to trust that. I'm building on top of what's been
given.
But: I like this thought experiment. Let's continue.
> So, how are the rules relaxed?
>
> First, it opens up the case where user threads consume RSB entries
> from
> other threads in the same process. Threads are usually at the same
> privilege level. Instead of using a speculative attack, an attacker
> could just read the data directly. The only place I can imagine this
> even remotely being a problem would be if threads were relying on
> protection keys to keep secrets from each other.
It's a valid scenario, albeit really badly designed security/crypto
software. I would imagine that's difficult to exploit even then.
> The kernel consuming RSB entries from another kernel thread seems
> less
> clear. I disagree with the idea that a valid entry in a given
> context
> is a _safe_ entry though. Spectre v2 in _general_ involves nasty
> speculation to otherwise perfectly safe code. A problematic scenario
> would be where kswapd wakes up after waiting for I/O and starts
> speculating back along the return path of the userspace thread that
> kswapd interrupted. Userspace has some level of control over both
> stacks
> and it doesn't seem super far fetched to think that there could be
> some
> disclosure gadgets in there.
>
> In short: the user-consumes-user-RSB-entry attacks seem fundamentally
> improbable because user threads are unlikely to have secrets from
> each
> other.
>
> The kernel-consumes-kernel-RSB-entry attacks seem hard rather than
> fundamentally improbable. I can't even count how many times our "oh
> that attack would be too hard to pull off" optimism has gone wrong.
>
> What does that all _mean_? ERAPS sucks? Nah. Maybe we just make
> sure
> that the existing spectre_v2=whatever controls can be used to stop
> relying on it if asked.
Hm, not yet convinced - but documenting this seems useful. I can stick
this email verbatim in Documentation/ for the future, if you don't
mind?
Beyond that - there's the "mov cr3, cr3" we could plug in in
__switch_to_asm when ERAPS is active. I don't like it - but at least
that's 1:1.
> > diff --git a/arch/x86/include/asm/nospec-branch.h
> > b/arch/x86/include/asm/nospec-branch.h
> > index 96b410b1d4e8..f5ee7fc71db5 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -117,6 +117,18 @@
> > * We define a CPP macro such that it can be used from both .S
> > files and
> > * inline assembly. It's possible to do a .macro and then include
> > that
> > * from C via asm(".include <asm/nospec-branch.h>") but let's not
> > go there.
> > + *
> > + * AMD CPUs with the ERAPS feature may have a larger default RSB.
> > These CPUs
> > + * use the default number of entries on a host, and can optionally
> > (based on
> > + * hypervisor setup) use 32 (old) or the new default in a guest.
> > The number
> > + * of default entries is reflected in CPUID 8000_0021:EBX[23:16].
> > + *
> > + * With the ERAPS feature, RSB filling is not necessary anymore:
> > the RSB is
> > + * auto-cleared by hardware on context switches, TLB flushes, or
> > some CR4
> > + * writes. Adapting the value of RSB_CLEAR_LOOPS below for ERAPS
> > would change
> > + * it to a runtime variable instead of the current compile-time
> > constant, so
> > + * leave it as-is, as this works for both older CPUs, as well as
> > newer ones
> > + * with ERAPS.
> > */
>
> This comment feels out of place and noisy to me. Why the heck would
> we
> need to document the RSB size enumeration here and not use it?
>
> Then this goes on to explain why this patch *didn't* bother to change
> RSB_CLEAR_LOOPS. That seems much more like changelog material than a
> code comment to me.
>
> To me, RSB_CLEAR_LOOPS, is for, well, clearing the RSB. The existing
> comments say that some CPUs don't need to clear the RSB. I don't
> think
> we need further explanation of why one specific CPU doesn't need to
> clear the RSB. The new CPU isn't special.
>
> Something slightly more useful would be to actually read CPUID
> 8000_0021:EBX[23:16] and compare it to RSB_CLEAR_LOOPS:
>
> void amd_check_rsb_clearing(void)
> {
> /*
> * Systems with both these features off
> * do not perform RSB clearing:
> */
> if (!boot_cpu_has(X86_FEATURE_RSB_CTXSW) &&
> !boot_cpu_has(X86_FEATURE_RSB_VMEXIT))
> return;
>
> // with actual helpers and mask generation, not this mess:
> rsb_size = (cpuid_ebx(0x80000021) >> 16) & 0xff;
>
> WARN_ON_ONCE(rsb_size > RSB_CLEAR_LOOPS);
> }
>
> That's almost shorter than the proposed comment.
That's a good idea.
The comment vs commit log is because the commit otherwise is in a
different file than the #define, and people reading the code can miss
it entirely. I just prefer grep+read+git workflows much better for my
archaeology.
> > #define RETPOLINE_THUNK_SIZE 32
> > diff --git a/arch/x86/kernel/cpu/bugs.c
> > b/arch/x86/kernel/cpu/bugs.c
> > index 0aa629b5537d..02446815b0de 100644
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
> > @@ -1818,9 +1818,12 @@ static void __init
> > spectre_v2_select_mitigation(void)
> > pr_info("%s\n", spectre_v2_strings[mode]);
> >
> > /*
> > - * If Spectre v2 protection has been enabled, fill the RSB
> > during a
> > - * context switch. In general there are two types of RSB
> > attacks
> > - * across context switches, for which the CALLs/RETs may
> > be unbalanced.
> > + * If Spectre v2 protection has been enabled, the RSB
> > needs to be
> > + * cleared during a context switch. Either do it in
> > software by
> > + * filling the RSB, or in hardware via ERAPS.
>
> I'd move this comment about using ERAPS down to the ERAPS check
> itself.
>
> > + * In general there are two types of RSB attacks across
> > context
> > + * switches, for which the CALLs/RETs may be unbalanced.
> > *
> > * 1) RSB underflow
> > *
> > @@ -1848,12 +1851,21 @@ static void __init
> > spectre_v2_select_mitigation(void)
> > * RSB clearing.
> > *
> > * So to mitigate all cases, unconditionally fill RSB on
> > context
> > - * switches.
> > + * switches when ERAPS is not present.
> > */
> > - setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> > - pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB
> > on context switch\n");
> > + if (!boot_cpu_has(X86_FEATURE_ERAPS)) {
> > + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> > + pr_info("Spectre v2 / SpectreRSB mitigation:
> > Filling RSB on context switch\n");
> >
> > - spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
> > + /*
> > + * For guest -> host (or vice versa) RSB poisoning
> > scenarios,
> > + * determine the mitigation mode here. With
> > ERAPS, RSB
> > + * entries are tagged as host or guest - ensuring
> > that neither
> > + * the host nor the guest have to clear or fill
> > RSB entries to
> > + * avoid poisoning: skip RSB filling at VMEXIT in
> > that case.
> > + */
> > + spectre_v2_determine_rsb_fill_type_at_vmexit(mode)
> > ;
> > + }
>
> This seems to be following a pattern of putting excessive amounts of
> very AMD-specific commenting right in the middle of common code.
> There's a *HUGE* comment in
> spectre_v2_determine_rsb_fill_type_at_vmexit(). Wouldn't this be
> more
> at home there?
I think with Josh's reworded comments and stripping out of comments
from spectre_v2_determine_...() flows much better - so yea, will
update.
Thank you for the thorough review, David!
Amit
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 21:43 ` Pawan Gupta
@ 2024-11-13 20:21 ` Josh Poimboeuf
2024-11-14 1:55 ` Pawan Gupta
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-13 20:21 UTC (permalink / raw)
To: Pawan Gupta
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Tue, Nov 12, 2024 at 01:43:48PM -0800, Pawan Gupta wrote:
> On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > + * 1) RSB underflow ("Intel Retbleed")
> > *
> > * Some Intel parts have "bottomless RSB". When the RSB is empty,
> > * speculated return targets may come from the branch predictor,
> > * which could have a user-poisoned BTB or BHB entry.
> > *
> > - * AMD has it even worse: *all* returns are speculated from the BTB,
> > - * regardless of the state of the RSB.
> > + * When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> > + * mitigated by the IBRS branch prediction isolation properties, so
> > + * the RSB buffer filling wouldn't be necessary to protect against
> > + * this type of attack.
> > *
> > - * When IBRS or eIBRS is enabled, the "user -> kernel" attack
> > - * scenario is mitigated by the IBRS branch prediction isolation
> > - * properties, so the RSB buffer filling wouldn't be necessary to
> > - * protect against this type of attack.
> > + * The "user -> user" attack is mitigated by RSB filling on context
> > + * switch.
>
> user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> unnecessary when IBPB is issued. Also, when an appication does not opted-in
> for IBPB at context switch, spectre-v2 for that app is not mitigated,
> filling RSB is only a half measure in that case.
>
> Is RSB filling really serving any purpose for userspace?
Indeed...
If we don't need to flush RSB for user->user, we'd only need to worry
about protecting the kernel. Something like so?
- eIBRS+!PBRSB: no flush
- eIBRS+PBRSB: lite flush
- everything else: full flush
i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
also for context switches.
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-12 11:58 ` Borislav Petkov
@ 2024-11-13 21:24 ` Josh Poimboeuf
2024-11-13 21:37 ` Borislav Petkov
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-13 21:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, tglx, peterz, pawan.kumar.gupta,
corbet, mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Tue, Nov 12, 2024 at 12:58:11PM +0100, Borislav Petkov wrote:
> On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks
>
> Why don't you stick this insanely long comment in
> Documentation/admin-guide/hw-vuln/ while at it?
>
> Its place is hardly in the code. You can point to it from the code tho...
There are a lot of subtle details to this $#!tstorm, and IMO we probably
wouldn't be having these discussions in the first place if the comment
lived in the docs, as most people seem to ignore them...
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-13 21:24 ` Josh Poimboeuf
@ 2024-11-13 21:37 ` Borislav Petkov
2024-11-14 0:43 ` Josh Poimboeuf
0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2024-11-13 21:37 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, tglx, peterz, pawan.kumar.gupta,
corbet, mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Wed, Nov 13, 2024 at 01:24:40PM -0800, Josh Poimboeuf wrote:
> There are a lot of subtle details to this $#!tstorm, and IMO we probably
> wouldn't be having these discussions in the first place if the comment
> lived in the docs, as most people seem to ignore them...
That's why I'm saying point to the docs from the code. You can't have a big
fat comment in the code about this but everything else in the hw-vuln docs.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-13 21:37 ` Borislav Petkov
@ 2024-11-14 0:43 ` Josh Poimboeuf
2024-11-14 7:47 ` Borislav Petkov
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-14 0:43 UTC (permalink / raw)
To: Borislav Petkov
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, tglx, peterz, pawan.kumar.gupta,
corbet, mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Wed, Nov 13, 2024 at 10:37:24PM +0100, Borislav Petkov wrote:
> On Wed, Nov 13, 2024 at 01:24:40PM -0800, Josh Poimboeuf wrote:
> > There are a lot of subtle details to this $#!tstorm, and IMO we probably
> > wouldn't be having these discussions in the first place if the comment
> > lived in the docs, as most people seem to ignore them...
>
> That's why I'm saying point to the docs from the code. You can't have a big
> fat comment in the code about this but everything else in the hw-vuln docs.
But those docs are user facing, describing the "what" for each
vulnerability individually. They're basically historical documents
which don't evolve over time unless we tweak an interface or add a new
mitigation.
This comment relates to the "why" for the code itself (and its poor
confused developers), taking all the RSB-related vulnerabilities into
account.
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-13 20:21 ` Josh Poimboeuf
@ 2024-11-14 1:55 ` Pawan Gupta
2024-11-14 2:31 ` Josh Poimboeuf
0 siblings, 1 reply; 34+ messages in thread
From: Pawan Gupta @ 2024-11-14 1:55 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Wed, Nov 13, 2024 at 12:21:05PM -0800, Josh Poimboeuf wrote:
> On Tue, Nov 12, 2024 at 01:43:48PM -0800, Pawan Gupta wrote:
> > On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > > + * 1) RSB underflow ("Intel Retbleed")
> > > *
> > > * Some Intel parts have "bottomless RSB". When the RSB is empty,
> > > * speculated return targets may come from the branch predictor,
> > > * which could have a user-poisoned BTB or BHB entry.
> > > *
> > > - * AMD has it even worse: *all* returns are speculated from the BTB,
> > > - * regardless of the state of the RSB.
> > > + * When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> > > + * mitigated by the IBRS branch prediction isolation properties, so
> > > + * the RSB buffer filling wouldn't be necessary to protect against
> > > + * this type of attack.
> > > *
> > > - * When IBRS or eIBRS is enabled, the "user -> kernel" attack
> > > - * scenario is mitigated by the IBRS branch prediction isolation
> > > - * properties, so the RSB buffer filling wouldn't be necessary to
> > > - * protect against this type of attack.
> > > + * The "user -> user" attack is mitigated by RSB filling on context
> > > + * switch.
> >
> > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > filling RSB is only a half measure in that case.
> >
> > Is RSB filling really serving any purpose for userspace?
>
> Indeed...
>
> If we don't need to flush RSB for user->user, we'd only need to worry
> about protecting the kernel. Something like so?
>
> - eIBRS+!PBRSB: no flush
> - eIBRS+PBRSB: lite flush
Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
and it is only a problem until the first retired CALL. At VMexit we do have
unbalanced RET but not at kernel entry.
> - everything else: full flush
> i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> also for context switches.
Yes, assuming you mean user->kernel switch, and not process context switch.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 1:55 ` Pawan Gupta
@ 2024-11-14 2:31 ` Josh Poimboeuf
2024-11-14 8:01 ` Pawan Gupta
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-14 2:31 UTC (permalink / raw)
To: Pawan Gupta
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Wed, Nov 13, 2024 at 05:55:05PM -0800, Pawan Gupta wrote:
> > > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > > filling RSB is only a half measure in that case.
> > >
> > > Is RSB filling really serving any purpose for userspace?
> >
> > Indeed...
> >
> > If we don't need to flush RSB for user->user, we'd only need to worry
> > about protecting the kernel. Something like so?
> >
> > - eIBRS+!PBRSB: no flush
> > - eIBRS+PBRSB: lite flush
>
> Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
> and it is only a problem until the first retired CALL. At VMexit we do have
> unbalanced RET but not at kernel entry.
>
> > - everything else: full flush
>
> > i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> > also for context switches.
>
> Yes, assuming you mean user->kernel switch, and not process context switch.
Actually I did mean context switch. AFAIK we don't need to flush RSB at
kernel entry.
If user->user RSB is already mitigated by IBPB, then at context switch
we only have to worry about user->kernel. e.g., if 'next' has more (in
kernel) RETs then 'prev' had (in kernel) CALLs, the user could trigger
RSB underflow or corruption inside the kernel after the context switch.
Doesn't eIBRS already protect against that?
For PBRSB, I guess we don't need to worry about that since there would
be at least one kernel CALL before context switch.
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 0:43 ` Josh Poimboeuf
@ 2024-11-14 7:47 ` Borislav Petkov
2024-11-14 9:47 ` Ingo Molnar
0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2024-11-14 7:47 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, tglx, peterz, pawan.kumar.gupta,
corbet, mingo, dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon,
kai.huang, sandipan.das, boris.ostrovsky, Babu.Moger,
david.kaplan, dwmw
On Wed, Nov 13, 2024 at 04:43:58PM -0800, Josh Poimboeuf wrote:
> This comment relates to the "why" for the code itself (and its poor
> confused developers), taking all the RSB-related vulnerabilities into
> account.
So use Documentation/arch/x86/.
This is exactly the reason why we need more "why" documentation - because
everytime we have to swap the whole bugs.c horror back in, we're poor confused
developers. And we have the "why" spread out across commit messages and other
folklore which means everytime we have to change stuff, the git archeology
starts. :-\ "err, do you remember why we're doing this?!" And so on
converstaions on IRC.
So having an implementation document explaining clearly why we did things is
long overdue.
But it's fine - I can move it later when the dust settles here.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 2:31 ` Josh Poimboeuf
@ 2024-11-14 8:01 ` Pawan Gupta
2024-11-15 5:48 ` Josh Poimboeuf
0 siblings, 1 reply; 34+ messages in thread
From: Pawan Gupta @ 2024-11-14 8:01 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Wed, Nov 13, 2024 at 06:31:41PM -0800, Josh Poimboeuf wrote:
> On Wed, Nov 13, 2024 at 05:55:05PM -0800, Pawan Gupta wrote:
> > > > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > > > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > > > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > > > filling RSB is only a half measure in that case.
> > > >
> > > > Is RSB filling really serving any purpose for userspace?
> > >
> > > Indeed...
> > >
> > > If we don't need to flush RSB for user->user, we'd only need to worry
> > > about protecting the kernel. Something like so?
> > >
> > > - eIBRS+!PBRSB: no flush
> > > - eIBRS+PBRSB: lite flush
> >
> > Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
> > and it is only a problem until the first retired CALL. At VMexit we do have
> > unbalanced RET but not at kernel entry.
> >
> > > - everything else: full flush
> >
> > > i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> > > also for context switches.
> >
> > Yes, assuming you mean user->kernel switch, and not process context switch.
>
> Actually I did mean context switch. AFAIK we don't need to flush RSB at
> kernel entry.
>
> If user->user RSB is already mitigated by IBPB, then at context switch
> we only have to worry about user->kernel. e.g., if 'next' has more (in
> kernel) RETs then 'prev' had (in kernel) CALLs, the user could trigger
> RSB underflow or corruption inside the kernel after the context switch.
Yes, this condition can cause RSB underflow, but that is not enough. More
importantly an attacker also needs to control the target of RET.
> Doesn't eIBRS already protect against that?
Yes, eIBRS does protect against that, because the alternate predictor (TA)
is isolated by eIBRS from user influence.
> For PBRSB, I guess we don't need to worry about that since there would
> be at least one kernel CALL before context switch.
Right. So the case where we need RSB filling at context switch is
retpoline+CDT mitigation.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 7:47 ` Borislav Petkov
@ 2024-11-14 9:47 ` Ingo Molnar
2024-11-14 9:54 ` Borislav Petkov
0 siblings, 1 reply; 34+ messages in thread
From: Ingo Molnar @ 2024-11-14 9:47 UTC (permalink / raw)
To: Borislav Petkov
Cc: Josh Poimboeuf, Andrew Cooper, Amit Shah, linux-kernel, kvm, x86,
linux-doc, amit.shah, thomas.lendacky, tglx, peterz,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
* Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Nov 13, 2024 at 04:43:58PM -0800, Josh Poimboeuf wrote:
> > This comment relates to the "why" for the code itself (and its poor
> > confused developers), taking all the RSB-related vulnerabilities into
> > account.
>
> So use Documentation/arch/x86/.
>
> This is exactly the reason why we need more "why" documentation - because
> everytime we have to swap the whole bugs.c horror back in, we're poor confused
> developers. And we have the "why" spread out across commit messages and other
> folklore which means everytime we have to change stuff, the git archeology
> starts. :-\ "err, do you remember why we're doing this?!" And so on
> converstaions on IRC.
>
> So having an implementation document explaining clearly why we did things is
> long overdue.
>
> But it's fine - I can move it later when the dust settles here.
I think in-line documentation is better in this case: the primary defense
against mistakes and misunderstandings is in the source code itself.
And "it's too long" is an argument *against* moving it out into some obscure
place 99% of developers aren't even aware of...
Thanks,
Ingo
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 9:47 ` Ingo Molnar
@ 2024-11-14 9:54 ` Borislav Petkov
0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2024-11-14 9:54 UTC (permalink / raw)
To: Ingo Molnar
Cc: Josh Poimboeuf, Andrew Cooper, Amit Shah, linux-kernel, kvm, x86,
linux-doc, amit.shah, thomas.lendacky, tglx, peterz,
pawan.kumar.gupta, corbet, mingo, dave.hansen, hpa, seanjc,
pbonzini, daniel.sneddon, kai.huang, sandipan.das,
boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Thu, Nov 14, 2024 at 10:47:23AM +0100, Ingo Molnar wrote:
> I think in-line documentation is better in this case: the primary defense
> against mistakes and misunderstandings is in the source code itself.
>
> And "it's too long" is an argument *against* moving it out into some obscure
> place 99% of developers aren't even aware of...
You mean developers can't even read?
/*
* See Documentation/arch/x86/ for details on this mitigation
* implementation.
*/
And if we want to expand the "why" and do proper documentation on the
implementation decisions of each mitigation, we still keep it there in the
code?
Or we do one part in Documentation/ and another part in the code?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-14 8:01 ` Pawan Gupta
@ 2024-11-15 5:48 ` Josh Poimboeuf
2024-11-15 14:44 ` Kaplan, David
2024-11-15 17:50 ` Pawan Gupta
0 siblings, 2 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 5:48 UTC (permalink / raw)
To: Pawan Gupta
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Thu, Nov 14, 2024 at 12:01:16AM -0800, Pawan Gupta wrote:
> > For PBRSB, I guess we don't need to worry about that since there would
> > be at least one kernel CALL before context switch.
>
> Right. So the case where we need RSB filling at context switch is
> retpoline+CDT mitigation.
According to the docs, classic IBRS also needs RSB filling at context
switch to protect against corrupt RSB entries (as opposed to RSB
underflow).
Something like so...
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..7b9c0a21e478 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1579,27 +1579,44 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
rrsba_disabled = true;
}
-static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
+static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
{
/*
- * Similar to context switches, there are two types of RSB attacks
- * after VM exit:
+ * In general there are two types of RSB attacks:
*
- * 1) RSB underflow
+ * 1) RSB underflow ("Intel Retbleed")
+ *
+ * Some Intel parts have "bottomless RSB". When the RSB is empty,
+ * speculated return targets may come from the branch predictor,
+ * which could have a user-poisoned BTB or BHB entry.
+ *
+ * user->user attacks are mitigated by IBPB on context switch.
+ *
+ * user->kernel attacks via context switch are mitigated by IBRS,
+ * eIBRS, or RSB filling.
+ *
+ * user->kernel attacks via kernel entry are mitigated by IBRS,
+ * eIBRS, or call depth tracking.
+ *
+ * On VMEXIT, guest->host attacks are mitigated by IBRS, eIBRS, or
+ * RSB filling.
*
* 2) Poisoned RSB entry
*
- * When retpoline is enabled, both are mitigated by filling/clearing
- * the RSB.
+ * On a context switch, the previous task can poison RSB entries
+ * used by the next task, controlling its speculative return
+ * targets. Poisoned RSB entries can also be created by "AMD
+ * Retbleed" or SRSO.
*
- * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
- * prediction isolation protections, RSB still needs to be cleared
- * because of #2. Note that SMEP provides no protection here, unlike
- * user-space-poisoned RSB entries.
+ * user->user attacks are mitigated by IBPB on context switch.
*
- * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
- * bug is present then a LITE version of RSB protection is required,
- * just a single call needs to retire before a RET is executed.
+ * user->kernel attacks via context switch are prevented by
+ * SMEP+eIBRS+SRSO mitigations, or RSB clearing.
+ *
+ * guest->host attacks are mitigated by eIBRS or RSB clearing on
+ * VMEXIT. eIBRS implementations with X86_BUG_EIBRS_PBRSB still
+ * need "lite" RSB filling which retires a CALL before the first
+ * RET.
*/
switch (mode) {
case SPECTRE_V2_NONE:
@@ -1608,8 +1625,8 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
case SPECTRE_V2_EIBRS_LFENCE:
case SPECTRE_V2_EIBRS:
if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
- setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
pr_info("Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT\n");
+ setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
}
return;
@@ -1617,12 +1634,13 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
case SPECTRE_V2_RETPOLINE:
case SPECTRE_V2_LFENCE:
case SPECTRE_V2_IBRS:
+ pr_info("Spectre v2 / SpectreRSB : Filling RSB on context switch and VMEXIT\n");
+ setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
- pr_info("Spectre v2 / SpectreRSB : Filling RSB on VMEXIT\n");
return;
}
- pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation at VM exit");
+ pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation\n");
dump_stack();
}
@@ -1817,48 +1835,7 @@ static void __init spectre_v2_select_mitigation(void)
spectre_v2_enabled = mode;
pr_info("%s\n", spectre_v2_strings[mode]);
- /*
- * If Spectre v2 protection has been enabled, fill the RSB during a
- * context switch. In general there are two types of RSB attacks
- * across context switches, for which the CALLs/RETs may be unbalanced.
- *
- * 1) RSB underflow
- *
- * Some Intel parts have "bottomless RSB". When the RSB is empty,
- * speculated return targets may come from the branch predictor,
- * which could have a user-poisoned BTB or BHB entry.
- *
- * AMD has it even worse: *all* returns are speculated from the BTB,
- * regardless of the state of the RSB.
- *
- * When IBRS or eIBRS is enabled, the "user -> kernel" attack
- * scenario is mitigated by the IBRS branch prediction isolation
- * properties, so the RSB buffer filling wouldn't be necessary to
- * protect against this type of attack.
- *
- * The "user -> user" attack scenario is mitigated by RSB filling.
- *
- * 2) Poisoned RSB entry
- *
- * If the 'next' in-kernel return stack is shorter than 'prev',
- * 'next' could be tricked into speculating with a user-poisoned RSB
- * entry.
- *
- * The "user -> kernel" attack scenario is mitigated by SMEP and
- * eIBRS.
- *
- * The "user -> user" scenario, also known as SpectreBHB, requires
- * RSB clearing.
- *
- * So to mitigate all cases, unconditionally fill RSB on context
- * switches.
- *
- * FIXME: Is this pointless for retbleed-affected AMD?
- */
- setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
- pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
-
- spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
+ spectre_v2_mitigate_rsb(mode);
/*
* Retpoline protects the kernel, but doesn't protect firmware. IBRS
^ permalink raw reply related [flat|nested] 34+ messages in thread
* RE: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-15 5:48 ` Josh Poimboeuf
@ 2024-11-15 14:44 ` Kaplan, David
2024-11-15 17:05 ` Josh Poimboeuf
2024-11-15 17:50 ` Pawan Gupta
1 sibling, 1 reply; 34+ messages in thread
From: Kaplan, David @ 2024-11-15 14:44 UTC (permalink / raw)
To: Josh Poimboeuf, Pawan Gupta
Cc: Andrew Cooper, Amit Shah, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, x86@kernel.org, linux-doc@vger.kernel.org,
Shah, Amit, Lendacky, Thomas, bp@alien8.de, tglx@linutronix.de,
peterz@infradead.org, corbet@lwn.net, mingo@redhat.com,
dave.hansen@linux.intel.com, hpa@zytor.com, seanjc@google.com,
pbonzini@redhat.com, daniel.sneddon@linux.intel.com,
kai.huang@intel.com, Das1, Sandipan, boris.ostrovsky@oracle.com,
Moger, Babu, dwmw@amazon.co.uk
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Josh Poimboeuf <jpoimboe@kernel.org>
> Sent: Thursday, November 14, 2024 11:49 PM
> To: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Amit Shah
> <amit@kernel.org>; linux-kernel@vger.kernel.org; kvm@vger.kernel.org;
> x86@kernel.org; linux-doc@vger.kernel.org; Shah, Amit
> <Amit.Shah@amd.com>; Lendacky, Thomas <Thomas.Lendacky@amd.com>;
> bp@alien8.de; tglx@linutronix.de; peterz@infradead.org; corbet@lwn.net;
> mingo@redhat.com; dave.hansen@linux.intel.com; hpa@zytor.com;
> seanjc@google.com; pbonzini@redhat.com;
> daniel.sneddon@linux.intel.com; kai.huang@intel.com; Das1, Sandipan
> <Sandipan.Das@amd.com>; boris.ostrovsky@oracle.com; Moger, Babu
> <Babu.Moger@amd.com>; Kaplan, David <David.Kaplan@amd.com>;
> dwmw@amazon.co.uk
> Subject: Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments
> for AMD
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On Thu, Nov 14, 2024 at 12:01:16AM -0800, Pawan Gupta wrote:
> > > For PBRSB, I guess we don't need to worry about that since there
> > > would be at least one kernel CALL before context switch.
> >
> > Right. So the case where we need RSB filling at context switch is
> > retpoline+CDT mitigation.
>
> According to the docs, classic IBRS also needs RSB filling at context switch to
> protect against corrupt RSB entries (as opposed to RSB underflow).
Which docs are that? Classic IBRS doesn't do anything with returns (at least on AMD). The AMD docs say that if you want to prevent earlier instructions from influencing later RETs, you need to do the 32 CALL sequence. But I'm not sure what corrupt RSB entries mean here, and how it relates to IBRS?
--David Kaplan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-15 14:44 ` Kaplan, David
@ 2024-11-15 17:05 ` Josh Poimboeuf
0 siblings, 0 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 17:05 UTC (permalink / raw)
To: Kaplan, David
Cc: Pawan Gupta, Andrew Cooper, Amit Shah,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, x86@kernel.org,
linux-doc@vger.kernel.org, Shah, Amit, Lendacky, Thomas,
bp@alien8.de, tglx@linutronix.de, peterz@infradead.org,
corbet@lwn.net, mingo@redhat.com, dave.hansen@linux.intel.com,
hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
daniel.sneddon@linux.intel.com, kai.huang@intel.com,
Das1, Sandipan, boris.ostrovsky@oracle.com, Moger, Babu,
dwmw@amazon.co.uk
On Fri, Nov 15, 2024 at 02:44:12PM +0000, Kaplan, David wrote:
> > On Thu, Nov 14, 2024 at 12:01:16AM -0800, Pawan Gupta wrote:
> > > > For PBRSB, I guess we don't need to worry about that since there
> > > > would be at least one kernel CALL before context switch.
> > >
> > > Right. So the case where we need RSB filling at context switch is
> > > retpoline+CDT mitigation.
> >
> > According to the docs, classic IBRS also needs RSB filling at context switch to
> > protect against corrupt RSB entries (as opposed to RSB underflow).
>
> Which docs are that? Classic IBRS doesn't do anything with returns
> (at least on AMD). The AMD docs say that if you want to prevent
> earlier instructions from influencing later RETs, you need to do the
> 32 CALL sequence. But I'm not sure what corrupt RSB entries mean
> here, and how it relates to IBRS?
Sorry, by "corrupt", I meant poisoned. I think we are saying the same
thing. Classic IBRS doesn't protect against RSB poisoning.
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-15 5:48 ` Josh Poimboeuf
2024-11-15 14:44 ` Kaplan, David
@ 2024-11-15 17:50 ` Pawan Gupta
2024-11-15 18:10 ` Josh Poimboeuf
1 sibling, 1 reply; 34+ messages in thread
From: Pawan Gupta @ 2024-11-15 17:50 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Thu, Nov 14, 2024 at 09:48:36PM -0800, Josh Poimboeuf wrote:
> According to the docs, classic IBRS also needs RSB filling at context
> switch to protect against corrupt RSB entries (as opposed to RSB
> underflow).
Correct.
> Something like so...
>
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 47a01d4028f6..7b9c0a21e478 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1579,27 +1579,44 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
> rrsba_disabled = true;
> }
>
> -static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
> +static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
> {
> /*
> - * Similar to context switches, there are two types of RSB attacks
> - * after VM exit:
> + * In general there are two types of RSB attacks:
> *
> - * 1) RSB underflow
> + * 1) RSB underflow ("Intel Retbleed")
> + *
> + * Some Intel parts have "bottomless RSB". When the RSB is empty,
> + * speculated return targets may come from the branch predictor,
> + * which could have a user-poisoned BTB or BHB entry.
> + *
> + * user->user attacks are mitigated by IBPB on context switch.
> + *
> + * user->kernel attacks via context switch are mitigated by IBRS,
> + * eIBRS, or RSB filling.
> + *
> + * user->kernel attacks via kernel entry are mitigated by IBRS,
> + * eIBRS, or call depth tracking.
> + *
> + * On VMEXIT, guest->host attacks are mitigated by IBRS, eIBRS, or
> + * RSB filling.
> *
> * 2) Poisoned RSB entry
> *
> - * When retpoline is enabled, both are mitigated by filling/clearing
> - * the RSB.
> + * On a context switch, the previous task can poison RSB entries
> + * used by the next task, controlling its speculative return
> + * targets. Poisoned RSB entries can also be created by "AMD
> + * Retbleed" or SRSO.
> *
> - * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
> - * prediction isolation protections, RSB still needs to be cleared
> - * because of #2. Note that SMEP provides no protection here, unlike
> - * user-space-poisoned RSB entries.
> + * user->user attacks are mitigated by IBPB on context switch.
> *
> - * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
> - * bug is present then a LITE version of RSB protection is required,
> - * just a single call needs to retire before a RET is executed.
> + * user->kernel attacks via context switch are prevented by
> + * SMEP+eIBRS+SRSO mitigations, or RSB clearing.
> + *
> + * guest->host attacks are mitigated by eIBRS or RSB clearing on
> + * VMEXIT. eIBRS implementations with X86_BUG_EIBRS_PBRSB still
> + * need "lite" RSB filling which retires a CALL before the first
> + * RET.
> */
> switch (mode) {
> case SPECTRE_V2_NONE:
> @@ -1608,8 +1625,8 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
> case SPECTRE_V2_EIBRS_LFENCE:
> case SPECTRE_V2_EIBRS:
> if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
> - setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
> pr_info("Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT\n");
> + setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
> }
> return;
>
> @@ -1617,12 +1634,13 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
> case SPECTRE_V2_RETPOLINE:
> case SPECTRE_V2_LFENCE:
> case SPECTRE_V2_IBRS:
> + pr_info("Spectre v2 / SpectreRSB : Filling RSB on context switch and VMEXIT\n");
> + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
> - pr_info("Spectre v2 / SpectreRSB : Filling RSB on VMEXIT\n");
> return;
> }
>
> - pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation at VM exit");
> + pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation\n");
> dump_stack();
> }
>
> @@ -1817,48 +1835,7 @@ static void __init spectre_v2_select_mitigation(void)
> spectre_v2_enabled = mode;
> pr_info("%s\n", spectre_v2_strings[mode]);
>
> - /*
> - * If Spectre v2 protection has been enabled, fill the RSB during a
> - * context switch. In general there are two types of RSB attacks
> - * across context switches, for which the CALLs/RETs may be unbalanced.
> - *
> - * 1) RSB underflow
> - *
> - * Some Intel parts have "bottomless RSB". When the RSB is empty,
> - * speculated return targets may come from the branch predictor,
> - * which could have a user-poisoned BTB or BHB entry.
> - *
> - * AMD has it even worse: *all* returns are speculated from the BTB,
> - * regardless of the state of the RSB.
> - *
> - * When IBRS or eIBRS is enabled, the "user -> kernel" attack
> - * scenario is mitigated by the IBRS branch prediction isolation
> - * properties, so the RSB buffer filling wouldn't be necessary to
> - * protect against this type of attack.
> - *
> - * The "user -> user" attack scenario is mitigated by RSB filling.
> - *
> - * 2) Poisoned RSB entry
> - *
> - * If the 'next' in-kernel return stack is shorter than 'prev',
> - * 'next' could be tricked into speculating with a user-poisoned RSB
> - * entry.
> - *
> - * The "user -> kernel" attack scenario is mitigated by SMEP and
> - * eIBRS.
> - *
> - * The "user -> user" scenario, also known as SpectreBHB, requires
> - * RSB clearing.
> - *
> - * So to mitigate all cases, unconditionally fill RSB on context
> - * switches.
> - *
> - * FIXME: Is this pointless for retbleed-affected AMD?
> - */
> - setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> - pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
> -
> - spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
> + spectre_v2_mitigate_rsb(mode);
>
> /*
> * Retpoline protects the kernel, but doesn't protect firmware. IBRS
This LGTM.
I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
doesn't need RSB filling on context switch, and only needs VMEXIT_LITE.
Does below change on top of your patch look okay?
---
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 7b9c0a21e478..d3b9a0d7a2b5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1622,6 +1622,7 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
case SPECTRE_V2_NONE:
return;
+ case SPECTRE_V2_EIBRS_RETPOLINE:
case SPECTRE_V2_EIBRS_LFENCE:
case SPECTRE_V2_EIBRS:
if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
@@ -1630,7 +1631,6 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
}
return;
- case SPECTRE_V2_EIBRS_RETPOLINE:
case SPECTRE_V2_RETPOLINE:
case SPECTRE_V2_LFENCE:
case SPECTRE_V2_IBRS:
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-15 17:50 ` Pawan Gupta
@ 2024-11-15 18:10 ` Josh Poimboeuf
2024-11-18 10:15 ` Shah, Amit
0 siblings, 1 reply; 34+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:10 UTC (permalink / raw)
To: Pawan Gupta
Cc: Andrew Cooper, Amit Shah, linux-kernel, kvm, x86, linux-doc,
amit.shah, thomas.lendacky, bp, tglx, peterz, corbet, mingo,
dave.hansen, hpa, seanjc, pbonzini, daniel.sneddon, kai.huang,
sandipan.das, boris.ostrovsky, Babu.Moger, david.kaplan, dwmw
On Fri, Nov 15, 2024 at 09:50:47AM -0800, Pawan Gupta wrote:
> This LGTM.
>
> I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
> doesn't need RSB filling on context switch, and only needs VMEXIT_LITE.
> Does below change on top of your patch look okay?
Yeah, I was wondering about that too. Since it changes existing
VMEXIT_LITE behavior I'll make it a separate patch. And I'll probably
do the comment changes in a separate patch as well.
> ---
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 7b9c0a21e478..d3b9a0d7a2b5 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1622,6 +1622,7 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
> case SPECTRE_V2_NONE:
> return;
>
> + case SPECTRE_V2_EIBRS_RETPOLINE:
> case SPECTRE_V2_EIBRS_LFENCE:
> case SPECTRE_V2_EIBRS:
> if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
> @@ -1630,7 +1631,6 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
> }
> return;
>
> - case SPECTRE_V2_EIBRS_RETPOLINE:
> case SPECTRE_V2_RETPOLINE:
> case SPECTRE_V2_LFENCE:
> case SPECTRE_V2_IBRS:
--
Josh
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD
2024-11-15 18:10 ` Josh Poimboeuf
@ 2024-11-18 10:15 ` Shah, Amit
0 siblings, 0 replies; 34+ messages in thread
From: Shah, Amit @ 2024-11-18 10:15 UTC (permalink / raw)
To: pawan.kumar.gupta@linux.intel.com, jpoimboe@kernel.org,
amit@kernel.org
Cc: corbet@lwn.net, kvm@vger.kernel.org, andrew.cooper3@citrix.com,
kai.huang@intel.com, dave.hansen@linux.intel.com,
Lendacky, Thomas, daniel.sneddon@linux.intel.com,
boris.ostrovsky@oracle.com, linux-kernel@vger.kernel.org,
seanjc@google.com, mingo@redhat.com, pbonzini@redhat.com,
tglx@linutronix.de, Moger, Babu, Das1, Sandipan,
linux-doc@vger.kernel.org, dwmw@amazon.co.uk, hpa@zytor.com,
peterz@infradead.org, bp@alien8.de, Kaplan, David, x86@kernel.org
On Fri, 2024-11-15 at 10:10 -0800, Josh Poimboeuf wrote:
> On Fri, Nov 15, 2024 at 09:50:47AM -0800, Pawan Gupta wrote:
> > This LGTM.
> >
> > I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
> > doesn't need RSB filling on context switch, and only needs
> > VMEXIT_LITE.
> > Does below change on top of your patch look okay?
>
> Yeah, I was wondering about that too. Since it changes existing
> VMEXIT_LITE behavior I'll make it a separate patch. And I'll
> probably
> do the comment changes in a separate patch as well.
So all of that looks good to me as well. I think a standalone series
makes sense - maybe even for 6.13. I'll base my patches on top of
yours.
Thanks!
Amit
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-11-18 10:16 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-11 16:39 [RFC PATCH v2 0/3] Add support for the ERAPS feature Amit Shah
2024-11-11 16:39 ` [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD Amit Shah
2024-11-11 17:01 ` Dave Hansen
2024-11-11 19:33 ` Josh Poimboeuf
2024-11-11 19:58 ` Kaplan, David
2024-11-11 20:39 ` Josh Poimboeuf
2024-11-11 20:40 ` Josh Poimboeuf
2024-11-12 0:30 ` Josh Poimboeuf
2024-11-12 0:29 ` Andrew Cooper
2024-11-12 1:46 ` Josh Poimboeuf
2024-11-12 11:58 ` Borislav Petkov
2024-11-13 21:24 ` Josh Poimboeuf
2024-11-13 21:37 ` Borislav Petkov
2024-11-14 0:43 ` Josh Poimboeuf
2024-11-14 7:47 ` Borislav Petkov
2024-11-14 9:47 ` Ingo Molnar
2024-11-14 9:54 ` Borislav Petkov
2024-11-12 15:16 ` Shah, Amit
2024-11-12 21:43 ` Pawan Gupta
2024-11-13 20:21 ` Josh Poimboeuf
2024-11-14 1:55 ` Pawan Gupta
2024-11-14 2:31 ` Josh Poimboeuf
2024-11-14 8:01 ` Pawan Gupta
2024-11-15 5:48 ` Josh Poimboeuf
2024-11-15 14:44 ` Kaplan, David
2024-11-15 17:05 ` Josh Poimboeuf
2024-11-15 17:50 ` Pawan Gupta
2024-11-15 18:10 ` Josh Poimboeuf
2024-11-18 10:15 ` Shah, Amit
2024-11-11 16:39 ` [RFC PATCH v2 2/3] x86: cpu/bugs: add support for AMD ERAPS feature Amit Shah
2024-11-11 18:57 ` Dave Hansen
2024-11-13 10:33 ` Shah, Amit
2024-11-11 16:39 ` [RFC PATCH v2 3/3] x86: kvm: svm: add support for ERAPS and FLUSH_RAP_ON_VMRUN Amit Shah
2024-11-11 19:02 ` Dave Hansen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).