From: Suraj Jitindar Singh <surajjs@amazon.com>
To: <stable@vger.kernel.org>
Cc: <surajjs@amazon.com>, <sjitindarsingh@gmail.com>,
<cascardo@canonical.com>, <kvm@vger.kernel.org>,
<pbonzini@redhat.com>, <jpoimboe@kernel.org>,
<peterz@infradead.org>, <x86@kernel.org>
Subject: [PATCH 4.14 17/34] entel_idle: Disable IBRS during long idle
Date: Thu, 27 Oct 2022 13:55:09 -0700 [thread overview]
Message-ID: <20221027205512.17684-1-surajjs@amazon.com> (raw)
In-Reply-To: <20221027204801.13146-1-surajjs@amazon.com>
From: Peter Zijlstra <peterz@infradead.org>
commit bf5835bcdb9635c97f85120dba9bfa21e111130f upstream.
Having IBRS enabled while the SMT sibling is idle unnecessarily slows
down the running sibling. OTOH, disabling IBRS around idle takes two
MSR writes, which will increase the idle latency.
Therefore, only disable IBRS around deeper idle states. Shallow idle
states are bounded by the tick in duration, since NOHZ is not allowed
for them by virtue of their short target residency.
Only do this for mwait-driven idle, since that keeps interrupts disabled
across idle, which makes disabling IBRS vs IRQ-entry a non-issue.
Note: C6 is a random threshold, most importantly C1 probably shouldn't
disable IBRS, benchmarking needed.
Suggested-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
[cascardo: no CPUIDLE_FLAG_IRQ_ENABLE]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ bp: Adjust context ]
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
---
arch/x86/include/asm/nospec-branch.h | 1 +
arch/x86/kernel/cpu/bugs.c | 6 ++++
drivers/idle/intel_idle.c | 45 ++++++++++++++++++++++++----
3 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 73f592d28396..3a1e3062f244 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -293,6 +293,7 @@ static inline void indirect_branch_prediction_barrier(void)
/* The Intel SPEC CTRL MSR base value cache */
extern u64 x86_spec_ctrl_base;
extern void write_spec_ctrl_current(u64 val, bool force);
+extern u64 spec_ctrl_current(void);
/*
* With retpoline, we must use IBRS to restrict branch prediction
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 01604ea478b0..43a258e592d0 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -77,6 +77,12 @@ void write_spec_ctrl_current(u64 val, bool force)
wrmsrl(MSR_IA32_SPEC_CTRL, val);
}
+u64 spec_ctrl_current(void)
+{
+ return this_cpu_read(x86_spec_ctrl_current);
+}
+EXPORT_SYMBOL_GPL(spec_ctrl_current);
+
/*
* The vendor and possibly platform specific bits which can be modified in
* x86_spec_ctrl_base.
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 31f54a334b58..51cc492c2e35 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -58,11 +58,13 @@
#include <linux/tick.h>
#include <trace/events/power.h>
#include <linux/sched.h>
+#include <linux/sched/smt.h>
#include <linux/notifier.h>
#include <linux/cpu.h>
#include <linux/moduleparam.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
+#include <asm/nospec-branch.h>
#include <asm/mwait.h>
#include <asm/msr.h>
@@ -97,6 +99,8 @@ static const struct idle_cpu *icpu;
static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
static int intel_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index);
+static int intel_idle_ibrs(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index);
static void intel_idle_s2idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index);
static struct cpuidle_state *cpuidle_state_table;
@@ -109,6 +113,12 @@ static struct cpuidle_state *cpuidle_state_table;
*/
#define CPUIDLE_FLAG_TLB_FLUSHED 0x10000
+/*
+ * Disable IBRS across idle (when KERNEL_IBRS), is exclusive vs IRQ_ENABLE
+ * above.
+ */
+#define CPUIDLE_FLAG_IBRS BIT(16)
+
/*
* MWAIT takes an 8-bit "hint" in EAX "suggesting"
* the C-state (top nibble) and sub-state (bottom nibble)
@@ -617,7 +627,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C6",
.desc = "MWAIT 0x20",
- .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 85,
.target_residency = 200,
.enter = &intel_idle,
@@ -625,7 +635,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C7s",
.desc = "MWAIT 0x33",
- .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 124,
.target_residency = 800,
.enter = &intel_idle,
@@ -633,7 +643,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C8",
.desc = "MWAIT 0x40",
- .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 200,
.target_residency = 800,
.enter = &intel_idle,
@@ -641,7 +651,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C9",
.desc = "MWAIT 0x50",
- .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 480,
.target_residency = 5000,
.enter = &intel_idle,
@@ -649,7 +659,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C10",
.desc = "MWAIT 0x60",
- .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 890,
.target_residency = 5000,
.enter = &intel_idle,
@@ -678,7 +688,7 @@ static struct cpuidle_state skx_cstates[] = {
{
.name = "C6",
.desc = "MWAIT 0x20",
- .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
.exit_latency = 133,
.target_residency = 600,
.enter = &intel_idle,
@@ -935,6 +945,24 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
return index;
}
+static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+{
+ bool smt_active = sched_smt_active();
+ u64 spec_ctrl = spec_ctrl_current();
+ int ret;
+
+ if (smt_active)
+ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
+ ret = intel_idle(dev, drv, index);
+
+ if (smt_active)
+ wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+
+ return ret;
+}
+
/**
* intel_idle_s2idle - simplified "enter" callback routine for suspend-to-idle
* @dev: cpuidle_device
@@ -1375,6 +1403,11 @@ static void __init intel_idle_cpuidle_driver_init(void)
mark_tsc_unstable("TSC halts in idle"
" states deeper than C2");
+ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
+ cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IBRS) {
+ drv->states[drv->state_count].enter = intel_idle_ibrs;
+ }
+
drv->states[drv->state_count] = /* structure copy */
cpuidle_state_table[cstate];
--
2.17.1
next prev parent reply other threads:[~2022-10-27 21:03 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 20:48 [PATCH 4.14 00/34] Retbleed & PBRSB Mitigations Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 01/34] Revert "x86/cpu: Add a steppings field to struct x86_cpu_id" Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 02/34] x86/cpufeature: Add facility to check for min microcode revisions Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 03/34] x86/cpufeature: Fix various quality problems in the <asm/cpu_device_hd.h> header Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 04/34] x86/devicetable: Move x86 specific macro out of generic code Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 05/34] x86/cpu: Add consistent CPU match macros Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 06/34] x86/cpu: Add a steppings field to struct x86_cpu_id Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 07/34] x86/entry: Remove skip_r11rcx Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 08/34] x86/cpufeatures: Move RETPOLINE flags to word 11 Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 09/34] x86/bugs: Report AMD retbleed vulnerability Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 10/34] x86/bugs: Add AMD retbleed= boot parameter Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 11/34] x86/bugs: Keep a per-CPU IA32_SPEC_CTRL value Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 12/34] x86/entry: Add kernel IBRS implementation Suraj Jitindar Singh
2022-10-27 20:54 ` [PATCH 4.14 13/34] x86/bugs: Optimize SPEC_CTRL MSR writes Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 14/34] x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 15/34] x86/bugs: Split spectre_v2_select_mitigation() and spectre_v2_user_select_mitigation() Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 16/34] x86/bugs: Report Intel retbleed vulnerability Suraj Jitindar Singh
2022-10-27 20:55 ` Suraj Jitindar Singh [this message]
2022-10-27 20:55 ` [PATCH 4.14 18/34] x86/speculation: Change FILL_RETURN_BUFFER to work with objtool Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 19/34] x86/speculation: Add LFENCE to RSB fill sequence Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 20/34] x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 21/34] x86/speculation: Fix firmware entry SPEC_CTRL handling Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 22/34] x86/speculation: Fix SPEC_CTRL write on SMT state change Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 23/34] x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 24/34] x86/speculation: Remove x86_spec_ctrl_mask Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 25/34] KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 26/34] KVM: VMX: Fix IBRS handling after vmexit Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 27/34] x86/speculation: Fill RSB on vmexit for IBRS Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 28/34] x86/common: Stamp out the stepping madness Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 29/34] x86/cpu/amd: Enumerate BTC_NO Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 30/34] x86/bugs: Add Cannon lake to RETBleed affected CPU list Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 31/34] x86/speculation: Disable RRSBA behavior Suraj Jitindar Singh
2022-10-27 20:55 ` [PATCH 4.14 32/34] x86/speculation: Use DECLARE_PER_CPU for x86_spec_ctrl_current Suraj Jitindar Singh
2022-10-27 20:56 ` [PATCH 4.14 33/34] x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts Suraj Jitindar Singh
2022-10-27 20:56 ` [PATCH 4.14 34/34] x86/speculation: Add RSB VM Exit protections Suraj Jitindar Singh
2022-10-31 7:00 ` [PATCH 4.14 00/34] Retbleed & PBRSB Mitigations Greg KH
-- strict thread matches above, loose matches on Subject: below --
2022-10-31 7:02 [PATCH 4.14 00/34] 4.14.297-rc1 review Greg Kroah-Hartman
2022-10-31 7:02 ` [PATCH 4.14 17/34] entel_idle: Disable IBRS during long idle Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221027205512.17684-1-surajjs@amazon.com \
--to=surajjs@amazon.com \
--cc=cascardo@canonical.com \
--cc=jpoimboe@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=sjitindarsingh@gmail.com \
--cc=stable@vger.kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox