public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jack Allister <jalliste@amazon.com>
Cc: Jack Allister <jalliste@amazon.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Paul Durrant <pdurrant@amazon.com>, Jue Wang <juew@amazon.com>,
	Usama Arif <usama.arif@bytedance.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>, <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Randy Dunlap <rdunlap@infradead.org>, Tejun Heo <tj@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Yan-Jie Wang <yanjiewtw@gmail.com>,
	Hans de Goede <hdegoede@redhat.com>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH v6] x86: intel_epb: Add earlyparam option to keep bias at performance
Date: Thu, 4 Jan 2024 09:05:48 +0000	[thread overview]
Message-ID: <20240104090551.46251-1-jalliste@amazon.com> (raw)
In-Reply-To: <ff3a0382-734d-4f46-bd35-ffa1f53a3ac3@intel.com>

Buggy BIOSes may not set a sane boot-time Energy Performance Bias (EPB).
A result of this may be overheating or excess power usage. The kernel
overrides any boot-time EPB "performance" bias to "normal" to avoid this.

When used in data centers it is preferable keep the EPB at "performance"
when performing a live-update of the host kernel via a kexec to the new
kernel. This is due to boot-time being critical when performing the kexec
as running guest VMs will perceieve this as latency or downtime.

On Intel Xeon Ice Lake platforms it has been observed that a combination of
EPB being set to "normal" alongside HWP (Intel Hardware P-states) being
enabled/configured during or close to the kexec causes an increases the
live-update/kexec downtime by 7 times compared to when the EPB is set to
"performance".

Introduce a command-line parameter, "intel_epb=preserve", to skip the
"performance" -> "normal" override/workaround. This maintains prior
functionality when no parameter is set, but adds in the ability to stay at
performance for a speedy kexec if a user wishes.

Signed-off-by: Jack Allister <jalliste@amazon.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: Paul Durrant <pdurrant@amazon.com>
Cc: Jue Wang <juew@amazon.com>
Cc: Usama Arif <usama.arif@bytedance.com>
---
 .../admin-guide/kernel-parameters.txt         |  9 ++++++++
 arch/x86/kernel/cpu/intel_epb.c               | 22 +++++++++++++++++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 65731b060e3f..d28f2fc41c0c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2148,6 +2148,15 @@
 			0	disables intel_idle and fall back on acpi_idle.
 			1 to 9	specify maximum depth of C-state.
 
+	intel_epb=	[X86]
+			auto (default)
+			  Work around buggy BIOSes to avoid excess power usage
+			  by forcing the performance bias to "normal" at boot-time.
+			preserve
+			  Do not override the existing performance bias setting.
+			  Useful if a previous kernel or bootloader's setting is
+			  more desirable than "normal".
+
 	intel_pstate=	[X86]
 			disable
 			  Do not enable intel_pstate as the default
diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
index e4c3ba91321c..01d406177751 100644
--- a/arch/x86/kernel/cpu/intel_epb.c
+++ b/arch/x86/kernel/cpu/intel_epb.c
@@ -50,7 +50,8 @@
  * the OS will do that anyway.  That sometimes is problematic, as it may cause
  * the system battery to drain too fast, for example, so it is better to adjust
  * it on CPU bring-up and if the initial EPB value for a given CPU is 0, the
- * kernel changes it to 6 ('normal').
+ * kernel changes it to 6 ('normal'). However, if it is desirable to retain the
+ * original initial EPB value, intel_epb=preserve can be set to enforce it.
  */
 
 static DEFINE_PER_CPU(u8, saved_epb);
@@ -75,6 +76,8 @@ static u8 energ_perf_values[] = {
 	[EPB_INDEX_POWERSAVE] = ENERGY_PERF_BIAS_POWERSAVE,
 };
 
+static bool intel_epb_no_override __read_mostly;
+
 static int intel_epb_save(void)
 {
 	u64 epb;
@@ -106,7 +109,7 @@ static void intel_epb_restore(void)
 		 * ('normal').
 		 */
 		val = epb & EPB_MASK;
-		if (val == ENERGY_PERF_BIAS_PERFORMANCE) {
+		if (!intel_epb_no_override && val == ENERGY_PERF_BIAS_PERFORMANCE) {
 			val = energ_perf_values[EPB_INDEX_NORMAL];
 			pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
 		}
@@ -213,6 +216,21 @@ static const struct x86_cpu_id intel_epb_normal[] = {
 	{}
 };
 
+static __init int parse_intel_epb(char *str)
+{
+	if (!str)
+		return 0;
+
+	/* "intel_epb=preserve" prevents PERFORMANCE->NORMAL on restore. */
+	if (!strcmp(str, "preserve"))
+		intel_epb_no_override = true;
+
+	/* "intel_epb=auto" not explicitly checked as default behaviour. */
+	return 0;
+}
+
+early_param("intel_epb", parse_intel_epb);
+
 static __init int intel_epb_init(void)
 {
 	const struct x86_cpu_id *id = x86_match_cpu(intel_epb_normal);
-- 
2.40.1


  reply	other threads:[~2024-01-04  9:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 17:28 [PATCH] x86: intel_epb: Add earlyparam option to keep bias at performance Jack Allister
2023-12-04 17:44 ` Dave Hansen
2023-12-05  9:00   ` Durrant, Paul
2023-12-05 12:00     ` David Woodhouse
2023-12-05 12:12       ` Rafael J. Wysocki
2023-12-05 12:15         ` David Woodhouse
2023-12-05 12:31           ` Rafael J. Wysocki
2023-12-05 12:32             ` David Woodhouse
2023-12-05 12:43               ` Rafael J. Wysocki
2023-12-05 15:19         ` Dave Hansen
2023-12-05 15:27           ` Dave Hansen
2023-12-05 10:23   ` Jack Allister
2023-12-05 12:48 ` Rafael J. Wysocki
2023-12-05 13:13   ` Jack Allister
2023-12-05 13:23     ` [PATCH v3] " Jack Allister
2023-12-05 15:26       ` Dave Hansen
2023-12-05 13:30     ` [PATCH] " Rafael J. Wysocki
2023-12-05 15:14       ` [PATCH v4] " Jack Allister
2023-12-05 16:17         ` Dave Hansen
2024-01-02 14:46           ` Jack Allister
2024-01-02 15:09             ` Dave Hansen
2024-01-03 14:46               ` [PATCH v5] " Jack Allister
2024-01-03 15:17                 ` Dave Hansen
2024-01-04  9:05                   ` Jack Allister [this message]
2024-01-04  9:22                     ` [PATCH v6] " Durrant, Paul
2024-01-03 15:18                 ` [External] [PATCH v5] " Usama Arif
2024-01-04  9:00                   ` Jack Allister

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240104090551.46251-1-jalliste@amazon.com \
    --to=jalliste@amazon.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@zytor.com \
    --cc=juew@amazon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pdurrant@amazon.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=usama.arif@bytedance.com \
    --cc=x86@kernel.org \
    --cc=yanjiewtw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox