linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/17] Prepare for new Intel Family numbers
@ 2025-02-11 19:43 Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 01/17] x86/smpboot: Remove confusing quirk usage in INIT delay Sohil Mehta
                   ` (16 more replies)
  0 siblings, 17 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

---Summary---
Mainstream Intel processors have been using Family 6 for a couple of decades.
Audit all the Intel Family-model checks to get ready for the upcoming Family 18
and 19 models.

Patch 1: Preparatory cleanup in smpboot.
Patch 2-7: Fixes in arch/x86 and drivers.
Patch 8-17: Cleanups in arch/x86 to convert x86_model checks to VFM ones.

This series does not include cleanups in drivers/.

Please feel free to pick up whichever patches seem ready. Most of the patches
can be applied out of order except patches 1 and 2 which should be applied
together.

---v2 changes---

* Improve commit messages.
* Split and reorder patches for better readability.
* Added a cleanup patch in the beginning.

RFC-v1: https://lore.kernel.org/lkml/20241220213711.1892696-1-sohil.mehta@intel.com/

---Background---
The last mainstream Intel processors that deviated from Family 6 were the
Netburst architecture based Pentium 4s that had a Family number of 15. Intel
has recently started to introduce extended Family numbers greater than 15, as
seen in commit d1fb034b75a8 ("x86/cpu: Add two Intel CPU model numbers").
Though newer CPUs can have any Family number, the currently planned CPUs lie in
Families 18 and 19.

Some kernel code assumes that the Family number would always remain 6.  There
are checks that apply to recent Family 6 models such as Lunar Lake and
Clearwater Forest but don't automatically extend to Family 19 models such as
Diamond Rapids. This series aims to fix and cleanup all of such Intel specific
checks (mainly in arch/x86) to avoid such issues in the future. It also
converts almost all of the x86_model checks in arch/x86 to the new VFM ones.

OTOH, x86_model usage in drivers/ is a huge mess. Some drivers might need to be
completely rewritten to make them future-proof. I have attempted a couple of
fixes in cpufreq and hwmon, but they are mostly superficial.  A more thorough
clean up of drivers is needed to replace all x86_model usage with the new VFM
checks.

---Assumptions and Trade-offs---
Newer CPUs will have model numbers only in Family 6 or after Family 15.  No new
processors would be added between Family 6 and 15.

As a convention, Intel Family numbers are referenced using decimals (Family 15,
19, etc.) even though the AMD-specific code might prefer hexadecimals in
similar situations.

It would be preferable to have simpler and more maintainable checks that might
in a rare situation change the behavior on really old platforms. If someone
pops up with an issue, the code would be fixed later.
For example, the check,
	c->x86_vfm >= INTEL_PENTIUM_PRO
is preferred over,
	c->x86 == 6 || c->x86 > 15
if the likelihood of adversely affecting Family 15 is low.

Also, the CPU defines in intel-family.h have been added as and when needed to
make reviewing and applying patches out of order easier.

Sohil Mehta (17):
  x86/smpboot: Remove confusing quirk usage in INIT delay
  x86/smpboot: Fix INIT delay optimization for extended Intel Families
  x86/apic: Fix 32-bit APIC initialization for extended Intel Families
  x86/cpu/intel: Fix the movsl alignment preference for extended
    Families
  x86/cpu/intel: Fix page copy performance for extended Families
  cpufreq: Fix the efficient idle check for Intel extended Families
  hwmon: Fix Intel Family-model checks to include extended Families
  x86/microcode: Update the Intel processor flag scan check
  x86/mtrr: Modify a x86_model check to an Intel VFM check
  x86/cpu/intel: Replace early Family 6 checks with VFM ones
  x86/cpu/intel: Replace Family 15 checks with VFM ones
  x86/cpu/intel: Replace Family 5 model checks with VFM ones
  x86/pat: Replace Intel x86_model checks with VFM ones
  x86/acpi/cstate: Improve Intel Family model checks
  x86/cpu/intel: Bound the non-architectural constant_tsc model checks
  perf/x86: Simplify P6 PMU initialization
  perf/x86/p4: Replace Pentium 4 model checks with VFM ones

 arch/x86/events/intel/p4.c            |  7 +--
 arch/x86/events/intel/p6.c            | 28 +++--------
 arch/x86/include/asm/intel-family.h   | 21 ++++++++-
 arch/x86/kernel/acpi/cstate.c         |  8 ++--
 arch/x86/kernel/apic/apic.c           |  4 +-
 arch/x86/kernel/cpu/intel.c           | 68 +++++++++++++--------------
 arch/x86/kernel/cpu/microcode/intel.c |  2 +-
 arch/x86/kernel/cpu/mtrr/generic.c    |  4 +-
 arch/x86/kernel/smpboot.c             | 17 ++++---
 arch/x86/mm/pat/memtype.c             |  7 +--
 drivers/cpufreq/cpufreq_ondemand.c    | 15 +++---
 drivers/hwmon/coretemp.c              | 26 ++++++----
 12 files changed, 111 insertions(+), 96 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 01/17] x86/smpboot: Remove confusing quirk usage in INIT delay
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families Sohil Mehta
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

The usage of the "quirk" wording while setting the INIT assert -
de-assert delay is misleading. The comments suggest that modern
processors need the quirk (to clear init_udelay) while legacy processors
don't need the quirk (to use the default init_udelay).

With a lot more modern processors, the wording should be inverted if at
all needed. Instead, simplify the comments and the code by getting rid of
"quirk" usage altogether.

No functional change.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: New patch

---
 arch/x86/kernel/smpboot.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c10850ae6f09..eb91ed0f2a06 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -654,10 +654,9 @@ static void impress_friends(void)
  * But that slows boot and resume on modern processors, which include
  * many cores and don't require that delay.
  *
- * Cmdline "init_cpu_udelay=" is available to over-ride this delay.
- * Modern processor families are quirked to remove the delay entirely.
+ * Cmdline "cpu_init_udelay=" is available to override this delay.
  */
-#define UDELAY_10MS_DEFAULT 10000
+#define UDELAY_10MS_LEGACY 10000
 
 static unsigned int init_udelay = UINT_MAX;
 
@@ -669,7 +668,7 @@ static int __init cpu_init_udelay(char *str)
 }
 early_param("cpu_init_udelay", cpu_init_udelay);
 
-static void __init smp_quirk_init_udelay(void)
+static void __init smp_set_init_udelay(void)
 {
 	/* if cmdline changed it from default, leave it alone */
 	if (init_udelay != UINT_MAX)
@@ -683,7 +682,7 @@ static void __init smp_quirk_init_udelay(void)
 		return;
 	}
 	/* else, use legacy delay */
-	init_udelay = UDELAY_10MS_DEFAULT;
+	init_udelay = UDELAY_10MS_LEGACY;
 }
 
 /*
@@ -1094,7 +1093,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 
 	uv_system_init();
 
-	smp_quirk_init_udelay();
+	smp_set_init_udelay();
 
 	speculative_store_bypass_ht_init();
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 01/17] x86/smpboot: Remove confusing quirk usage in INIT delay Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 20:10   ` Dave Hansen
  2025-02-11 19:43 ` [PATCH v2 03/17] x86/apic: Fix 32-bit APIC initialization " Sohil Mehta
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Currently only Family 6 is considered as modern and avoids the 10 msec
INIT delay. The optimization doesn't extend to the upcoming Family 18/19
models.

Also, the omission of Family 15 (Pentium 4s) seems like an oversight and
should probably be included in the modern check as well.

Choose a simpler check and extend the optimization to all Intel
processors Family 6 and beyond.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Make the changelog more precise

---
 arch/x86/kernel/smpboot.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index eb91ed0f2a06..871c61df4edb 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -675,9 +675,9 @@ static void __init smp_set_init_udelay(void)
 		return;
 
 	/* if modern processor, use no delay */
-	if (((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && (boot_cpu_data.x86 == 6)) ||
-	    ((boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) && (boot_cpu_data.x86 >= 0x18)) ||
-	    ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && (boot_cpu_data.x86 >= 0xF))) {
+	if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO) ||
+	    (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON && boot_cpu_data.x86 >= 0x18) ||
+	    (boot_cpu_data.x86_vendor == X86_VENDOR_AMD && boot_cpu_data.x86 >= 0xF)) {
 		init_udelay = 0;
 		return;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 03/17] x86/apic: Fix 32-bit APIC initialization for extended Intel Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 01/17] x86/smpboot: Remove confusing quirk usage in INIT delay Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 19:43 ` [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families Sohil Mehta
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

APIC detection is currently limited to a few specific Families and will
not match the upcoming Families >=18.

Extend the check to include all Families 6 or greater. Also convert it
to a VFM check to make it simpler.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---

v2: Update commit message to make it more precise

---
 arch/x86/kernel/apic/apic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index e893dc6f11c1..4d99bd65faf5 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2014,8 +2014,8 @@ static bool __init detect_init_APIC(void)
 	case X86_VENDOR_HYGON:
 		break;
 	case X86_VENDOR_INTEL:
-		if (boot_cpu_data.x86 == 6 || boot_cpu_data.x86 == 15 ||
-		    (boot_cpu_data.x86 == 5 && boot_cpu_has(X86_FEATURE_APIC)))
+		if ((boot_cpu_data.x86 == 5 && boot_cpu_has(X86_FEATURE_APIC)) ||
+		    boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO)
 			break;
 		goto no_apic;
 	default:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (2 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 03/17] x86/apic: Fix 32-bit APIC initialization " Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 20:26   ` Dave Hansen
  2025-02-11 19:43 ` [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance " Sohil Mehta
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

The alignment preference for 32-bit movsl based bulk memory move has
been 8-byte for a long time. However this preference is only set for
Family 6 and 15 processors.

Extend the preference to upcoming Family numbers 18 and 19 to maintain
legacy behavior. Also, use a VFM based check instead of switching based
on Family numbers. Refresh the comment to reflect the new check.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Split the patch into two parts. Update commit message.

---
 arch/x86/kernel/cpu/intel.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 3dce22f00dc3..e5f34a90963e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -449,23 +449,16 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
 	    (c->x86_stepping < 0x6 || c->x86_stepping == 0xb))
 		set_cpu_bug(c, X86_BUG_11AP);
 
-
 #ifdef CONFIG_X86_INTEL_USERCOPY
 	/*
-	 * Set up the preferred alignment for movsl bulk memory moves
+	 * movsl bulk memory moves can be slow when source and dest are not
+	 * both 8-byte aligned. PII/PIII only like movsl with 8-byte alignment.
+	 *
+	 * Set the preferred alignment for Pentium Pro and newer processors, as
+	 * it has only been tested on these.
 	 */
-	switch (c->x86) {
-	case 4:		/* 486: untested */
-		break;
-	case 5:		/* Old Pentia: untested */
-		break;
-	case 6:		/* PII/PIII only like movsl with 8-byte alignment */
+	if (c->x86_vfm >= INTEL_PENTIUM_PRO)
 		movsl_mask.mask = 7;
-		break;
-	case 15:	/* P4 is OK down to 8-byte alignment */
-		movsl_mask.mask = 7;
-		break;
-	}
 #endif
 
 	intel_smp_check(c);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (3 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 20:53   ` Dave Hansen
  2025-02-11 19:43 ` [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel " Sohil Mehta
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

X86_FEATURE_REP_GOOD is a linux defined feature flag to track whether
fast string operations should be used for copy_page(). It is also used
as a backup alternative for clear_page() if enhanced fast string
operations (ERMS) are not available.

Currently, the flag is only set for Family 6 processors. Extend the
check to include upcoming processors in Family 18 and 19.

It is uncertain whether X86_FEATURE_REP_GOOD should be set for Family 15
(Pentium 4) as well. Commit 185f3b9da24c ("x86: make intel.c have 64-bit
support code") that originally set the flag also set the
x86_cache_alignment preference for Family 15 processors in the same
commit. The omission of the Family 15 may have been intentional.

Also, move the check before a related check in early_init_intel() to
avoid resetting the flag.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Separate out the REP_GOOD (copy page) specific change into a
separate commit.

From the archives, it wasn't exactly clear why the set_cpu_cap() and
clear_cpu_cap() calls for X86_FEATURE_REP_GOOD are in distinct
locations. Also, why there is a difference between 32-bit and 64-bit.
Any insight there would be useful. For now, I have kept the change
minimal based on my limited understanding.

---
 arch/x86/kernel/cpu/intel.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index e5f34a90963e..4f8b02cbe8c5 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -297,6 +297,14 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 	    c->x86_vfm <= INTEL_CORE_YONAH)
 		clear_cpu_cap(c, X86_FEATURE_PAT);
 
+	/*
+	 * Modern CPUs are generally expected to have a sane fast string
+	 * implementation. However, the BIOS may disable it on certain CPUs
+	 * via the architectural FAST_STRING bit.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64) && (c->x86 == 6 || c->x86 > 15))
+		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
+
 	/*
 	 * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
 	 * clear the fast string and enhanced fast string CPU capabilities.
@@ -556,8 +564,6 @@ static void init_intel(struct cpuinfo_x86 *c)
 #ifdef CONFIG_X86_64
 	if (c->x86 == 15)
 		c->x86_cache_alignment = c->x86_clflush_size * 2;
-	if (c->x86 == 6)
-		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
 #else
 	/*
 	 * Names for the Pentium II/Celeron processors
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel extended Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (4 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance " Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-12  5:35   ` Zhang, Rui
  2025-02-11 19:43 ` [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include " Sohil Mehta
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

IO time is considered as busy by default for modern Intel processors.
However the check doesn't include the upcoming Family 18 and 19
processors. Also, Arjan van de Ven says the current nature of the check
was mainly due to lack of testing on old systems. He suggests
considering all Intel processors as having efficient idle.

Extend the IO busy classification to all Intel processors starting with
Family 6.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Improve commit message and code comments.

---
 drivers/cpufreq/cpufreq_ondemand.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index a7c38b8b3e78..b13f197707f4 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -15,6 +15,10 @@
 #include <linux/tick.h>
 #include <linux/sched/cpufreq.h>
 
+#ifdef CONFIG_X86
+#include <asm/cpu_device_id.h>
+#endif
+
 #include "cpufreq_ondemand.h"
 
 /* On-demand governor macros */
@@ -32,21 +36,20 @@ static unsigned int default_powersave_bias;
 /*
  * Not all CPUs want IO time to be accounted as busy; this depends on how
  * efficient idling at a higher frequency/voltage is.
- * Pavel Machek says this is not so for various generations of AMD and old
- * Intel systems.
+ * Pavel Machek says this is not so for various generations of AMD.
  * Mike Chan (android.com) claims this is also not true for ARM.
- * Because of this, whitelist specific known (series) of CPUs by default, and
+ * Because of this, select known series of CPUs by default, and
  * leave all others up to the user.
  */
 static int should_io_be_busy(void)
 {
 #if defined(CONFIG_X86)
 	/*
-	 * For Intel, Core 2 (model 15) and later have an efficient idle.
+	 * Starting with Family 6 consider all Intel CPUs to have an
+	 * efficient idle.
 	 */
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
-			boot_cpu_data.x86 == 6 &&
-			boot_cpu_data.x86_model >= 15)
+	    boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO)
 		return 1;
 #endif
 	return 0;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (5 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel " Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 20:58   ` Dave Hansen
  2025-02-11 19:43 ` [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check Sohil Mehta
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

The current Intel Family-model checks in the coretemp driver seem to
implicitly assume Family 6. Extend the checks to include the extended
Family numbers 18 and 19 as well.

Also, add explicit checks for Family 6 in places where it is assumed
implicitly.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>

---

v2: No change. Pickup Ack from Guenter Roeck.

---
 drivers/hwmon/coretemp.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 1b9203b20d70..1aa67a2b5f18 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -185,6 +185,13 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 			return tjmax_table[i].tjmax;
 	}
 
+	/*
+	 * Return without adjustment if the Family isn't 6.
+	 * The rest of the function assumes Family 6.
+	 */
+	if (c->x86 != 6)
+		return tjmax;
+
 	for (i = 0; i < ARRAY_SIZE(tjmax_model_table); i++) {
 		const struct tjmax_model *tm = &tjmax_model_table[i];
 		if (c->x86_model == tm->model &&
@@ -260,14 +267,17 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 
 static bool cpu_has_tjmax(struct cpuinfo_x86 *c)
 {
+	u8 family = c->x86;
 	u8 model = c->x86_model;
 
-	return model > 0xe &&
-	       model != 0x1c &&
-	       model != 0x26 &&
-	       model != 0x27 &&
-	       model != 0x35 &&
-	       model != 0x36;
+	return family > 15 ||
+	       (family == 6 &&
+		model > 0xe &&
+		model != 0x1c &&
+		model != 0x26 &&
+		model != 0x27 &&
+		model != 0x35 &&
+		model != 0x36);
 }
 
 static int get_tjmax(struct temp_data *tdata, struct device *dev)
@@ -460,7 +470,7 @@ static int chk_ucode_version(unsigned int cpu)
 	 * Readings might stop update when processor visited too deep sleep,
 	 * fixed for stepping D0 (6EC).
 	 */
-	if (c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
+	if (c->x86 == 6 && c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
 		pr_err("Errata AE18 not fixed, update BIOS or microcode of the CPU!\n");
 		return -ENODEV;
 	}
@@ -580,7 +590,7 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	 * MSR_IA32_TEMPERATURE_TARGET register. Atoms don't have the register
 	 * at all.
 	 */
-	if (c->x86_model > 0xe && c->x86_model != 0x1c)
+	if (c->x86 > 15 || (c->x86 == 6 && c->x86_model > 0xe && c->x86_model != 0x1c))
 		if (get_ttarget(tdata, &pdev->dev) >= 0)
 			tdata->attr_size++;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (6 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include " Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 21:00   ` Dave Hansen
  2025-02-11 19:43 ` [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check Sohil Mehta
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

The Family model check to read the processor flag MSR is misleading and
potentially incorrect. It doesn't consider Family while comparing the
model number. The original check did have a Family number but it got
lost/moved during refactoring.

intel_collect_cpu_info() is called through multiple paths such as early
initialization, CPU hotplug as well as IFS image load. Some of these
flows would be error prone due to the ambiguous check.

Correct the processor flag scan check to use a Family number and update
it to a VFM based one to make it more readable.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Use a VFM check instead of hardcoded numbers.

I evaluted whether CPUID can be avoided in intel_collect_cpu_info(). But
the answer seems a bit more complex than I expected.

* On the BSP, intel_collect_cpu_info() can be called very early
  via load_ucode_bsp() even before cpu_data[] has been populated.

* In the hotplug path, based on section II.c. of
  Documentation/power/suspend-and-cpuhotplug.rst rescanning of FMS
  during ucode load might be intentional.

Maybe this can be resolved by updating the Intel ucode load flows to
pass the CPU information or the cpuid_eax information around. But it is
beyond the scope of this series. Also, I am not sure whether the
effort/risk would be worth saving a single cpuid() call in an uncommon
path. If this is desired, I can work on it in a seperate patch.

---
 arch/x86/include/asm/intel-family.h   | 1 +
 arch/x86/kernel/cpu/microcode/intel.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index 6d7b04ffc5fd..cccc932d761e 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -46,6 +46,7 @@
 #define INTEL_ANY			IFM(X86_FAMILY_ANY, X86_MODEL_ANY)
 
 #define INTEL_PENTIUM_PRO		IFM(6, 0x01)
+#define INTEL_PENTIUM_III_DESCHUTES	IFM(6, 0x05)
 
 #define INTEL_CORE_YONAH		IFM(6, 0x0E)
 
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index f3d534807d91..819199bc0119 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -74,7 +74,7 @@ void intel_collect_cpu_info(struct cpu_signature *sig)
 	sig->pf = 0;
 	sig->rev = intel_get_microcode_revision();
 
-	if (x86_model(sig->sig) >= 5 || x86_family(sig->sig) > 6) {
+	if (IFM(x86_family(sig->sig), x86_model(sig->sig)) >= INTEL_PENTIUM_III_DESCHUTES) {
 		unsigned int val[2];
 
 		/* get processor flags from MSR 0x17 */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (7 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check Sohil Mehta
@ 2025-02-11 19:43 ` Sohil Mehta
  2025-02-11 21:00   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones Sohil Mehta
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:43 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Simplify one of the last few Intel x86_model checks in arch/x86 by
substituting it with a VFM one.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change.

---
 arch/x86/kernel/cpu/mtrr/generic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 2fdfda2b60e4..826b8cff33cf 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -12,6 +12,7 @@
 #include <asm/processor-flags.h>
 #include <asm/cacheinfo.h>
 #include <asm/cpufeature.h>
+#include <asm/cpu_device_id.h>
 #include <asm/hypervisor.h>
 #include <asm/mshyperv.h>
 #include <asm/tlbflush.h>
@@ -1025,8 +1026,7 @@ int generic_validate_add_page(unsigned long base, unsigned long size,
 	 * For Intel PPro stepping <= 7
 	 * must be 4 MiB aligned and not touch 0x70000000 -> 0x7003FFFF
 	 */
-	if (mtrr_if == &generic_mtrr_ops && boot_cpu_data.x86 == 6 &&
-	    boot_cpu_data.x86_model == 1 &&
+	if (mtrr_if == &generic_mtrr_ops && boot_cpu_data.x86_vfm == INTEL_PENTIUM_PRO &&
 	    boot_cpu_data.x86_stepping <= 7) {
 		if (base & ((1 << (22 - PAGE_SHIFT)) - 1)) {
 			pr_warn("mtrr: base(0x%lx000) is not 4 MiB aligned\n", base);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (8 preceding siblings ...)
  2025-02-11 19:43 ` [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:03   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 " Sohil Mehta
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Introduce names for some old pentium models and replace the x86_model
checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change

---
 arch/x86/include/asm/intel-family.h |  4 ++++
 arch/x86/kernel/cpu/intel.c         | 13 ++++++-------
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index cccc932d761e..c1a081585fcb 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -45,8 +45,12 @@
 /* Wildcard match so X86_MATCH_VFM(ANY) works */
 #define INTEL_ANY			IFM(X86_FAMILY_ANY, X86_MODEL_ANY)
 
+/* Family 6 */
 #define INTEL_PENTIUM_PRO		IFM(6, 0x01)
+#define INTEL_PENTIUM_II_KLAMATH	IFM(6, 0x03)
 #define INTEL_PENTIUM_III_DESCHUTES	IFM(6, 0x05)
+#define INTEL_PENTIUM_III_TUALATIN	IFM(6, 0x0B)
+#define INTEL_PENTIUM_M_DOTHAN		IFM(6, 0x0D)
 
 #define INTEL_CORE_YONAH		IFM(6, 0x0E)
 
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 4f8b02cbe8c5..1b6e077a037a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -195,7 +195,7 @@ void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 		return;
 
-	if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
+	if (c->x86_vfm < INTEL_PENTIUM_M_DOTHAN)
 		return;
 
 	/*
@@ -309,7 +309,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 	 * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
 	 * clear the fast string and enhanced fast string CPU capabilities.
 	 */
-	if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
+	if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) {
 		rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
 		if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) {
 			pr_info("Disabled fast string operations\n");
@@ -358,9 +358,7 @@ static void bsp_init_intel(struct cpuinfo_x86 *c)
 int ppro_with_ram_bug(void)
 {
 	/* Uses data from early_cpu_detect now */
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
-	    boot_cpu_data.x86 == 6 &&
-	    boot_cpu_data.x86_model == 1 &&
+	if (boot_cpu_data.x86_vfm == INTEL_PENTIUM_PRO &&
 	    boot_cpu_data.x86_stepping < 8) {
 		pr_info("Pentium Pro with Errata#50 detected. Taking evasive action.\n");
 		return 1;
@@ -421,7 +419,8 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
 	 * SEP CPUID bug: Pentium Pro reports SEP but doesn't have it until
 	 * model 3 mask 3
 	 */
-	if ((c->x86<<8 | c->x86_model<<4 | c->x86_stepping) < 0x633)
+	if ((c->x86_vfm == INTEL_PENTIUM_II_KLAMATH && c->x86_stepping < 3) ||
+	    c->x86_vfm < INTEL_PENTIUM_II_KLAMATH)
 		clear_cpu_cap(c, X86_FEATURE_SEP);
 
 	/*
@@ -621,7 +620,7 @@ static unsigned int intel_size_cache(struct cpuinfo_x86 *c, unsigned int size)
 	 * to determine which, so we use a boottime override
 	 * for the 512kb model, and assume 256 otherwise.
 	 */
-	if ((c->x86 == 6) && (c->x86_model == 11) && (size == 0))
+	if (c->x86_vfm == INTEL_PENTIUM_III_TUALATIN && size == 0)
 		size = 256;
 
 	/*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 checks with VFM ones
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (9 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:03   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model " Sohil Mehta
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Introduce names for some old pentium 4 models and replace the x86_model
checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change.

---
 arch/x86/include/asm/intel-family.h | 4 ++++
 arch/x86/kernel/cpu/intel.c         | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index c1a081585fcb..f509061b8c7e 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -184,6 +184,10 @@
 /* Family 5 */
 #define INTEL_QUARK_X1000		IFM(5, 0x09) /* Quark X1000 SoC */
 
+/* Family 15 - NetBurst */
+#define INTEL_P4_WILLAMETTE		IFM(15, 0x01) /* Also Xeon Foster */
+#define INTEL_P4_PRESCOTT		IFM(15, 0x03)
+
 /* Family 19 */
 #define INTEL_PANTHERCOVE_X		IFM(19, 0x01) /* Diamond Rapids */
 
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1b6e077a037a..507cb4c6d587 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -256,8 +256,8 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 #endif
 
 	/* CPUID workaround for 0F33/0F34 CPU */
-	if (c->x86 == 0xF && c->x86_model == 0x3
-	    && (c->x86_stepping == 0x3 || c->x86_stepping == 0x4))
+	if (c->x86_vfm == INTEL_P4_PRESCOTT &&
+	    (c->x86_stepping == 0x3 || c->x86_stepping == 0x4))
 		c->x86_phys_bits = 36;
 
 	/*
@@ -438,7 +438,7 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
 	 * P4 Xeon erratum 037 workaround.
 	 * Hardware prefetcher may cause stale data to be loaded into the cache.
 	 */
-	if ((c->x86 == 15) && (c->x86_model == 1) && (c->x86_stepping == 1)) {
+	if (c->x86_vfm == INTEL_P4_WILLAMETTE && c->x86_stepping == 1) {
 		if (msr_set_bit(MSR_IA32_MISC_ENABLE,
 				MSR_IA32_MISC_ENABLE_PREFETCH_DISABLE_BIT) > 0) {
 			pr_info("CPU: C0 stepping P4 Xeon detected.\n");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model checks with VFM ones
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (10 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 " Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:06   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 13/17] x86/pat: Replace Intel x86_model " Sohil Mehta
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Introduce names for some Family 5 models and convert some of the checks
to be VFM based.

Also, to keep the file sorted by family, move Family 5 to the top of the
header file.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Reorder the Family 5 models to be at the top of the file.

---
 arch/x86/include/asm/intel-family.h |  9 ++++++---
 arch/x86/kernel/cpu/intel.c         | 11 +++++------
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index f509061b8c7e..9e6a13f03f0e 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -45,6 +45,12 @@
 /* Wildcard match so X86_MATCH_VFM(ANY) works */
 #define INTEL_ANY			IFM(X86_FAMILY_ANY, X86_MODEL_ANY)
 
+/* Family 5 */
+#define INTEL_FAM5_START		IFM(5, 0x00) /* Notational marker, also P5 A-step */
+#define INTEL_PENTIUM_75		IFM(5, 0x02) /* P54C */
+#define INTEL_PENTIUM_MMX		IFM(5, 0x04) /* P55C */
+#define INTEL_QUARK_X1000		IFM(5, 0x09) /* Quark X1000 SoC */
+
 /* Family 6 */
 #define INTEL_PENTIUM_PRO		IFM(6, 0x01)
 #define INTEL_PENTIUM_II_KLAMATH	IFM(6, 0x03)
@@ -181,9 +187,6 @@
 #define INTEL_XEON_PHI_KNL		IFM(6, 0x57) /* Knights Landing */
 #define INTEL_XEON_PHI_KNM		IFM(6, 0x85) /* Knights Mill */
 
-/* Family 5 */
-#define INTEL_QUARK_X1000		IFM(5, 0x09) /* Quark X1000 SoC */
-
 /* Family 15 - NetBurst */
 #define INTEL_P4_WILLAMETTE		IFM(15, 0x01) /* Also Xeon Foster */
 #define INTEL_P4_PRESCOTT		IFM(15, 0x03)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 507cb4c6d587..1b01ef4dfda2 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -375,9 +375,8 @@ static void intel_smp_check(struct cpuinfo_x86 *c)
 	/*
 	 * Mask B, Pentium, but not Pentium MMX
 	 */
-	if (c->x86 == 5 &&
-	    c->x86_stepping >= 1 && c->x86_stepping <= 4 &&
-	    c->x86_model <= 3) {
+	if (c->x86_vfm >= INTEL_FAM5_START && c->x86_vfm < INTEL_PENTIUM_MMX &&
+	    c->x86_stepping >= 1 && c->x86_stepping <= 4) {
 		/*
 		 * Remember we have B step Pentia with bugs
 		 */
@@ -404,7 +403,7 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
 	 * The Quark is also family 5, but does not have the same bug.
 	 */
 	clear_cpu_bug(c, X86_BUG_F00F);
-	if (c->x86 == 5 && c->x86_model < 9) {
+	if (c->x86_vfm >= INTEL_FAM5_START && c->x86_vfm < INTEL_QUARK_X1000) {
 		static int f00f_workaround_enabled;
 
 		set_cpu_bug(c, X86_BUG_F00F);
@@ -452,7 +451,7 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
 	 * integrated APIC (see 11AP erratum in "Pentium Processor
 	 * Specification Update").
 	 */
-	if (boot_cpu_has(X86_FEATURE_APIC) && (c->x86<<8 | c->x86_model<<4) == 0x520 &&
+	if (boot_cpu_has(X86_FEATURE_APIC) && c->x86_vfm == INTEL_PENTIUM_75 &&
 	    (c->x86_stepping < 0x6 || c->x86_stepping == 0xb))
 		set_cpu_bug(c, X86_BUG_11AP);
 
@@ -627,7 +626,7 @@ static unsigned int intel_size_cache(struct cpuinfo_x86 *c, unsigned int size)
 	 * Intel Quark SoC X1000 contains a 4-way set associative
 	 * 16K cache with a 16 byte cache line and 256 lines per tag
 	 */
-	if ((c->x86 == 5) && (c->x86_model == 9))
+	if (c->x86_vfm == INTEL_QUARK_X1000)
 		size = 16;
 	return size;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 13/17] x86/pat: Replace Intel x86_model checks with VFM ones
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (11 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model " Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:09   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks Sohil Mehta
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Introduce markers and names for some Family 6 and Family 15 models and
replace x86_model checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Get rid of the INTEL_FAM15_START IFM(15, 0x00) define.

---
 arch/x86/include/asm/intel-family.h | 1 +
 arch/x86/mm/pat/memtype.c           | 7 ++++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index 9e6a13f03f0e..300dac505d7f 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -190,6 +190,7 @@
 /* Family 15 - NetBurst */
 #define INTEL_P4_WILLAMETTE		IFM(15, 0x01) /* Also Xeon Foster */
 #define INTEL_P4_PRESCOTT		IFM(15, 0x03)
+#define INTEL_P4_CEDARMILL		IFM(15, 0x06) /* Also Xeon Dempsey */
 
 /* Family 19 */
 #define INTEL_PANTHERCOVE_X		IFM(19, 0x01) /* Diamond Rapids */
diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index feb8cc6a12bf..25a8ecbad3a2 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -43,6 +43,7 @@
 #include <linux/fs.h>
 #include <linux/rbtree.h>
 
+#include <asm/cpu_device_id.h>
 #include <asm/cacheflush.h>
 #include <asm/cacheinfo.h>
 #include <asm/processor.h>
@@ -290,9 +291,9 @@ void __init pat_bp_init(void)
 		return;
 	}
 
-	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
-	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
-	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+	if (c->x86_vendor == X86_VENDOR_INTEL &&
+	    ((c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm <= INTEL_PENTIUM_M_DOTHAN) ||
+	    (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <= INTEL_P4_CEDARMILL))) {
 		/*
 		 * PAT support with the lower four entries. Intel Pentium 2,
 		 * 3, M, and 4 are affected by PAT errata, which makes the
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (12 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 13/17] x86/pat: Replace Intel x86_model " Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:20   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc " Sohil Mehta
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Update the Intel Family checks to consistently use Family 15 instead of
Family 0xF. Also, get rid of one of last usages of x86_model by using
the new VFM checks.

Update the incorrect comment since the check has changed[1][2] since the
initial commit ee1ca48fae7e ("ACPI: Disable ARB_DISABLE on platforms
where it is not needed").

[1]: commit 3e2ada5867b7 ("ACPI: fix Compaq Evo N800c (Pentium 4m) boot
hang regression") removed the P4 - Family 15.

[2]: commit 03a05ed11529 ("ACPI: Use the ARB_DISABLE for the CPU which
model id is less than 0x0f.") got rid of CORE_YONAH - Family 6, model E.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: Improve commit message.

---
 arch/x86/include/asm/intel-family.h | 3 +++
 arch/x86/kernel/acpi/cstate.c       | 8 ++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index 300dac505d7f..fae52a15d9b9 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -187,6 +187,9 @@
 #define INTEL_XEON_PHI_KNL		IFM(6, 0x57) /* Knights Landing */
 #define INTEL_XEON_PHI_KNM		IFM(6, 0x85) /* Knights Mill */
 
+/* Notational marker denoting the last Family 6 model */
+#define INTEL_FAM6_LAST		        IFM(6, 0xFF)
+
 /* Family 15 - NetBurst */
 #define INTEL_P4_WILLAMETTE		IFM(15, 0x01) /* Also Xeon Foster */
 #define INTEL_P4_PRESCOTT		IFM(15, 0x03)
diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c
index 5854f0b8f0f1..444602a0a3dd 100644
--- a/arch/x86/kernel/acpi/cstate.c
+++ b/arch/x86/kernel/acpi/cstate.c
@@ -13,6 +13,7 @@
 #include <linux/sched.h>
 
 #include <acpi/processor.h>
+#include <asm/cpu_device_id.h>
 #include <asm/cpuid.h>
 #include <asm/mwait.h>
 #include <asm/special_insns.h>
@@ -47,12 +48,11 @@ void acpi_processor_power_init_bm_check(struct acpi_processor_flags *flags,
 	/*
 	 * On all recent Intel platforms, ARB_DISABLE is a nop.
 	 * So, set bm_control to zero to indicate that ARB_DISABLE
-	 * is not required while entering C3 type state on
-	 * P4, Core and beyond CPUs
+	 * is not required while entering C3 type state.
 	 */
 	if (c->x86_vendor == X86_VENDOR_INTEL &&
-	    (c->x86 > 0xf || (c->x86 == 6 && c->x86_model >= 0x0f)))
-			flags->bm_control = 0;
+	    (c->x86 > 15 || (c->x86_vfm >= INTEL_CORE2_MEROM && c->x86_vfm <= INTEL_FAM6_LAST)))
+		flags->bm_control = 0;
 
 	if (c->x86_vendor == X86_VENDOR_CENTAUR) {
 		if (c->x86 > 6 || (c->x86 == 6 && c->x86_model == 0x0f &&
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc model checks
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (13 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 21:41   ` Dave Hansen
  2025-02-11 19:44 ` [PATCH v2 16/17] perf/x86: Simplify P6 PMU initialization Sohil Mehta
  2025-02-11 19:44 ` [PATCH v2 17/17] perf/x86/p4: Replace Pentium 4 model checks with VFM ones Sohil Mehta
  16 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Constant TSC has been architectural on Intel CPUs for a while. Supported
CPUs use the architectural Invariant TSC bit in CPUID.80000007. A
Family-model check is not required for these CPUs.

Prevent unnecessary confusion but restricting the model specific checks
to CPUs that need it and moving it closer to the architectural check.

Invariant TSC was likely introduced around the Nehalam timeframe on the
Xeon side and Saltwell timeframe on the Atom side.  Due to interspersed
model numbers extend the non-architectural capability setting until
Ivybridge to be safe.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change.

---
 arch/x86/kernel/cpu/intel.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1b01ef4dfda2..ab195dcea50b 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -210,10 +210,6 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 {
 	u64 misc_enable;
 
-	if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
-		(c->x86 == 0x6 && c->x86_model >= 0x0e))
-		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
-
 	if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64))
 		c->microcode = intel_get_microcode_revision();
 
@@ -272,6 +268,11 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 		set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
 	}
 
+	/* Some older CPUs have invariant TSC but may not report it architecturally via 8000_0007 */
+	if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_WILLAMETTE) ||
+	    (c->x86_vfm >= INTEL_CORE_YONAH && c->x86_vfm <= INTEL_IVYBRIDGE))
+		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
+
 	/* Penwell and Cloverview have the TSC which doesn't sleep on S3 */
 	switch (c->x86_vfm) {
 	case INTEL_ATOM_SALTWELL_MID:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 16/17] perf/x86: Simplify P6 PMU initialization
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (14 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc " Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  2025-02-11 19:44 ` [PATCH v2 17/17] perf/x86/p4: Replace Pentium 4 model checks with VFM ones Sohil Mehta
  16 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

A switch case is unnecessary when only a single case matters. Also, the
gaps in the case numbers are due to no CPU with those model numbers
being released.

Avoid the switch case and combine the cases into simpler VFM checks.
Also, this gets rid of one last few Intel x86_model comparisons.

No functional change intended.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change.

---
 arch/x86/events/intel/p6.c | 28 +++++++---------------------
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
index a6cffb4f4ef5..37e3beb6d633 100644
--- a/arch/x86/events/intel/p6.c
+++ b/arch/x86/events/intel/p6.c
@@ -2,6 +2,8 @@
 #include <linux/perf_event.h>
 #include <linux/types.h>
 
+#include <asm/cpu_device_id.h>
+
 #include "../perf_event.h"
 
 /*
@@ -244,35 +246,19 @@ static __init void p6_pmu_rdpmc_quirk(void)
 	}
 }
 
+/* Only called for Family 6 CPUs without X86_FEATURE_ARCH_PERFMON */
 __init int p6_pmu_init(void)
 {
 	x86_pmu = p6_pmu;
 
-	switch (boot_cpu_data.x86_model) {
-	case  1: /* Pentium Pro */
-		x86_add_quirk(p6_pmu_rdpmc_quirk);
-		break;
-
-	case  3: /* Pentium II - Klamath */
-	case  5: /* Pentium II - Deschutes */
-	case  6: /* Pentium II - Mendocino */
-		break;
-
-	case  7: /* Pentium III - Katmai */
-	case  8: /* Pentium III - Coppermine */
-	case 10: /* Pentium III Xeon */
-	case 11: /* Pentium III - Tualatin */
-		break;
-
-	case  9: /* Pentium M - Banias */
-	case 13: /* Pentium M - Dothan */
-		break;
-
-	default:
+	if (boot_cpu_data.x86_vfm >= INTEL_CORE_YONAH) {
 		pr_cont("unsupported p6 CPU model %d ", boot_cpu_data.x86_model);
 		return -ENODEV;
 	}
 
+	if (boot_cpu_data.x86_vfm == INTEL_PENTIUM_PRO)
+		x86_add_quirk(p6_pmu_rdpmc_quirk);
+
 	memcpy(hw_cache_event_ids, p6_hw_cache_event_ids,
 		sizeof(hw_cache_event_ids));
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 17/17] perf/x86/p4: Replace Pentium 4 model checks with VFM ones
  2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
                   ` (15 preceding siblings ...)
  2025-02-11 19:44 ` [PATCH v2 16/17] perf/x86: Simplify P6 PMU initialization Sohil Mehta
@ 2025-02-11 19:44 ` Sohil Mehta
  16 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 19:44 UTC (permalink / raw)
  To: x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	Sohil Mehta, linux-perf-users, linux-kernel, linux-acpi, linux-pm,
	linux-hwmon

Introduce names for some old pentium 4 models and replace x86_model
checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

---

v2: No change.

---
 arch/x86/events/intel/p4.c          | 7 ++++---
 arch/x86/include/asm/intel-family.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index 844bc4fc4724..fb726c6fc6e7 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -10,6 +10,7 @@
 #include <linux/perf_event.h>
 
 #include <asm/perf_event_p4.h>
+#include <asm/cpu_device_id.h>
 #include <asm/hardirq.h>
 #include <asm/apic.h>
 
@@ -732,9 +733,9 @@ static bool p4_event_match_cpu_model(unsigned int event_idx)
 {
 	/* INSTR_COMPLETED event only exist for model 3, 4, 6 (Prescott) */
 	if (event_idx == P4_EVENT_INSTR_COMPLETED) {
-		if (boot_cpu_data.x86_model != 3 &&
-			boot_cpu_data.x86_model != 4 &&
-			boot_cpu_data.x86_model != 6)
+		if (boot_cpu_data.x86_vfm != INTEL_P4_PRESCOTT &&
+		    boot_cpu_data.x86_vfm != INTEL_P4_PRESCOTT_2M &&
+		    boot_cpu_data.x86_vfm != INTEL_P4_CEDARMILL)
 			return false;
 	}
 
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index fae52a15d9b9..af2979850d62 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -193,6 +193,7 @@
 /* Family 15 - NetBurst */
 #define INTEL_P4_WILLAMETTE		IFM(15, 0x01) /* Also Xeon Foster */
 #define INTEL_P4_PRESCOTT		IFM(15, 0x03)
+#define INTEL_P4_PRESCOTT_2M		IFM(15, 0x04)
 #define INTEL_P4_CEDARMILL		IFM(15, 0x06) /* Also Xeon Dempsey */
 
 /* Family 19 */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families
  2025-02-11 19:43 ` [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families Sohil Mehta
@ 2025-02-11 20:10   ` Dave Hansen
  2025-02-11 20:20     ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 20:10 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:43, Sohil Mehta wrote:
> Currently only Family 6 is considered as modern and avoids the 10 msec
> INIT delay. The optimization doesn't extend to the upcoming Family 18/19
> models.

This doesn't quite parse correctly to me.

Let's say it this way:

	Some old crusty CPUs need an extra delay that slows down
	booting. See the comment above 'init_udelay' for details. Newer
	CPUs don't need the delay.

	Right now, for Intel, Family 6 and only Family 6 skips the
	delay. That leaves out both the Family 15 (Pentium 4s) and brand
	new Family 18/19 models.

	The omission of Family 15 (Pentium 4s) seems like an oversight
	and 18/19 do not need the delay.

	Skip the delay on all Intel processors Family 6 and beyond.

Is there anything wrong there?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families
  2025-02-11 20:10   ` Dave Hansen
@ 2025-02-11 20:20     ` Sohil Mehta
  0 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 20:20 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Jean Delvare, Guenter Roeck,
	Zhang Rui, Andrew Cooper, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/2025 12:10 PM, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
>> Currently only Family 6 is considered as modern and avoids the 10 msec
>> INIT delay. The optimization doesn't extend to the upcoming Family 18/19
>> models.
> 
> This doesn't quite parse correctly to me.
> 
> Let's say it this way:
> 
> 	Some old crusty CPUs need an extra delay that slows down
> 	booting. See the comment above 'init_udelay' for details. Newer
> 	CPUs don't need the delay.
> 
> 	Right now, for Intel, Family 6 and only Family 6 skips the
> 	delay. That leaves out both the Family 15 (Pentium 4s) and brand
> 	new Family 18/19 models.
> 
> 	The omission of Family 15 (Pentium 4s) seems like an oversight
> 	and 18/19 do not need the delay.
> 
> 	Skip the delay on all Intel processors Family 6 and beyond.
> 
> Is there anything wrong there?

No, it is accurate.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families
  2025-02-11 19:43 ` [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families Sohil Mehta
@ 2025-02-11 20:26   ` Dave Hansen
  2025-02-11 21:45     ` David Laight
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 20:26 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

We should really rename intel_workarounds() to make it more clear that
it's 32-bit only. But I digress...

On 2/11/25 11:43, Sohil Mehta wrote:
> The alignment preference for 32-bit movsl based bulk memory move has
> been 8-byte for a long time. However this preference is only set for
> Family 6 and 15 processors.
> 
> Extend the preference to upcoming Family numbers 18 and 19 to maintain
> legacy behavior. Also, use a VFM based check instead of switching based
> on Family numbers. Refresh the comment to reflect the new check.
"Legacy behavior" is not important here. If anyone is running 32-bit
kernel binaries on their brand new CPUs they (as far as I know) have a
few screws loose. They don't care about performance or security and we
shouldn't care _for_ them.

If the code yielded the "wrong" movsl_mask.mask for 18/19, it wouldn't
matter one bit.

The thing that _does_ matter is someone auditing to figure out whether
the code comprehends families>15 or whether it would break in horrible
ways. The new check is shorter and it's more obvious that it will work
forever.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-11 19:43 ` [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance " Sohil Mehta
@ 2025-02-11 20:53   ` Dave Hansen
  2025-02-12  0:54     ` Andrew Cooper
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 20:53 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:43, Sohil Mehta wrote:
> +	/*
> +	 * Modern CPUs are generally expected to have a sane fast string
> +	 * implementation. However, the BIOS may disable it on certain CPUs
> +	 * via the architectural FAST_STRING bit.
> +	 */
> +	if (IS_ENABLED(CONFIG_X86_64) && (c->x86 == 6 || c->x86 > 15))
> +		set_cpu_cap(c, X86_FEATURE_REP_GOOD);

I'm not sure the BIOS comment is helpful here.

Also, at this point, let's just make the check >=6 (or the >=PPRO
equivalent).

It will only matter if *all* of these are true:
1. Someone has a 64-bit capable P4 that powers on
2. They're running a 64-bit mainline kernel
3. String copy is *actually* slower than the alternative
4. They are performance sensitive enough to notice

We don't even know the answer to #3 for sure. Let's just say what we're
doing in a comment:

	/* Assume that any 64-bit CPU has a good implementation */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-11 19:43 ` [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include " Sohil Mehta
@ 2025-02-11 20:58   ` Dave Hansen
  2025-02-11 21:38     ` Sohil Mehta
  2025-02-12 13:10     ` Zhang, Rui
  0 siblings, 2 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 20:58 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:43, Sohil Mehta wrote:
> +	/*
> +	 * Return without adjustment if the Family isn't 6.
> +	 * The rest of the function assumes Family 6.
> +	 */
> +	if (c->x86 != 6)
> +		return tjmax;

Shouldn't we be converting this over to the vfm matches?

This is kinda icky:

> +	return family > 15 ||
> +	       (family == 6 &&
> +		model > 0xe &&
> +		model != 0x1c &&
> +		model != 0x26 &&
> +		model != 0x27 &&
> +		model != 0x35 &&
> +		model != 0x36);
>  }

I'm not sure how this escaped so far. Probably because it's not in arch/x86.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check
  2025-02-11 19:43 ` [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check Sohil Mehta
@ 2025-02-11 21:00   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:00 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:43, Sohil Mehta wrote:
> The Family model check to read the processor flag MSR is misleading and
> potentially incorrect. It doesn't consider Family while comparing the
> model number. The original check did have a Family number but it got
> lost/moved during refactoring.
> 
> intel_collect_cpu_info() is called through multiple paths such as early
> initialization, CPU hotplug as well as IFS image load. Some of these
> flows would be error prone due to the ambiguous check.
> 
> Correct the processor flag scan check to use a Family number and update
> it to a VFM based one to make it more readable.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check
  2025-02-11 19:43 ` [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check Sohil Mehta
@ 2025-02-11 21:00   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:00 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:43, Sohil Mehta wrote:
> Simplify one of the last few Intel x86_model checks in arch/x86 by
> substituting it with a VFM one.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones
  2025-02-11 19:44 ` [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones Sohil Mehta
@ 2025-02-11 21:03   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:03 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> Introduce names for some old pentium models and replace the x86_model
> checks with VFM ones.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

and...

> -	if ((c->x86<<8 | c->x86_model<<4 | c->x86_stepping) < 0x633)
> +	if ((c->x86_vfm == INTEL_PENTIUM_II_KLAMATH && c->x86_stepping < 3) ||
> +	    c->x86_vfm < INTEL_PENTIUM_II_KLAMATH)
>  		clear_cpu_cap(c, X86_FEATURE_SEP);
Ewwwwww. Good riddance on that one. :)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 checks with VFM ones
  2025-02-11 19:44 ` [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 " Sohil Mehta
@ 2025-02-11 21:03   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:03 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> Introduce names for some old pentium 4 models and replace the x86_model
> checks with VFM ones.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model checks with VFM ones
  2025-02-11 19:44 ` [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model " Sohil Mehta
@ 2025-02-11 21:06   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:06 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> Introduce names for some Family 5 models and convert some of the checks
> to be VFM based.
> 
> Also, to keep the file sorted by family, move Family 5 to the top of the
> header file.

It seems a little crazy to be naming all these old CPUs, but it does
make the code a lot more readable. Seeing INTEL_PENTIUM_MMX actually in
the code also *SCREAMS* how horribly ancient these CPUs are. :)

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 13/17] x86/pat: Replace Intel x86_model checks with VFM ones
  2025-02-11 19:44 ` [PATCH v2 13/17] x86/pat: Replace Intel x86_model " Sohil Mehta
@ 2025-02-11 21:09   ` Dave Hansen
  2025-02-11 21:42     ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:09 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> +	if (c->x86_vendor == X86_VENDOR_INTEL &&
> +	    ((c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm <= INTEL_PENTIUM_M_DOTHAN) ||
> +	    (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <= INTEL_P4_CEDARMILL))) {

Since these are both closed checks and not open-ended, is the

	if (c->x86_vendor == X86_VENDOR_INTEL &&

bit needed or superfluous?

Also, super nit, can you vertically align the two range checks, please?

	    ((c->x86_vfm >= INTEL_PENTIUM_PRO   && c->x86_vfm <=
INTEL_PENTIUM_M_DOTHAN) ||
	     (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <=
INTEL_P4_CEDARMILL))) {



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks
  2025-02-11 19:44 ` [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks Sohil Mehta
@ 2025-02-11 21:20   ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:20 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> Update the Intel Family checks to consistently use Family 15 instead of
> Family 0xF. Also, get rid of one of last usages of x86_model by using
> the new VFM checks.
> 
> Update the incorrect comment since the check has changed[1][2] since the
> initial commit ee1ca48fae7e ("ACPI: Disable ARB_DISABLE on platforms
> where it is not needed").
> 
> [1]: commit 3e2ada5867b7 ("ACPI: fix Compaq Evo N800c (Pentium 4m) boot
> hang regression") removed the P4 - Family 15.
> 
> [2]: commit 03a05ed11529 ("ACPI: Use the ARB_DISABLE for the CPU which
> model id is less than 0x0f.") got rid of CORE_YONAH - Family 6, model E.
> 
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-11 20:58   ` Dave Hansen
@ 2025-02-11 21:38     ` Sohil Mehta
  2025-02-12 13:43       ` Zhang, Rui
  2025-02-12 13:10     ` Zhang, Rui
  1 sibling, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 21:38 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Jean Delvare, Guenter Roeck,
	Zhang Rui, Andrew Cooper, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/2025 12:58 PM, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
>> +	/*
>> +	 * Return without adjustment if the Family isn't 6.
>> +	 * The rest of the function assumes Family 6.
>> +	 */
>> +	if (c->x86 != 6)
>> +		return tjmax;
> 
> Shouldn't we be converting this over to the vfm matches?
> 

For drivers/, I mainly focused on fixes instead of cleanups.

Converting drivers over to VFM checks is significant work. There are a
lot of such comparisons and switch cases (probably more than 50) across
drivers/cpufreq/ and drivers/hwmon/.

Some of the functions might need significant refactoring and rewrites. I
think someone with expertise in that particular driver should probably
do it. I did start with it initially but it is beyond my bandwidth at
the moment.

> This is kinda icky:
> 
>> +	return family > 15 ||
>> +	       (family == 6 &&
>> +		model > 0xe &&
>> +		model != 0x1c &&
>> +		model != 0x26 &&
>> +		model != 0x27 &&
>> +		model != 0x35 &&
>> +		model != 0x36);
>>  }
> 
> I'm not sure how this escaped so far. Probably because it's not in arch/x86.
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc model checks
  2025-02-11 19:44 ` [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc " Sohil Mehta
@ 2025-02-11 21:41   ` Dave Hansen
  2025-02-12  0:45     ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-11 21:41 UTC (permalink / raw)
  To: Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/25 11:44, Sohil Mehta wrote:
> Constant TSC has been architectural on Intel CPUs for a while. Supported
> CPUs use the architectural Invariant TSC bit in CPUID.80000007. A
> Family-model check is not required for these CPUs.
> 
> Prevent unnecessary confusion but restricting the model specific checks
> to CPUs that need it and moving it closer to the architectural check.
> 
> Invariant TSC was likely introduced around the Nehalam timeframe on the
> Xeon side and Saltwell timeframe on the Atom side.  Due to interspersed
> model numbers extend the non-architectural capability setting until
> Ivybridge to be safe.

How about:

X86_FEATURE_CONSTANT_TSC is a Linux-defined, synthesized feature flag.
It is used across several vendors. Intel CPUs will set the feature when
the architectural CPUID.80000007.EDX[1] bit is set. There are also some
Intel CPUs that have the X86_FEATURE_CONSTANT_TSC behavior but don't
enumerate it with the architectural bit.  Those currently have a model
range check.

Today, virtually all of the CPUs that have the CPUID bit *also* match
the "model >= 0x0e" check. This is confusing. Instead of an open-ended
check, pick some models (INTEL_IVYBRIDGE and P4_WILLAMETTE) as the end
of goofy CPUs that should enumerate the bit but don't.  These models are
relatively arbitrary but conservative pick for this.

This makes it obvious that later CPUs (like family 18+) no longer need
to synthesize X86_FEATURE_CONSTANT_TSC.


> +	/* Some older CPUs have invariant TSC but may not report it architecturally via 8000_0007 */
> +	if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_WILLAMETTE) ||
> +	    (c->x86_vfm >= INTEL_CORE_YONAH && c->x86_vfm <= INTEL_IVYBRIDGE))
> +		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

Please do vertically align this too.

Would it make logical sense to do:

        if (c->x86_power & (1 << 8)) {
                set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
                set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
        } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT ...

?

That would make it *totally* clear that it's an either/or situation.  Right?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 13/17] x86/pat: Replace Intel x86_model checks with VFM ones
  2025-02-11 21:09   ` Dave Hansen
@ 2025-02-11 21:42     ` Sohil Mehta
  0 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-11 21:42 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/2025 1:09 PM, Dave Hansen wrote:
> On 2/11/25 11:44, Sohil Mehta wrote:
>> +	if (c->x86_vendor == X86_VENDOR_INTEL &&
>> +	    ((c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm <= INTEL_PENTIUM_M_DOTHAN) ||
>> +	    (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <= INTEL_P4_CEDARMILL))) {
> 
> Since these are both closed checks and not open-ended, is the
> 
> 	if (c->x86_vendor == X86_VENDOR_INTEL &&
> 
> bit needed or superfluous?
> 

You are right, since it is close ended on both sides we should be able
to remove the X86_VENDOR_INTEL.

I was thinking if we should leave it there to avoid confusion. But,
INTEL_* in the VFM string is a good enough hint that the checks are
Intel specific. Also, it's not like this check is going to be modified
frequently.

> Also, super nit, can you vertically align the two range checks, please?
> 
> 	    ((c->x86_vfm >= INTEL_PENTIUM_PRO   && c->x86_vfm <=
> INTEL_PENTIUM_M_DOTHAN) ||
> 	     (c->x86_vfm >= INTEL_P4_WILLAMETTE && c->x86_vfm <=
> INTEL_P4_CEDARMILL))) {
> 
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families
  2025-02-11 20:26   ` Dave Hansen
@ 2025-02-11 21:45     ` David Laight
  0 siblings, 0 replies; 46+ messages in thread
From: David Laight @ 2025-02-11 21:45 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Sohil Mehta, x86, Dave Hansen, Tony Luck, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Thomas Gleixner, Borislav Petkov, H . Peter Anvin,
	Rafael J . Wysocki, Len Brown, Andy Lutomirski, Viresh Kumar,
	Fenghua Yu, Jean Delvare, Guenter Roeck, Zhang Rui, Andrew Cooper,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On Tue, 11 Feb 2025 12:26:48 -0800
Dave Hansen <dave.hansen@intel.com> wrote:

> We should really rename intel_workarounds() to make it more clear that
> it's 32-bit only. But I digress...
> 
> On 2/11/25 11:43, Sohil Mehta wrote:
> > The alignment preference for 32-bit movsl based bulk memory move has
> > been 8-byte for a long time. However this preference is only set for
> > Family 6 and 15 processors.
> > 
> > Extend the preference to upcoming Family numbers 18 and 19 to maintain
> > legacy behavior. Also, use a VFM based check instead of switching based
> > on Family numbers. Refresh the comment to reflect the new check.  
> "Legacy behavior" is not important here. If anyone is running 32-bit
> kernel binaries on their brand new CPUs they (as far as I know) have a
> few screws loose. They don't care about performance or security and we
> shouldn't care _for_ them.
> 
> If the code yielded the "wrong" movsl_mask.mask for 18/19, it wouldn't
> matter one bit.
> 
> The thing that _does_ matter is someone auditing to figure out whether
> the code comprehends families>15 or whether it would break in horrible
> ways. The new check is shorter and it's more obvious that it will work
> forever.

For any Intel non-atom processors since the Ivy bridge the only alignment
that makes real difference is aligning the destination to a 32 byte boundary.
That does make it twice as fast (32 bytes/clock rather than 16).
The source alignment never matters.
(I've got access to one of the later 64-bit 8 core atoms - but can't
remember how it behaves.)

For short (IRC 1..32) byte transfers the cost is constant.
The cost depends on the cpu, Ivy bridge is something like 40 clocks.
Lower for later cpu.
(Unlike the P4 where the overhead is some 163 clocks.)
It also makes no difference whether you do 'rep movsb' or 'rep movsq'.

For any of those cpu I'm not sure it is ever worth using anything
other than 'rep movsb' unless the length is known to be very short,
likely a multiply of 4/8 and preferably constant.
Doing a function call and a one or two mispredictable branches will
soon eat into the overhead. Not to mention displacing code from the I-cache.
Unless you are micro-optimising a very hot path it really isn't worth
doing anything else.

OTOH even some recent AMD cpu are reported not to have FRSM and will
execute 'rep movsb' slowly.

I did 'discover' that code at the weekend, just the memory load to
get the mask is going to slow things down.
Running a benchmark test it'll be in cache and the branch predictor
will remember what you are doing.
Come in 'cold cache' and (IIRC) Intel cpu have a 50% chance of predicting
a branch taken (no static predict - eg backward taken).

Even for normal memory accesses I've not seen any significant slowdown
for misaligned memory accesses.
Ones that cross a cache line might end up being 2 uops, but the cpu
can do two reads/clock (with a following wind) and it is hard to write
a code loop that gets close to sustaining that.

I'll have tested the IP checksum (adc loop) code with misaligned buffers.
I don't even remember a significant slowdown for the version that does
three memory reads every two clocks (which seems to be the limit).

I actually suspect that any copies that matter are aligned so the cost
of the check far outways the benefit across all the calls.

One optimisation that seems to be absent is that if you are doing a
register copy loop, then any trailing bytes can be copied by doing
a misaligned copy of the last word (and I mean word, not 16 bits)
of the buffer - copying a few bytes twice.

	David



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc model checks
  2025-02-11 21:41   ` Dave Hansen
@ 2025-02-12  0:45     ` Sohil Mehta
  0 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-12  0:45 UTC (permalink / raw)
  To: Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, Andrew Cooper, David Laight,
	linux-perf-users, linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/2025 1:41 PM, Dave Hansen wrote:
> On 2/11/25 11:44, Sohil Mehta wrote:
>> Constant TSC has been architectural on Intel CPUs for a while. Supported
>> CPUs use the architectural Invariant TSC bit in CPUID.80000007. A
>> Family-model check is not required for these CPUs.
>>
>> Prevent unnecessary confusion but restricting the model specific checks
>> to CPUs that need it and moving it closer to the architectural check.
>>
>> Invariant TSC was likely introduced around the Nehalam timeframe on the
>> Xeon side and Saltwell timeframe on the Atom side.  Due to interspersed
>> model numbers extend the non-architectural capability setting until
>> Ivybridge to be safe.
> 
> How about:
> 
> X86_FEATURE_CONSTANT_TSC is a Linux-defined, synthesized feature flag.
> It is used across several vendors. Intel CPUs will set the feature when
> the architectural CPUID.80000007.EDX[1] bit is set. There are also some
> Intel CPUs that have the X86_FEATURE_CONSTANT_TSC behavior but don't
> enumerate it with the architectural bit.  Those currently have a model
> range check.
> 
> Today, virtually all of the CPUs that have the CPUID bit *also* match
> the "model >= 0x0e" check. This is confusing. Instead of an open-ended
> check, pick some models (INTEL_IVYBRIDGE and P4_WILLAMETTE) as the end
> of goofy CPUs that should enumerate the bit but don't.  These models are
> relatively arbitrary but conservative pick for this.
> 
> This makes it obvious that later CPUs (like family 18+) no longer need
> to synthesize X86_FEATURE_CONSTANT_TSC.
> 

Looks much better.

>> +	/* Some older CPUs have invariant TSC but may not report it architecturally via 8000_0007 */
>> +	if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_WILLAMETTE) ||
>> +	    (c->x86_vfm >= INTEL_CORE_YONAH && c->x86_vfm <= INTEL_IVYBRIDGE))
>> +		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
> 
> Please do vertically align this too.
> 
> Would it make logical sense to do:
> 
>         if (c->x86_power & (1 << 8)) {
>                 set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
>                 set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
>         } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT ...
> 
> ?
> 
> That would make it *totally* clear that it's an either/or situation.  Right?
> 

Yup, will change it.

> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-11 20:53   ` Dave Hansen
@ 2025-02-12  0:54     ` Andrew Cooper
  2025-02-12 21:19       ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2025-02-12  0:54 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 11/02/2025 8:53 pm, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
>> +	/*
>> +	 * Modern CPUs are generally expected to have a sane fast string
>> +	 * implementation. However, the BIOS may disable it on certain CPUs
>> +	 * via the architectural FAST_STRING bit.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_X86_64) && (c->x86 == 6 || c->x86 > 15))
>> +		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> I'm not sure the BIOS comment is helpful here.
>
> Also, at this point, let's just make the check >=6 (or the >=PPRO
> equivalent).
>
> It will only matter if *all* of these are true:
> 1. Someone has a 64-bit capable P4 that powers on
> 2. They're running a 64-bit mainline kernel
> 3. String copy is *actually* slower than the alternative
> 4. They are performance sensitive enough to notice
>
> We don't even know the answer to #3 for sure. Let's just say what we're
> doing in a comment:
>
> 	/* Assume that any 64-bit CPU has a good implementation */

If you're going to override the BIOS setting, then you need to
explicitly set MSR_MISC_ENABLE.FAST_STRINGS.

Otherwise you're claiming to Linux that REP is good even when hardware
is prohibited from using optimisations.

~Andrew


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel extended Families
  2025-02-11 19:43 ` [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel " Sohil Mehta
@ 2025-02-12  5:35   ` Zhang, Rui
  2025-02-13 18:49     ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang, Rui @ 2025-02-12  5:35 UTC (permalink / raw)
  To: Mehta, Sohil, Luck, Tony, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org,
	andrew.cooper3@citrix.com, alexander.shishkin@linux.intel.com,
	luto@kernel.org, david.laight.linux@gmail.com,
	linux-hwmon@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, irogers@google.com, tglx@linutronix.de,
	fenghua.yu@intel.com, lenb@kernel.org, kan.liang@linux.intel.com,
	linux@roeck-us.net, hpa@zytor.com, peterz@infradead.org,
	mark.rutland@arm.com, bp@alien8.de, acme@kernel.org,
	rafael@kernel.org, jolsa@kernel.org, linux-acpi@vger.kernel.org,
	namhyung@kernel.org

On Tue, 2025-02-11 at 19:43 +0000, Sohil Mehta wrote:
> IO time is considered as busy by default for modern Intel processors.
> However the check doesn't include the upcoming Family 18 and 19
> processors. Also, Arjan van de Ven says the current nature of the
> check
> was mainly due to lack of testing on old systems. He suggests
> considering all Intel processors as having efficient idle.
> 
> Extend the IO busy classification to all Intel processors starting
> with
> Family 6.
> 
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> 
> ---
> 
> v2: Improve commit message and code comments.
> 
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c
> b/drivers/cpufreq/cpufreq_ondemand.c
> index a7c38b8b3e78..b13f197707f4 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -15,6 +15,10 @@
>  #include <linux/tick.h>
>  #include <linux/sched/cpufreq.h>
>  
> +#ifdef CONFIG_X86
> +#include <asm/cpu_device_id.h>
> +#endif
> +
>  #include "cpufreq_ondemand.h"
>  
>  /* On-demand governor macros */
> @@ -32,21 +36,20 @@ static unsigned int default_powersave_bias;
>  /*
>   * Not all CPUs want IO time to be accounted as busy; this depends
> on how
>   * efficient idling at a higher frequency/voltage is.
> - * Pavel Machek says this is not so for various generations of AMD
> and old
> - * Intel systems.
> + * Pavel Machek says this is not so for various generations of AMD.
>   * Mike Chan (android.com) claims this is also not true for ARM.
> - * Because of this, whitelist specific known (series) of CPUs by
> default, and
> + * Because of this, select known series of CPUs by default, and
>   * leave all others up to the user.
>   */
>  static int should_io_be_busy(void)
>  {
>  #if defined(CONFIG_X86)
>  	/*
> -	 * For Intel, Core 2 (model 15) and later have an efficient
> idle.
> +	 * Starting with Family 6 consider all Intel CPUs to have an
> +	 * efficient idle.
>  	 */
>  	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> -			boot_cpu_data.x86 == 6 &&
> -			boot_cpu_data.x86_model >= 15)
> +	    boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO)

This is "Starting from P4" rather than "Starting from Family 6", right?

thanks,
rui
>  		return 1;
>  #endif
>  	return 0;


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-11 20:58   ` Dave Hansen
  2025-02-11 21:38     ` Sohil Mehta
@ 2025-02-12 13:10     ` Zhang, Rui
  1 sibling, 0 replies; 46+ messages in thread
From: Zhang, Rui @ 2025-02-12 13:10 UTC (permalink / raw)
  To: Mehta, Sohil, Luck, Tony, Hansen, Dave, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org,
	andrew.cooper3@citrix.com, alexander.shishkin@linux.intel.com,
	luto@kernel.org, david.laight.linux@gmail.com,
	linux-hwmon@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, irogers@google.com, tglx@linutronix.de,
	fenghua.yu@intel.com, lenb@kernel.org, kan.liang@linux.intel.com,
	linux@roeck-us.net, hpa@zytor.com, peterz@infradead.org,
	mark.rutland@arm.com, bp@alien8.de, acme@kernel.org,
	rafael@kernel.org, jolsa@kernel.org, linux-acpi@vger.kernel.org,
	namhyung@kernel.org

On Tue, 2025-02-11 at 12:58 -0800, Dave Hansen wrote:
> On 2/11/25 11:43, Sohil Mehta wrote:
> > +       /*
> > +        * Return without adjustment if the Family isn't 6.
> > +        * The rest of the function assumes Family 6.
> > +        */
> > +       if (c->x86 != 6)
> > +               return tjmax;
> 
> Shouldn't we be converting this over to the vfm matches?
> 
> This is kinda icky:
> 
> > +       return family > 15 ||
> > +              (family == 6 &&
> > +               model > 0xe &&
> > +               model != 0x1c &&
> > +               model != 0x26 &&
> > +               model != 0x27 &&
> > +               model != 0x35 &&
> > +               model != 0x36);
> >  }
> 
> I'm not sure how this escaped so far. Probably because it's not in
> arch/x86.
> 
This code was introduced 10+ years ago, and it only brings a warning
message when reading MSR_IA32_TEMPERATURE_TARGET fails.
So probably no one has ever checked this.

thanks,
rui

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-11 21:38     ` Sohil Mehta
@ 2025-02-12 13:43       ` Zhang, Rui
  2025-02-12 16:57         ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang, Rui @ 2025-02-12 13:43 UTC (permalink / raw)
  To: Mehta, Sohil, Luck, Tony, Hansen, Dave, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org,
	andrew.cooper3@citrix.com, alexander.shishkin@linux.intel.com,
	luto@kernel.org, david.laight.linux@gmail.com,
	linux-hwmon@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, irogers@google.com, tglx@linutronix.de,
	linux@roeck-us.net, lenb@kernel.org, kan.liang@linux.intel.com,
	hpa@zytor.com, peterz@infradead.org, mark.rutland@arm.com,
	bp@alien8.de, acme@kernel.org, rafael@kernel.org,
	jolsa@kernel.org, linux-acpi@vger.kernel.org, namhyung@kernel.org

On Tue, 2025-02-11 at 13:38 -0800, Sohil Mehta wrote:
> On 2/11/2025 12:58 PM, Dave Hansen wrote:
> > On 2/11/25 11:43, Sohil Mehta wrote:
> > > +       /*
> > > +        * Return without adjustment if the Family isn't 6.
> > > +        * The rest of the function assumes Family 6.
> > > +        */
> > > +       if (c->x86 != 6)
> > > +               return tjmax;
> > 
> > Shouldn't we be converting this over to the vfm matches?
> > 
> 
> For drivers/, I mainly focused on fixes instead of cleanups.
> 
> Converting drivers over to VFM checks is significant work. There are
> a
> lot of such comparisons and switch cases (probably more than 50)
> across
> drivers/cpufreq/ and drivers/hwmon/.
> 
> Some of the functions might need significant refactoring and
> rewrites. I
> think someone with expertise in that particular driver should
> probably
> do it. I did start with it initially but it is beyond my bandwidth at
> the moment.
> 
I agree.
adjust_tjmax() contains a list of quirks based on PCI-
ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that all
the quirks are for Fam6 processors but the family id is not checked. So
the fix is sufficient. In fact, I think it is better to move the check
to the very beginning of adjust_tjmax().

Plus that, I do think we can have more cleanups on top
1. rename adjust_tjmax() to adjust_tjmax_for_fam6()
2. move all model specific quirks altogether and avoid model checks in
the main functions.
3. for processors newer than fam6, the driver should fail to probe
rather than using a hardcoded value when reading
MSR_IA32_TEMPERATURE_TARGET fails.

maybe I can start with something like below.

---
 drivers/hwmon/coretemp.c | 98 +++++++++++++++++++++++-----------------
 1 file changed, 57 insertions(+), 41 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 1aa67a2b5f18..fc2cf607aa36 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -99,6 +99,7 @@ struct platform_data {
 	struct device_attribute name_attr;
 };
 
+/* Beginning of Model specific quirks */
 struct tjmax_pci {
 	unsigned int device;
 	int tjmax;
@@ -147,12 +148,11 @@ static const struct tjmax_model tjmax_model_table[] = {
 				 */
 };
 
-static bool is_pkg_temp_data(struct temp_data *tdata)
-{
-	return tdata->index < 0;
-}
-
-static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
+/*
+ * Adjust tjmax value for early Fam6 CPUs with unreadable MSR_IA32_TEMPERATURE_TARGET
+ * NOTE: the calculated value may not be correct.
+ */
+static int adjust_tjmax_for_fam6(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 {
 	/* The 100C is default for both mobile and non mobile CPUs */
 
@@ -163,8 +163,16 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 	u32 eax, edx;
 	int i;
 	u16 devfn = PCI_DEVFN(0, 0);
-	struct pci_dev *host_bridge = pci_get_domain_bus_and_slot(0, 0, devfn);
+	struct pci_dev *host_bridge;
+
+	/*
+	 * Return without adjustment if the Family isn't 6.
+	 * The rest of the function assumes Family 6.
+	 */
+	if (c->x86 != 6)
+		return tjmax;
 
+	host_bridge = pci_get_domain_bus_and_slot(0, 0, devfn);
 	/*
 	 * Explicit tjmax table entries override heuristics.
 	 * First try PCI host bridge IDs, followed by model ID strings
@@ -185,12 +193,6 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 			return tjmax_table[i].tjmax;
 	}
 
-	/*
-	 * Return without adjustment if the Family isn't 6.
-	 * The rest of the function assumes Family 6.
-	 */
-	if (c->x86 != 6)
-		return tjmax;
 
 	for (i = 0; i < ARRAY_SIZE(tjmax_model_table); i++) {
 		const struct tjmax_model *tm = &tjmax_model_table[i];
@@ -280,6 +282,37 @@ static bool cpu_has_tjmax(struct cpuinfo_x86 *c)
 		model != 0x36);
 }
 
+static bool cpu_has_ttarget(struct temp_data *tdata)
+{
+	struct cpuinfo_x86 *c = &cpu_data(tdata->cpu);
+
+	/*
+	 * The target temperature is available on older CPUs but not in the
+	 * MSR_IA32_TEMPERATURE_TARGET register. Atoms don't have the register
+	 * at all.
+	 */
+	if (c->x86 > 15 || (c->x86 == 6 && c->x86_model > 0xe && c->x86_model != 0x1c))
+		return true;
+	return false;
+}
+
+static bool cpu_has_broken_ucode(unsigned int cpu)
+{
+	struct cpuinfo_x86 *c = &cpu_data(cpu);
+
+	/*
+	 * Check if we have problem with errata AE18 of Core processors:
+	 * Readings might stop update when processor visited too deep sleep,
+	 * fixed for stepping D0 (6EC).
+	 */
+	if (c->x86 == 6 && c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
+		pr_err("Errata AE18 not fixed, update BIOS or microcode of the CPU!\n");
+		return true;
+	}
+	return false;
+}
+/* End of Model specific quirks */
+
 static int get_tjmax(struct temp_data *tdata, struct device *dev)
 {
 	struct cpuinfo_x86 *c = &cpu_data(tdata->cpu);
@@ -312,9 +345,8 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
 	} else {
 		/*
 		 * An assumption is made for early CPUs and unreadable MSR.
-		 * NOTE: the calculated value may not be correct.
 		 */
-		tdata->tjmax = adjust_tjmax(c, tdata->cpu, dev);
+		tdata->tjmax = adjust_tjmax_for_fam6(c, tdata->cpu, dev);
 	}
 	return tdata->tjmax;
 }
@@ -324,6 +356,8 @@ static int get_ttarget(struct temp_data *tdata, struct device *dev)
 	u32 eax, edx;
 	int tjmax, ttarget_offset, ret;
 
+	if (!cpu_has_ttarget(tdata))
+		return -ENODEV;
 	/*
 	 * ttarget is valid only if tjmax can be retrieved from
 	 * MSR_IA32_TEMPERATURE_TARGET
@@ -348,6 +382,11 @@ static int max_zones __read_mostly;
 /* Array of zone pointers. Serialized by cpu hotplug lock */
 static struct platform_device **zone_devices;
 
+static bool is_pkg_temp_data(struct temp_data *tdata)
+{
+	return tdata->index < 0;
+}
+
 static ssize_t show_label(struct device *dev,
 				struct device_attribute *devattr, char *buf)
 {
@@ -460,23 +499,6 @@ static int create_core_attrs(struct temp_data *tdata, struct device *dev)
 	return sysfs_create_group(&dev->kobj, &tdata->attr_group);
 }
 
-
-static int chk_ucode_version(unsigned int cpu)
-{
-	struct cpuinfo_x86 *c = &cpu_data(cpu);
-
-	/*
-	 * Check if we have problem with errata AE18 of Core processors:
-	 * Readings might stop update when processor visited too deep sleep,
-	 * fixed for stepping D0 (6EC).
-	 */
-	if (c->x86 == 6 && c->x86_model == 0xe && c->x86_stepping < 0xc && c->microcode < 0x39) {
-		pr_err("Errata AE18 not fixed, update BIOS or microcode of the CPU!\n");
-		return -ENODEV;
-	}
-	return 0;
-}
-
 static struct platform_device *coretemp_get_pdev(unsigned int cpu)
 {
 	int id = topology_logical_die_id(cpu);
@@ -585,14 +607,8 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	/* Make sure tdata->tjmax is a valid indicator for dynamic/static tjmax */
 	get_tjmax(tdata, &pdev->dev);
 
-	/*
-	 * The target temperature is available on older CPUs but not in the
-	 * MSR_IA32_TEMPERATURE_TARGET register. Atoms don't have the register
-	 * at all.
-	 */
-	if (c->x86 > 15 || (c->x86 == 6 && c->x86_model > 0xe && c->x86_model != 0x1c))
-		if (get_ttarget(tdata, &pdev->dev) >= 0)
-			tdata->attr_size++;
+	if (get_ttarget(tdata, &pdev->dev) >= 0)
+		tdata->attr_size++;
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, pdata->hwmon_dev);
@@ -696,7 +712,7 @@ static int coretemp_cpu_online(unsigned int cpu)
 		struct device *hwmon;
 
 		/* Check the microcode version of the CPU */
-		if (chk_ucode_version(cpu))
+		if (cpu_has_broken_ucode(cpu))
 			return -EINVAL;
 
 		/*
-- 
2.43.0








^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-12 13:43       ` Zhang, Rui
@ 2025-02-12 16:57         ` Dave Hansen
  2025-02-14  2:23           ` Zhang, Rui
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2025-02-12 16:57 UTC (permalink / raw)
  To: Zhang, Rui, Mehta, Sohil, Luck, Tony, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org,
	andrew.cooper3@citrix.com, alexander.shishkin@linux.intel.com,
	luto@kernel.org, david.laight.linux@gmail.com,
	linux-hwmon@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, irogers@google.com, tglx@linutronix.de,
	linux@roeck-us.net, lenb@kernel.org, kan.liang@linux.intel.com,
	hpa@zytor.com, peterz@infradead.org, mark.rutland@arm.com,
	bp@alien8.de, acme@kernel.org, rafael@kernel.org,
	jolsa@kernel.org, linux-acpi@vger.kernel.org, namhyung@kernel.org

[-- Attachment #1: Type: text/plain, Size: 514 bytes --]

On 2/12/25 05:43, Zhang, Rui wrote:
> I agree.
> adjust_tjmax() contains a list of quirks based on PCI-
> ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that all
> the quirks are for Fam6 processors but the family id is not checked. So
> the fix is sufficient. In fact, I think it is better to move the check
> to the very beginning of adjust_tjmax().

Or, heck, just remove the model list. dev_warn_once() if the rdmsr
fails. Who cares about one more line in dmesg?

Why not do the attached patch?

[-- Attachment #2: coretemp-1.patch --]
[-- Type: text/x-patch, Size: 1146 bytes --]



---

 b/drivers/hwmon/coretemp.c |   15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff -puN drivers/hwmon/coretemp.c~coretemp-1 drivers/hwmon/coretemp.c
--- a/drivers/hwmon/coretemp.c~coretemp-1	2025-02-12 08:52:48.782731226 -0800
+++ b/drivers/hwmon/coretemp.c	2025-02-12 08:53:43.867617505 -0800
@@ -258,18 +258,6 @@ static int adjust_tjmax(struct cpuinfo_x
 	return tjmax;
 }
 
-static bool cpu_has_tjmax(struct cpuinfo_x86 *c)
-{
-	u8 model = c->x86_model;
-
-	return model > 0xe &&
-	       model != 0x1c &&
-	       model != 0x26 &&
-	       model != 0x27 &&
-	       model != 0x35 &&
-	       model != 0x36;
-}
-
 static int get_tjmax(struct temp_data *tdata, struct device *dev)
 {
 	struct cpuinfo_x86 *c = &cpu_data(tdata->cpu);
@@ -287,8 +275,7 @@ static int get_tjmax(struct temp_data *t
 	 */
 	err = rdmsr_safe_on_cpu(tdata->cpu, MSR_IA32_TEMPERATURE_TARGET, &eax, &edx);
 	if (err) {
-		if (cpu_has_tjmax(c))
-			dev_warn(dev, "Unable to read TjMax from CPU %u\n", tdata->cpu);
+		dev_warn_once(dev, "Unable to read TjMax from CPU %u\n", tdata->cpu);
 	} else {
 		val = (eax >> 16) & 0xff;
 		if (val)
_

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-12  0:54     ` Andrew Cooper
@ 2025-02-12 21:19       ` Sohil Mehta
  2025-02-13 23:02         ` Andrew Cooper
  0 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-12 21:19 UTC (permalink / raw)
  To: Andrew Cooper, Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/11/2025 4:54 PM, Andrew Cooper wrote:

> If you're going to override the BIOS setting, then you need to
> explicitly set MSR_MISC_ENABLE.FAST_STRINGS.
> 
> Otherwise you're claiming to Linux that REP is good even when hardware
> is prohibited from using optimisations.
> 

I think the current checks have unnecessary overlap which makes them
confusing. We should be fine if we only rely on the architectural
MSR_MISC_ENABLE.FAST_STRINGS bit and rely just on the BIOS setting. My
justification is below.

The simplified version of the current checks is as follows:

Check 1 (Based on Family Model numbers):
> /*
>  * Unconditionally set REP_GOOD on early Family 6 processors
>  */
> if (IS_ENABLED(CONFIG_X86_64) &&
>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);

This check is mostly redundant since it is targeted for 64 bit and very
few if any of those CPUs support 64 bit processing. I suggest that we
get rid of this check completely. The risk here is fairly limited as well.

Check 2 (Based on MISC_ENABLE.FAST_STRING):
> /*
>  * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
>  * clear the fast string and enhanced fast string CPU capabilities.
>  */
> if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) {
> 	rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
> 	if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING) {
> 		/* X86_FEATURE_ERMS will be automatically set based on CPUID */
> 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> 	} else {
> 		pr_info("Disabled fast string operations\n");
> 		setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
> 		setup_clear_cpu_cap(X86_FEATURE_ERMS);
> 	}
> }

This is the only real check that is needed and should likely suffice in
all meaningful scenarios.

Comments?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel extended Families
  2025-02-12  5:35   ` Zhang, Rui
@ 2025-02-13 18:49     ` Sohil Mehta
  2025-02-14  2:03       ` Zhang, Rui
  0 siblings, 1 reply; 46+ messages in thread
From: Sohil Mehta @ 2025-02-13 18:49 UTC (permalink / raw)
  To: Zhang, Rui, Luck, Tony, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org,
	andrew.cooper3@citrix.com, alexander.shishkin@linux.intel.com,
	luto@kernel.org, david.laight.linux@gmail.com,
	linux-hwmon@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, irogers@google.com, tglx@linutronix.de,
	fenghua.yu@intel.com, lenb@kernel.org, kan.liang@linux.intel.com,
	linux@roeck-us.net, hpa@zytor.com, peterz@infradead.org,
	mark.rutland@arm.com, bp@alien8.de, acme@kernel.org,
	rafael@kernel.org, jolsa@kernel.org, linux-acpi@vger.kernel.org,
	namhyung@kernel.org

On 2/11/2025 9:35 PM, Zhang, Rui wrote:
>>  static int should_io_be_busy(void)
>>  {
>>  #if defined(CONFIG_X86)
>>  	/*
>> -	 * For Intel, Core 2 (model 15) and later have an efficient
>> idle.
>> +	 * Starting with Family 6 consider all Intel CPUs to have an
>> +	 * efficient idle.
>>  	 */
>>  	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
>> -			boot_cpu_data.x86 == 6 &&
>> -			boot_cpu_data.x86_model >= 15)
>> +	    boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO)
> 
> This is "Starting from P4" rather than "Starting from Family 6", right?
> 

As described in the commit message, we are extending this to all
relevant Intel processors. That would include Family 6, Family 15 and
the upcoming Family > 15 processors as well.

A VFM check starting at INTEL_PENTIUM_PRO (Family 6, Model 1) is just a
way to simplify that.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-12 21:19       ` Sohil Mehta
@ 2025-02-13 23:02         ` Andrew Cooper
  2025-02-14  0:29           ` Sohil Mehta
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2025-02-13 23:02 UTC (permalink / raw)
  To: Sohil Mehta, Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 12/02/2025 9:19 pm, Sohil Mehta wrote:
> Check 1 (Based on Family Model numbers):
>> /*
>>  * Unconditionally set REP_GOOD on early Family 6 processors
>>  */
>> if (IS_ENABLED(CONFIG_X86_64) &&
>>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
>> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
> This check is mostly redundant since it is targeted for 64 bit and very
> few if any of those CPUs support 64 bit processing. I suggest that we
> get rid of this check completely. The risk here is fairly limited as well.

PENTIUM_PRO is model 0x1.  M_DOTHAN isn't introduced until patch 10, but
is model 0xd.

And model 0xf (Memron) is the first 64bit capable fam6 CPU, so this is
dead code given the CONFIG_X86_64 which the compiler can't actually
optimise out.

>
> Check 2 (Based on MISC_ENABLE.FAST_STRING):
>> /*
>>  * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
>>  * clear the fast string and enhanced fast string CPU capabilities.

I'd suggest that a better way of phrasing this is:

/* BIOSes typically have a knob for Fast Strings.  Honour the user's
wishes. */

>>  */
>> if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) {
>> 	rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
>> 	if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING) {
>> 		/* X86_FEATURE_ERMS will be automatically set based on CPUID */
>> 		set_cpu_cap(c, X86_FEATURE_REP_GOOD);
>> 	} else {
>> 		pr_info("Disabled fast string operations\n");
>> 		setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
>> 		setup_clear_cpu_cap(X86_FEATURE_ERMS);
>> 	}
>> }

MSR_MISC_ENABLE exists on all 64bit CPUs, and some 32bit ones too. 
Therefore, this section alone seems to suffice in order to set up
REP_GOOD properly.

~Andrew

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance for extended Families
  2025-02-13 23:02         ` Andrew Cooper
@ 2025-02-14  0:29           ` Sohil Mehta
  0 siblings, 0 replies; 46+ messages in thread
From: Sohil Mehta @ 2025-02-14  0:29 UTC (permalink / raw)
  To: Andrew Cooper, Dave Hansen, x86, Dave Hansen, Tony Luck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Thomas Gleixner,
	Borislav Petkov, H . Peter Anvin, Rafael J . Wysocki, Len Brown,
	Andy Lutomirski, Viresh Kumar, Fenghua Yu, Jean Delvare,
	Guenter Roeck, Zhang Rui, David Laight, linux-perf-users,
	linux-kernel, linux-acpi, linux-pm, linux-hwmon

On 2/13/2025 3:02 PM, Andrew Cooper wrote:
> On 12/02/2025 9:19 pm, Sohil Mehta wrote:
>> Check 1 (Based on Family Model numbers):
>>> /*
>>>  * Unconditionally set REP_GOOD on early Family 6 processors
>>>  */
>>> if (IS_ENABLED(CONFIG_X86_64) &&
>>>     (c->x86_vfm >= INTEL_PENTIUM_PRO && c->x86_vfm < INTEL_PENTIUM_M_DOTHAN))
>>> 	set_cpu_cap(c, X86_FEATURE_REP_GOOD);
>> This check is mostly redundant since it is targeted for 64 bit and very
>> few if any of those CPUs support 64 bit processing. I suggest that we
>> get rid of this check completely. The risk here is fairly limited as well.
> 
> PENTIUM_PRO is model 0x1.  M_DOTHAN isn't introduced until patch 10, but
> is model 0xd.
> 
> And model 0xf (Memron) is the first 64bit capable fam6 CPU, so this is
> dead code given the CONFIG_X86_64 which the compiler can't actually
> optimise out.
> 

Thanks for confirming. I figured this is likely dead code but wasn't
completely sure.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel extended Families
  2025-02-13 18:49     ` Sohil Mehta
@ 2025-02-14  2:03       ` Zhang, Rui
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang, Rui @ 2025-02-14  2:03 UTC (permalink / raw)
  To: Mehta, Sohil, Luck, Tony, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: viresh.kumar@linaro.org, andrew.cooper3@citrix.com,
	luto@kernel.org, linux-hwmon@vger.kernel.org,
	david.laight.linux@gmail.com, alexander.shishkin@linux.intel.com,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, linux-perf-users@vger.kernel.org,
	irogers@google.com, tglx@linutronix.de, fenghua.yu@intel.com,
	lenb@kernel.org, kan.liang@linux.intel.com, linux@roeck-us.net,
	hpa@zytor.com, peterz@infradead.org, mark.rutland@arm.com,
	bp@alien8.de, linux-pm@vger.kernel.org, acme@kernel.org,
	rafael@kernel.org, jolsa@kernel.org, linux-acpi@vger.kernel.org,
	namhyung@kernel.org

On Thu, 2025-02-13 at 10:49 -0800, Sohil Mehta wrote:
> On 2/11/2025 9:35 PM, Zhang, Rui wrote:
> > >  static int should_io_be_busy(void)
> > >  {
> > >  #if defined(CONFIG_X86)
> > >         /*
> > > -        * For Intel, Core 2 (model 15) and later have an
> > > efficient
> > > idle.
> > > +        * Starting with Family 6 consider all Intel CPUs to have
> > > an
> > > +        * efficient idle.
> > >          */
> > >         if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> > > -                       boot_cpu_data.x86 == 6 &&
> > > -                       boot_cpu_data.x86_model >= 15)
> > > +           boot_cpu_data.x86_vfm >= INTEL_PENTIUM_PRO)
> > 
> > This is "Starting from P4" rather than "Starting from Family 6",
> > right?
> > 
> 
> As described in the commit message, we are extending this to all
> relevant Intel processors. That would include Family 6, Family 15 and
> the upcoming Family > 15 processors as well.
> 
> A VFM check starting at INTEL_PENTIUM_PRO (Family 6, Model 1) is just
> a
> way to simplify that.
> 
You're right. Previously I thought the Fam15 processors are before
Fam6.

-rui

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include extended Families
  2025-02-12 16:57         ` Dave Hansen
@ 2025-02-14  2:23           ` Zhang, Rui
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang, Rui @ 2025-02-14  2:23 UTC (permalink / raw)
  To: Mehta, Sohil, Luck, Tony, Hansen, Dave, x86@kernel.org,
	dave.hansen@linux.intel.com
  Cc: viresh.kumar@linaro.org, andrew.cooper3@citrix.com,
	luto@kernel.org, linux-hwmon@vger.kernel.org,
	david.laight.linux@gmail.com, alexander.shishkin@linux.intel.com,
	Hunter, Adrian, jdelvare@suse.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, linux-perf-users@vger.kernel.org,
	irogers@google.com, tglx@linutronix.de, linux@roeck-us.net,
	lenb@kernel.org, kan.liang@linux.intel.com, hpa@zytor.com,
	peterz@infradead.org, mark.rutland@arm.com, bp@alien8.de,
	linux-pm@vger.kernel.org, acme@kernel.org, rafael@kernel.org,
	jolsa@kernel.org, linux-acpi@vger.kernel.org, namhyung@kernel.org

On Wed, 2025-02-12 at 08:57 -0800, Dave Hansen wrote:
> On 2/12/25 05:43, Zhang, Rui wrote:
> > I agree.
> > adjust_tjmax() contains a list of quirks based on PCI-
> > ID/x86_vendor_id/x86_model/x86_stepping. The common problem is that
> > all
> > the quirks are for Fam6 processors but the family id is not
> > checked. So
> > the fix is sufficient. In fact, I think it is better to move the
> > check
> > to the very beginning of adjust_tjmax().
> 
> Or, heck, just remove the model list. dev_warn_once() if the rdmsr
> fails. Who cares about one more line in dmesg?
> 
> Why not do the attached patch?

The patch looks good to me.

-rui

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2025-02-14  2:23 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-11 19:43 [PATCH v2 00/17] Prepare for new Intel Family numbers Sohil Mehta
2025-02-11 19:43 ` [PATCH v2 01/17] x86/smpboot: Remove confusing quirk usage in INIT delay Sohil Mehta
2025-02-11 19:43 ` [PATCH v2 02/17] x86/smpboot: Fix INIT delay optimization for extended Intel Families Sohil Mehta
2025-02-11 20:10   ` Dave Hansen
2025-02-11 20:20     ` Sohil Mehta
2025-02-11 19:43 ` [PATCH v2 03/17] x86/apic: Fix 32-bit APIC initialization " Sohil Mehta
2025-02-11 19:43 ` [PATCH v2 04/17] x86/cpu/intel: Fix the movsl alignment preference for extended Families Sohil Mehta
2025-02-11 20:26   ` Dave Hansen
2025-02-11 21:45     ` David Laight
2025-02-11 19:43 ` [PATCH v2 05/17] x86/cpu/intel: Fix page copy performance " Sohil Mehta
2025-02-11 20:53   ` Dave Hansen
2025-02-12  0:54     ` Andrew Cooper
2025-02-12 21:19       ` Sohil Mehta
2025-02-13 23:02         ` Andrew Cooper
2025-02-14  0:29           ` Sohil Mehta
2025-02-11 19:43 ` [PATCH v2 06/17] cpufreq: Fix the efficient idle check for Intel " Sohil Mehta
2025-02-12  5:35   ` Zhang, Rui
2025-02-13 18:49     ` Sohil Mehta
2025-02-14  2:03       ` Zhang, Rui
2025-02-11 19:43 ` [PATCH v2 07/17] hwmon: Fix Intel Family-model checks to include " Sohil Mehta
2025-02-11 20:58   ` Dave Hansen
2025-02-11 21:38     ` Sohil Mehta
2025-02-12 13:43       ` Zhang, Rui
2025-02-12 16:57         ` Dave Hansen
2025-02-14  2:23           ` Zhang, Rui
2025-02-12 13:10     ` Zhang, Rui
2025-02-11 19:43 ` [PATCH v2 08/17] x86/microcode: Update the Intel processor flag scan check Sohil Mehta
2025-02-11 21:00   ` Dave Hansen
2025-02-11 19:43 ` [PATCH v2 09/17] x86/mtrr: Modify a x86_model check to an Intel VFM check Sohil Mehta
2025-02-11 21:00   ` Dave Hansen
2025-02-11 19:44 ` [PATCH v2 10/17] x86/cpu/intel: Replace early Family 6 checks with VFM ones Sohil Mehta
2025-02-11 21:03   ` Dave Hansen
2025-02-11 19:44 ` [PATCH v2 11/17] x86/cpu/intel: Replace Family 15 " Sohil Mehta
2025-02-11 21:03   ` Dave Hansen
2025-02-11 19:44 ` [PATCH v2 12/17] x86/cpu/intel: Replace Family 5 model " Sohil Mehta
2025-02-11 21:06   ` Dave Hansen
2025-02-11 19:44 ` [PATCH v2 13/17] x86/pat: Replace Intel x86_model " Sohil Mehta
2025-02-11 21:09   ` Dave Hansen
2025-02-11 21:42     ` Sohil Mehta
2025-02-11 19:44 ` [PATCH v2 14/17] x86/acpi/cstate: Improve Intel Family model checks Sohil Mehta
2025-02-11 21:20   ` Dave Hansen
2025-02-11 19:44 ` [PATCH v2 15/17] x86/cpu/intel: Bound the non-architectural constant_tsc " Sohil Mehta
2025-02-11 21:41   ` Dave Hansen
2025-02-12  0:45     ` Sohil Mehta
2025-02-11 19:44 ` [PATCH v2 16/17] perf/x86: Simplify P6 PMU initialization Sohil Mehta
2025-02-11 19:44 ` [PATCH v2 17/17] perf/x86/p4: Replace Pentium 4 model checks with VFM ones Sohil Mehta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).