public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-03-22 17:56 [PATCH 0/4] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
@ 2024-03-22 17:56 ` Dave Hansen
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2024-03-22 17:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

xen_start_kernel() is some of the first C code run "Xen PV" systems.
It precedes things like setup_arch() or processor initialization code
that would be thought of as "very early".

That means that 'boot_cpu_data' is garbage.  It has not even
established the utter basics like if the CPU supports the CPUID
instruction.  Unfortunately get_cpu_cap() requires this exact
information.

Nevertheless, xen_start_kernel() calls get_cpu_cap().  But it
works out in practice because it's looking for the NX bit which
comes from an extended CPUID leaf that doesn't depend on
c->cpuid_level being set.

Stop calling get_cpu_cap().  Use CPUID directly.  Add a little
helper to avoid cluttering up the xen_start_kernel() flow.

This removes the risky, barely-working call to get_cpu_cap().

Note: The '& 31' strips the "capability" word off of an
      X86_FEATURE_* value.  It's a magic number, but there are
      a few of them sparsely scattered around and it's way
      shorter than any helper.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
---

 b/arch/x86/xen/enlighten_pv.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff -puN arch/x86/xen/enlighten_pv.c~xen-explain1 arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~xen-explain1	2024-03-22 09:30:25.499892614 -0700
+++ b/arch/x86/xen/enlighten_pv.c	2024-03-22 09:30:25.503892742 -0700
@@ -1306,6 +1306,21 @@ static void __init xen_domu_set_legacy_f
 
 extern void early_xen_iret_patch(void);
 
+/*
+ * It is too early for boot_cpu_has() and friends to work.
+ * Use CPUID to directly enumerate NX support.
+ */
+static inline void xen_configure_nx(void)
+{
+	bool nx_supported;
+	u32 eax = 0;
+
+	get_cpuid_region_leaf(0x80000001, CPUID_EDX, &eax);
+
+	nx_supported = eax & (X86_FEATURE_NX & 31);
+	x86_configure_nx(nx_supported);
+}
+
 /* First C function to be called on Xen boot */
 asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
 {
@@ -1369,9 +1384,7 @@ asmlinkage __visible void __init xen_sta
 	/* Get mfn list */
 	xen_build_dynamic_phys_to_machine();
 
-	/* Work out if we support NX */
-	get_cpu_cap(&boot_cpu_data);
-	x86_configure_nx(boot_cpu_has(X86_FEATURE_NX));
+	xen_configure_nx();
 
 	/*
 	 * Set up kernel GDT and segment registers, mainly so that
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup
@ 2024-04-03 15:35 Dave Hansen
  2024-04-03 15:35 ` [PATCH 1/4] x86/cpu: Add and use new CPUID region helper Dave Hansen
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Dave Hansen @ 2024-04-03 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen

Changes from v1:
 - Typos and whitepace damage noted by Ingo.

--

This peels off a patches from the top of another longer series
where Borislav suggested actually fixing a problem rather than
complaining about it:

	https://lore.kernel.org/all/20240307162752.GAZenrCDqs0lMTjmu1@fat_crate.local/

There's a little detour here to explain some mysterious (to me)
CPUID prodding.

The end result gets rid of some fragile Xen boot code.  It also clears
the way for a future series to bring some order to the early boot code
and remove some of the random tinkering that is there today.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] x86/cpu: Add and use new CPUID region helper
  2024-04-03 15:35 [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
@ 2024-04-03 15:35 ` Dave Hansen
  2024-04-04  9:04   ` Jürgen Groß
  2024-04-03 15:35 ` [PATCH 2/4] perf/x86/ibs: Use " Dave Hansen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2024-04-03 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen, kai.huang


From: Dave Hansen <dave.hansen@linux.intel.com>

There are some (before now) unwritten rules about CPUID "regions".
Basically, there is a 32-bit address space of CPUID leaves.  The
top 16 bits address a "region" and the first leaf in a region
is special.

The kernel only has a few spots that care about this, but it's
rather hard to make sense of the code as is.

Add a helper that explains regions.  Use it where applicable.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Cc: Juergen Gross <jgross@suse.com>

--

changes from v1:
 * Fix typo comment and whitespace gunk noted by Ingo

---

 b/arch/x86/include/asm/cpuid.h    |   59 ++++++++++++++++++++++++++++++++++++++
 b/arch/x86/kernel/cpu/common.c    |   13 +++-----
 b/arch/x86/kernel/cpu/transmeta.c |    9 +----
 b/arch/x86/xen/enlighten_pv.c     |    9 +----
 4 files changed, 69 insertions(+), 21 deletions(-)

diff -puN arch/x86/include/asm/cpuid.h~cpuid-regions arch/x86/include/asm/cpuid.h
--- a/arch/x86/include/asm/cpuid.h~cpuid-regions	2024-04-02 15:22:58.838912075 -0700
+++ b/arch/x86/include/asm/cpuid.h	2024-04-02 15:22:58.846912085 -0700
@@ -168,4 +168,63 @@ static inline uint32_t hypervisor_cpuid_
 	return 0;
 }
 
+/*
+ * By convention, CPUID is broken up into regions which each
+ * have 2^16 leaves.  EAX in the first leaf of each valid
+ * region returns the maximum valid leaf in that region.
+ *
+ * The regions can be thought of as being vendor-specific
+ * areas of CPUID, but that's imprecise because everybody
+ * implements the "Intel" region and Intel implements the
+ * AMD region.  There are a few well-known regions:
+ *  - Intel	(0x0000)
+ *  - AMD	(0x8000)
+ *  - Transmeta	(0x8086)
+ *  - Centaur	(0xC000)
+ *
+ * Consider a CPU where the maximum leaf in the Transmeta
+ * region is 2.  On such a CPU, leaf 0x80860000 would contain:
+ * EAX==0x80860002.
+ * region-^^^^
+ *   max leaf-^^^^
+ */
+static inline u32 cpuid_region_max_leaf(u16 region)
+{
+	u32 eax = cpuid_eax(region << 16);
+
+	/*
+	 * An unsupported region may return data from the last
+	 * "basic" leaf, which is essentially garbage.  Avoid
+	 * mistaking basic leaf data for region data.
+	 *
+	 * Note: this is not perfect.  It is theoretically
+	 * possible for the last basic leaf to _resemble_ a
+	 * valid first leaf from a region that doesn't exist.
+	 * But Intel at least seems to pad out the basic region
+	 * with 0's, possibly to avoid this.
+	 */
+	if ((eax >> 16) != region)
+		return 0;
+
+	return eax;
+}
+
+/* Returns true if the leaf exists and @value was populated */
+static inline bool get_cpuid_region_leaf(u32 leaf, enum cpuid_regs_idx reg,
+					 u32 *value)
+{
+	u16 region = leaf >> 16;
+	u32 regs[4];
+
+	if (cpuid_region_max_leaf(region) < leaf)
+		return false;
+
+	cpuid(leaf, &regs[CPUID_EAX], &regs[CPUID_EBX],
+	            &regs[CPUID_ECX], &regs[CPUID_EDX]);
+
+	*value = regs[reg];
+
+	return true;
+}
+
 #endif /* _ASM_X86_CPUID_H */
diff -puN arch/x86/kernel/cpu/common.c~cpuid-regions arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
+++ b/arch/x86/kernel/cpu/common.c	2024-04-02 15:22:58.846912085 -0700
@@ -1049,16 +1049,13 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	}
 
 	/* AMD-defined flags: level 0x80000001 */
-	eax = cpuid_eax(0x80000000);
-	c->extended_cpuid_level = eax;
+	c->extended_cpuid_level = cpuid_region_max_leaf(0x8000);
 
-	if ((eax & 0xffff0000) == 0x80000000) {
-		if (eax >= 0x80000001) {
-			cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
+	if (c->extended_cpuid_level >= 0x80000001) {
+		cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
 
-			c->x86_capability[CPUID_8000_0001_ECX] = ecx;
-			c->x86_capability[CPUID_8000_0001_EDX] = edx;
-		}
+		c->x86_capability[CPUID_8000_0001_ECX] = ecx;
+		c->x86_capability[CPUID_8000_0001_EDX] = edx;
 	}
 
 	if (c->extended_cpuid_level >= 0x80000007) {
diff -puN arch/x86/kernel/cpu/transmeta.c~cpuid-regions arch/x86/kernel/cpu/transmeta.c
--- a/arch/x86/kernel/cpu/transmeta.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
+++ b/arch/x86/kernel/cpu/transmeta.c	2024-04-02 15:22:58.846912085 -0700
@@ -9,14 +9,9 @@
 
 static void early_init_transmeta(struct cpuinfo_x86 *c)
 {
-	u32 xlvl;
-
 	/* Transmeta-defined flags: level 0x80860001 */
-	xlvl = cpuid_eax(0x80860000);
-	if ((xlvl & 0xffff0000) == 0x80860000) {
-		if (xlvl >= 0x80860001)
-			c->x86_capability[CPUID_8086_0001_EDX] = cpuid_edx(0x80860001);
-	}
+	get_cpuid_region_leaf(0x80860001, CPUID_EDX,
+			  &c->x86_capability[CPUID_8086_0001_EDX]);
 }
 
 static void init_transmeta(struct cpuinfo_x86 *c)
diff -puN arch/x86/xen/enlighten_pv.c~cpuid-regions arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
+++ b/arch/x86/xen/enlighten_pv.c	2024-04-03 08:34:28.221534043 -0700
@@ -141,16 +141,13 @@ static void __init xen_set_mtrr_data(voi
 	};
 	unsigned int reg;
 	unsigned long mask;
-	uint32_t eax, width;
+	uint32_t width;
 	static struct mtrr_var_range var[MTRR_MAX_VAR_RANGES] __initdata;
 
 	/* Get physical address width (only 64-bit cpus supported). */
 	width = 36;
-	eax = cpuid_eax(0x80000000);
-	if ((eax >> 16) == 0x8000 && eax >= 0x80000008) {
-		eax = cpuid_eax(0x80000008);
-		width = eax & 0xff;
-	}
+	/* Will overwrite 'width' if available in CPUID: */
+	get_cpuid_region_leaf(0x80000008, CPUID_EAX, &width);
 
 	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
 		op.u.read_memtype.reg = reg;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-03 15:35 [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
  2024-04-03 15:35 ` [PATCH 1/4] x86/cpu: Add and use new CPUID region helper Dave Hansen
@ 2024-04-03 15:35 ` Dave Hansen
  2024-04-16 15:12   ` Borislav Petkov
  2024-04-03 15:35 ` [PATCH 3/4] x86/boot: Explicitly pass NX enabling status Dave Hansen
  2024-04-03 15:35 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen
  3 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2024-04-03 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

IBS details are enumerated in an extended CPUID leaf.  But
the support has an open-coded CPUID region check.  Use the
new helper to trim down the code.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
--

Note: this cleanup could take another form:

        if (boot_cpu_data->extended_cpuid_level >= IBS_CPUID_FEATURES)
                caps = cpuid_eax(IBS_CPUID_FEATURES);

that would be one fewer CPUID invocations, but one more
line of code.
---

 b/arch/x86/events/amd/ibs.c |    9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff -puN arch/x86/events/amd/ibs.c~ibs-region-helpers arch/x86/events/amd/ibs.c
--- a/arch/x86/events/amd/ibs.c~ibs-region-helpers	2024-04-02 15:22:59.262912595 -0700
+++ b/arch/x86/events/amd/ibs.c	2024-04-02 15:22:59.262912595 -0700
@@ -1278,18 +1278,13 @@ static __init int perf_event_ibs_init(vo
 
 static __init u32 __get_ibs_caps(void)
 {
-	u32 caps;
-	unsigned int max_level;
+	u32 caps = 0;
 
 	if (!boot_cpu_has(X86_FEATURE_IBS))
 		return 0;
 
-	/* check IBS cpuid feature flags */
-	max_level = cpuid_eax(0x80000000);
-	if (max_level < IBS_CPUID_FEATURES)
-		return IBS_CAPS_DEFAULT;
+	get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps);
 
-	caps = cpuid_eax(IBS_CPUID_FEATURES);
 	if (!(caps & IBS_CAPS_AVAIL))
 		/* cpuid flags not valid */
 		return IBS_CAPS_DEFAULT;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] x86/boot: Explicitly pass NX enabling status
  2024-04-03 15:35 [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
  2024-04-03 15:35 ` [PATCH 1/4] x86/cpu: Add and use new CPUID region helper Dave Hansen
  2024-04-03 15:35 ` [PATCH 2/4] perf/x86/ibs: Use " Dave Hansen
@ 2024-04-03 15:35 ` Dave Hansen
  2024-04-04 10:33   ` Jürgen Groß
  2024-04-03 15:35 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen
  3 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2024-04-03 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen, kai.huang


From: Dave Hansen <dave.hansen@linux.intel.com>

The kernel sometimes needs to mask unsupported bits out of page
table entries.  It does that with a mask: '__supported_pte_mask'.

That mask can obviously only contain the No-eXecute bit (_PAGE_NX)
on hardware where NX is supported.  x86_configure_nx() checks the
boot CPU's NX support and adjusts the mask appropriately.

But it doesn't check support directly.  It uses the venerable
'boot_cpu_data' which is a software approximation of the actual CPU
support.  Unfortunately, Xen wants to set up '__supported_pte_mask'
before 'boot_cpu_data' has been initialized.  It hacks around this
problem by repeating some of the 'boot_cpu_data' setup *just* for
NX.

Have x86_configure_nx() stop consulting 'boot_cpu_data' and move
the NX detection to the caller.

No functional change.  That will come later.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
---

 b/arch/x86/include/asm/proto.h |    2 +-
 b/arch/x86/kernel/setup.c      |    6 +++---
 b/arch/x86/xen/enlighten_pv.c  |    2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff -puN arch/x86/include/asm/proto.h~x86_configure_nx-arg arch/x86/include/asm/proto.h
--- a/arch/x86/include/asm/proto.h~x86_configure_nx-arg	2024-04-02 15:22:59.638913056 -0700
+++ b/arch/x86/include/asm/proto.h	2024-04-02 15:22:59.642913062 -0700
@@ -37,7 +37,7 @@ void entry_SYSRETL_compat_end(void);
 #define entry_SYSENTER_compat NULL
 #endif
 
-void x86_configure_nx(void);
+void x86_configure_nx(bool nx_supported);
 
 extern int reboot_force;
 
diff -puN arch/x86/kernel/setup.c~x86_configure_nx-arg arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~x86_configure_nx-arg	2024-04-02 15:22:59.638913056 -0700
+++ b/arch/x86/kernel/setup.c	2024-04-02 15:22:59.642913062 -0700
@@ -687,9 +687,9 @@ dump_kernel_offset(struct notifier_block
 	return 0;
 }
 
-void x86_configure_nx(void)
+void x86_configure_nx(bool nx_supported)
 {
-	if (boot_cpu_has(X86_FEATURE_NX))
+	if (nx_supported)
 		__supported_pte_mask |= _PAGE_NX;
 	else
 		__supported_pte_mask &= ~_PAGE_NX;
@@ -853,7 +853,7 @@ void __init setup_arch(char **cmdline_p)
 	 * whether hardware doesn't support NX (so that the early EHCI debug
 	 * console setup can safely call set_fixmap()).
 	 */
-	x86_configure_nx();
+	x86_configure_nx(boot_cpu_has(X86_FEATURE_NX));
 
 	parse_early_param();
 
diff -puN arch/x86/xen/enlighten_pv.c~x86_configure_nx-arg arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~x86_configure_nx-arg	2024-04-02 15:22:59.638913056 -0700
+++ b/arch/x86/xen/enlighten_pv.c	2024-04-02 15:22:59.642913062 -0700
@@ -1371,7 +1371,7 @@ asmlinkage __visible void __init xen_sta
 
 	/* Work out if we support NX */
 	get_cpu_cap(&boot_cpu_data);
-	x86_configure_nx();
+	x86_configure_nx(boot_cpu_has(X86_FEATURE_NX));
 
 	/*
 	 * Set up kernel GDT and segment registers, mainly so that
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-04-03 15:35 [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
                   ` (2 preceding siblings ...)
  2024-04-03 15:35 ` [PATCH 3/4] x86/boot: Explicitly pass NX enabling status Dave Hansen
@ 2024-04-03 15:35 ` Dave Hansen
  2024-04-04 10:44   ` Jürgen Groß
  2024-04-04 15:05   ` Sean Christopherson
  3 siblings, 2 replies; 24+ messages in thread
From: Dave Hansen @ 2024-04-03 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: jgross, tglx, x86, bp, Dave Hansen


From: Dave Hansen <dave.hansen@linux.intel.com>

xen_start_kernel() is some of the first C code run "Xen PV" systems.
It precedes things like setup_arch() or processor initialization code
that would be thought of as "very early".

That means that 'boot_cpu_data' is garbage.  It has not even
established the utter basics like if the CPU supports the CPUID
instruction.  Unfortunately get_cpu_cap() requires this exact
information.

Nevertheless, xen_start_kernel() calls get_cpu_cap().  But it
works out in practice because it's looking for the NX bit which
comes from an extended CPUID leaf that doesn't depend on
c->cpuid_level being set.

Stop calling get_cpu_cap().  Use CPUID directly.  Add a little
helper to avoid cluttering up the xen_start_kernel() flow.

This removes the risky, barely-working call to get_cpu_cap().

Note: The '& 31' strips the "capability" word off of an
      X86_FEATURE_* value.  It's a magic number, but there are
      a few of them sparsely scattered around and it's way
      shorter than any helper.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Juergen Gross <jgross@suse.com>
---

 b/arch/x86/xen/enlighten_pv.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff -puN arch/x86/xen/enlighten_pv.c~xen-explain1 arch/x86/xen/enlighten_pv.c
--- a/arch/x86/xen/enlighten_pv.c~xen-explain1	2024-04-02 15:23:00.046913557 -0700
+++ b/arch/x86/xen/enlighten_pv.c	2024-04-02 15:23:00.050913562 -0700
@@ -1306,6 +1306,21 @@ static void __init xen_domu_set_legacy_f
 
 extern void early_xen_iret_patch(void);
 
+/*
+ * It is too early for boot_cpu_has() and friends to work.
+ * Use CPUID to directly enumerate NX support.
+ */
+static inline void xen_configure_nx(void)
+{
+	bool nx_supported;
+	u32 eax = 0;
+
+	get_cpuid_region_leaf(0x80000001, CPUID_EDX, &eax);
+
+	nx_supported = eax & (X86_FEATURE_NX & 31);
+	x86_configure_nx(nx_supported);
+}
+
 /* First C function to be called on Xen boot */
 asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
 {
@@ -1369,9 +1384,7 @@ asmlinkage __visible void __init xen_sta
 	/* Get mfn list */
 	xen_build_dynamic_phys_to_machine();
 
-	/* Work out if we support NX */
-	get_cpu_cap(&boot_cpu_data);
-	x86_configure_nx(boot_cpu_has(X86_FEATURE_NX));
+	xen_configure_nx();
 
 	/*
 	 * Set up kernel GDT and segment registers, mainly so that
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] x86/cpu: Add and use new CPUID region helper
  2024-04-03 15:35 ` [PATCH 1/4] x86/cpu: Add and use new CPUID region helper Dave Hansen
@ 2024-04-04  9:04   ` Jürgen Groß
  2024-04-08 16:25     ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Jürgen Groß @ 2024-04-04  9:04 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel; +Cc: tglx, x86, bp, kai.huang

On 03.04.24 17:35, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> There are some (before now) unwritten rules about CPUID "regions".
> Basically, there is a 32-bit address space of CPUID leaves.  The
> top 16 bits address a "region" and the first leaf in a region
> is special.
> 
> The kernel only has a few spots that care about this, but it's
> rather hard to make sense of the code as is.
> 
> Add a helper that explains regions.  Use it where applicable.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> Cc: Juergen Gross <jgross@suse.com>
> 
> --
> 
> changes from v1:
>   * Fix typo comment and whitespace gunk noted by Ingo
> 
> ---
> 
>   b/arch/x86/include/asm/cpuid.h    |   59 ++++++++++++++++++++++++++++++++++++++
>   b/arch/x86/kernel/cpu/common.c    |   13 +++-----
>   b/arch/x86/kernel/cpu/transmeta.c |    9 +----
>   b/arch/x86/xen/enlighten_pv.c     |    9 +----
>   4 files changed, 69 insertions(+), 21 deletions(-)
> 
> diff -puN arch/x86/include/asm/cpuid.h~cpuid-regions arch/x86/include/asm/cpuid.h
> --- a/arch/x86/include/asm/cpuid.h~cpuid-regions	2024-04-02 15:22:58.838912075 -0700
> +++ b/arch/x86/include/asm/cpuid.h	2024-04-02 15:22:58.846912085 -0700
> @@ -168,4 +168,63 @@ static inline uint32_t hypervisor_cpuid_
>   	return 0;
>   }
>   
> +/*
> + * By convention, CPUID is broken up into regions which each
> + * have 2^16 leaves.  EAX in the first leaf of each valid
> + * region returns the maximum valid leaf in that region.
> + *
> + * The regions can be thought of as being vendor-specific
> + * areas of CPUID, but that's imprecise because everybody
> + * implements the "Intel" region and Intel implements the
> + * AMD region.  There are a few well-known regions:
> + *  - Intel	(0x0000)
> + *  - AMD	(0x8000)
> + *  - Transmeta	(0x8086)
> + *  - Centaur	(0xC000)
> + *
> + * Consider a CPU where the maximum leaf in the Transmeta
> + * region is 2.  On such a CPU, leaf 0x80860000 would contain:
> + * EAX==0x80860002.
> + * region-^^^^
> + *   max leaf-^^^^
> + */
> +static inline u32 cpuid_region_max_leaf(u16 region)
> +{
> +	u32 eax = cpuid_eax(region << 16);
> +
> +	/*
> +	 * An unsupported region may return data from the last
> +	 * "basic" leaf, which is essentially garbage.  Avoid
> +	 * mistaking basic leaf data for region data.
> +	 *
> +	 * Note: this is not perfect.  It is theoretically
> +	 * possible for the last basic leaf to _resemble_ a
> +	 * valid first leaf from a region that doesn't exist.
> +	 * But Intel at least seems to pad out the basic region
> +	 * with 0's, possibly to avoid this.
> +	 */
> +	if ((eax >> 16) != region)
> +		return 0;
> +
> +	return eax;
> +}
> +
> +/* Returns true if the leaf exists and @value was populated */
> +static inline bool get_cpuid_region_leaf(u32 leaf, enum cpuid_regs_idx reg,
> +					 u32 *value)
> +{
> +	u16 region = leaf >> 16;
> +	u32 regs[4];
> +
> +	if (cpuid_region_max_leaf(region) < leaf)
> +		return false;
> +
> +	cpuid(leaf, &regs[CPUID_EAX], &regs[CPUID_EBX],
> +	            &regs[CPUID_ECX], &regs[CPUID_EDX]);
> +
> +	*value = regs[reg];
> +
> +	return true;
> +}
> +
>   #endif /* _ASM_X86_CPUID_H */
> diff -puN arch/x86/kernel/cpu/common.c~cpuid-regions arch/x86/kernel/cpu/common.c
> --- a/arch/x86/kernel/cpu/common.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
> +++ b/arch/x86/kernel/cpu/common.c	2024-04-02 15:22:58.846912085 -0700
> @@ -1049,16 +1049,13 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
>   	}
>   
>   	/* AMD-defined flags: level 0x80000001 */
> -	eax = cpuid_eax(0x80000000);
> -	c->extended_cpuid_level = eax;
> +	c->extended_cpuid_level = cpuid_region_max_leaf(0x8000);
>   
> -	if ((eax & 0xffff0000) == 0x80000000) {
> -		if (eax >= 0x80000001) {
> -			cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
> +	if (c->extended_cpuid_level >= 0x80000001) {
> +		cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
>   
> -			c->x86_capability[CPUID_8000_0001_ECX] = ecx;
> -			c->x86_capability[CPUID_8000_0001_EDX] = edx;
> -		}
> +		c->x86_capability[CPUID_8000_0001_ECX] = ecx;
> +		c->x86_capability[CPUID_8000_0001_EDX] = edx;
>   	}
>   
>   	if (c->extended_cpuid_level >= 0x80000007) {
> diff -puN arch/x86/kernel/cpu/transmeta.c~cpuid-regions arch/x86/kernel/cpu/transmeta.c
> --- a/arch/x86/kernel/cpu/transmeta.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
> +++ b/arch/x86/kernel/cpu/transmeta.c	2024-04-02 15:22:58.846912085 -0700
> @@ -9,14 +9,9 @@
>   
>   static void early_init_transmeta(struct cpuinfo_x86 *c)
>   {
> -	u32 xlvl;
> -
>   	/* Transmeta-defined flags: level 0x80860001 */
> -	xlvl = cpuid_eax(0x80860000);
> -	if ((xlvl & 0xffff0000) == 0x80860000) {
> -		if (xlvl >= 0x80860001)
> -			c->x86_capability[CPUID_8086_0001_EDX] = cpuid_edx(0x80860001);
> -	}
> +	get_cpuid_region_leaf(0x80860001, CPUID_EDX,
> +			  &c->x86_capability[CPUID_8086_0001_EDX]);
>   }
>   
>   static void init_transmeta(struct cpuinfo_x86 *c)
> diff -puN arch/x86/xen/enlighten_pv.c~cpuid-regions arch/x86/xen/enlighten_pv.c
> --- a/arch/x86/xen/enlighten_pv.c~cpuid-regions	2024-04-02 15:22:58.842912079 -0700
> +++ b/arch/x86/xen/enlighten_pv.c	2024-04-03 08:34:28.221534043 -0700
> @@ -141,16 +141,13 @@ static void __init xen_set_mtrr_data(voi
>   	};
>   	unsigned int reg;
>   	unsigned long mask;
> -	uint32_t eax, width;
> +	uint32_t width;
>   	static struct mtrr_var_range var[MTRR_MAX_VAR_RANGES] __initdata;
>   
>   	/* Get physical address width (only 64-bit cpus supported). */
>   	width = 36;
> -	eax = cpuid_eax(0x80000000);
> -	if ((eax >> 16) == 0x8000 && eax >= 0x80000008) {
> -		eax = cpuid_eax(0x80000008);
> -		width = eax & 0xff;
> -	}
> +	/* Will overwrite 'width' if available in CPUID: */
> +	get_cpuid_region_leaf(0x80000008, CPUID_EAX, &width);

Aren't you missing "width &= 0xff;" here? get_cpuid_region_leaf() doesn't
truncate the returned value, which has the linear address width at bits 8-15.


Juergen

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] x86/boot: Explicitly pass NX enabling status
  2024-04-03 15:35 ` [PATCH 3/4] x86/boot: Explicitly pass NX enabling status Dave Hansen
@ 2024-04-04 10:33   ` Jürgen Groß
  0 siblings, 0 replies; 24+ messages in thread
From: Jürgen Groß @ 2024-04-04 10:33 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel; +Cc: tglx, x86, bp, kai.huang

On 03.04.24 17:35, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The kernel sometimes needs to mask unsupported bits out of page
> table entries.  It does that with a mask: '__supported_pte_mask'.
> 
> That mask can obviously only contain the No-eXecute bit (_PAGE_NX)
> on hardware where NX is supported.  x86_configure_nx() checks the
> boot CPU's NX support and adjusts the mask appropriately.
> 
> But it doesn't check support directly.  It uses the venerable
> 'boot_cpu_data' which is a software approximation of the actual CPU
> support.  Unfortunately, Xen wants to set up '__supported_pte_mask'
> before 'boot_cpu_data' has been initialized.  It hacks around this
> problem by repeating some of the 'boot_cpu_data' setup *just* for
> NX.
> 
> Have x86_configure_nx() stop consulting 'boot_cpu_data' and move
> the NX detection to the caller.
> 
> No functional change.  That will come later.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-04-03 15:35 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen
@ 2024-04-04 10:44   ` Jürgen Groß
  2024-04-04 14:24     ` Dave Hansen
  2024-04-04 15:05   ` Sean Christopherson
  1 sibling, 1 reply; 24+ messages in thread
From: Jürgen Groß @ 2024-04-04 10:44 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel; +Cc: tglx, x86, bp

On 03.04.24 17:35, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> xen_start_kernel() is some of the first C code run "Xen PV" systems.

Nit: s/run/run on/

> It precedes things like setup_arch() or processor initialization code
> that would be thought of as "very early".
> 
> That means that 'boot_cpu_data' is garbage.  It has not even
> established the utter basics like if the CPU supports the CPUID
> instruction.  Unfortunately get_cpu_cap() requires this exact
> information.
> 
> Nevertheless, xen_start_kernel() calls get_cpu_cap().  But it
> works out in practice because it's looking for the NX bit which
> comes from an extended CPUID leaf that doesn't depend on
> c->cpuid_level being set.
> 
> Stop calling get_cpu_cap().  Use CPUID directly.  Add a little
> helper to avoid cluttering up the xen_start_kernel() flow.
> 
> This removes the risky, barely-working call to get_cpu_cap().
> 
> Note: The '& 31' strips the "capability" word off of an
>        X86_FEATURE_* value.  It's a magic number, but there are
>        a few of them sparsely scattered around and it's way
>        shorter than any helper.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
> 
>   b/arch/x86/xen/enlighten_pv.c |   19 ++++++++++++++++---
>   1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff -puN arch/x86/xen/enlighten_pv.c~xen-explain1 arch/x86/xen/enlighten_pv.c
> --- a/arch/x86/xen/enlighten_pv.c~xen-explain1	2024-04-02 15:23:00.046913557 -0700
> +++ b/arch/x86/xen/enlighten_pv.c	2024-04-02 15:23:00.050913562 -0700
> @@ -1306,6 +1306,21 @@ static void __init xen_domu_set_legacy_f
>   
>   extern void early_xen_iret_patch(void);
>   
> +/*
> + * It is too early for boot_cpu_has() and friends to work.
> + * Use CPUID to directly enumerate NX support.
> + */
> +static inline void xen_configure_nx(void)
> +{
> +	bool nx_supported;
> +	u32 eax = 0;

I'd prefer to name this variable edx.

> +
> +	get_cpuid_region_leaf(0x80000001, CPUID_EDX, &eax);
> +
> +	nx_supported = eax & (X86_FEATURE_NX & 31);
> +	x86_configure_nx(nx_supported);
> +}
> +
>   /* First C function to be called on Xen boot */
>   asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
>   {
> @@ -1369,9 +1384,7 @@ asmlinkage __visible void __init xen_sta
>   	/* Get mfn list */
>   	xen_build_dynamic_phys_to_machine();
>   
> -	/* Work out if we support NX */
> -	get_cpu_cap(&boot_cpu_data);
> -	x86_configure_nx(boot_cpu_has(X86_FEATURE_NX));
> +	xen_configure_nx();
>   
>   	/*
>   	 * Set up kernel GDT and segment registers, mainly so that
> _


Juergen

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-04-04 10:44   ` Jürgen Groß
@ 2024-04-04 14:24     ` Dave Hansen
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2024-04-04 14:24 UTC (permalink / raw)
  To: Jürgen Groß, Dave Hansen, linux-kernel; +Cc: tglx, x86, bp

On 4/4/24 03:44, Jürgen Groß wrote:
>> @@ -1306,6 +1306,21 @@ static void __init xen_domu_set_legacy_f
>>     extern void early_xen_iret_patch(void);
>>   +/*
>> + * It is too early for boot_cpu_has() and friends to work.
>> + * Use CPUID to directly enumerate NX support.
>> + */
>> +static inline void xen_configure_nx(void)
>> +{
>> +    bool nx_supported;
>> +    u32 eax = 0;
> 
> I'd prefer to name this variable edx.
> 
>> +
>> +    get_cpuid_region_leaf(0x80000001, CPUID_EDX, &eax); 

Heh.  That's a nice way to say it. :)

I'll fix it up.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-04-03 15:35 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen
  2024-04-04 10:44   ` Jürgen Groß
@ 2024-04-04 15:05   ` Sean Christopherson
  2024-04-22 17:44     ` Borislav Petkov
  1 sibling, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2024-04-04 15:05 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linux-kernel, jgross, tglx, x86, bp

On Wed, Apr 03, 2024, Dave Hansen wrote:
> +/*
> + * It is too early for boot_cpu_has() and friends to work.
> + * Use CPUID to directly enumerate NX support.
> + */
> +static inline void xen_configure_nx(void)
> +{
> +	bool nx_supported;
> +	u32 eax = 0;
> +
> +	get_cpuid_region_leaf(0x80000001, CPUID_EDX, &eax);
> +
> +	nx_supported = eax & (X86_FEATURE_NX & 31);

Heh, I wasn't looking to review this, but this is wrong, and the fly-by comment
I was going to make is relevant (and got really long once I started typing...).

X86_FEATURE_NX isn't a bit, it's bit position: (1*32+20) & 0x31 == 20.  I.e. the
code would actually need to be

	nx_supported = edx & BIT(X86_FEATURE_NX & 31);

The TL;DR version of comment I came to give, is that KVM has a lot of infrastructure
for manipulating raw CPUID values using X86_FEATURE_* flags, and that I think we
should seriously consider moving much of KVM's infrastructure to core x86. 

KVM has several macros for doing this sort of thing, and we solve the "way shorter
than any helper" conundrum is by pasting tokens, e.g.

	#define feature_bit(name)  __feature_bit(X86_FEATURE_##name)

which yields fairly readable code, e.g. this would be

	nx_supported = edx & feature_bit(NX);

and if you want to go really terse with a macro that is local to a .c file

	#define F feature_bit

though I doubt that's necessary outside of KVM (KVM has ~200 uses in a single
function that computes the support guest CPUID).

Grabbing feature_bit() directly from KVM would be cumbersome, as __feature_bit()
pulls in a _lot_ of dependencies that aren't strictly necessary.  But if you do
do want to add a generic macro, I definitely think it's worth moving KVM's stuff
to common code, because all of the dependencies are compile-time assertions to
guard against incorrect usage.  E.g. using the "& 31" trick on scattered flags
for raw CPUID will give the wrong result, because the bit position for scattered
flags doesn't match the bit position for hardware-defined CPUID.

But if we went a step further, KVM's code can also programatically generate the
inputs to CPUID.  x86_feature_cpuid() returns a cpuid_reg structure, which gives
the function, index, and register:

	struct cpuid_reg {
		u32 function;
		u32 index;
		int reg;
	};

E.g. if we move the KVM stuff to common code, we could have a helper like:

	static __always_inline bool x86_cpuid_has(unsigned int feature)
	{
		struct cpuid_reg cpuid = x86_feature_cpuid(feature);
		u32 val;

		if (!get_cpuid_region_leaf(cpuid.function, cpuid.reg, &val))
			return false;

		return val & __feature_bit(feature);
	}

and then the NX Xen code is more simply

	x86_configure_nx(x86_cpuid_has(X86_FEATURE_NX));

If we wanted to go really crazy, I bet we could figure out a way to use an
alternative-like implementation to automagically redirect boot_cpuid_has() to
a raw CPUID-based check if it's invoked before boot_cpu_data() is populated,
without increasing the code footprint.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] x86/cpu: Add and use new CPUID region helper
  2024-04-04  9:04   ` Jürgen Groß
@ 2024-04-08 16:25     ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2024-04-08 16:25 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: Dave Hansen, linux-kernel, tglx, x86, kai.huang

On Thu, Apr 04, 2024 at 11:04:49AM +0200, Jürgen Groß wrote:
> Aren't you missing "width &= 0xff;" here? get_cpuid_region_leaf() doesn't
> truncate the returned value, which has the linear address width at bits 8-15.

Yeah.

Other than that, the intention looks ok as it documents what that code
magic was.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-03 15:35 ` [PATCH 2/4] perf/x86/ibs: Use " Dave Hansen
@ 2024-04-16 15:12   ` Borislav Petkov
  2024-04-16 15:23     ` Dave Hansen
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-04-16 15:12 UTC (permalink / raw)
  To: Dave Hansen, Robert Richter; +Cc: linux-kernel, jgross, tglx, x86

On Wed, Apr 03, 2024 at 08:35:11AM -0700, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> IBS details are enumerated in an extended CPUID leaf.  But
> the support has an open-coded CPUID region check.  Use the
> new helper to trim down the code.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> --
> 
> Note: this cleanup could take another form:
> 
>         if (boot_cpu_data->extended_cpuid_level >= IBS_CPUID_FEATURES)
>                 caps = cpuid_eax(IBS_CPUID_FEATURES);
> 
> that would be one fewer CPUID invocations, but one more
> line of code.
> ---
> 
>  b/arch/x86/events/amd/ibs.c |    9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff -puN arch/x86/events/amd/ibs.c~ibs-region-helpers arch/x86/events/amd/ibs.c
> --- a/arch/x86/events/amd/ibs.c~ibs-region-helpers	2024-04-02 15:22:59.262912595 -0700
> +++ b/arch/x86/events/amd/ibs.c	2024-04-02 15:22:59.262912595 -0700
> @@ -1278,18 +1278,13 @@ static __init int perf_event_ibs_init(vo
>  
>  static __init u32 __get_ibs_caps(void)
>  {
> -	u32 caps;
> -	unsigned int max_level;
> +	u32 caps = 0;
>  
>  	if (!boot_cpu_has(X86_FEATURE_IBS))
>  		return 0;
>  
> -	/* check IBS cpuid feature flags */
> -	max_level = cpuid_eax(0x80000000);
> -	if (max_level < IBS_CPUID_FEATURES)
> -		return IBS_CAPS_DEFAULT;
> +	get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps);

I wanna say all this checking of max level is worthless because if you
have X86_FEATURE_IBS, then it is a given that you also have that
0x8000001b CPUID leaf.

Right, Bob?

Unless there was some weird thing back then with the CPUID leafs...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-16 15:12   ` Borislav Petkov
@ 2024-04-16 15:23     ` Dave Hansen
  2024-04-16 17:48       ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2024-04-16 15:23 UTC (permalink / raw)
  To: Borislav Petkov, Dave Hansen, Robert Richter
  Cc: linux-kernel, jgross, tglx, x86

On 4/16/24 08:12, Borislav Petkov wrote:
>>  	if (!boot_cpu_has(X86_FEATURE_IBS))
>>  		return 0;
>>  
>> -	/* check IBS cpuid feature flags */
>> -	max_level = cpuid_eax(0x80000000);
>> -	if (max_level < IBS_CPUID_FEATURES)
>> -		return IBS_CAPS_DEFAULT;
>> +	get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps);
> I wanna say all this checking of max level is worthless because if you
> have X86_FEATURE_IBS, then it is a given that you also have that
> 0x8000001b CPUID leaf.
> 
> Right, Bob?
> 
> Unless there was some weird thing back then with the CPUID leafs...

When I was looking at it, I (maybe wrongly) assumed that 0x8000001b and
X86_FEATURE_IBS were independent for a reason.  Because, oddly enough:

	#define IBS_CAPS_DEFAULT         (IBS_CAPS_AVAIL         \
                                         | IBS_CAPS_FETCHSAM    \
                                         | IBS_CAPS_OPSAM)

So, if the CPU enumerates X86_FEATURE_IBS but has a
max_level<IBS_CPUID_FEATURES then it assumes there's a working IBS
because the software-inserted IBS_CAPS_DEFAULT has IBS_CAPS_AVAIL set.

Actually, that even means that the new code should probably be:

	u32 caps = IBS_CAPS_DEFAULT;
	
 	if (!boot_cpu_has(X86_FEATURE_IBS))
 		return 0;	

	get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps);

to match the old.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-16 15:23     ` Dave Hansen
@ 2024-04-16 17:48       ` Borislav Petkov
  2024-04-18 12:05         ` Robert Richter
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-04-16 17:48 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Dave Hansen, Robert Richter, linux-kernel, jgross, tglx, x86

On Tue, Apr 16, 2024 at 08:23:58AM -0700, Dave Hansen wrote:
> When I was looking at it, I (maybe wrongly) assumed that 0x8000001b and
> X86_FEATURE_IBS were independent for a reason.  Because, oddly enough:
> 
> 	#define IBS_CAPS_DEFAULT         (IBS_CAPS_AVAIL         \
>                                          | IBS_CAPS_FETCHSAM    \
>                                          | IBS_CAPS_OPSAM)
> 
> So, if the CPU enumerates X86_FEATURE_IBS but has a
> max_level<IBS_CPUID_FEATURES then it assumes there's a working IBS
> because the software-inserted IBS_CAPS_DEFAULT has IBS_CAPS_AVAIL set.

Right, that's why I added Robert. I found this in a F10h doc (old
Greyhound CPU rust):

"CPUID Fn8000_0001_ECX Feature Identifiers

...

10: IBS: Instruction Based Sampling = 1.

...

CPUID Fn8000_001B Instruction Based Sampling Identifiers

...

IBSFFV. IBS feature flags valid. Revision B = 0. Revision C = 1."

which makes this look like some hack to fix broken CPUID IBS reporting.

And if it is that, I don't think we care, frankly, because revB is
ooold. Mine is somewhere in the basement on some old board which got
bricked so I don't know even if I could use it anymore.

And I'm not even planing to - that CPU is almost 20 years old and no one
cares whether it can even do IBS.

So I wouldn't mind at all if we simplify this code for the sake of it.
I don't think anyone would care or notice.

But let's see what Robert says first...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-16 17:48       ` Borislav Petkov
@ 2024-04-18 12:05         ` Robert Richter
  2024-04-22 17:20           ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Robert Richter @ 2024-04-18 12:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On 16.04.24 19:48:50, Borislav Petkov wrote:
> On Tue, Apr 16, 2024 at 08:23:58AM -0700, Dave Hansen wrote:
> > When I was looking at it, I (maybe wrongly) assumed that 0x8000001b and
> > X86_FEATURE_IBS were independent for a reason.  Because, oddly enough:
> > 
> > 	#define IBS_CAPS_DEFAULT         (IBS_CAPS_AVAIL         \
> >                                          | IBS_CAPS_FETCHSAM    \
> >                                          | IBS_CAPS_OPSAM)
> > 
> > So, if the CPU enumerates X86_FEATURE_IBS but has a
> > max_level<IBS_CPUID_FEATURES then it assumes there's a working IBS
> > because the software-inserted IBS_CAPS_DEFAULT has IBS_CAPS_AVAIL set.
> 
> Right, that's why I added Robert. I found this in a F10h doc (old
> Greyhound CPU rust):
> 
> "CPUID Fn8000_0001_ECX Feature Identifiers
> 
> ...
> 
> 10: IBS: Instruction Based Sampling = 1.
> 
> ...
> 
> CPUID Fn8000_001B Instruction Based Sampling Identifiers
> 
> ...
> 
> IBSFFV. IBS feature flags valid. Revision B = 0. Revision C = 1."
> 
> which makes this look like some hack to fix broken CPUID IBS reporting.

There is the X86_FEATURE_IBS bit (Fn8000_0001_ECX, bit 10) which is
available from the beginning of IBS (all Fam10h production releases,
revB onwards).

And right, IBS_CPUID_FEATURES (CPUID Fn8000_001B) was introduced with
revC. The capabilities of revB are set in IBS_CAPS_DEFAULT.

It doesn't look broken to me, simply the ibs caps field was introduced
later which can be determined checking the return code of
get_cpuid_region_leaf().

> 
> And if it is that, I don't think we care, frankly, because revB is
> ooold. Mine is somewhere in the basement on some old board which got
> bricked so I don't know even if I could use it anymore.
> 
> And I'm not even planing to - that CPU is almost 20 years old and no one
> cares whether it can even do IBS.
> 
> So I wouldn't mind at all if we simplify this code for the sake of it.
> I don't think anyone would care or notice.
> 
> But let's see what Robert says first...

My preference would be:

	[...]
	if (!get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps))
		return IBS_CAPS_DEFAULT;

	if (caps & IBS_CAPS_AVAIL)
		return caps;

	return 0;
	[...]

Not too complex?

This slightly modifies the functionality so that 0 is return if
!IBS_CAPS_AVAIL (instead of IBS_CAPS_DEFAULT). The feature could be
disabled then by overriding IBS_CAPS_AVAIL by just clearing the ibs
leaf, valid even for newer systems.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-18 12:05         ` Robert Richter
@ 2024-04-22 17:20           ` Borislav Petkov
  2024-04-22 20:09             ` Robert Richter
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-04-22 17:20 UTC (permalink / raw)
  To: Robert Richter
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On Thu, Apr 18, 2024 at 02:05:49PM +0200, Robert Richter wrote:
> There is the X86_FEATURE_IBS bit (Fn8000_0001_ECX, bit 10) which is
> available from the beginning of IBS (all Fam10h production releases,
> revB onwards).
> 
> And right, IBS_CPUID_FEATURES (CPUID Fn8000_001B) was introduced with
> revC. The capabilities of revB are set in IBS_CAPS_DEFAULT.
> 
> It doesn't look broken to me, simply the ibs caps field was introduced
> later which can be determined checking the return code of
> get_cpuid_region_leaf().

Right.

> My preference would be:
> 
> 	[...]
> 	if (!get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps))

Right, checking get_cpuid_region_leaf() retval should happen.

> 		return IBS_CAPS_DEFAULT;
> 
> 	if (caps & IBS_CAPS_AVAIL)
> 		return caps;
> 
> 	return 0;
> 	[...]
> 
> Not too complex?

I'm wondering should we even support those old things? All revBs should
be probably dead by now. But ok, it's not like it is a lot of code or
so...

> This slightly modifies the functionality so that 0 is return if
> !IBS_CAPS_AVAIL (instead of IBS_CAPS_DEFAULT).

If !IBS_CAPS_AVAIL, then this is revB. But then you want to return
IBS_CAPS_DEFAULT there.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly
  2024-04-04 15:05   ` Sean Christopherson
@ 2024-04-22 17:44     ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2024-04-22 17:44 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Dave Hansen, linux-kernel, jgross, tglx, x86

On Thu, Apr 04, 2024 at 08:05:52AM -0700, Sean Christopherson wrote:
> ...
> and then the NX Xen code is more simply
> 
> 	x86_configure_nx(x86_cpuid_has(X86_FEATURE_NX));

I can't say that I haven't looked at the KVM features code and haven't
thought that, yap, some of the bits we could make generic. And I was
going to even suggest that but then, there's this bigger CPUID feature
rework that has been brewing on the horizon for a loong while now and
where we want to read out the CPUID features once and have everything
else query those instead of everybody doing their own thing...

... and some of that work is here:
https://lpc.events/event/17/contributions/1511/ ...

and tglx wanted to get this thing moving faster...

So yeah, I think we should finally unify all the feature bits handling
and reading so that there's a single set of APIs which are used by
everybody instead of the crazy "fun" we have now.

And then my hope is that you could kill/merge some of the KVM infra into
generic code.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-22 17:20           ` Borislav Petkov
@ 2024-04-22 20:09             ` Robert Richter
  2024-04-22 20:41               ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Robert Richter @ 2024-04-22 20:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On 22.04.24 19:20:55, Borislav Petkov wrote:
> On Thu, Apr 18, 2024 at 02:05:49PM +0200, Robert Richter wrote:
> > There is the X86_FEATURE_IBS bit (Fn8000_0001_ECX, bit 10) which is
> > available from the beginning of IBS (all Fam10h production releases,
> > revB onwards).
> > 
> > And right, IBS_CPUID_FEATURES (CPUID Fn8000_001B) was introduced with
> > revC. The capabilities of revB are set in IBS_CAPS_DEFAULT.
> > 
> > It doesn't look broken to me, simply the ibs caps field was introduced
> > later which can be determined checking the return code of
> > get_cpuid_region_leaf().
> 
> Right.
> 
> > My preference would be:
> > 
> > 	[...]
> > 	if (!get_cpuid_region_leaf(IBS_CPUID_FEATURES, CPUID_EAX, &caps))
> 
> Right, checking get_cpuid_region_leaf() retval should happen.
> 
> > 		return IBS_CAPS_DEFAULT;
> > 
> > 	if (caps & IBS_CAPS_AVAIL)
> > 		return caps;
> > 
> > 	return 0;
> > 	[...]
> > 
> > Not too complex?
> 
> I'm wondering should we even support those old things? All revBs should
> be probably dead by now. But ok, it's not like it is a lot of code or
> so...

Since we check get_cpuid_region_leaf()'s return code here we know if
the cpud leaf exists and return IBS_CAPS_DEFAULT if not. That would
not change the refB behaviour.

Though I think that case is rare or even not existing, I would just
keep the implementation like that and as it was for for years.

> 
> > This slightly modifies the functionality so that 0 is return if
> > !IBS_CAPS_AVAIL (instead of IBS_CAPS_DEFAULT).
> 
> If !IBS_CAPS_AVAIL, then this is revB. But then you want to return
> IBS_CAPS_DEFAULT there.

No, on a rebB get_cpuid_region_leaf() would be false, meaning the
cpuid leaf is missing, function returns with IBS_CAPS_DEFAULT then.

-Robert

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-22 20:09             ` Robert Richter
@ 2024-04-22 20:41               ` Borislav Petkov
  2024-04-22 21:30                 ` Robert Richter
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-04-22 20:41 UTC (permalink / raw)
  To: Robert Richter
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On Mon, Apr 22, 2024 at 10:09:57PM +0200, Robert Richter wrote:
> Since we check get_cpuid_region_leaf()'s return code here we know if
> the cpud leaf exists and return IBS_CAPS_DEFAULT if not. That would
> not change the refB behaviour.

Yes.

> Though I think that case is rare or even not existing, I would just
> keep the implementation like that and as it was for for years.

Yes.

> > > This slightly modifies the functionality so that 0 is return if
> > > !IBS_CAPS_AVAIL (instead of IBS_CAPS_DEFAULT).
> > 
> > If !IBS_CAPS_AVAIL, then this is revB. But then you want to return
> > IBS_CAPS_DEFAULT there.
> 
> No, on a rebB get_cpuid_region_leaf() would be false, meaning the
> cpuid leaf is missing, function returns with IBS_CAPS_DEFAULT then.

So what functionality modification do you mean then?

When will IBS_CAPS_AVAIL be not set?

GH BKDG says about that bit:

"IBSFFV. IBS feature flags valid. Revision B = 0. Revision C = 1."

so that has been set ever since on >= revC.

And on revB we'll return IBS_CAPS_DEFAULT which has IBS_CAPS_AVAIL.

IOW, I don't see how we'll return 0 if !IBS_CAPS_AVAIL because latter
doesn't happen practically with that flow.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-22 20:41               ` Borislav Petkov
@ 2024-04-22 21:30                 ` Robert Richter
  2024-04-22 21:45                   ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Robert Richter @ 2024-04-22 21:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On 22.04.24 22:41:46, Borislav Petkov wrote:
> On Mon, Apr 22, 2024 at 10:09:57PM +0200, Robert Richter wrote:
> > Since we check get_cpuid_region_leaf()'s return code here we know if
> > the cpud leaf exists and return IBS_CAPS_DEFAULT if not. That would
> > not change the refB behaviour.
> 
> Yes.
> 
> > Though I think that case is rare or even not existing, I would just
> > keep the implementation like that and as it was for for years.
> 
> Yes.
> 
> > > > This slightly modifies the functionality so that 0 is return if
> > > > !IBS_CAPS_AVAIL (instead of IBS_CAPS_DEFAULT).
> > > 
> > > If !IBS_CAPS_AVAIL, then this is revB. But then you want to return
> > > IBS_CAPS_DEFAULT there.
> > 
> > No, on a rebB get_cpuid_region_leaf() would be false, meaning the
> > cpuid leaf is missing, function returns with IBS_CAPS_DEFAULT then.
> 
> So what functionality modification do you mean then?
> 
> When will IBS_CAPS_AVAIL be not set?

I mean the case where the cpuid leaf exists but IBS_CAPS_AVAIL is
clear. That could be possible with some cpuid override e.g. in virt
envs.

> 
> GH BKDG says about that bit:
> 
> "IBSFFV. IBS feature flags valid. Revision B = 0. Revision C = 1."
> 
> so that has been set ever since on >= revC.
> 
> And on revB we'll return IBS_CAPS_DEFAULT which has IBS_CAPS_AVAIL.

Yes, all this is correct.

> 
> IOW, I don't see how we'll return 0 if !IBS_CAPS_AVAIL because latter
> doesn't happen practically with that flow.

Not on real hardware and if future systems not decide to enable IBS
feature bit and clear IBS_CAPS_AVAIL, which could be a valid case IMO.

Thanks,

-Robert

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-22 21:30                 ` Robert Richter
@ 2024-04-22 21:45                   ` Borislav Petkov
  2024-04-23  7:45                     ` Robert Richter
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2024-04-22 21:45 UTC (permalink / raw)
  To: Robert Richter
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On Mon, Apr 22, 2024 at 11:30:24PM +0200, Robert Richter wrote:
> I mean the case where the cpuid leaf exists but IBS_CAPS_AVAIL is
> clear. That could be possible with some cpuid override e.g. in virt
> envs.

Until there is a valid use case, I don't care.

> Not on real hardware and if future systems not decide to enable IBS
> feature bit and clear IBS_CAPS_AVAIL, which could be a valid case IMO.

Then they get what they ordered and get to keep the pieces.

We don't support every insane configuration virt comes up with.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-22 21:45                   ` Borislav Petkov
@ 2024-04-23  7:45                     ` Robert Richter
  2024-04-23  8:33                       ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Robert Richter @ 2024-04-23  7:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On 22.04.24 23:45:03, Borislav Petkov wrote:
> On Mon, Apr 22, 2024 at 11:30:24PM +0200, Robert Richter wrote:
> > I mean the case where the cpuid leaf exists but IBS_CAPS_AVAIL is
> > clear. That could be possible with some cpuid override e.g. in virt
> > envs.
> 
> Until there is a valid use case, I don't care.
> 
> > Not on real hardware and if future systems not decide to enable IBS
> > feature bit and clear IBS_CAPS_AVAIL, which could be a valid case IMO.
> 
> Then they get what they ordered and get to keep the pieces.
> 
> We don't support every insane configuration virt comes up with.

I think we can assume/agree that max cpuid leaf will not decrease.
That is, future implementations will contain the IBS leaf (offset
1Bh). Second, if IBS feature bit is cleared, IBS is switched
off. Third, use IBS_CAPS_DEFAULT if the IBS leaf is missing.

Now, we just need to decide one of those for the case where the IBS
cpuid leaf exists and IBS_CAPS_AVAIL is cleared:

1) Apply 0 to caps and entirely disable IBS (my proposal).

2) Apply IBS_CAPS_DEFAULT (originally intended for GH revB) and
   enable IBS with the limited feature set for GH revB (current
   kernel implementation).

I prefer 1) as this applies IBS_CAPS_DEFAULT only if the leaf is
missing which was the original intention of IBS_CAPS_DEFAULT but can
live with 2) as it is implemented now.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] perf/x86/ibs: Use CPUID region helper
  2024-04-23  7:45                     ` Robert Richter
@ 2024-04-23  8:33                       ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2024-04-23  8:33 UTC (permalink / raw)
  To: Robert Richter
  Cc: Dave Hansen, Dave Hansen, linux-kernel, jgross, tglx, x86,
	Kim Phillips, Robert Richter

On Tue, Apr 23, 2024 at 09:45:10AM +0200, Robert Richter wrote:
> I prefer 1) as this applies IBS_CAPS_DEFAULT only if the leaf is
> missing which was the original intention of IBS_CAPS_DEFAULT but can
> live with 2) as it is implemented now.

As implemented now and be done with it. We already wasted too much time
with some hypothetical and there's bigger fish to fry so let's leave it
like it is and move on.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-04-23  8:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 15:35 [PATCH 0/4] [v2] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
2024-04-03 15:35 ` [PATCH 1/4] x86/cpu: Add and use new CPUID region helper Dave Hansen
2024-04-04  9:04   ` Jürgen Groß
2024-04-08 16:25     ` Borislav Petkov
2024-04-03 15:35 ` [PATCH 2/4] perf/x86/ibs: Use " Dave Hansen
2024-04-16 15:12   ` Borislav Petkov
2024-04-16 15:23     ` Dave Hansen
2024-04-16 17:48       ` Borislav Petkov
2024-04-18 12:05         ` Robert Richter
2024-04-22 17:20           ` Borislav Petkov
2024-04-22 20:09             ` Robert Richter
2024-04-22 20:41               ` Borislav Petkov
2024-04-22 21:30                 ` Robert Richter
2024-04-22 21:45                   ` Borislav Petkov
2024-04-23  7:45                     ` Robert Richter
2024-04-23  8:33                       ` Borislav Petkov
2024-04-03 15:35 ` [PATCH 3/4] x86/boot: Explicitly pass NX enabling status Dave Hansen
2024-04-04 10:33   ` Jürgen Groß
2024-04-03 15:35 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen
2024-04-04 10:44   ` Jürgen Groß
2024-04-04 14:24     ` Dave Hansen
2024-04-04 15:05   ` Sean Christopherson
2024-04-22 17:44     ` Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2024-03-22 17:56 [PATCH 0/4] x86: Add CPUID region helper and clarify Xen startup Dave Hansen
2024-03-22 17:56 ` [PATCH 4/4] x86/xen: Enumerate NX from CPUID directly Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox