[PATCH v5 00/16] x86: make PAT and MTRR independent from each other

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
@ 2022-11-02  7:46 Juergen Gross
  2022-11-02  7:47 ` [PATCH v5 12/16] x86/mtrr: add a stop_machine() handler calling only cache_cpu_init() Juergen Gross
  2022-11-02 18:04 ` [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Borislav Petkov
  0 siblings, 2 replies; 7+ messages in thread
From: Juergen Gross @ 2022-11-02  7:46 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra

Today PAT can't be used without MTRR being available, unless MTRR is at
least configured via CONFIG_MTRR and the system is running as Xen PV
guest. In this case PAT is automatically available via the hypervisor,
but the PAT MSR can't be modified by the kernel and MTRR is disabled.

The same applies to a kernel built with no MTRR support: it won't
allow to use the PAT MSR, even if there is no technical reason for
that, other than setting up PAT on all CPUs the same way (which is a
requirement of the processor's cache management) is relying on some
MTRR specific code.

Fix all of that by:

- moving the function needed by PAT from MTRR specific code one level
  up
- reworking the init sequences of MTRR and PAT to be more similar to
  each other without calling PAT from MTRR code
- removing the dependency of PAT on MTRR

There is some more cleanup done reducing code size.

Note that patches 1+2 have already been applied to tip.git x86/cpu.
They are included in this series only for reference.

Changes in V5:
- addressed comments

Changes in V4:
- new patches 10, 14, 15, 16
- split up old patch 4 into 3 patches
- addressed comments

Changes in V3:
- replace patch 1 by just adding a comment

Changes in V2:
- complete rework of the patches based on comments by Boris Petkov
- added several patches to the series

Juergen Gross (16):
  x86/mtrr: add comment for set_mtrr_state() serialization
  x86/mtrr: remove unused cyrix_set_all() function
  x86/mtrr: replace use_intel() with a local flag
  x86/mtrr: rename prepare_set() and post_set()
  x86/mtrr: split MTRR specific handling from cache dis/enabling
  x86: move some code out of arch/x86/kernel/cpu/mtrr
  x86/mtrr: Disentangle MTRR init from PAT init.
  x86/mtrr: remove set_all callback from struct mtrr_ops
  x86/mtrr: simplify mtrr_bp_init()
  x86/mtrr: get rid of __mtrr_enabled bool
  x86/mtrr: let cache_aps_delayed_init replace mtrr_aps_delayed_init
  x86/mtrr: add a stop_machine() handler calling only cache_cpu_init()
  x86: decouple PAT and MTRR handling
  x86: switch cache_ap_init() to hotplug callback
  x86: do MTRR/PAT setup on all secondary CPUs in parallel
  x86/mtrr: simplify mtrr_ops initialization

 arch/x86/include/asm/cacheinfo.h   |  19 ++++
 arch/x86/include/asm/memtype.h     |   5 +-
 arch/x86/include/asm/mtrr.h        |  17 +--
 arch/x86/kernel/cpu/cacheinfo.c    | 173 +++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/common.c       |   2 +-
 arch/x86/kernel/cpu/mtrr/amd.c     |   8 +-
 arch/x86/kernel/cpu/mtrr/centaur.c |   8 +-
 arch/x86/kernel/cpu/mtrr/cyrix.c   |  42 +------
 arch/x86/kernel/cpu/mtrr/generic.c | 127 ++++-----------------
 arch/x86/kernel/cpu/mtrr/mtrr.c    | 171 ++++------------------------
 arch/x86/kernel/cpu/mtrr/mtrr.h    |  15 +--
 arch/x86/kernel/setup.c            |  14 +--
 arch/x86/kernel/smpboot.c          |   9 +-
 arch/x86/mm/pat/memtype.c          | 152 +++++++++----------------
 arch/x86/power/cpu.c               |   3 +-
 include/linux/cpuhotplug.h         |   1 +
 16 files changed, 308 insertions(+), 458 deletions(-)

-- 
2.35.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v5 12/16] x86/mtrr: add a stop_machine() handler calling only cache_cpu_init()
  2022-11-02  7:46 [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Juergen Gross
@ 2022-11-02  7:47 ` Juergen Gross
  2022-11-02 18:04 ` [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Borislav Petkov
  1 sibling, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2022-11-02  7:47 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek

Instead of having a stop_machine() handler for either a specific MTRR
register or all state at once, add a handler just for calling
cache_cpu_init() if appropriate.

Add functions for calling stop_machine() with this handler as well.

Add a generic replacement for mtrr_bp_restore() and a wrapper for
mtrr_bp_init().

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- completely new replacement of former patch 2
V5:
- add a missing mtrr_bp_init() stub (Borislav Petkov)
---
 arch/x86/include/asm/cacheinfo.h |  5 +-
 arch/x86/include/asm/mtrr.h      |  8 +--
 arch/x86/kernel/cpu/cacheinfo.c  | 59 ++++++++++++++++++++-
 arch/x86/kernel/cpu/common.c     |  3 +-
 arch/x86/kernel/cpu/mtrr/mtrr.c  | 88 +-------------------------------
 arch/x86/kernel/setup.c          |  3 +-
 arch/x86/kernel/smpboot.c        |  4 +-
 arch/x86/power/cpu.c             |  3 +-
 8 files changed, 74 insertions(+), 99 deletions(-)

diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h
index e443fcc1f045..a0ef46e9f453 100644
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -12,8 +12,11 @@ void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);
 
 void cache_disable(void);
 void cache_enable(void);
-void cache_cpu_init(void);
 void set_cache_aps_delayed_init(bool val);
 bool get_cache_aps_delayed_init(void);
+void cache_bp_init(void);
+void cache_bp_restore(void);
+void cache_ap_init(void);
+void cache_aps_init(void);
 
 #endif /* _ASM_X86_CACHEINFO_H */
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 5d31219c8529..f0eeaf6e5f5f 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -25,13 +25,12 @@
 
 #include <uapi/asm/mtrr.h>
 
-void mtrr_bp_init(void);
-
 /*
  * The following functions are for use by other drivers that cannot use
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
+void mtrr_bp_init(void);
 extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
@@ -42,8 +41,6 @@ extern int mtrr_add_page(unsigned long base, unsigned long size,
 extern int mtrr_del(int reg, unsigned long base, unsigned long size);
 extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
 extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
-extern void mtrr_ap_init(void);
-extern void mtrr_aps_init(void);
 extern void mtrr_bp_restore(void);
 extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
@@ -85,8 +82,7 @@ static inline int mtrr_trim_uncached_memory(unsigned long end_pfn)
 static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
 {
 }
-#define mtrr_ap_init() do {} while (0)
-#define mtrr_aps_init() do {} while (0)
+#define mtrr_bp_init() do {} while (0)
 #define mtrr_bp_restore() do {} while (0)
 #define mtrr_disable() do {} while (0)
 #define mtrr_enable() do {} while (0)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 931ba3fb1363..a92099569617 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -15,6 +15,7 @@
 #include <linux/capability.h>
 #include <linux/sysfs.h>
 #include <linux/pci.h>
+#include <linux/stop_machine.h>
 
 #include <asm/cpufeature.h>
 #include <asm/cacheinfo.h>
@@ -1121,7 +1122,7 @@ void cache_enable(void) __releases(cache_disable_lock)
 	raw_spin_unlock(&cache_disable_lock);
 }
 
-void cache_cpu_init(void)
+static void cache_cpu_init(void)
 {
 	unsigned long flags;
 
@@ -1149,3 +1150,59 @@ bool get_cache_aps_delayed_init(void)
 {
 	return cache_aps_delayed_init;
 }
+
+static int cache_rendezvous_handler(void *unused)
+{
+	if (get_cache_aps_delayed_init() || !cpu_online(smp_processor_id()))
+		cache_cpu_init();
+
+	return 0;
+}
+
+void __init cache_bp_init(void)
+{
+	mtrr_bp_init();
+
+	if (memory_caching_control)
+		cache_cpu_init();
+}
+
+void cache_bp_restore(void)
+{
+	if (memory_caching_control)
+		cache_cpu_init();
+}
+
+void cache_ap_init(void)
+{
+	if (!memory_caching_control || get_cache_aps_delayed_init())
+		return;
+
+	/*
+	 * Ideally we should hold mtrr_mutex here to avoid MTRR entries
+	 * changed, but this routine will be called in CPU boot time,
+	 * holding the lock breaks it.
+	 *
+	 * This routine is called in two cases:
+	 *
+	 *   1. very early time of software resume, when there absolutely
+	 *      isn't MTRR entry changes;
+	 *
+	 *   2. CPU hotadd time. We let mtrr_add/del_page hold cpuhotplug
+	 *      lock to prevent MTRR entry changes
+	 */
+	stop_machine_from_inactive_cpu(cache_rendezvous_handler, NULL,
+				       cpu_callout_mask);
+}
+
+/*
+ * Delayed cache initialization for all AP's
+ */
+void cache_aps_init(void)
+{
+	if (!memory_caching_control || !get_cache_aps_delayed_init())
+		return;
+
+	stop_machine(cache_rendezvous_handler, NULL, cpu_online_mask);
+	set_cache_aps_delayed_init(false);
+}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..fd058b547f8d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -52,6 +52,7 @@
 #include <asm/cpu.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
+#include <asm/cacheinfo.h>
 #include <asm/memtype.h>
 #include <asm/microcode.h>
 #include <asm/microcode_intel.h>
@@ -1948,7 +1949,7 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
 #ifdef CONFIG_X86_32
 	enable_sep_cpu();
 #endif
-	mtrr_ap_init();
+	cache_ap_init();
 	validate_apic_and_package_id(c);
 	x86_spec_ctrl_setup_ap();
 	update_srbds_msr();
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
index 15ee6d72fb1f..99b6973a69b4 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
@@ -73,9 +73,6 @@ static const struct mtrr_ops *mtrr_ops[X86_VENDOR_NUM] __ro_after_init;
 
 const struct mtrr_ops *mtrr_if;
 
-static void set_mtrr(unsigned int reg, unsigned long base,
-		     unsigned long size, mtrr_type type);
-
 void __init set_mtrr_ops(const struct mtrr_ops *ops)
 {
 	if (ops->vendor && ops->vendor < X86_VENDOR_NUM)
@@ -158,26 +155,8 @@ static int mtrr_rendezvous_handler(void *info)
 {
 	struct set_mtrr_data *data = info;
 
-	/*
-	 * We use this same function to initialize the mtrrs during boot,
-	 * resume, runtime cpu online and on an explicit request to set a
-	 * specific MTRR.
-	 *
-	 * During boot or suspend, the state of the boot cpu's mtrrs has been
-	 * saved, and we want to replicate that across all the cpus that come
-	 * online (either at the end of boot or resume or during a runtime cpu
-	 * online). If we're doing that, @reg is set to something special and on
-	 * all the CPUs we do cache_cpu_init() (On the logical CPU that
-	 * started the boot/resume sequence, this might be a duplicate
-	 * cache_cpu_init()).
-	 */
-	if (data->smp_reg != ~0U) {
-		mtrr_if->set(data->smp_reg, data->smp_base,
-			     data->smp_size, data->smp_type);
-	} else if (get_cache_aps_delayed_init() ||
-		   !cpu_online(smp_processor_id())) {
-		cache_cpu_init();
-	}
+	mtrr_if->set(data->smp_reg, data->smp_base,
+		     data->smp_size, data->smp_type);
 	return 0;
 }
 
@@ -247,19 +226,6 @@ static void set_mtrr_cpuslocked(unsigned int reg, unsigned long base,
 	stop_machine_cpuslocked(mtrr_rendezvous_handler, &data, cpu_online_mask);
 }
 
-static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base,
-				      unsigned long size, mtrr_type type)
-{
-	struct set_mtrr_data data = { .smp_reg = reg,
-				      .smp_base = base,
-				      .smp_size = size,
-				      .smp_type = type
-				    };
-
-	stop_machine_from_inactive_cpu(mtrr_rendezvous_handler, &data,
-				       cpu_callout_mask);
-}
-
 /**
  * mtrr_add_page - Add a memory type region
  * @base: Physical base address of region in pages (in units of 4 kB!)
@@ -761,7 +727,6 @@ void __init mtrr_bp_init(void)
 			if (get_mtrr_state()) {
 				memory_caching_control |= CACHE_MTRR | CACHE_PAT;
 				changed_by_mtrr_cleanup = mtrr_cleanup(phys_addr);
-				cache_cpu_init();
 			} else {
 				mtrr_if = NULL;
 			}
@@ -780,27 +745,6 @@ void __init mtrr_bp_init(void)
 	}
 }
 
-void mtrr_ap_init(void)
-{
-	if (!memory_caching_control || get_cache_aps_delayed_init())
-		return;
-
-	/*
-	 * Ideally we should hold mtrr_mutex here to avoid mtrr entries
-	 * changed, but this routine will be called in cpu boot time,
-	 * holding the lock breaks it.
-	 *
-	 * This routine is called in two cases:
-	 *
-	 *   1. very early time of software resume, when there absolutely
-	 *      isn't mtrr entry changes;
-	 *
-	 *   2. cpu hotadd time. We let mtrr_add/del_page hold cpuhotplug
-	 *      lock to prevent mtrr entry changes
-	 */
-	set_mtrr_from_inactive_cpu(~0U, 0, 0, 0);
-}
-
 /**
  * mtrr_save_state - Save current fixed-range MTRR state of the first
  *	cpu in cpu_online_mask.
@@ -816,34 +760,6 @@ void mtrr_save_state(void)
 	smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
 }
 
-/*
- * Delayed MTRR initialization for all AP's
- */
-void mtrr_aps_init(void)
-{
-	if (!memory_caching_control)
-		return;
-
-	/*
-	 * Check if someone has requested the delay of AP MTRR initialization,
-	 * by doing set_mtrr_aps_delayed_init(), prior to this point. If not,
-	 * then we are done.
-	 */
-	if (!get_cache_aps_delayed_init())
-		return;
-
-	set_mtrr(~0U, 0, 0, 0);
-	set_cache_aps_delayed_init(false);
-}
-
-void mtrr_bp_restore(void)
-{
-	if (!memory_caching_control)
-		return;
-
-	cache_cpu_init();
-}
-
 static int __init mtrr_init_finialize(void)
 {
 	if (!mtrr_enabled())
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 216fee7144ee..e0e185ee0229 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -34,6 +34,7 @@
 #include <asm/numa.h>
 #include <asm/bios_ebda.h>
 #include <asm/bugs.h>
+#include <asm/cacheinfo.h>
 #include <asm/cpu.h>
 #include <asm/efi.h>
 #include <asm/gart.h>
@@ -1075,7 +1076,7 @@ void __init setup_arch(char **cmdline_p)
 
 	/* update e820 for memory not covered by WB MTRRs */
 	if (IS_ENABLED(CONFIG_MTRR))
-		mtrr_bp_init();
+		cache_bp_init();
 	else
 		pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
 
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 13c71ab29d84..1b61a480c966 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1445,7 +1445,7 @@ void arch_thaw_secondary_cpus_begin(void)
 
 void arch_thaw_secondary_cpus_end(void)
 {
-	mtrr_aps_init();
+	cache_aps_init();
 }
 
 /*
@@ -1488,7 +1488,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 
 	nmi_selftest();
 	impress_friends();
-	mtrr_aps_init();
+	cache_aps_init();
 }
 
 static int __initdata setup_possible_cpus = -1;
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index bb176c72891c..754221c9a1c3 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -23,6 +23,7 @@
 #include <asm/fpu/api.h>
 #include <asm/debugreg.h>
 #include <asm/cpu.h>
+#include <asm/cacheinfo.h>
 #include <asm/mmu_context.h>
 #include <asm/cpu_device_id.h>
 #include <asm/microcode.h>
@@ -261,7 +262,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
 	do_fpu_end();
 	tsc_verify_tsc_adjust(true);
 	x86_platform.restore_sched_clock_state();
-	mtrr_bp_restore();
+	cache_bp_restore();
 	perf_restore_debug_store();
 
 	c = &cpu_data(smp_processor_id());
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
  2022-11-02  7:46 [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Juergen Gross
  2022-11-02  7:47 ` [PATCH v5 12/16] x86/mtrr: add a stop_machine() handler calling only cache_cpu_init() Juergen Gross
@ 2022-11-02 18:04 ` Borislav Petkov
  2022-11-03  8:40   ` Juergen Gross
  1 sibling, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2022-11-02 18:04 UTC (permalink / raw)
  To: Juergen Gross
  Cc: linux-kernel, x86, linux-pm, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra

On Wed, Nov 02, 2022 at 08:46:57AM +0100, Juergen Gross wrote:
> Today PAT can't be used without MTRR being available, unless MTRR is at
> least configured via CONFIG_MTRR and the system is running as Xen PV
> guest. In this case PAT is automatically available via the hypervisor,
> but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> 
> The same applies to a kernel built with no MTRR support: it won't
> allow to use the PAT MSR, even if there is no technical reason for
> that, other than setting up PAT on all CPUs the same way (which is a
> requirement of the processor's cache management) is relying on some
> MTRR specific code.
> 
> Fix all of that by:

One of the AMD test boxes here says with this:

...
[    0.863466] PCI: not using MMCONFIG
[    0.863475] PCI: Using configuration type 1 for base access
[    0.863478] PCI: Using configuration type 1 for extended access
[    0.866733] mtrr: your CPUs had inconsistent MTRRdefType settings
[    0.866737] mtrr: probably your BIOS does not setup all CPUs.
[    0.866740] mtrr: corrected configuration.
[    0.869350] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
...

Previous logs don't have it:

PCI: not using MMCONFIG
PCI: Using configuration type 1 for base access
PCI: Using configuration type 1 for extended access
kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
  2022-11-02 18:04 ` [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Borislav Petkov
@ 2022-11-03  8:40   ` Juergen Gross
  2022-11-03 16:15     ` Borislav Petkov
  0 siblings, 1 reply; 7+ messages in thread
From: Juergen Gross @ 2022-11-03  8:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, linux-pm, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra


[-- Attachment #1.1.1: Type: text/plain, Size: 1699 bytes --]

On 02.11.22 19:04, Borislav Petkov wrote:
> On Wed, Nov 02, 2022 at 08:46:57AM +0100, Juergen Gross wrote:
>> Today PAT can't be used without MTRR being available, unless MTRR is at
>> least configured via CONFIG_MTRR and the system is running as Xen PV
>> guest. In this case PAT is automatically available via the hypervisor,
>> but the PAT MSR can't be modified by the kernel and MTRR is disabled.
>>
>> The same applies to a kernel built with no MTRR support: it won't
>> allow to use the PAT MSR, even if there is no technical reason for
>> that, other than setting up PAT on all CPUs the same way (which is a
>> requirement of the processor's cache management) is relying on some
>> MTRR specific code.
>>
>> Fix all of that by:
> 
> One of the AMD test boxes here says with this:
> 
> ...
> [    0.863466] PCI: not using MMCONFIG
> [    0.863475] PCI: Using configuration type 1 for base access
> [    0.863478] PCI: Using configuration type 1 for extended access
> [    0.866733] mtrr: your CPUs had inconsistent MTRRdefType settings
> [    0.866737] mtrr: probably your BIOS does not setup all CPUs.
> [    0.866740] mtrr: corrected configuration.
> [    0.869350] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
> ...
> 
> Previous logs don't have it:
> 
> PCI: not using MMCONFIG
> PCI: Using configuration type 1 for base access
> PCI: Using configuration type 1 for extended access
> kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
> 

Weird. I can't spot any modification which could have caused that.

Would it be possible to identify the patch causing that?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
  2022-11-03  8:40   ` Juergen Gross
@ 2022-11-03 16:15     ` Borislav Petkov
  2022-11-07 19:25       ` Borislav Petkov
  0 siblings, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2022-11-03 16:15 UTC (permalink / raw)
  To: Juergen Gross
  Cc: linux-kernel, x86, linux-pm, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra

On Thu, Nov 03, 2022 at 09:40:32AM +0100, Juergen Gross wrote:
> Would it be possible to identify the patch causing that?

Lemme try to find a smaller box which shows that too - that one is a
pain to bisect on.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
  2022-11-03 16:15     ` Borislav Petkov
@ 2022-11-07 19:25       ` Borislav Petkov
  2022-11-08  7:30         ` Juergen Gross
  0 siblings, 1 reply; 7+ messages in thread
From: Borislav Petkov @ 2022-11-07 19:25 UTC (permalink / raw)
  To: Juergen Gross
  Cc: linux-kernel, x86, linux-pm, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra

On Thu, Nov 03, 2022 at 05:15:52PM +0100, Borislav Petkov wrote:
> Lemme try to find a smaller box which shows that too - that one is a
> pain to bisect on.

Ok, couldn't find a smaller one (or maybe it had to be a big one to
tickle this out).

So I think it is the parallel setup thing:

x86/mtrr: Do MTRR/PAT setup on all secondary CPUs in parallel

Note that before it, it would do the configuration sequentially on each
CPU:

[    0.759239] MTRR: prepare_set: CPU83, MSR_MTRRdefType: 0x0, read: (0xc00:0)
[    0.759239] MTRR: set_mtrr_state: CPU83, mtrr_deftype_lo: 0xc00, mtrr_state.def_type: 0, mtrr_state.enabled: 3
[    0.760794] MTRR: post_set: CPU83, MSR_MTRRdefType will write: (0xc00:0)
[    0.761151] MTRR: prepare_set: CPU70, MSR_MTRRdefType: 0x0, read: (0xc00:0)
[    0.761151] MTRR: set_mtrr_state: CPU70, mtrr_deftype_lo: 0xc00, mtrr_state.def_type: 0, mtrr_state.enabled: 3
[    0.761151] MTRR: post_set: CPU70, MSR_MTRRdefType will write: (0xc00:0)
...

and so on.

Now, it would do it all in parallel:

[    0.762006] MTRR: mtrr_disable: CPU70, MSR_MTRRdefType: 0x0, read: (0xc00:0)
[    0.761916] MTRR: mtrr_disable: CPU18, MSR_MTRRdefType: 0x0, read: (0xc00:0)
[    0.761808] MTRR: mtrr_disable: CPU82, MSR_MTRRdefType: 0x0, read: (0xc00:0)
[    0.762593] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
								      ^^^^^^

Note that last thing. That comes from (with debug output added):

void mtrr_disable(struct cache_state *state)
{
        unsigned int cpu = smp_processor_id();
        u64 msrval;

        /* Save MTRR state */
        rdmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, state->mtrr_deftype_hi);

        /* Disable MTRRs, and set the default type to uncached */
        mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
                   state->mtrr_deftype_hi);

        rdmsrl(MSR_MTRRdefType, msrval);

        pr_info("%s: CPU%d, MSR_MTRRdefType: 0x%llx, read: (0x%x:%x)\n",
                __func__, cpu, msrval, state->mtrr_deftype_lo, state->mtrr_deftype_hi);
}

The "read: (0x0:0)" basically says that

	state->mtrr_deftype_lo, state->mtrr_deftype_hi

are both 0 already. BUT(!), they should NOT be. The low piece is 0xc00 on most
cores except a handful and it means that MTRRs and Fixed Range are
enabled. In total, they're these cores here:

[    0.762593] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762247] MTRR: mtrr_disable: CPU26, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762685] MTRR: mtrr_disable: CPU68, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762725] MTRR: mtrr_disable: CPU17, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762685] MTRR: mtrr_disable: CPU69, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762800] MTRR: mtrr_disable: CPU1, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762734] MTRR: mtrr_disable: CPU13, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762720] MTRR: mtrr_disable: CPU24, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762696] MTRR: mtrr_disable: CPU66, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762716] MTRR: mtrr_disable: CPU48, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762693] MTRR: mtrr_disable: CPU57, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762519] MTRR: mtrr_disable: CPU87, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762532] MTRR: mtrr_disable: CPU58, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762755] MTRR: mtrr_disable: CPU32, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762693] MTRR: mtrr_disable: CPU52, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762861] MTRR: mtrr_disable: CPU0, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762724] MTRR: mtrr_disable: CPU21, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762640] MTRR: mtrr_disable: CPU15, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762615] MTRR: mtrr_disable: CPU50, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762741] MTRR: mtrr_disable: CPU40, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762738] MTRR: mtrr_disable: CPU37, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762716] MTRR: mtrr_disable: CPU25, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762512] MTRR: mtrr_disable: CPU59, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762721] MTRR: mtrr_disable: CPU45, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762682] MTRR: mtrr_disable: CPU56, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762583] MTRR: mtrr_disable: CPU124, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762751] MTRR: mtrr_disable: CPU12, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762741] MTRR: mtrr_disable: CPU9, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762575] MTRR: mtrr_disable: CPU51, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762632] MTRR: mtrr_disable: CPU100, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762688] MTRR: mtrr_disable: CPU61, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762610] MTRR: mtrr_disable: CPU105, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762721] MTRR: mtrr_disable: CPU20, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.762583] MTRR: mtrr_disable: CPU47, MSR_MTRRdefType: 0x0, read: (0x0:0)

Now, if I add MFENCEs around those RDMSRs:

void mtrr_disable(struct cache_state *state)
{
        unsigned int cpu = smp_processor_id();
        u64 msrval;

        /* Save MTRR state */
        rdmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, state->mtrr_deftype_hi);

        __mb();

        /* Disable MTRRs, and set the default type to uncached */
        mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
                   state->mtrr_deftype_hi);

        __mb();

        rdmsrl(MSR_MTRRdefType, msrval);

        pr_info("%s: CPU%d, MSR_MTRRdefType: 0x%llx, read: (0x%x:%x)\n",
                __func__, cpu, msrval, state->mtrr_deftype_lo, state->mtrr_deftype_hi);

        __mb();
}

the amount of cores becomes less:

[    0.765260] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765462] MTRR: mtrr_disable: CPU5, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765242] MTRR: mtrr_disable: CPU22, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765522] MTRR: mtrr_disable: CPU0, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765474] MTRR: mtrr_disable: CPU1, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765207] MTRR: mtrr_disable: CPU54, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765225] MTRR: mtrr_disable: CPU8, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765282] MTRR: mtrr_disable: CPU88, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765150] MTRR: mtrr_disable: CPU119, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765370] MTRR: mtrr_disable: CPU49, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765395] MTRR: mtrr_disable: CPU16, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765348] MTRR: mtrr_disable: CPU52, MSR_MTRRdefType: 0x0, read: (0x0:0)
[    0.765270] MTRR: mtrr_disable: CPU58, MSR_MTRRdefType: 0x0, read: (0x0:0)

which basically hints at some speculative fun where we end up reading
the MSR *after* the write to it has already happened. After this thing:

        /* Disable MTRRs, and set the default type to uncached */
        mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
                   state->mtrr_deftype_hi);

and thus when we read it, we already read the disabled state. But this
is only a conjecture because I still have no clear idea how TF would
that even happen?!?

Needless to say, this fixes it, ofc:

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 3805a6d32d37..4a685898caf3 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -1116,12 +1116,14 @@ void cache_enable(struct cache_state *state)
                __write_cr4(state->cr4);
 }
 
+static DEFINE_RAW_SPINLOCK(set_atomicity_lock);
+
 static void cache_cpu_init(void)
 {
        unsigned long flags;
        struct cache_state state = { };
 
-       local_irq_save(flags);
+       raw_spin_lock_irqsave(&set_atomicity_lock, flags);
        cache_disable(&state);
 
        if (memory_caching_control & CACHE_MTRR)
@@ -1131,7 +1133,7 @@ static void cache_cpu_init(void)
                pat_cpu_init();
 
        cache_enable(&state);
-       local_irq_restore(flags);
+       raw_spin_unlock_irqrestore(&set_atomicity_lock, flags);
 }
 
 static bool cache_aps_delayed_init = true;

---

and frankly, considering how we have bigger fish to fry, I'd say we do
it the old way and leave that can'o'worms half-opened.

Unless you wanna continue poking at it. I can give you access to that
box at work...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 00/16] x86: make PAT and MTRR independent from each other
  2022-11-07 19:25       ` Borislav Petkov
@ 2022-11-08  7:30         ` Juergen Gross
  0 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2022-11-08  7:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, linux-pm, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, Rafael J. Wysocki, Pavel Machek,
	Andy Lutomirski, Peter Zijlstra


[-- Attachment #1.1.1: Type: text/plain, Size: 9430 bytes --]

On 07.11.22 20:25, Borislav Petkov wrote:
> On Thu, Nov 03, 2022 at 05:15:52PM +0100, Borislav Petkov wrote:
>> Lemme try to find a smaller box which shows that too - that one is a
>> pain to bisect on.
> 
> Ok, couldn't find a smaller one (or maybe it had to be a big one to
> tickle this out).
> 
> So I think it is the parallel setup thing:
> 
> x86/mtrr: Do MTRR/PAT setup on all secondary CPUs in parallel
> 
> Note that before it, it would do the configuration sequentially on each
> CPU:
> 
> [    0.759239] MTRR: prepare_set: CPU83, MSR_MTRRdefType: 0x0, read: (0xc00:0)
> [    0.759239] MTRR: set_mtrr_state: CPU83, mtrr_deftype_lo: 0xc00, mtrr_state.def_type: 0, mtrr_state.enabled: 3
> [    0.760794] MTRR: post_set: CPU83, MSR_MTRRdefType will write: (0xc00:0)
> [    0.761151] MTRR: prepare_set: CPU70, MSR_MTRRdefType: 0x0, read: (0xc00:0)
> [    0.761151] MTRR: set_mtrr_state: CPU70, mtrr_deftype_lo: 0xc00, mtrr_state.def_type: 0, mtrr_state.enabled: 3
> [    0.761151] MTRR: post_set: CPU70, MSR_MTRRdefType will write: (0xc00:0)
> ...
> 
> and so on.
> 
> Now, it would do it all in parallel:
> 
> [    0.762006] MTRR: mtrr_disable: CPU70, MSR_MTRRdefType: 0x0, read: (0xc00:0)
> [    0.761916] MTRR: mtrr_disable: CPU18, MSR_MTRRdefType: 0x0, read: (0xc00:0)
> [    0.761808] MTRR: mtrr_disable: CPU82, MSR_MTRRdefType: 0x0, read: (0xc00:0)
> [    0.762593] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
> 								      ^^^^^^
> 
> Note that last thing. That comes from (with debug output added):
> 
> void mtrr_disable(struct cache_state *state)
> {
>          unsigned int cpu = smp_processor_id();
>          u64 msrval;
> 
>          /* Save MTRR state */
>          rdmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, state->mtrr_deftype_hi);
> 
>          /* Disable MTRRs, and set the default type to uncached */
>          mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
>                     state->mtrr_deftype_hi);
> 
>          rdmsrl(MSR_MTRRdefType, msrval);
> 
>          pr_info("%s: CPU%d, MSR_MTRRdefType: 0x%llx, read: (0x%x:%x)\n",
>                  __func__, cpu, msrval, state->mtrr_deftype_lo, state->mtrr_deftype_hi);
> }
> 
> The "read: (0x0:0)" basically says that
> 
> 	state->mtrr_deftype_lo, state->mtrr_deftype_hi
> 
> are both 0 already. BUT(!), they should NOT be. The low piece is 0xc00 on most
> cores except a handful and it means that MTRRs and Fixed Range are
> enabled. In total, they're these cores here:
> 
> [    0.762593] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762247] MTRR: mtrr_disable: CPU26, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762685] MTRR: mtrr_disable: CPU68, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762725] MTRR: mtrr_disable: CPU17, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762685] MTRR: mtrr_disable: CPU69, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762800] MTRR: mtrr_disable: CPU1, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762734] MTRR: mtrr_disable: CPU13, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762720] MTRR: mtrr_disable: CPU24, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762696] MTRR: mtrr_disable: CPU66, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762716] MTRR: mtrr_disable: CPU48, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762693] MTRR: mtrr_disable: CPU57, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762519] MTRR: mtrr_disable: CPU87, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762532] MTRR: mtrr_disable: CPU58, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762755] MTRR: mtrr_disable: CPU32, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762693] MTRR: mtrr_disable: CPU52, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762861] MTRR: mtrr_disable: CPU0, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762724] MTRR: mtrr_disable: CPU21, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762640] MTRR: mtrr_disable: CPU15, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762615] MTRR: mtrr_disable: CPU50, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762741] MTRR: mtrr_disable: CPU40, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762738] MTRR: mtrr_disable: CPU37, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762716] MTRR: mtrr_disable: CPU25, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762512] MTRR: mtrr_disable: CPU59, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762721] MTRR: mtrr_disable: CPU45, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762682] MTRR: mtrr_disable: CPU56, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762583] MTRR: mtrr_disable: CPU124, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762751] MTRR: mtrr_disable: CPU12, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762741] MTRR: mtrr_disable: CPU9, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762575] MTRR: mtrr_disable: CPU51, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762632] MTRR: mtrr_disable: CPU100, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762688] MTRR: mtrr_disable: CPU61, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762610] MTRR: mtrr_disable: CPU105, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762721] MTRR: mtrr_disable: CPU20, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.762583] MTRR: mtrr_disable: CPU47, MSR_MTRRdefType: 0x0, read: (0x0:0)
> 
> Now, if I add MFENCEs around those RDMSRs:
> 
> void mtrr_disable(struct cache_state *state)
> {
>          unsigned int cpu = smp_processor_id();
>          u64 msrval;
> 
>          /* Save MTRR state */
>          rdmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, state->mtrr_deftype_hi);
> 
>          __mb();
> 
>          /* Disable MTRRs, and set the default type to uncached */
>          mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
>                     state->mtrr_deftype_hi);
> 
>          __mb();
> 
>          rdmsrl(MSR_MTRRdefType, msrval);
> 
>          pr_info("%s: CPU%d, MSR_MTRRdefType: 0x%llx, read: (0x%x:%x)\n",
>                  __func__, cpu, msrval, state->mtrr_deftype_lo, state->mtrr_deftype_hi);
> 
>          __mb();
> }
> 
> the amount of cores becomes less:

Probably not because of the fencing, but because of the different timing.

> 
> [    0.765260] MTRR: mtrr_disable: CPU6, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765462] MTRR: mtrr_disable: CPU5, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765242] MTRR: mtrr_disable: CPU22, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765522] MTRR: mtrr_disable: CPU0, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765474] MTRR: mtrr_disable: CPU1, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765207] MTRR: mtrr_disable: CPU54, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765225] MTRR: mtrr_disable: CPU8, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765282] MTRR: mtrr_disable: CPU88, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765150] MTRR: mtrr_disable: CPU119, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765370] MTRR: mtrr_disable: CPU49, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765395] MTRR: mtrr_disable: CPU16, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765348] MTRR: mtrr_disable: CPU52, MSR_MTRRdefType: 0x0, read: (0x0:0)
> [    0.765270] MTRR: mtrr_disable: CPU58, MSR_MTRRdefType: 0x0, read: (0x0:0)
> 
> which basically hints at some speculative fun where we end up reading
> the MSR *after* the write to it has already happened. After this thing:
> 
>          /* Disable MTRRs, and set the default type to uncached */
>          mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff,
>                     state->mtrr_deftype_hi);
> 
> and thus when we read it, we already read the disabled state. But this
> is only a conjecture because I still have no clear idea how TF would
> that even happen?!?

Yeah, and why doesn't it happen when we handle only one cpu at a time?

There might be some interaction between the cpus ...

> 
> Needless to say, this fixes it, ofc:
> 
> diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
> index 3805a6d32d37..4a685898caf3 100644
> --- a/arch/x86/kernel/cpu/cacheinfo.c
> +++ b/arch/x86/kernel/cpu/cacheinfo.c
> @@ -1116,12 +1116,14 @@ void cache_enable(struct cache_state *state)
>                  __write_cr4(state->cr4);
>   }
>   
> +static DEFINE_RAW_SPINLOCK(set_atomicity_lock);
> +
>   static void cache_cpu_init(void)
>   {
>          unsigned long flags;
>          struct cache_state state = { };
>   
> -       local_irq_save(flags);
> +       raw_spin_lock_irqsave(&set_atomicity_lock, flags);
>          cache_disable(&state);
>   
>          if (memory_caching_control & CACHE_MTRR)
> @@ -1131,7 +1133,7 @@ static void cache_cpu_init(void)
>                  pat_cpu_init();
>   
>          cache_enable(&state);
> -       local_irq_restore(flags);
> +       raw_spin_unlock_irqrestore(&set_atomicity_lock, flags);
>   }
>   
>   static bool cache_aps_delayed_init = true;
> 
> ---
> 
> and frankly, considering how we have bigger fish to fry, I'd say we do
> it the old way and leave that can'o'worms half-opened.

I agree to keep this patch out of the series for now.

> 
> Unless you wanna continue poking at it. I can give you access to that
> box at work...

Yes, please. I suspect there are some additional requirements for updating
MTRR in parallel, or this is "just" a cpu bug.


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-08  7:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-02  7:46 [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Juergen Gross
2022-11-02  7:47 ` [PATCH v5 12/16] x86/mtrr: add a stop_machine() handler calling only cache_cpu_init() Juergen Gross
2022-11-02 18:04 ` [PATCH v5 00/16] x86: make PAT and MTRR independent from each other Borislav Petkov
2022-11-03  8:40   ` Juergen Gross
2022-11-03 16:15     ` Borislav Petkov
2022-11-07 19:25       ` Borislav Petkov
2022-11-08  7:30         ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).