* [PATCH 1/3] x86: move some code out of arch/x86/kernel/cpu/mtrr
2022-07-15 14:25 [PATCH 0/3] x86: make pat and mtrr independent from each other Juergen Gross
@ 2022-07-15 14:25 ` Juergen Gross
2022-07-18 12:20 ` Borislav Petkov
2022-07-15 14:25 ` [PATCH 2/3] x86: add wrapper functions for mtrr functions handling also pat Juergen Gross
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2022-07-15 14:25 UTC (permalink / raw)
To: xen-devel, x86, linux-kernel
Cc: brchuckz, jbeulich, Juergen Gross, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, stable
Prepare making PAT and MTRR support independent from each other by
moving some code needed by both out of the MTRR specific sources.
Cc: <stable@vger.kernel.org> # 5.17
Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/include/asm/mtrr.h | 4 ++
arch/x86/include/asm/processor.h | 3 ++
arch/x86/kernel/cpu/common.c | 76 ++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mtrr/generic.c | 80 +++---------------------------
4 files changed, 91 insertions(+), 72 deletions(-)
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 76d726074c16..12a16caed395 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -48,6 +48,8 @@ extern void mtrr_aps_init(void);
extern void mtrr_bp_restore(void);
extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
+void mtrr_disable(void);
+void mtrr_enable(void);
# else
static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
{
@@ -87,6 +89,8 @@ static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
#define set_mtrr_aps_delayed_init() do {} while (0)
#define mtrr_aps_init() do {} while (0)
#define mtrr_bp_restore() do {} while (0)
+#define mtrr_disable() do {} while (0)
+#define mtrr_enable() do {} while (0)
# endif
#ifdef CONFIG_COMPAT
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 356308c73951..5c934b922450 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -865,4 +865,7 @@ bool arch_is_platform_page(u64 paddr);
#define arch_is_platform_page arch_is_platform_page
#endif
+void cache_disable(void);
+void cache_enable(void);
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 736262a76a12..e43322f8a4ef 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -61,6 +61,7 @@
#include <asm/sigframe.h>
#include <asm/traps.h>
#include <asm/sev.h>
+#include <asm/mtrr.h>
#include "cpu.h"
@@ -2327,3 +2328,78 @@ void arch_smt_update(void)
/* Check whether IPI broadcasting can be enabled */
apic_smt_update();
}
+
+/*
+ * Disable and enable caches. Needed for changing MTRRs and the PAT MSR.
+ *
+ * Since we are disabling the cache don't allow any interrupts,
+ * they would run extremely slow and would only increase the pain.
+ *
+ * The caller must ensure that local interrupts are disabled and
+ * are reenabled after cache_enable() has been called.
+ */
+static unsigned long saved_cr4;
+static DEFINE_RAW_SPINLOCK(cache_disable_lock);
+
+void cache_disable(void) __acquires(cache_disable_lock)
+{
+ unsigned long cr0;
+
+ /*
+ * Note that this is not ideal
+ * since the cache is only flushed/disabled for this CPU while the
+ * MTRRs are changed, but changing this requires more invasive
+ * changes to the way the kernel boots
+ */
+
+ raw_spin_lock(&cache_disable_lock);
+
+ /* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
+ cr0 = read_cr0() | X86_CR0_CD;
+ write_cr0(cr0);
+
+ /*
+ * Cache flushing is the most time-consuming step when programming
+ * the MTRRs. Fortunately, as per the Intel Software Development
+ * Manual, we can skip it if the processor supports cache self-
+ * snooping.
+ */
+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+ wbinvd();
+
+ /* Save value of CR4 and clear Page Global Enable (bit 7) */
+ if (boot_cpu_has(X86_FEATURE_PGE)) {
+ saved_cr4 = __read_cr4();
+ __write_cr4(saved_cr4 & ~X86_CR4_PGE);
+ }
+
+ /* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+ flush_tlb_local();
+
+ if (boot_cpu_has(X86_FEATURE_MTRR))
+ mtrr_disable();
+
+ /* Again, only flush caches if we have to. */
+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+ wbinvd();
+}
+
+void cache_enable(void) __releases(cache_disable_lock)
+{
+ /* Flush TLBs (no need to flush caches - they are disabled) */
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+ flush_tlb_local();
+
+ if (boot_cpu_has(X86_FEATURE_MTRR))
+ mtrr_enable();
+
+ /* Enable caches */
+ write_cr0(read_cr0() & ~X86_CR0_CD);
+
+ /* Restore value of CR4 */
+ if (boot_cpu_has(X86_FEATURE_PGE))
+ __write_cr4(saved_cr4);
+
+ raw_spin_unlock(&cache_disable_lock);
+}
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 558108296f3c..84732215b61d 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -396,9 +396,6 @@ print_fixed(unsigned base, unsigned step, const mtrr_type *types)
}
}
-static void prepare_set(void);
-static void post_set(void);
-
static void __init print_mtrr_state(void)
{
unsigned int i;
@@ -450,11 +447,11 @@ void __init mtrr_bp_pat_init(void)
unsigned long flags;
local_irq_save(flags);
- prepare_set();
+ cache_disable();
pat_init();
- post_set();
+ cache_enable();
local_irq_restore(flags);
}
@@ -715,80 +712,19 @@ static unsigned long set_mtrr_state(void)
return change_mask;
}
-
-static unsigned long cr4;
-static DEFINE_RAW_SPINLOCK(set_atomicity_lock);
-
-/*
- * Since we are disabling the cache don't allow any interrupts,
- * they would run extremely slow and would only increase the pain.
- *
- * The caller must ensure that local interrupts are disabled and
- * are reenabled after post_set() has been called.
- */
-static void prepare_set(void) __acquires(set_atomicity_lock)
+void mtrr_disable(void)
{
- unsigned long cr0;
-
- /*
- * Note that this is not ideal
- * since the cache is only flushed/disabled for this CPU while the
- * MTRRs are changed, but changing this requires more invasive
- * changes to the way the kernel boots
- */
-
- raw_spin_lock(&set_atomicity_lock);
-
- /* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
- cr0 = read_cr0() | X86_CR0_CD;
- write_cr0(cr0);
-
- /*
- * Cache flushing is the most time-consuming step when programming
- * the MTRRs. Fortunately, as per the Intel Software Development
- * Manual, we can skip it if the processor supports cache self-
- * snooping.
- */
- if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
- wbinvd();
-
- /* Save value of CR4 and clear Page Global Enable (bit 7) */
- if (boot_cpu_has(X86_FEATURE_PGE)) {
- cr4 = __read_cr4();
- __write_cr4(cr4 & ~X86_CR4_PGE);
- }
-
- /* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- flush_tlb_local();
-
/* Save MTRR state */
rdmsr(MSR_MTRRdefType, deftype_lo, deftype_hi);
/* Disable MTRRs, and set the default type to uncached */
mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & ~0xcff, deftype_hi);
-
- /* Again, only flush caches if we have to. */
- if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
- wbinvd();
}
-static void post_set(void) __releases(set_atomicity_lock)
+void mtrr_enable(void)
{
- /* Flush TLBs (no need to flush caches - they are disabled) */
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- flush_tlb_local();
-
/* Intel (P6) standard MTRRs */
mtrr_wrmsr(MSR_MTRRdefType, deftype_lo, deftype_hi);
-
- /* Enable caches */
- write_cr0(read_cr0() & ~X86_CR0_CD);
-
- /* Restore value of CR4 */
- if (boot_cpu_has(X86_FEATURE_PGE))
- __write_cr4(cr4);
- raw_spin_unlock(&set_atomicity_lock);
}
static void generic_set_all(void)
@@ -797,7 +733,7 @@ static void generic_set_all(void)
unsigned long flags;
local_irq_save(flags);
- prepare_set();
+ cache_disable();
/* Actually set the state */
mask = set_mtrr_state();
@@ -805,7 +741,7 @@ static void generic_set_all(void)
/* also set PAT */
pat_init();
- post_set();
+ cache_enable();
local_irq_restore(flags);
/* Use the atomic bitops to update the global mask */
@@ -836,7 +772,7 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base,
vr = &mtrr_state.var_ranges[reg];
local_irq_save(flags);
- prepare_set();
+ cache_disable();
if (size == 0) {
/*
@@ -855,7 +791,7 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base,
mtrr_wrmsr(MTRRphysMask_MSR(reg), vr->mask_lo, vr->mask_hi);
}
- post_set();
+ cache_enable();
local_irq_restore(flags);
}
--
2.35.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 1/3] x86: move some code out of arch/x86/kernel/cpu/mtrr
2022-07-15 14:25 ` [PATCH 1/3] x86: move some code out of arch/x86/kernel/cpu/mtrr Juergen Gross
@ 2022-07-18 12:20 ` Borislav Petkov
0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2022-07-18 12:20 UTC (permalink / raw)
To: Juergen Gross
Cc: xen-devel, x86, linux-kernel, brchuckz, jbeulich, Thomas Gleixner,
Ingo Molnar, Dave Hansen, H. Peter Anvin, stable
On Fri, Jul 15, 2022 at 04:25:47PM +0200, Juergen Gross wrote:
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 736262a76a12..e43322f8a4ef 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
I guess the move's ok but not into cpu/common.c pls. That thing is
huuuge and is a dumping ground for everything.
arch/x86/kernel/cpu/cacheinfo.c looks like a more viable candidate for
all things cache.
Rest looks trivial.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/3] x86: add wrapper functions for mtrr functions handling also pat
2022-07-15 14:25 [PATCH 0/3] x86: make pat and mtrr independent from each other Juergen Gross
2022-07-15 14:25 ` [PATCH 1/3] x86: move some code out of arch/x86/kernel/cpu/mtrr Juergen Gross
@ 2022-07-15 14:25 ` Juergen Gross
2022-07-15 16:41 ` Rafael J. Wysocki
2022-07-15 14:25 ` [PATCH 3/3] x86: decouple pat and mtrr handling Juergen Gross
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2022-07-15 14:25 UTC (permalink / raw)
To: xen-devel, x86, linux-kernel, linux-pm
Cc: brchuckz, jbeulich, Juergen Gross, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Rafael J. Wysocki,
Pavel Machek, stable
There are several MTRR functions which also do PAT handling. In order
to support PAT handling without MTRR in the future, add some wrappers
for those functions.
Cc: <stable@vger.kernel.org> # 5.17
Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/include/asm/mtrr.h | 2 --
arch/x86/include/asm/processor.h | 7 +++++
arch/x86/kernel/cpu/common.c | 44 +++++++++++++++++++++++++++++++-
arch/x86/kernel/cpu/mtrr/mtrr.c | 25 +++---------------
arch/x86/kernel/setup.c | 5 +---
arch/x86/kernel/smpboot.c | 8 +++---
arch/x86/power/cpu.c | 2 +-
7 files changed, 59 insertions(+), 34 deletions(-)
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 12a16caed395..900083ac9f60 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -43,7 +43,6 @@ extern int mtrr_del(int reg, unsigned long base, unsigned long size);
extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
extern void mtrr_ap_init(void);
-extern void set_mtrr_aps_delayed_init(void);
extern void mtrr_aps_init(void);
extern void mtrr_bp_restore(void);
extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
@@ -86,7 +85,6 @@ static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
{
}
#define mtrr_ap_init() do {} while (0)
-#define set_mtrr_aps_delayed_init() do {} while (0)
#define mtrr_aps_init() do {} while (0)
#define mtrr_bp_restore() do {} while (0)
#define mtrr_disable() do {} while (0)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 5c934b922450..e2140204fb7e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -865,7 +865,14 @@ bool arch_is_platform_page(u64 paddr);
#define arch_is_platform_page arch_is_platform_page
#endif
+extern bool cache_aps_delayed_init;
+
void cache_disable(void);
void cache_enable(void);
+void cache_bp_init(void);
+void cache_ap_init(void);
+void cache_set_aps_delayed_init(void);
+void cache_aps_init(void);
+void cache_bp_restore(void);
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e43322f8a4ef..0a1bd14f7966 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1929,7 +1929,7 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
#ifdef CONFIG_X86_32
enable_sep_cpu();
#endif
- mtrr_ap_init();
+ cache_ap_init();
validate_apic_and_package_id(c);
x86_spec_ctrl_setup_ap();
update_srbds_msr();
@@ -2403,3 +2403,45 @@ void cache_enable(void) __releases(cache_disable_lock)
raw_spin_unlock(&cache_disable_lock);
}
+
+void __init cache_bp_init(void)
+{
+ if (IS_ENABLED(CONFIG_MTRR))
+ mtrr_bp_init();
+ else
+ pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
+}
+
+void cache_ap_init(void)
+{
+ if (cache_aps_delayed_init)
+ return;
+
+ mtrr_ap_init();
+}
+
+bool cache_aps_delayed_init;
+
+void cache_set_aps_delayed_init(void)
+{
+ cache_aps_delayed_init = true;
+}
+
+void cache_aps_init(void)
+{
+ /*
+ * Check if someone has requested the delay of AP cache initialization,
+ * by doing cache_set_aps_delayed_init(), prior to this point. If not,
+ * then we are done.
+ */
+ if (!cache_aps_delayed_init)
+ return;
+
+ mtrr_aps_init();
+ cache_aps_delayed_init = false;
+}
+
+void cache_bp_restore(void)
+{
+ mtrr_bp_restore();
+}
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
index 2746cac9d8a9..c1593cfae641 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
@@ -69,7 +69,6 @@ unsigned int mtrr_usage_table[MTRR_MAX_VAR_RANGES];
static DEFINE_MUTEX(mtrr_mutex);
u64 size_or_mask, size_and_mask;
-static bool mtrr_aps_delayed_init;
static const struct mtrr_ops *mtrr_ops[X86_VENDOR_NUM] __ro_after_init;
@@ -176,7 +175,8 @@ static int mtrr_rendezvous_handler(void *info)
if (data->smp_reg != ~0U) {
mtrr_if->set(data->smp_reg, data->smp_base,
data->smp_size, data->smp_type);
- } else if (mtrr_aps_delayed_init || !cpu_online(smp_processor_id())) {
+ } else if ((use_intel() && cache_aps_delayed_init) ||
+ !cpu_online(smp_processor_id())) {
mtrr_if->set_all();
}
return 0;
@@ -789,7 +789,7 @@ void mtrr_ap_init(void)
if (!mtrr_enabled())
return;
- if (!use_intel() || mtrr_aps_delayed_init)
+ if (!use_intel())
return;
/*
@@ -823,16 +823,6 @@ void mtrr_save_state(void)
smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
}
-void set_mtrr_aps_delayed_init(void)
-{
- if (!mtrr_enabled())
- return;
- if (!use_intel())
- return;
-
- mtrr_aps_delayed_init = true;
-}
-
/*
* Delayed MTRR initialization for all AP's
*/
@@ -841,16 +831,7 @@ void mtrr_aps_init(void)
if (!use_intel() || !mtrr_enabled())
return;
- /*
- * Check if someone has requested the delay of AP MTRR initialization,
- * by doing set_mtrr_aps_delayed_init(), prior to this point. If not,
- * then we are done.
- */
- if (!mtrr_aps_delayed_init)
- return;
-
set_mtrr(~0U, 0, 0, 0);
- mtrr_aps_delayed_init = false;
}
void mtrr_bp_restore(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index bd6c6fd373ae..27d61f73c68a 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1001,10 +1001,7 @@ void __init setup_arch(char **cmdline_p)
max_pfn = e820__end_of_ram_pfn();
/* update e820 for memory not covered by WB MTRRs */
- if (IS_ENABLED(CONFIG_MTRR))
- mtrr_bp_init();
- else
- pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
+ cache_bp_init();
if (mtrr_trim_uncached_memory(max_pfn))
max_pfn = e820__end_of_ram_pfn();
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5e7f9532a10d..535d73a47062 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1432,7 +1432,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
uv_system_init();
- set_mtrr_aps_delayed_init();
+ cache_set_aps_delayed_init();
smp_quirk_init_udelay();
@@ -1443,12 +1443,12 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
void arch_thaw_secondary_cpus_begin(void)
{
- set_mtrr_aps_delayed_init();
+ cache_set_aps_delayed_init();
}
void arch_thaw_secondary_cpus_end(void)
{
- mtrr_aps_init();
+ cache_aps_init();
}
/*
@@ -1491,7 +1491,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
nmi_selftest();
impress_friends();
- mtrr_aps_init();
+ cache_aps_init();
}
static int __initdata setup_possible_cpus = -1;
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index bb176c72891c..21e014715322 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -261,7 +261,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
do_fpu_end();
tsc_verify_tsc_adjust(true);
x86_platform.restore_sched_clock_state();
- mtrr_bp_restore();
+ cache_bp_restore();
perf_restore_debug_store();
c = &cpu_data(smp_processor_id());
--
2.35.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/3] x86: add wrapper functions for mtrr functions handling also pat
2022-07-15 14:25 ` [PATCH 2/3] x86: add wrapper functions for mtrr functions handling also pat Juergen Gross
@ 2022-07-15 16:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2022-07-15 16:41 UTC (permalink / raw)
To: Juergen Gross
Cc: xen-devel, the arch/x86 maintainers, Linux Kernel Mailing List,
Linux PM, brchuckz, Jan Beulich, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Rafael J. Wysocki,
Pavel Machek, Stable
On Fri, Jul 15, 2022 at 4:25 PM Juergen Gross <jgross@suse.com> wrote:
>
> There are several MTRR functions which also do PAT handling. In order
> to support PAT handling without MTRR in the future, add some wrappers
> for those functions.
>
> Cc: <stable@vger.kernel.org> # 5.17
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Signed-off-by: Juergen Gross <jgross@suse.com>
Do I understand correctly that this particular patch doesn't change
the behavior?
If so, it would be good to mention that in the changelog.
> ---
> arch/x86/include/asm/mtrr.h | 2 --
> arch/x86/include/asm/processor.h | 7 +++++
> arch/x86/kernel/cpu/common.c | 44 +++++++++++++++++++++++++++++++-
> arch/x86/kernel/cpu/mtrr/mtrr.c | 25 +++---------------
> arch/x86/kernel/setup.c | 5 +---
> arch/x86/kernel/smpboot.c | 8 +++---
> arch/x86/power/cpu.c | 2 +-
> 7 files changed, 59 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
> index 12a16caed395..900083ac9f60 100644
> --- a/arch/x86/include/asm/mtrr.h
> +++ b/arch/x86/include/asm/mtrr.h
> @@ -43,7 +43,6 @@ extern int mtrr_del(int reg, unsigned long base, unsigned long size);
> extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
> extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
> extern void mtrr_ap_init(void);
> -extern void set_mtrr_aps_delayed_init(void);
> extern void mtrr_aps_init(void);
> extern void mtrr_bp_restore(void);
> extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
> @@ -86,7 +85,6 @@ static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
> {
> }
> #define mtrr_ap_init() do {} while (0)
> -#define set_mtrr_aps_delayed_init() do {} while (0)
> #define mtrr_aps_init() do {} while (0)
> #define mtrr_bp_restore() do {} while (0)
> #define mtrr_disable() do {} while (0)
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 5c934b922450..e2140204fb7e 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -865,7 +865,14 @@ bool arch_is_platform_page(u64 paddr);
> #define arch_is_platform_page arch_is_platform_page
> #endif
>
> +extern bool cache_aps_delayed_init;
> +
> void cache_disable(void);
> void cache_enable(void);
> +void cache_bp_init(void);
> +void cache_ap_init(void);
> +void cache_set_aps_delayed_init(void);
> +void cache_aps_init(void);
> +void cache_bp_restore(void);
>
> #endif /* _ASM_X86_PROCESSOR_H */
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index e43322f8a4ef..0a1bd14f7966 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1929,7 +1929,7 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
> #ifdef CONFIG_X86_32
> enable_sep_cpu();
> #endif
> - mtrr_ap_init();
> + cache_ap_init();
> validate_apic_and_package_id(c);
> x86_spec_ctrl_setup_ap();
> update_srbds_msr();
> @@ -2403,3 +2403,45 @@ void cache_enable(void) __releases(cache_disable_lock)
>
> raw_spin_unlock(&cache_disable_lock);
> }
> +
> +void __init cache_bp_init(void)
> +{
> + if (IS_ENABLED(CONFIG_MTRR))
> + mtrr_bp_init();
> + else
> + pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
> +}
> +
> +void cache_ap_init(void)
> +{
> + if (cache_aps_delayed_init)
> + return;
> +
> + mtrr_ap_init();
> +}
> +
> +bool cache_aps_delayed_init;
> +
> +void cache_set_aps_delayed_init(void)
> +{
> + cache_aps_delayed_init = true;
> +}
> +
> +void cache_aps_init(void)
> +{
> + /*
> + * Check if someone has requested the delay of AP cache initialization,
> + * by doing cache_set_aps_delayed_init(), prior to this point. If not,
> + * then we are done.
> + */
> + if (!cache_aps_delayed_init)
> + return;
> +
> + mtrr_aps_init();
> + cache_aps_delayed_init = false;
> +}
> +
> +void cache_bp_restore(void)
> +{
> + mtrr_bp_restore();
> +}
> diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
> index 2746cac9d8a9..c1593cfae641 100644
> --- a/arch/x86/kernel/cpu/mtrr/mtrr.c
> +++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
> @@ -69,7 +69,6 @@ unsigned int mtrr_usage_table[MTRR_MAX_VAR_RANGES];
> static DEFINE_MUTEX(mtrr_mutex);
>
> u64 size_or_mask, size_and_mask;
> -static bool mtrr_aps_delayed_init;
>
> static const struct mtrr_ops *mtrr_ops[X86_VENDOR_NUM] __ro_after_init;
>
> @@ -176,7 +175,8 @@ static int mtrr_rendezvous_handler(void *info)
> if (data->smp_reg != ~0U) {
> mtrr_if->set(data->smp_reg, data->smp_base,
> data->smp_size, data->smp_type);
> - } else if (mtrr_aps_delayed_init || !cpu_online(smp_processor_id())) {
> + } else if ((use_intel() && cache_aps_delayed_init) ||
> + !cpu_online(smp_processor_id())) {
> mtrr_if->set_all();
> }
> return 0;
> @@ -789,7 +789,7 @@ void mtrr_ap_init(void)
> if (!mtrr_enabled())
> return;
>
> - if (!use_intel() || mtrr_aps_delayed_init)
> + if (!use_intel())
> return;
>
> /*
> @@ -823,16 +823,6 @@ void mtrr_save_state(void)
> smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
> }
>
> -void set_mtrr_aps_delayed_init(void)
> -{
> - if (!mtrr_enabled())
> - return;
> - if (!use_intel())
> - return;
> -
> - mtrr_aps_delayed_init = true;
> -}
> -
> /*
> * Delayed MTRR initialization for all AP's
> */
> @@ -841,16 +831,7 @@ void mtrr_aps_init(void)
> if (!use_intel() || !mtrr_enabled())
> return;
>
> - /*
> - * Check if someone has requested the delay of AP MTRR initialization,
> - * by doing set_mtrr_aps_delayed_init(), prior to this point. If not,
> - * then we are done.
> - */
> - if (!mtrr_aps_delayed_init)
> - return;
> -
> set_mtrr(~0U, 0, 0, 0);
> - mtrr_aps_delayed_init = false;
> }
>
> void mtrr_bp_restore(void)
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index bd6c6fd373ae..27d61f73c68a 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1001,10 +1001,7 @@ void __init setup_arch(char **cmdline_p)
> max_pfn = e820__end_of_ram_pfn();
>
> /* update e820 for memory not covered by WB MTRRs */
> - if (IS_ENABLED(CONFIG_MTRR))
> - mtrr_bp_init();
> - else
> - pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
> + cache_bp_init();
>
> if (mtrr_trim_uncached_memory(max_pfn))
> max_pfn = e820__end_of_ram_pfn();
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 5e7f9532a10d..535d73a47062 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1432,7 +1432,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
>
> uv_system_init();
>
> - set_mtrr_aps_delayed_init();
> + cache_set_aps_delayed_init();
>
> smp_quirk_init_udelay();
>
> @@ -1443,12 +1443,12 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
>
> void arch_thaw_secondary_cpus_begin(void)
> {
> - set_mtrr_aps_delayed_init();
> + cache_set_aps_delayed_init();
> }
>
> void arch_thaw_secondary_cpus_end(void)
> {
> - mtrr_aps_init();
> + cache_aps_init();
> }
>
> /*
> @@ -1491,7 +1491,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
>
> nmi_selftest();
> impress_friends();
> - mtrr_aps_init();
> + cache_aps_init();
> }
>
> static int __initdata setup_possible_cpus = -1;
> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> index bb176c72891c..21e014715322 100644
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -261,7 +261,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> do_fpu_end();
> tsc_verify_tsc_adjust(true);
> x86_platform.restore_sched_clock_state();
> - mtrr_bp_restore();
> + cache_bp_restore();
> perf_restore_debug_store();
>
> c = &cpu_data(smp_processor_id());
> --
> 2.35.3
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/3] x86: decouple pat and mtrr handling
2022-07-15 14:25 [PATCH 0/3] x86: make pat and mtrr independent from each other Juergen Gross
2022-07-15 14:25 ` [PATCH 1/3] x86: move some code out of arch/x86/kernel/cpu/mtrr Juergen Gross
2022-07-15 14:25 ` [PATCH 2/3] x86: add wrapper functions for mtrr functions handling also pat Juergen Gross
@ 2022-07-15 14:25 ` Juergen Gross
2022-07-19 15:15 ` Borislav Petkov
2022-07-16 11:32 ` [PATCH 0/3] x86: make pat and mtrr independent from each other Chuck Zmudzinski
2022-07-17 7:55 ` Thorsten Leemhuis
4 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2022-07-15 14:25 UTC (permalink / raw)
To: xen-devel, x86, linux-kernel
Cc: brchuckz, jbeulich, Juergen Gross, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Andy Lutomirski,
Peter Zijlstra, Boris Ostrovsky, stable
Today PAT is usable only with MTRR being active, with some nasty tweaks
to make PAT usable when running as Xen PV guest, which doesn't support
MTRR.
The reason for this coupling is, that both, PAT MSR changes and MTRR
changes, require a similar sequence and so full PAT support was added
using the already available MTRR handling.
Xen PV PAT handling can work without MTRR, as it just needs to consume
the PAT MSR setting done by the hypervisor without the ability and need
to change it. This in turn has resulted in a convoluted initialization
sequence and wrong decisions regarding cache mode availability due to
misguiding PAT availability flags.
Fix all of that by allowing to use PAT without MTRR and by adding an
environment dependent PAT init function.
Cc: <stable@vger.kernel.org> # 5.17
Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/include/asm/memtype.h | 13 ++-
arch/x86/include/asm/mtrr.h | 21 +++--
arch/x86/kernel/cpu/common.c | 13 +--
arch/x86/kernel/cpu/mtrr/generic.c | 14 ----
arch/x86/kernel/cpu/mtrr/mtrr.c | 33 ++++----
arch/x86/kernel/cpu/mtrr/mtrr.h | 1 -
arch/x86/kernel/setup.c | 7 --
arch/x86/mm/pat/memtype.c | 127 +++++++++++++++++++++--------
arch/x86/xen/enlighten_pv.c | 4 +
9 files changed, 146 insertions(+), 87 deletions(-)
diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
index 9ca760e430b9..93ad980631dc 100644
--- a/arch/x86/include/asm/memtype.h
+++ b/arch/x86/include/asm/memtype.h
@@ -8,7 +8,18 @@
extern bool pat_enabled(void);
extern void pat_disable(const char *reason);
extern void pat_init(void);
-extern void init_cache_modes(void);
+#ifdef CONFIG_X86_PAT
+void pat_init_set(void (*func)(void));
+void pat_init_noset(void);
+void pat_cpu_init(void);
+void pat_ap_init_nomtrr(void);
+void pat_aps_init_nomtrr(void);
+#else
+#define pat_init_set(f) do { } while (0)
+#define pat_cpu_init(f) do { } while (0)
+#define pat_ap_init_nomtrr(f) do { } while (0)
+#define pat_aps_init_nomtrr(f) do { } while (0)
+#endif
extern int memtype_reserve(u64 start, u64 end,
enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 900083ac9f60..bb76e5c6e21d 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -42,9 +42,9 @@ extern int mtrr_add_page(unsigned long base, unsigned long size,
extern int mtrr_del(int reg, unsigned long base, unsigned long size);
extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
-extern void mtrr_ap_init(void);
-extern void mtrr_aps_init(void);
-extern void mtrr_bp_restore(void);
+extern bool mtrr_ap_init(void);
+extern bool mtrr_aps_init(void);
+extern bool mtrr_bp_restore(void);
extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
extern int amd_special_default_mtrr(void);
void mtrr_disable(void);
@@ -84,9 +84,18 @@ static inline int mtrr_trim_uncached_memory(unsigned long end_pfn)
static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
{
}
-#define mtrr_ap_init() do {} while (0)
-#define mtrr_aps_init() do {} while (0)
-#define mtrr_bp_restore() do {} while (0)
+static inline bool mtrr_ap_init(void)
+{
+ return false;
+}
+static inline bool mtrr_aps_init(void)
+{
+ return false;
+}
+static inline bool mtrr_bp_restore(void)
+{
+ return false;
+}
#define mtrr_disable() do {} while (0)
#define mtrr_enable() do {} while (0)
# endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0a1bd14f7966..3edfb779dab5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
{
if (IS_ENABLED(CONFIG_MTRR))
mtrr_bp_init();
- else
- pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
+
+ pat_cpu_init();
}
void cache_ap_init(void)
@@ -2417,7 +2417,8 @@ void cache_ap_init(void)
if (cache_aps_delayed_init)
return;
- mtrr_ap_init();
+ if (!mtrr_ap_init())
+ pat_ap_init_nomtrr();
}
bool cache_aps_delayed_init;
@@ -2437,11 +2438,13 @@ void cache_aps_init(void)
if (!cache_aps_delayed_init)
return;
- mtrr_aps_init();
+ if (!mtrr_aps_init())
+ pat_aps_init_nomtrr();
cache_aps_delayed_init = false;
}
void cache_bp_restore(void)
{
- mtrr_bp_restore();
+ if (!mtrr_bp_restore())
+ pat_cpu_init();
}
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 84732215b61d..bb6dd96923a4 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -441,20 +441,6 @@ static void __init print_mtrr_state(void)
pr_debug("TOM2: %016llx aka %lldM\n", mtrr_tom2, mtrr_tom2>>20);
}
-/* PAT setup for BP. We need to go through sync steps here */
-void __init mtrr_bp_pat_init(void)
-{
- unsigned long flags;
-
- local_irq_save(flags);
- cache_disable();
-
- pat_init();
-
- cache_enable();
- local_irq_restore(flags);
-}
-
/* Grab all of the MTRR state for this CPU into *state */
bool __init get_mtrr_state(void)
{
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
index c1593cfae641..76b1553d90f9 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
@@ -762,9 +762,6 @@ void __init mtrr_bp_init(void)
/* BIOS may override */
__mtrr_enabled = get_mtrr_state();
- if (mtrr_enabled())
- mtrr_bp_pat_init();
-
if (mtrr_cleanup(phys_addr)) {
changed_by_mtrr_cleanup = 1;
mtrr_if->set_all();
@@ -772,25 +769,17 @@ void __init mtrr_bp_init(void)
}
}
- if (!mtrr_enabled()) {
+ if (!mtrr_enabled())
pr_info("Disabled\n");
-
- /*
- * PAT initialization relies on MTRR's rendezvous handler.
- * Skip PAT init until the handler can initialize both
- * features independently.
- */
- pat_disable("MTRRs disabled, skipping PAT initialization too.");
- }
}
-void mtrr_ap_init(void)
+bool mtrr_ap_init(void)
{
if (!mtrr_enabled())
- return;
+ return false;
if (!use_intel())
- return;
+ return false;
/*
* Ideally we should hold mtrr_mutex here to avoid mtrr entries
@@ -806,6 +795,8 @@ void mtrr_ap_init(void)
* lock to prevent mtrr entry changes
*/
set_mtrr_from_inactive_cpu(~0U, 0, 0, 0);
+
+ return true;
}
/**
@@ -826,20 +817,24 @@ void mtrr_save_state(void)
/*
* Delayed MTRR initialization for all AP's
*/
-void mtrr_aps_init(void)
+bool mtrr_aps_init(void)
{
if (!use_intel() || !mtrr_enabled())
- return;
+ return false;
set_mtrr(~0U, 0, 0, 0);
+
+ return true;
}
-void mtrr_bp_restore(void)
+bool mtrr_bp_restore(void)
{
if (!use_intel() || !mtrr_enabled())
- return;
+ return false;
mtrr_if->set_all();
+
+ return true;
}
static int __init mtrr_init_finialize(void)
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index 2ac99e561181..f6135a539073 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -53,7 +53,6 @@ void set_mtrr_prepare_save(struct set_mtrr_context *ctxt);
void fill_mtrr_var_range(unsigned int index,
u32 base_lo, u32 base_hi, u32 mask_lo, u32 mask_hi);
bool get_mtrr_state(void);
-void mtrr_bp_pat_init(void);
extern void __init set_mtrr_ops(const struct mtrr_ops *ops);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 27d61f73c68a..14bb40cd22c6 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1008,13 +1008,6 @@ void __init setup_arch(char **cmdline_p)
max_possible_pfn = max_pfn;
- /*
- * This call is required when the CPU does not support PAT. If
- * mtrr_bp_init() invoked it already via pat_init() the call has no
- * effect.
- */
- init_cache_modes();
-
/*
* Define random base addresses for memory sections after max_pfn is
* defined and before each memory section base is used.
diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index d5ef64ddd35e..3d4bc27ffebb 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -41,6 +41,7 @@
#include <linux/mm.h>
#include <linux/fs.h>
#include <linux/rbtree.h>
+#include <linux/stop_machine.h>
#include <asm/cacheflush.h>
#include <asm/processor.h>
@@ -65,30 +66,23 @@ static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
static bool __read_mostly pat_bp_enabled;
static bool __read_mostly pat_cm_initialized;
-/*
- * PAT support is enabled by default, but can be disabled for
- * various user-requested or hardware-forced reasons:
- */
-void pat_disable(const char *msg_reason)
-{
- if (pat_disabled)
- return;
+static void init_cache_modes(void);
- if (pat_bp_initialized) {
- WARN_ONCE(1, "x86/PAT: PAT cannot be disabled after initialization\n");
- return;
- }
-
- pat_disabled = true;
- pr_info("x86/PAT: %s\n", msg_reason);
+#ifdef CONFIG_X86_PAT
+static void pat_init_nopat(void)
+{
+ init_cache_modes();
}
static int __init nopat(char *str)
{
- pat_disable("PAT support disabled via boot option.");
+ pat_disabled = true;
+ pr_info("PAT support disabled via boot option.");
+ pat_init_set(pat_init_nopat);
return 0;
}
early_param("nopat", nopat);
+#endif
bool pat_enabled(void)
{
@@ -243,13 +237,17 @@ static void pat_bp_init(u64 pat)
u64 tmp_pat;
if (!boot_cpu_has(X86_FEATURE_PAT)) {
- pat_disable("PAT not supported by the CPU.");
+ pr_info("PAT not supported by the CPU.");
+ pat_disabled = true;
+ pat_init_set(pat_init_nopat);
return;
}
rdmsrl(MSR_IA32_CR_PAT, tmp_pat);
if (!tmp_pat) {
- pat_disable("PAT support disabled by the firmware.");
+ pr_info("PAT support disabled by the firmware.");
+ pat_disabled = true;
+ pat_init_set(pat_init_nopat);
return;
}
@@ -272,7 +270,7 @@ static void pat_ap_init(u64 pat)
wrmsrl(MSR_IA32_CR_PAT, pat);
}
-void init_cache_modes(void)
+static void init_cache_modes(void)
{
u64 pat = 0;
@@ -318,25 +316,12 @@ void init_cache_modes(void)
__init_cache_modes(pat);
}
-/**
- * pat_init - Initialize the PAT MSR and PAT table on the current CPU
- *
- * This function initializes PAT MSR and PAT table with an OS-defined value
- * to enable additional cache attributes, WC, WT and WP.
- *
- * This function must be called on all CPUs using the specific sequence of
- * operations defined in Intel SDM. mtrr_rendezvous_handler() provides this
- * procedure for PAT.
- */
-void pat_init(void)
+#ifdef CONFIG_X86_PAT
+static void pat_init_native(void)
{
u64 pat;
struct cpuinfo_x86 *c = &boot_cpu_data;
-#ifndef CONFIG_X86_PAT
- pr_info_once("x86/PAT: PAT support disabled because CONFIG_X86_PAT is disabled in the kernel.\n");
-#endif
-
if (pat_disabled)
return;
@@ -406,6 +391,80 @@ void pat_init(void)
#undef PAT
+void pat_init_noset(void)
+{
+ pat_bp_enabled = true;
+ init_cache_modes();
+}
+
+static void (*pat_init_func)(void) = pat_init_native;
+
+void __init pat_init_set(void (*func)(void))
+{
+ pat_init_func = func;
+}
+
+/**
+ * pat_init - Initialize the PAT MSR and PAT table on the current CPU
+ *
+ * This function initializes PAT MSR and PAT table with an OS-defined value
+ * to enable additional cache attributes, WC, WT and WP.
+ *
+ * This function must be called on all CPUs using the specific sequence of
+ * operations defined in Intel SDM. mtrr_rendezvous_handler() provides this
+ * procedure for PAT.
+ */
+void pat_init(void)
+{
+ pat_init_func();
+}
+
+static int __pat_cpu_init(void *data)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ cache_disable();
+
+ pat_init();
+
+ cache_enable();
+ local_irq_restore(flags);
+
+ return 0;
+}
+
+void pat_cpu_init(void)
+{
+ if (pat_init_func != pat_init_native)
+ pat_init();
+ else
+ __pat_cpu_init(NULL);
+}
+
+void pat_ap_init_nomtrr(void)
+{
+ if (pat_init_func != pat_init_native)
+ return;
+
+ stop_machine_from_inactive_cpu(__pat_cpu_init, NULL, cpu_callout_mask);
+}
+
+void pat_aps_init_nomtrr(void)
+{
+ if (pat_init_func != pat_init_native)
+ return;
+
+ stop_machine(__pat_cpu_init, NULL, cpu_online_mask);
+}
+#else
+void pat_init(void)
+{
+ pr_info_once("x86/PAT: PAT support disabled because CONFIG_X86_PAT is disabled in the kernel.\n");
+ init_cache_modes();
+}
+#endif /* CONFIG_X86_PAT */
+
static DEFINE_SPINLOCK(memtype_lock); /* protects memtype accesses */
/*
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 70fb2ea85e90..97831d822872 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -69,6 +69,7 @@
#include <asm/mwait.h>
#include <asm/pci_x86.h>
#include <asm/cpu.h>
+#include <asm/memtype.h>
#ifdef CONFIG_X86_IOPL_IOPERM
#include <asm/io_bitmap.h>
#endif
@@ -1317,6 +1318,9 @@ asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
initrd_start = __pa(xen_start_info->mod_start);
}
+ /* Don't try to modify PAT MSR. */
+ pat_init_set(pat_init_noset);
+
/* Poke various useful things into boot_params */
boot_params.hdr.type_of_loader = (9 << 4) | 0;
boot_params.hdr.ramdisk_image = initrd_start;
--
2.35.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 3/3] x86: decouple pat and mtrr handling
2022-07-15 14:25 ` [PATCH 3/3] x86: decouple pat and mtrr handling Juergen Gross
@ 2022-07-19 15:15 ` Borislav Petkov
2022-08-17 9:17 ` Juergen Gross
0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2022-07-19 15:15 UTC (permalink / raw)
To: Juergen Gross
Cc: xen-devel, x86, linux-kernel, brchuckz, jbeulich, Thomas Gleixner,
Ingo Molnar, Dave Hansen, H. Peter Anvin, Andy Lutomirski,
Peter Zijlstra, Boris Ostrovsky, stable
On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
> Today PAT is usable only with MTRR being active, with some nasty tweaks
> to make PAT usable when running as Xen PV guest, which doesn't support
> MTRR.
>
> The reason for this coupling is, that both, PAT MSR changes and MTRR
> changes, require a similar sequence and so full PAT support was added
> using the already available MTRR handling.
>
> Xen PV PAT handling can work without MTRR, as it just needs to consume
> the PAT MSR setting done by the hypervisor without the ability and need
> to change it. This in turn has resulted in a convoluted initialization
> sequence and wrong decisions regarding cache mode availability due to
> misguiding PAT availability flags.
>
> Fix all of that by allowing to use PAT without MTRR and by adding an
> environment dependent PAT init function.
Aha, there's the explanation I was looking for.
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 0a1bd14f7966..3edfb779dab5 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
> {
> if (IS_ENABLED(CONFIG_MTRR))
> mtrr_bp_init();
> - else
> - pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
> +
> + pat_cpu_init();
> }
>
> void cache_ap_init(void)
> @@ -2417,7 +2417,8 @@ void cache_ap_init(void)
> if (cache_aps_delayed_init)
> return;
>
> - mtrr_ap_init();
> + if (!mtrr_ap_init())
> + pat_ap_init_nomtrr();
> }
So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.
But currently, the code sets the MTRRs for the delayed case or when the
CPU is not online by doing ->set_all and in there it sets first MTRRs
and then PAT.
I think the code above should simply try the two things, one after the
other, independently from one another.
And I see you've added another stomp machine call for PAT only.
Now, what I think the design of all this should be, is:
you have a bunch of things you need to do at each point:
* cache_ap_init
* cache_aps_init
* ...
Now, in each those, you look at whether PAT or MTRR is supported and you
do only those which are supported.
Also, the rendezvous handler should do:
if MTRR:
do MTRR specific stuff
if PAT:
do PAT specific stuff
This way you have clean definitions of what needs to happen when and you
also do *only* the things that the platform supports, by keeping the
proper order of operations - I believe MTRRs first and then PAT.
This way we'll get rid of that crazy maze of who calls what and when.
But first we need to define those points where stuff needs to happen and
then for each point define what stuff needs to happen.
How does that sound?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 3/3] x86: decouple pat and mtrr handling
2022-07-19 15:15 ` Borislav Petkov
@ 2022-08-17 9:17 ` Juergen Gross
0 siblings, 0 replies; 15+ messages in thread
From: Juergen Gross @ 2022-08-17 9:17 UTC (permalink / raw)
To: Borislav Petkov
Cc: xen-devel, x86, linux-kernel, brchuckz, jbeulich, Thomas Gleixner,
Ingo Molnar, Dave Hansen, H. Peter Anvin, Andy Lutomirski,
Peter Zijlstra, Boris Ostrovsky, stable
[-- Attachment #1.1.1: Type: text/plain, Size: 3734 bytes --]
On 19.07.22 17:15, Borislav Petkov wrote:
> On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
>> Today PAT is usable only with MTRR being active, with some nasty tweaks
>> to make PAT usable when running as Xen PV guest, which doesn't support
>> MTRR.
>>
>> The reason for this coupling is, that both, PAT MSR changes and MTRR
>> changes, require a similar sequence and so full PAT support was added
>> using the already available MTRR handling.
>>
>> Xen PV PAT handling can work without MTRR, as it just needs to consume
>> the PAT MSR setting done by the hypervisor without the ability and need
>> to change it. This in turn has resulted in a convoluted initialization
>> sequence and wrong decisions regarding cache mode availability due to
>> misguiding PAT availability flags.
>>
>> Fix all of that by allowing to use PAT without MTRR and by adding an
>> environment dependent PAT init function.
>
> Aha, there's the explanation I was looking for.
>
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 0a1bd14f7966..3edfb779dab5 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
>> {
>> if (IS_ENABLED(CONFIG_MTRR))
>> mtrr_bp_init();
>> - else
>> - pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
>> +
>> + pat_cpu_init();
>> }
>>
>> void cache_ap_init(void)
>> @@ -2417,7 +2417,8 @@ void cache_ap_init(void)
>> if (cache_aps_delayed_init)
>> return;
>>
>> - mtrr_ap_init();
>> + if (!mtrr_ap_init())
>> + pat_ap_init_nomtrr();
>> }
>
> So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.
>
> But currently, the code sets the MTRRs for the delayed case or when the
> CPU is not online by doing ->set_all and in there it sets first MTRRs
> and then PAT.
>
> I think the code above should simply try the two things, one after the
> other, independently from one another.
>
> And I see you've added another stomp machine call for PAT only.
>
> Now, what I think the design of all this should be, is:
>
> you have a bunch of things you need to do at each point:
>
> * cache_ap_init
>
> * cache_aps_init
>
> * ...
>
> Now, in each those, you look at whether PAT or MTRR is supported and you
> do only those which are supported.
>
> Also, the rendezvous handler should do:
>
> if MTRR:
> do MTRR specific stuff
>
> if PAT:
> do PAT specific stuff
>
> This way you have clean definitions of what needs to happen when and you
> also do *only* the things that the platform supports, by keeping the
> proper order of operations - I believe MTRRs first and then PAT.
>
> This way we'll get rid of that crazy maze of who calls what and when.
>
> But first we need to define those points where stuff needs to happen and
> then for each point define what stuff needs to happen.
>
> How does that sound?
This asks for some more cleanup in the MTRR code:
mtrr_if->set_all() is the relevant callback, and it will only ever be called
for the generic case (use_intel() == true), so I think we want to:
- remove the cyrix specific set_all() function
- split the set_all() callback case from mtrr_rendezvous_handler() into a
dedicated rendezvous handler
- remove the set_all() member from struct mtrr_ops and directly call
generic_set_all() from the new rendezvous handler
- optional: rename use_intel() to use_generic(), or even introduce just
a static bool for that purpose
Then the new rendezvous handler can be modified as you suggested.
Are you okay with that route?
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-15 14:25 [PATCH 0/3] x86: make pat and mtrr independent from each other Juergen Gross
` (2 preceding siblings ...)
2022-07-15 14:25 ` [PATCH 3/3] x86: decouple pat and mtrr handling Juergen Gross
@ 2022-07-16 11:32 ` Chuck Zmudzinski
2022-07-16 11:42 ` Borislav Petkov
2022-07-16 12:01 ` Chuck Zmudzinski
2022-07-17 7:55 ` Thorsten Leemhuis
4 siblings, 2 replies; 15+ messages in thread
From: Chuck Zmudzinski @ 2022-07-16 11:32 UTC (permalink / raw)
To: Juergen Gross, xen-devel, x86, linux-kernel, linux-pm,
Thorsten Leemhuis
Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, # 5 . 17, Rafael J. Wysocki,
Pavel Machek, Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On 7/15/2022 10:25 AM, Juergen Gross wrote:
> Today PAT can't be used without MTRR being available, unless MTRR is at
> least configured via CONFIG_MTRR and the system is running as Xen PV
> guest. In this case PAT is automatically available via the hypervisor,
> but the PAT MSR can't be modified by the kernel and MTRR is disabled.
>
> As an additional complexity the availability of PAT can't be queried
> via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> to be disabled. This leads to some drivers believing that not all cache
> modes are available, resulting in failures or degraded functionality.
>
> The same applies to a kernel built with no MTRR support: it won't
> allow to use the PAT MSR, even if there is no technical reason for
> that, other than setting up PAT on all cpus the same way (which is a
> requirement of the processor's cache management) is relying on some
> MTRR specific code.
>
> Fix all of that by:
>
> - moving the function needed by PAT from MTRR specific code one level
> up
> - adding a PAT indirection layer supporting the 3 cases "no or disabled
> PAT", "PAT under kernel control", and "PAT under Xen control"
> - removing the dependency of PAT on MTRR
>
> Juergen Gross (3):
> x86: move some code out of arch/x86/kernel/cpu/mtrr
> x86: add wrapper functions for mtrr functions handling also pat
> x86: decouple pat and mtrr handling
>
> arch/x86/include/asm/memtype.h | 13 ++-
> arch/x86/include/asm/mtrr.h | 27 ++++--
> arch/x86/include/asm/processor.h | 10 +++
> arch/x86/kernel/cpu/common.c | 123 +++++++++++++++++++++++++++-
> arch/x86/kernel/cpu/mtrr/generic.c | 90 ++------------------
> arch/x86/kernel/cpu/mtrr/mtrr.c | 58 ++++---------
> arch/x86/kernel/cpu/mtrr/mtrr.h | 1 -
> arch/x86/kernel/setup.c | 12 +--
> arch/x86/kernel/smpboot.c | 8 +-
> arch/x86/mm/pat/memtype.c | 127 +++++++++++++++++++++--------
> arch/x86/power/cpu.c | 2 +-
> arch/x86/xen/enlighten_pv.c | 4 +
> 12 files changed, 289 insertions(+), 186 deletions(-)
>
This patch series seems related to the regression reported
here on May 5, 2022:
https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
I am experiencing that regression and could test this patch
on my system.
Can you confirm that with this patch series you are trying
to fix that regression?
Chuck
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-16 11:32 ` [PATCH 0/3] x86: make pat and mtrr independent from each other Chuck Zmudzinski
@ 2022-07-16 11:42 ` Borislav Petkov
2022-07-17 4:06 ` Chuck Zmudzinski
2022-07-16 12:01 ` Chuck Zmudzinski
1 sibling, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2022-07-16 11:42 UTC (permalink / raw)
To: Chuck Zmudzinski
Cc: Juergen Gross, xen-devel, x86, linux-kernel, linux-pm,
Thorsten Leemhuis, jbeulich, Thomas Gleixner, Ingo Molnar,
Dave Hansen, H. Peter Anvin, # 5 . 17, Rafael J. Wysocki,
Pavel Machek, Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On Sat, Jul 16, 2022 at 07:32:46AM -0400, Chuck Zmudzinski wrote:
> Can you confirm that with this patch series you are trying
> to fix that regression?
Yes, this patchset is aimed to fix the whole situation but please don't
do anything yet - I need to find time and look at the whole approach
before you can test it. Just be patient and we'll ping you when the time
comes.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-16 11:42 ` Borislav Petkov
@ 2022-07-17 4:06 ` Chuck Zmudzinski
0 siblings, 0 replies; 15+ messages in thread
From: Chuck Zmudzinski @ 2022-07-17 4:06 UTC (permalink / raw)
To: Borislav Petkov
Cc: Juergen Gross, xen-devel, x86, linux-kernel, linux-pm,
Thorsten Leemhuis, jbeulich, Thomas Gleixner, Ingo Molnar,
Dave Hansen, H. Peter Anvin, # 5 . 17, Rafael J. Wysocki,
Pavel Machek, Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On 7/16/2022 7:42 AM, Borislav Petkov wrote:
> On Sat, Jul 16, 2022 at 07:32:46AM -0400, Chuck Zmudzinski wrote:
> > Can you confirm that with this patch series you are trying
> > to fix that regression?
>
> Yes, this patchset is aimed to fix the whole situation but please don't
> do anything yet - I need to find time and look at the whole approach
> before you can test it. Just be patient and we'll ping you when the time
> comes.
>
> Thx.
>
I will wait until I get the ping before trying it.
Thanks,
Chuck
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-16 11:32 ` [PATCH 0/3] x86: make pat and mtrr independent from each other Chuck Zmudzinski
2022-07-16 11:42 ` Borislav Petkov
@ 2022-07-16 12:01 ` Chuck Zmudzinski
1 sibling, 0 replies; 15+ messages in thread
From: Chuck Zmudzinski @ 2022-07-16 12:01 UTC (permalink / raw)
To: Juergen Gross, xen-devel, x86, linux-kernel, linux-pm,
Thorsten Leemhuis
Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, # 5 . 17, Rafael J. Wysocki,
Pavel Machek, Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On 7/16/2022 7:32 AM, Chuck Zmudzinski wrote:
> On 7/15/2022 10:25 AM, Juergen Gross wrote:
> > Today PAT can't be used without MTRR being available, unless MTRR is at
> > least configured via CONFIG_MTRR and the system is running as Xen PV
> > guest. In this case PAT is automatically available via the hypervisor,
> > but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> >
> > As an additional complexity the availability of PAT can't be queried
> > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> > to be disabled. This leads to some drivers believing that not all cache
> > modes are available, resulting in failures or degraded functionality.
> >
> > The same applies to a kernel built with no MTRR support: it won't
> > allow to use the PAT MSR, even if there is no technical reason for
> > that, other than setting up PAT on all cpus the same way (which is a
> > requirement of the processor's cache management) is relying on some
> > MTRR specific code.
> >
> > Fix all of that by:
> >
> > - moving the function needed by PAT from MTRR specific code one level
> > up
> > - adding a PAT indirection layer supporting the 3 cases "no or disabled
> > PAT", "PAT under kernel control", and "PAT under Xen control"
> > - removing the dependency of PAT on MTRR
> >
> > Juergen Gross (3):
> > x86: move some code out of arch/x86/kernel/cpu/mtrr
> > x86: add wrapper functions for mtrr functions handling also pat
> > x86: decouple pat and mtrr handling
> >
> > arch/x86/include/asm/memtype.h | 13 ++-
> > arch/x86/include/asm/mtrr.h | 27 ++++--
> > arch/x86/include/asm/processor.h | 10 +++
> > arch/x86/kernel/cpu/common.c | 123 +++++++++++++++++++++++++++-
> > arch/x86/kernel/cpu/mtrr/generic.c | 90 ++------------------
> > arch/x86/kernel/cpu/mtrr/mtrr.c | 58 ++++---------
> > arch/x86/kernel/cpu/mtrr/mtrr.h | 1 -
> > arch/x86/kernel/setup.c | 12 +--
> > arch/x86/kernel/smpboot.c | 8 +-
> > arch/x86/mm/pat/memtype.c | 127 +++++++++++++++++++++--------
> > arch/x86/power/cpu.c | 2 +-
> > arch/x86/xen/enlighten_pv.c | 4 +
> > 12 files changed, 289 insertions(+), 186 deletions(-)
> >
>
> This patch series seems related to the regression reported
> here on May 5, 2022:
I'm sorry, the date of that report was May 4, 2022, not
May 5, 2022 - just to avoid any doubt about which regression
I am referring to.
Chuck
>
> https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
>
> I am experiencing that regression
or a very similar regression that is caused by the same commit:
bdd8b6c98239cad
("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> and could test this patch
> on my system.
>
> Can you confirm that with this patch series you are trying
> to fix that regression?
>
> Chuck
Chuck
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-15 14:25 [PATCH 0/3] x86: make pat and mtrr independent from each other Juergen Gross
` (3 preceding siblings ...)
2022-07-16 11:32 ` [PATCH 0/3] x86: make pat and mtrr independent from each other Chuck Zmudzinski
@ 2022-07-17 7:55 ` Thorsten Leemhuis
2022-07-18 11:32 ` Chuck Zmudzinski
4 siblings, 1 reply; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-07-17 7:55 UTC (permalink / raw)
To: Juergen Gross, xen-devel, x86, linux-kernel, linux-pm
Cc: brchuckz, jbeulich, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, # 5 . 17, Rafael J. Wysocki,
Pavel Machek, Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
Hi Juergen!
On 15.07.22 16:25, Juergen Gross wrote:
> Today PAT can't be used without MTRR being available, unless MTRR is at
> least configured via CONFIG_MTRR and the system is running as Xen PV
> guest. In this case PAT is automatically available via the hypervisor,
> but the PAT MSR can't be modified by the kernel and MTRR is disabled.
>
> As an additional complexity the availability of PAT can't be queried
> via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> to be disabled. This leads to some drivers believing that not all cache
> modes are available, resulting in failures or degraded functionality.
>
> The same applies to a kernel built with no MTRR support: it won't
> allow to use the PAT MSR, even if there is no technical reason for
> that, other than setting up PAT on all cpus the same way (which is a
> requirement of the processor's cache management) is relying on some
> MTRR specific code.
>
> Fix all of that by:
>
> - moving the function needed by PAT from MTRR specific code one level
> up
> - adding a PAT indirection layer supporting the 3 cases "no or disabled
> PAT", "PAT under kernel control", and "PAT under Xen control"
> - removing the dependency of PAT on MTRR
Thx for working on this. If you need to respin these patches for one
reason or another, could you do me a favor and add proper 'Link:' tags
pointing to all reports about this issue? e.g. like this:
Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
These tags are considered important by Linus[1] and others, as they
allow anyone to look into the backstory weeks or years from now. That is
why they should be placed in cases like this, as
Documentation/process/submitting-patches.rst and
Documentation/process/5.Posting.rst explain in more detail. I care
personally, because these tags make my regression tracking efforts a
whole lot easier, as they allow my tracking bot 'regzbot' to
automatically connect reports with patches posted or committed to fix
tracked regressions.
[1] see for example:
https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.
BTW, let me tell regzbot to monitor this thread:
#regzbot ^backmonitor:
https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-17 7:55 ` Thorsten Leemhuis
@ 2022-07-18 11:32 ` Chuck Zmudzinski
2022-07-19 13:16 ` Chuck Zmudzinski
0 siblings, 1 reply; 15+ messages in thread
From: Chuck Zmudzinski @ 2022-07-18 11:32 UTC (permalink / raw)
To: Thorsten Leemhuis, Juergen Gross, xen-devel, x86, linux-kernel,
linux-pm, Borislav Petkov
Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin, # 5 . 17, Rafael J. Wysocki, Pavel Machek,
Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On 7/17/2022 3:55 AM, Thorsten Leemhuis wrote:
> Hi Juergen!
>
> On 15.07.22 16:25, Juergen Gross wrote:
> > Today PAT can't be used without MTRR being available, unless MTRR is at
> > least configured via CONFIG_MTRR and the system is running as Xen PV
> > guest. In this case PAT is automatically available via the hypervisor,
> > but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> >
> > As an additional complexity the availability of PAT can't be queried
> > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> > to be disabled. This leads to some drivers believing that not all cache
> > modes are available, resulting in failures or degraded functionality.
> >
> > The same applies to a kernel built with no MTRR support: it won't
> > allow to use the PAT MSR, even if there is no technical reason for
> > that, other than setting up PAT on all cpus the same way (which is a
> > requirement of the processor's cache management) is relying on some
> > MTRR specific code.
> >
> > Fix all of that by:
> >
> > - moving the function needed by PAT from MTRR specific code one level
> > up
> > - adding a PAT indirection layer supporting the 3 cases "no or disabled
> > PAT", "PAT under kernel control", and "PAT under Xen control"
> > - removing the dependency of PAT on MTRR
>
> Thx for working on this. If you need to respin these patches for one
> reason or another, could you do me a favor and add proper 'Link:' tags
> pointing to all reports about this issue? e.g. like this:
>
> Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
>
> These tags are considered important by Linus[1] and others, as they
> allow anyone to look into the backstory weeks or years from now. That is
> why they should be placed in cases like this, as
> Documentation/process/submitting-patches.rst and
> Documentation/process/5.Posting.rst explain in more detail. I care
> personally, because these tags make my regression tracking efforts a
> whole lot easier, as they allow my tracking bot 'regzbot' to
> automatically connect reports with patches posted or committed to fix
> tracked regressions.
>
> [1] see for example:
> https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
> https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/
> https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
I echo Thorsten's thx for starting on this now instead of waiting until
September which I think is when Juergen said he could start working
on this last week. I agree with Thorsten that Link tags are needed.
Since multiple patches have been proposed to fix this regression,
perhaps a Link to each proposed patch, and a note that
the original report identified a specific commit which when reverted
also fixes it. IMO, this is all part of the backstory Thorsten refers to.
It looks like with this approach, a fix will not be coming real soon,
and Borislav Petkov also discouraged me from testing this
patch set until I receive a ping telling me it is ready for testing,
which seems to confirm that this regression will not be fixed
very soon. Please correct me if I am wrong about how long
it will take to fix it with this approach.
Also, is there any guarantee this approach is endorsed by
all the maintainers who will need to sign-off, especially
Linus? I say this because some of the discussion on the
earlier proposed patches makes me doubt this. I am especially
referring to this discussion:
https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898a28@leemhuis.info/
and also, here:
https://lore.kernel.org/lkml/YsRjX%2FU1XN8rq+8u@zn.tnic/
where Borislav Petkov argues that Linux should not be
patched at all to fix this regression but instead the fix
should come by patching the Xen hypervisor.
So I have several questions, presuming at least the fix is going
to be delayed for some time, and also presuming this approach
is not yet an approach that has the blessing of the maintainers
who will need to sign-off:
1. Can you estimate when the patch series will be ready for
testing and suitable for a prepatch or RC release?
2. Can you estimate when the patch series will be ready to be
merged into the mainline release? Is there any hope it will be
fixed before the next longterm release hosted on kernel.org?
3. Since a fix is likely not coming soon, can you explain
why the commit that was mentioned in the original
report cannot be reverted as a temporary solution while
we wait for the full fix to come later? I can say that
reverting that commit (It was a commit affecting
drm/i915) does fix the issue on my system with no
negative side effects at all. In such a case, it seems
contrary to Linus' regression rule to not revert the
offending commit, even if reverting the offending
commit is not going to be the final solution. IOW,
I am trying to argue that an important corollary to
the Linus regression rule is that we revert commits
that introduce regressions, especially when there
are no negative effects when reverting the offending
commit. Why are we not doing that in this case?
4. Can you explain why this patch series is superior
to the other proposed patches that are much more
simple and have been reported to fix the regression?
5. This approach seems way too aggressive for backporting
to the stable releases. Is that correct? Or, will the patches
be backported to the stable releases? I was told that
backports to the stable releases are needed to keep things
consistent across all the supported versions when I submitted
a patch to fix this regression that identified a specific five year
old commit that my proposed patch would fix.
Remember, this is a regression that is really bothering
people now. For example, I am now in a position where
I cannot install the updates of the Linux kernel that Debian
pushes out to me without patching the kernel with my
own private build that has one of the known fixes that
have already been identified as ways to workaround this
regression while we wait for the full solution that will
hopefully come later.
Chuck
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
>
> BTW, let me tell regzbot to monitor this thread:
>
> #regzbot ^backmonitor:
> https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
2022-07-18 11:32 ` Chuck Zmudzinski
@ 2022-07-19 13:16 ` Chuck Zmudzinski
0 siblings, 0 replies; 15+ messages in thread
From: Chuck Zmudzinski @ 2022-07-19 13:16 UTC (permalink / raw)
To: Thorsten Leemhuis, Juergen Gross, xen-devel, x86, linux-kernel,
linux-pm, Borislav Petkov
Cc: jbeulich, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin, # 5 . 17, Rafael J. Wysocki, Pavel Machek,
Andy Lutomirski, Peter Zijlstra, Boris Ostrovsky
On 7/18/2022 7:32 AM, Chuck Zmudzinski wrote:
> On 7/17/2022 3:55 AM, Thorsten Leemhuis wrote:
> > Hi Juergen!
> >
> > On 15.07.22 16:25, Juergen Gross wrote:
> > > Today PAT can't be used without MTRR being available, unless MTRR is at
> > > least configured via CONFIG_MTRR and the system is running as Xen PV
> > > guest. In this case PAT is automatically available via the hypervisor,
> > > but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> > >
> > > As an additional complexity the availability of PAT can't be queried
> > > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> > > to be disabled. This leads to some drivers believing that not all cache
> > > modes are available, resulting in failures or degraded functionality.
> > >
> > > The same applies to a kernel built with no MTRR support: it won't
> > > allow to use the PAT MSR, even if there is no technical reason for
> > > that, other than setting up PAT on all cpus the same way (which is a
> > > requirement of the processor's cache management) is relying on some
> > > MTRR specific code.
> > >
> > > Fix all of that by:
> > >
> > > - moving the function needed by PAT from MTRR specific code one level
> > > up
> > > - adding a PAT indirection layer supporting the 3 cases "no or disabled
> > > PAT", "PAT under kernel control", and "PAT under Xen control"
> > > - removing the dependency of PAT on MTRR
> >
> > Thx for working on this. If you need to respin these patches for one
> > reason or another, could you do me a favor and add proper 'Link:' tags
> > pointing to all reports about this issue? e.g. like this:
> >
> > Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
> >
> > These tags are considered important by Linus[1] and others, as they
> > allow anyone to look into the backstory weeks or years from now. That is
> > why they should be placed in cases like this, as
> > Documentation/process/submitting-patches.rst and
> > Documentation/process/5.Posting.rst explain in more detail. I care
> > personally, because these tags make my regression tracking efforts a
> > whole lot easier, as they allow my tracking bot 'regzbot' to
> > automatically connect reports with patches posted or committed to fix
> > tracked regressions.
> >
> > [1] see for example:
> > https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/
> >
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> >
>
> I echo Thorsten's thx for starting on this now instead of waiting until
> September which I think is when Juergen said he could start working
> on this last week. I agree with Thorsten that Link tags are needed.
> Since multiple patches have been proposed to fix this regression,
> perhaps a Link to each proposed patch, and a note that
> the original report identified a specific commit which when reverted
> also fixes it. IMO, this is all part of the backstory Thorsten refers to.
>
> It looks like with this approach, a fix will not be coming real soon,
> and Borislav Petkov also discouraged me from testing this
> patch set until I receive a ping telling me it is ready for testing,
> which seems to confirm that this regression will not be fixed
> very soon. Please correct me if I am wrong about how long
> it will take to fix it with this approach.
>
> Also, is there any guarantee this approach is endorsed by
> all the maintainers who will need to sign-off, especially
> Linus? I say this because some of the discussion on the
> earlier proposed patches makes me doubt this. I am especially
> referring to this discussion:
>
> https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898a28@leemhuis.info/
>
> and also, here:
>
> https://lore.kernel.org/lkml/YsRjX%2FU1XN8rq+8u@zn.tnic/
>
> where Borislav Petkov argues that Linux should not be
> patched at all to fix this regression but instead the fix
> should come by patching the Xen hypervisor.
>
> So I have several questions, presuming at least the fix is going
> to be delayed for some time, and also presuming this approach
> is not yet an approach that has the blessing of the maintainers
> who will need to sign-off:
>
> 1. Can you estimate when the patch series will be ready for
> testing and suitable for a prepatch or RC release?
>
> 2. Can you estimate when the patch series will be ready to be
> merged into the mainline release? Is there any hope it will be
> fixed before the next longterm release hosted on kernel.org?
>
> 3. Since a fix is likely not coming soon, can you explain
> why the commit that was mentioned in the original
> report cannot be reverted as a temporary solution while
> we wait for the full fix to come later? I can say that
> reverting that commit (It was a commit affecting
> drm/i915) does fix the issue on my system with no
> negative side effects at all. In such a case, it seems
> contrary to Linus' regression rule to not revert the
> offending commit, even if reverting the offending
> commit is not going to be the final solution. IOW,
> I am trying to argue that an important corollary to
> the Linus regression rule is that we revert commits
> that introduce regressions, especially when there
> are no negative effects when reverting the offending
> commit. Why are we not doing that in this case?
>
> 4. Can you explain why this patch series is superior
> to the other proposed patches that are much more
> simple and have been reported to fix the regression?
>
> 5. This approach seems way too aggressive for backporting
> to the stable releases. Is that correct? Or, will the patches
> be backported to the stable releases? I was told that
> backports to the stable releases are needed to keep things
> consistent across all the supported versions when I submitted
> a patch to fix this regression that identified a specific five year
> old commit that my proposed patch would fix.
>
> Remember, this is a regression that is really bothering
> people now. For example, I am now in a position where
> I cannot install the updates of the Linux kernel that Debian
> pushes out to me without patching the kernel with my
> own private build that has one of the known fixes that
> have already been identified as ways to workaround this
> regression while we wait for the full solution that will
> hopefully come later.
>
> Chuck
>
> > P.S.: As the Linux kernel's regression tracker I deal with a lot of
> > reports and sometimes miss something important when writing mails like
> > this. If that's the case here, don't hesitate to tell me in a public
> > reply, it's in everyone's interest to set the public record straight.
> >
> > BTW, let me tell regzbot to monitor this thread:
> >
> > #regzbot ^backmonitor:
> > https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
>
OK, the comments Boris made on the individual patches of
this patch set answers most of my questions. Thx, Boris.
Chuck
^ permalink raw reply [flat|nested] 15+ messages in thread