* x86/MCE/AMD: Fix the thresholding machinery initialization order
@ 2018-11-27 19:25 Borislav Petkov
0 siblings, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2018-11-27 19:25 UTC (permalink / raw)
To: Ghannam, Yazen, Rafał Miłecki, John Clemens
Cc: Tony Luck, linux-edac@vger.kernel.org, Aravind Gopalakrishnan,
Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86@kernel.org,
Rafał Miłecki
Currently, the code sets up the thresholding interrupt vector and only
then goes about initializing the thresholding banks. Which is wrong,
because an early thresholding interrupt would cause a NULL pointer
dereference when accessing those banks and prevent the machine from
booting.
Therefore, set the thresholding interrupt vector only *after* having
initialized the banks successfully.
Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging error")
Reported-by: Rafał Miłecki <rafal@milecki.pl>
Reported-by: John Clemens <clemej@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index dd33c357548f..e12454e21b8a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -56,7 +56,7 @@
/* Threshold LVT offset is at MSR0xC0000410[15:12] */
#define SMCA_THR_LVT_OFF 0xF000
-static bool thresholding_en;
+static bool thresholding_irq_en;
static const char * const th_names[] = {
"load_store",
@@ -534,9 +534,8 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
set_offset:
offset = setup_APIC_mce_threshold(offset, new);
-
- if ((offset == new) && (mce_threshold_vector != amd_threshold_interrupt))
- mce_threshold_vector = amd_threshold_interrupt;
+ if (offset == new)
+ thresholding_irq_en = true;
done:
mce_threshold_block_init(&b, offset);
@@ -1357,9 +1356,6 @@ int mce_threshold_remove_device(unsigned int cpu)
{
unsigned int bank;
- if (!thresholding_en)
- return 0;
-
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
@@ -1377,9 +1373,6 @@ int mce_threshold_create_device(unsigned int cpu)
struct threshold_bank **bp;
int err = 0;
- if (!thresholding_en)
- return 0;
-
bp = per_cpu(threshold_banks, cpu);
if (bp)
return 0;
@@ -1408,9 +1401,6 @@ static __init int threshold_init_device(void)
{
unsigned lcpu = 0;
- if (mce_threshold_vector == amd_threshold_interrupt)
- thresholding_en = true;
-
/* to hit CPUs online before the notifier is up */
for_each_online_cpu(lcpu) {
int err = mce_threshold_create_device(lcpu);
@@ -1419,6 +1409,9 @@ static __init int threshold_init_device(void)
return err;
}
+ if (thresholding_irq_en)
+ mce_threshold_vector = amd_threshold_interrupt;
+
return 0;
}
/*
^ permalink raw reply related [flat|nested] 4+ messages in thread
* x86/MCE/AMD: Fix the thresholding machinery initialization order
@ 2018-11-27 20:38 Rafał Miłecki
0 siblings, 0 replies; 4+ messages in thread
From: Rafał Miłecki @ 2018-11-27 20:38 UTC (permalink / raw)
To: Borislav Petkov, Ghannam, Yazen, John Clemens
Cc: Tony Luck, linux-edac@vger.kernel.org, Aravind Gopalakrishnan,
Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86@kernel.org,
Rafał Miłecki
On 27.11.2018 20:25, Borislav Petkov wrote:
> Currently, the code sets up the thresholding interrupt vector and only
> then goes about initializing the thresholding banks. Which is wrong,
> because an early thresholding interrupt would cause a NULL pointer
> dereference when accessing those banks and prevent the machine from
> booting.
>
> Therefore, set the thresholding interrupt vector only *after* having
> initialized the banks successfully.
>
> Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging error")
> Reported-by: Rafał Miłecki <rafal@milecki.pl>
> Reported-by: John Clemens <clemej@gmail.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
> Cc: linux-edac@vger.kernel.org
> Cc: stable@vger.kernel.org
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: x86@kernel.org
> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
> Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291
Tested on top of 4.14.83 and 4.20-rc4. It fixed both kernels for me!
Tested-by: Rafał Miłecki <rafal@milecki.pl>
John: thanks a lot for bisecting this issue down and finding a
workaround (mce=off). It allowed me to boot recent Linux & debug it
further.
Borislav: thank you for looking at my patch and coming with a nicer fix
SO quickly!
^ permalink raw reply [flat|nested] 4+ messages in thread
* x86/MCE/AMD: Fix the thresholding machinery initialization order
@ 2018-11-28 3:06 John Clemens
0 siblings, 0 replies; 4+ messages in thread
From: John Clemens @ 2018-11-28 3:06 UTC (permalink / raw)
To: Rafał Miłecki, Borislav Petkov, Ghannam, Yazen
Cc: Tony Luck, linux-edac@vger.kernel.org, Aravind Gopalakrishnan,
Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86@kernel.org,
Rafał Miłecki
On 11/27/18 3:38 PM, Rafał Miłecki wrote:
> On 27.11.2018 20:25, Borislav Petkov wrote:
>> Currently, the code sets up the thresholding interrupt vector and only
>> then goes about initializing the thresholding banks. Which is wrong,
>> because an early thresholding interrupt would cause a NULL pointer
>> dereference when accessing those banks and prevent the machine from
>> booting.
>>
>> Therefore, set the thresholding interrupt vector only *after* having
>> initialized the banks successfully.
>>
>> Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging
>> error")
>> Reported-by: Rafał Miłecki <rafal@milecki.pl>
>> Reported-by: John Clemens <clemej@gmail.com>
>> Signed-off-by: Borislav Petkov <bp@suse.de>
>> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
>> Cc: linux-edac@vger.kernel.org
>> Cc: stable@vger.kernel.org
>> Cc: Tony Luck <tony.luck@intel.com>
>> Cc: x86@kernel.org
>> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
>> Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291
>
> Tested on top of 4.14.83 and 4.20-rc4. It fixed both kernels for me!
>
> Tested-by: Rafał Miłecki <rafal@milecki.pl>
Not that anyone needs further verification at this point, but tested
working on top of 4.19.5, on the HP EliteBook 745 G5, BIOS Q81 v01.03.01
(same as bugzilla report).
Tested-by: John Clemens <john@deater.net>
> John: thanks a lot for bisecting this issue down and finding a
> workaround (mce=off). It allowed me to boot recent Linux & debug it
> further.
Thank you for taking it the rest of the way, life got in the way after
the bisect. I'm glad you were able to pick it up.
> Borislav: thank you for looking at my patch and coming with a nicer fix
> SO quickly!
My thanks as well, to all.
john.c
^ permalink raw reply [flat|nested] 4+ messages in thread
* x86/MCE/AMD: Fix the thresholding machinery initialization order
@ 2018-11-28 9:09 Borislav Petkov
0 siblings, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2018-11-28 9:09 UTC (permalink / raw)
To: John Clemens
Cc: Rafał Miłecki, Ghannam, Yazen, Tony Luck,
linux-edac@vger.kernel.org, Aravind Gopalakrishnan,
Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86@kernel.org,
Rafał Miłecki
On Tue, Nov 27, 2018 at 10:06:55PM -0500, John Clemens wrote:
> Not that anyone needs further verification at this point, but tested working
> on top of 4.19.5, on the HP EliteBook 745 G5, BIOS Q81 v01.03.01 (same as
> bugzilla report).
>
> Tested-by: John Clemens <john@deater.net>
Thanks for testing!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-11-28 9:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-28 9:09 x86/MCE/AMD: Fix the thresholding machinery initialization order Borislav Petkov
-- strict thread matches above, loose matches on Subject: below --
2018-11-28 3:06 John Clemens
2018-11-27 20:38 Rafał Miłecki
2018-11-27 19:25 Borislav Petkov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.