* [PATCH V3 0/2] Rework mce_severity
@ 2015-03-23 15:42 Aravind Gopalakrishnan
2015-03-23 15:42 ` [PATCH V3 1/2] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-23 15:42 UTC (permalink / raw)
To: tglx, mingo, hpa, tony.luck, bp, slaoub, luto, x86, linux-kernel,
linux-edac
Cc: Aravind Gopalakrishnan
Patch1: Introduce AMD severities function
Patch2: Initialise mce_severity function pointer to mce_severity_intel
and override it to mce_severity_amd on AMD systems
Aravind Gopalakrishnan (2):
x86, mce, severities: Add AMD severities function
x86, mce, severities: Define mce_severity function pointer
arch/x86/include/asm/mce.h | 8 ++++
arch/x86/kernel/cpu/mcheck/mce-internal.h | 3 +-
arch/x86/kernel/cpu/mcheck/mce-severity.c | 64 ++++++++++++++++++++++++++++++-
arch/x86/kernel/cpu/mcheck/mce.c | 10 +++++
4 files changed, 83 insertions(+), 2 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-23 15:42 [PATCH V3 0/2] Rework mce_severity Aravind Gopalakrishnan @ 2015-03-23 15:42 ` Aravind Gopalakrishnan 2015-03-24 8:30 ` Borislav Petkov 2015-03-23 15:42 ` [PATCH V3 2/2] x86, mce, severities: Define mce_severity function pointer Aravind Gopalakrishnan 2015-03-23 21:54 ` [PATCH V3 0/2] Rework mce_severity Luck, Tony 2 siblings, 1 reply; 9+ messages in thread From: Aravind Gopalakrishnan @ 2015-03-23 15:42 UTC (permalink / raw) To: tglx, mingo, hpa, tony.luck, bp, slaoub, luto, x86, linux-kernel, linux-edac Cc: Aravind Gopalakrishnan, Aravind Gopalakrishnan Add a severities function that caters to AMD processors. This allows us to do some vendor specific work within the function if necessary. Also, introduce a vendor flag bitfield which contains vendor specific flags. The severities code uses this to define error scope based on the prescence of the flags field. This is based off of work by Boris Petkov. Testing details: Tested the patch for any regressions on Fam10h, Model 9h (Greyhound) Fam15h: Models 0h-0fh (Orochi), 30h-3fh (Kaveri) and 60h-6fh (Carrizo), Fam16h Model 00h-0fh (Kabini) Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> --- Changes from V2: - Rebase on top of latest tip - Tested patch on more systems and updated commit message appropriately Changes from V1: - Test mce_flags.overflow_recov once instead of multiple times arch/x86/include/asm/mce.h | 6 ++++ arch/x86/kernel/cpu/mcheck/mce-severity.c | 53 +++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/mcheck/mce.c | 9 ++++++ 3 files changed, 68 insertions(+) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index fd38a23..b574fbf 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -116,6 +116,12 @@ struct mca_config { u32 rip_msr; }; +struct mce_vendor_flags { + __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */ + __reserved_0 : 63; +}; +extern struct mce_vendor_flags mce_flags; + extern struct mca_config mca_cfg; extern void mce_register_decode_chain(struct notifier_block *nb); extern void mce_unregister_decode_chain(struct notifier_block *nb); diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 8bb4330..4f8f87d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -186,12 +186,65 @@ static int error_context(struct mce *m) return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; } +/* keeping mce_severity_amd in sync with AMD error scope heirarchy table */ +static int mce_severity_amd(struct mce *m, enum context ctx) +{ + enum context ctx = error_context(m); + /* Processor Context Corrupt, no need to fumble too much, die! */ + if (m->status & MCI_STATUS_PCC) + return MCE_PANIC_SEVERITY; + + if (m->status & MCI_STATUS_UC) { + /* + * On older systems, where overflow_recov flag is not + * present, we should simply PANIC if Overflow occurs. + * If overflow_recov flag set, then SW can try + * to at least kill process to salvage systen operation. + */ + + if (mce_flags.overflow_recov) { + /* software can try to contain */ + if (!(m->mcgstatus & MCG_STATUS_RIPV)) + if (ctx == IN_KERNEL) + return MCE_PANIC_SEVERITY; + + /* kill current process */ + return MCE_AR_SEVERITY; + } else { + /* at least one error was not logged */ + if (m->status & MCI_STATUS_OVER) + return MCE_PANIC_SEVERITY; + } + /* + * any other case, return MCE_UC_SEVERITY so that + * we log the error and exit #MC handler. + */ + return MCE_UC_SEVERITY; + } + + /* + * deferred error: poll handler catches these and adds to mce_ring + * so memory-failure can take recovery actions. + */ + if (m->status & MCI_STATUS_DEFERRED) + return MCE_DEFERRED_SEVERITY; + + /* + * corrected error: poll handler catches these and passes + * responsibility of decoding the error to EDAC + */ + return MCE_KEEP_SEVERITY; +} + int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp) { enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP); enum context ctx = error_context(m); struct severity *s; + if (m->cpuvendor == X86_VENDOR_AMD) + return mce_severity_amd(m, ctx); + for (s = severities;; s++) { if ((m->status & s->mask) != s->result) continue; diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 3cc6793..03c7e0a 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -65,6 +65,7 @@ static DEFINE_MUTEX(mce_chrdev_read_mutex); DEFINE_PER_CPU(unsigned, mce_exception_count); struct mce_bank *mce_banks __read_mostly; +struct mce_vendor_flags mce_flags __read_mostly; struct mca_config mca_cfg __read_mostly = { .bootlog = -1, @@ -1533,6 +1534,13 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c) mce_banks[0].ctl = 0; /* + * overflow_recov is supported for F15h Models 00h-0fh + * even though we don't have cpuid bit for this + */ + if (c->x86 == 0x15 && c->x86_model <= 0xf) + mce_flags.overflow_recov = 1; + + /* * Turn off MC4_MISC thresholding banks on those models since * they're not supported there. */ @@ -1631,6 +1639,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) break; case X86_VENDOR_AMD: mce_amd_feature_init(c); + mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1; break; default: break; -- 1.9.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-23 15:42 ` [PATCH V3 1/2] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan @ 2015-03-24 8:30 ` Borislav Petkov 2015-03-24 15:02 ` Aravind Gopalakrishnan 0 siblings, 1 reply; 9+ messages in thread From: Borislav Petkov @ 2015-03-24 8:30 UTC (permalink / raw) To: Aravind Gopalakrishnan Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac On Mon, Mar 23, 2015 at 10:42:52AM -0500, Aravind Gopalakrishnan wrote: > Add a severities function that caters to AMD processors. > This allows us to do some vendor specific work within the > function if necessary. > > Also, introduce a vendor flag bitfield which contains vendor > specific flags. The severities code uses this to define error > scope based on the prescence of the flags field. > > This is based off of work by Boris Petkov. > > Testing details: > Tested the patch for any regressions on > Fam10h, Model 9h (Greyhound) > Fam15h: Models 0h-0fh (Orochi), 30h-3fh (Kaveri) and 60h-6fh (Carrizo), > Fam16h Model 00h-0fh (Kabini) > > Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> > --- > Changes from V2: > - Rebase on top of latest tip > - Tested patch on more systems and updated commit message appropriately > > Changes from V1: > - Test mce_flags.overflow_recov once instead of multiple times > > arch/x86/include/asm/mce.h | 6 ++++ > arch/x86/kernel/cpu/mcheck/mce-severity.c | 53 +++++++++++++++++++++++++++++++ > arch/x86/kernel/cpu/mcheck/mce.c | 9 ++++++ > 3 files changed, 68 insertions(+) > > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h > index fd38a23..b574fbf 100644 > --- a/arch/x86/include/asm/mce.h > +++ b/arch/x86/include/asm/mce.h > @@ -116,6 +116,12 @@ struct mca_config { > u32 rip_msr; > }; > > +struct mce_vendor_flags { > + __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */ > + __reserved_0 : 63; > +}; > +extern struct mce_vendor_flags mce_flags; > + > extern struct mca_config mca_cfg; > extern void mce_register_decode_chain(struct notifier_block *nb); > extern void mce_unregister_decode_chain(struct notifier_block *nb); > diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c > index 8bb4330..4f8f87d 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c > +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c > @@ -186,12 +186,65 @@ static int error_context(struct mce *m) > return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; > } > > +/* keeping mce_severity_amd in sync with AMD error scope heirarchy table */ Which table do you mean? I changed it to: /* * See AMD Error Scope Hierarchy table in a newer BKDG. For example * 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features" */ to explicitly name it. > +static int mce_severity_amd(struct mce *m, enum context ctx) > +{ > + enum context ctx = error_context(m); arch/x86/kernel/cpu/mcheck/mce-severity.c: In function ‘mce_severity_amd’: arch/x86/kernel/cpu/mcheck/mce-severity.c:192:15: error: ‘ctx’ redeclared as different kind of symbol enum context ctx = error_context(m); ^ arch/x86/kernel/cpu/mcheck/mce-severity.c:190:57: note: previous definition of ‘ctx’ was here static int mce_severity_amd(struct mce *m, enum context ctx) ^ make[4]: *** [arch/x86/kernel/cpu/mcheck/mce-severity.o] Error 1 make[3]: *** [arch/x86/kernel/cpu/mcheck] Error 2 make[2]: *** [arch/x86/kernel/cpu] Error 2 make[1]: *** [arch/x86/kernel] Error 2 make: *** [arch/x86] Error 2 make: *** Waiting for unfinished jobs.... I fixed it up. I've committed this: --- From: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Date: Mon, 23 Mar 2015 10:42:52 -0500 Subject: [PATCH] x86/mce: Add an AMD severities-grading function Add a severities function that caters to AMD processors. This allows us to do some vendor-specific work within the function if necessary. Also, introduce a vendor flag bitfield for vendor-specific settings. The severities code uses this to define error scope based on the prescence of the flags field. This is based off of work by Boris Petkov. Testing details: Fam10h, Model 9h (Greyhound) Fam15h: Models 0h-0fh (Orochi), 30h-3fh (Kaveri) and 60h-6fh (Carrizo), Fam16h Model 00h-0fh (Kabini) Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/1427125373-2918-2-git-send-email-Aravind.Gopalakrishnan@amd.com [ Fixup build, clean up comments. ] Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/include/asm/mce.h | 6 ++++ arch/x86/kernel/cpu/mcheck/mce-severity.c | 56 +++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/mcheck/mce.c | 9 +++++ 3 files changed, 71 insertions(+) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index fd38a23e729f..b574fbf62d39 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -116,6 +116,12 @@ struct mca_config { u32 rip_msr; }; +struct mce_vendor_flags { + __u64 overflow_recov : 1, /* cpuid_ebx(80000007) */ + __reserved_0 : 63; +}; +extern struct mce_vendor_flags mce_flags; + extern struct mca_config mca_cfg; extern void mce_register_decode_chain(struct notifier_block *nb); extern void mce_unregister_decode_chain(struct notifier_block *nb); diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 8bb433043a7f..e16f3f201e06 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -186,12 +186,68 @@ static int error_context(struct mce *m) return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; } +/* + * See AMD Error Scope Hierarchy table in a newer BKDG. For example + * 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features" + */ +static int mce_severity_amd(struct mce *m, enum context ctx) +{ + /* Processor Context Corrupt, no need to fumble too much, die! */ + if (m->status & MCI_STATUS_PCC) + return MCE_PANIC_SEVERITY; + + if (m->status & MCI_STATUS_UC) { + + /* + * On older systems where overflow_recov flag is not present, we + * should simply panic if an error overflow occurs. If + * overflow_recov flag is present and set, then software can try + * to at least kill process to prolong system operation. + */ + if (mce_flags.overflow_recov) { + /* software can try to contain */ + if (!(m->mcgstatus & MCG_STATUS_RIPV)) + if (ctx == IN_KERNEL) + return MCE_PANIC_SEVERITY; + + /* kill current process */ + return MCE_AR_SEVERITY; + } else { + /* at least one error was not logged */ + if (m->status & MCI_STATUS_OVER) + return MCE_PANIC_SEVERITY; + } + + /* + * For any other case, return MCE_UC_SEVERITY so that we log the + * error and exit #MC handler. + */ + return MCE_UC_SEVERITY; + } + + /* + * deferred error: poll handler catches these and adds to mce_ring so + * memory-failure can take recovery actions. + */ + if (m->status & MCI_STATUS_DEFERRED) + return MCE_DEFERRED_SEVERITY; + + /* + * corrected error: poll handler catches these and passes responsibility + * of decoding the error to EDAC + */ + return MCE_KEEP_SEVERITY; +} + int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp) { enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP); enum context ctx = error_context(m); struct severity *s; + if (m->cpuvendor == X86_VENDOR_AMD) + return mce_severity_amd(m, ctx); + for (s = severities;; s++) { if ((m->status & s->mask) != s->result) continue; diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 8548b714a16b..1189f1150a19 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -64,6 +64,7 @@ static DEFINE_MUTEX(mce_chrdev_read_mutex); DEFINE_PER_CPU(unsigned, mce_exception_count); struct mce_bank *mce_banks __read_mostly; +struct mce_vendor_flags mce_flags __read_mostly; struct mca_config mca_cfg __read_mostly = { .bootlog = -1, @@ -1535,6 +1536,13 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c) mce_banks[0].ctl = 0; /* + * overflow_recov is supported for F15h Models 00h-0fh + * even though we don't have a CPUID bit for it. + */ + if (c->x86 == 0x15 && c->x86_model <= 0xf) + mce_flags.overflow_recov = 1; + + /* * Turn off MC4_MISC thresholding banks on those models since * they're not supported there. */ @@ -1633,6 +1641,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) break; case X86_VENDOR_AMD: mce_amd_feature_init(c); + mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1; break; default: break; -- 2.3.3 -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-24 8:30 ` Borislav Petkov @ 2015-03-24 15:02 ` Aravind Gopalakrishnan 2015-03-24 15:18 ` Borislav Petkov 0 siblings, 1 reply; 9+ messages in thread From: Aravind Gopalakrishnan @ 2015-03-24 15:02 UTC (permalink / raw) To: Borislav Petkov Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac On 3/24/2015 3:30 AM, Borislav Petkov wrote: > On Mon, Mar 23, 2015 at 10:42:52AM -0500, Aravind Gopalakrishnan wrote: >> >> +/* keeping mce_severity_amd in sync with AMD error scope heirarchy table */ > Which table do you mean? > > I changed it to: > > /* > * See AMD Error Scope Hierarchy table in a newer BKDG. For example > * 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features" > */ Yes, this is the one I meant. Thanks. > to explicitly name it. > >> +static int mce_severity_amd(struct mce *m, enum context ctx) >> +{ >> + enum context ctx = error_context(m); > arch/x86/kernel/cpu/mcheck/mce-severity.c: In function ‘mce_severity_amd’: > arch/x86/kernel/cpu/mcheck/mce-severity.c:192:15: error: ‘ctx’ redeclared as different kind of symbol > enum context ctx = error_context(m); > ^ > arch/x86/kernel/cpu/mcheck/mce-severity.c:190:57: note: previous definition of ‘ctx’ was here > static int mce_severity_amd(struct mce *m, enum context ctx) > ^ > make[4]: *** [arch/x86/kernel/cpu/mcheck/mce-severity.o] Error 1 > make[3]: *** [arch/x86/kernel/cpu/mcheck] Error 2 > make[2]: *** [arch/x86/kernel/cpu] Error 2 > make[1]: *** [arch/x86/kernel] Error 2 > make: *** [arch/x86] Error 2 > make: *** Waiting for unfinished jobs.... > > I fixed it up. Sorry about that. That line should be in [patch 2/2] and I mistakenly committed it as part of this patch. It didn't show up on my builds as I had build tested after applying both patches. But there is a different problem now. The second patch won't apply cleanly on top of your fix- error: while searching for: } /* keeping mce_severity_amd in sync with AMD error scope heirarchy table */ static int mce_severity_amd(struct mce *m, enum context ctx) { enum context ctx = error_context(m); /* Processor Context Corrupt, no need to fumble too much, die! */ error: patch failed: arch/x86/kernel/cpu/mcheck/mce-severity.c:187 I think the clean way to fix this would be to move the line to [patch 2/2]. Besides, we would need a new patch 2 anyway as the patch application would fail due to changes to the comment line. Shall I do that and resend? Thanks, -Aravind. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-24 15:02 ` Aravind Gopalakrishnan @ 2015-03-24 15:18 ` Borislav Petkov 2015-03-24 15:23 ` Aravind Gopalakrishnan 0 siblings, 1 reply; 9+ messages in thread From: Borislav Petkov @ 2015-03-24 15:18 UTC (permalink / raw) To: Aravind Gopalakrishnan Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac On Tue, Mar 24, 2015 at 10:02:33AM -0500, Aravind Gopalakrishnan wrote: > I think the clean way to fix this would be to move the line to [patch 2/2]. > Besides, we would need a new patch 2 anyway as the patch application would > fail due to changes to the comment line. Already done and done. Here's what I'm working with: http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/log/?h=tip-x86-ras -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-24 15:18 ` Borislav Petkov @ 2015-03-24 15:23 ` Aravind Gopalakrishnan 2015-03-24 16:03 ` Borislav Petkov 0 siblings, 1 reply; 9+ messages in thread From: Aravind Gopalakrishnan @ 2015-03-24 15:23 UTC (permalink / raw) To: Borislav Petkov Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac On 3/24/2015 10:18 AM, Borislav Petkov wrote: > On Tue, Mar 24, 2015 at 10:02:33AM -0500, Aravind Gopalakrishnan wrote: >> I think the clean way to fix this would be to move the line to [patch 2/2]. >> Besides, we would need a new patch 2 anyway as the patch application would >> fail due to changes to the comment line. > Already done and done. > > Here's what I'm working with: > > http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/log/?h=tip-x86-ras > Ah. Ok, Thanks! -Aravind. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/2] x86, mce, severities: Add AMD severities function 2015-03-24 15:23 ` Aravind Gopalakrishnan @ 2015-03-24 16:03 ` Borislav Petkov 0 siblings, 0 replies; 9+ messages in thread From: Borislav Petkov @ 2015-03-24 16:03 UTC (permalink / raw) To: Aravind Gopalakrishnan Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac On Tue, Mar 24, 2015 at 10:23:43AM -0500, Aravind Gopalakrishnan wrote: > >http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/log/?h=tip-x86-ras Btw, testing looks good on the Intel SNB box and on the K8 box I have here. If nothing breaks until next week, I'll send the stuff to Ingo. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH V3 2/2] x86, mce, severities: Define mce_severity function pointer 2015-03-23 15:42 [PATCH V3 0/2] Rework mce_severity Aravind Gopalakrishnan 2015-03-23 15:42 ` [PATCH V3 1/2] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan @ 2015-03-23 15:42 ` Aravind Gopalakrishnan 2015-03-23 21:54 ` [PATCH V3 0/2] Rework mce_severity Luck, Tony 2 siblings, 0 replies; 9+ messages in thread From: Aravind Gopalakrishnan @ 2015-03-23 15:42 UTC (permalink / raw) To: tglx, mingo, hpa, tony.luck, bp, slaoub, luto, x86, linux-kernel, linux-edac Cc: Aravind Gopalakrishnan Rename mce_severity() as mce_severity_intel and assign mce_severity function pointer to mce_severity_amd during init if we are on an AMD processor. This way, we can avoid a test to call mce_severity_amd every time we get into mce_severity(). And it's cleaner to do it this way. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> --- Changes from V2: (per Boris suggestion) - use mce_severity_intel for default and only override it for AMD systems - remove switch and use simple if for checking if we are on AMD processor arch/x86/include/asm/mce.h | 2 ++ arch/x86/kernel/cpu/mcheck/mce-internal.h | 3 ++- arch/x86/kernel/cpu/mcheck/mce-severity.c | 19 ++++++++++++++----- arch/x86/kernel/cpu/mcheck/mce.c | 1 + 4 files changed, 19 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index b574fbf..1f5a86d 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -134,9 +134,11 @@ extern int mce_p5_enabled; #ifdef CONFIG_X86_MCE int mcheck_init(void); void mcheck_cpu_init(struct cpuinfo_x86 *c); +void mcheck_vendor_init_severity(void); #else static inline int mcheck_init(void) { return 0; } static inline void mcheck_cpu_init(struct cpuinfo_x86 *c) {} +static inline void mcheck_vendor_init_severity(void) {} #endif #ifdef CONFIG_X86_ANCIENT_MCE diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index e12f0bf..4758f5f 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -24,7 +24,8 @@ struct mce_bank { char attrname[ATTR_LEN]; /* attribute name */ }; -int mce_severity(struct mce *a, int tolerant, char **msg, bool is_excp); +extern int (*mce_severity)(struct mce *a, int tolerant, + char **msg, bool is_excp); struct dentry *mce_get_debugfs_dir(void); extern struct mce_bank *mce_banks; diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 4f8f87d..2d444de 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -187,7 +187,8 @@ static int error_context(struct mce *m) } /* keeping mce_severity_amd in sync with AMD error scope heirarchy table */ -static int mce_severity_amd(struct mce *m, enum context ctx) +static int mce_severity_amd(struct mce *m, int tolerant, + char **msg, bool is_excp) { enum context ctx = error_context(m); /* Processor Context Corrupt, no need to fumble too much, die! */ @@ -236,15 +237,13 @@ static int mce_severity_amd(struct mce *m, enum context ctx) return MCE_KEEP_SEVERITY; } -int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp) +static int mce_severity_intel(struct mce *m, int tolerant, + char **msg, bool is_excp) { enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP); enum context ctx = error_context(m); struct severity *s; - if (m->cpuvendor == X86_VENDOR_AMD) - return mce_severity_amd(m, ctx); - for (s = severities;; s++) { if ((m->status & s->mask) != s->result) continue; @@ -269,6 +268,16 @@ int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp) } } +/* Default to mce_severity_intel */ +int (*mce_severity)(struct mce *m, int tolerant, char **msg, bool is_excp) = + mce_severity_intel; + +void __init mcheck_vendor_init_severity(void) +{ + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + mce_severity = mce_severity_amd; +} + #ifdef CONFIG_DEBUG_FS static void *s_start(struct seq_file *f, loff_t *pos) { diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 03c7e0a..0faf418 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2024,6 +2024,7 @@ __setup("mce", mcheck_enable); int __init mcheck_init(void) { mcheck_intel_therm_init(); + mcheck_vendor_init_severity(); return 0; } -- 1.9.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [PATCH V3 0/2] Rework mce_severity 2015-03-23 15:42 [PATCH V3 0/2] Rework mce_severity Aravind Gopalakrishnan 2015-03-23 15:42 ` [PATCH V3 1/2] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan 2015-03-23 15:42 ` [PATCH V3 2/2] x86, mce, severities: Define mce_severity function pointer Aravind Gopalakrishnan @ 2015-03-23 21:54 ` Luck, Tony 2 siblings, 0 replies; 9+ messages in thread From: Luck, Tony @ 2015-03-23 21:54 UTC (permalink / raw) To: Aravind Gopalakrishnan, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, bp@alien8.de, slaoub@gmail.com, luto@amacapital.net, x86@kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org > Patch1: Introduce AMD severities function > Patch2: Initialise mce_severity function pointer to mce_severity_intel > and override it to mce_severity_amd on AMD systems both parts: Acked-by: Tony Luck <tony.luck@intel.com> -Tony ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-03-24 16:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-23 15:42 [PATCH V3 0/2] Rework mce_severity Aravind Gopalakrishnan 2015-03-23 15:42 ` [PATCH V3 1/2] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan 2015-03-24 8:30 ` Borislav Petkov 2015-03-24 15:02 ` Aravind Gopalakrishnan 2015-03-24 15:18 ` Borislav Petkov 2015-03-24 15:23 ` Aravind Gopalakrishnan 2015-03-24 16:03 ` Borislav Petkov 2015-03-23 15:42 ` [PATCH V3 2/2] x86, mce, severities: Define mce_severity function pointer Aravind Gopalakrishnan 2015-03-23 21:54 ` [PATCH V3 0/2] Rework mce_severity Luck, Tony
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox