From: Yazen Ghannam <yazen.ghannam@amd.com>
To: "Naik, Avadhut" <avadnaik@amd.com>, bp@alien.de
Cc: Borislav Petkov <bp@alien8.de>,
Avadhut Naik <avadhut.naik@amd.com>,
x86@kernel.org, linux-edac@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org,
tony.luck@intel.com, qiuxu.zhuo@intel.com, tglx@linutronix.de,
mingo@redhat.com, rostedt@goodmis.org, mchehab@kernel.org,
john.allen@amd.com
Subject: Re: [PATCH v7 5/5] EDAC/mce_amd: Add support for FRU Text in MCA
Date: Wed, 30 Oct 2024 16:46:01 -0400 [thread overview]
Message-ID: <20241030204601.GA1505849@yaz-khff2.amd.com> (raw)
In-Reply-To: <5885d093-275d-4d29-ab13-2f118d61d62d@amd.com>
On Wed, Oct 30, 2024 at 02:57:33PM -0500, Naik, Avadhut wrote:
>
>
> On 10/30/2024 13:01, Borislav Petkov wrote:
> > On Wed, Oct 30, 2024 at 05:50:02PM +0100, Borislav Petkov wrote:
> >> Bah, crap. Lemme go back and take a second stab at this.
> >
> > Second try.
> >
> > The reason why I don't want to expose MCA_CONFIG to userspace is, well,
> > userspace doesn't need to know any "management" information the hw gives. It
> > either gets FRU text in that tracepoint or it doesn't. But it doesn't need to
> > know what MCA_CONFIG said or didn't say.
> >
> > Ok?
> >
> So, for now, in the kernel, we log SYND1/2 registers only when they contain
> FRUText.
> While in the userspace, since MCA_CONFIG is not in the picture, we always
> interpret SYND1/2 data as FRUText.
> Rasdaemon might need to be tweaked accordingly. Will take care of it.
> Overall, sounds good.
>
Sounds good to me too.
Thanks,
Yazen
> Do you want me send out a revised version with these changes?
>
> > Author: Yazen Ghannam <yazen.ghannam@amd.com>
> > Date: Tue Oct 22 19:36:31 2024 +0000
> >
> > EDAC/mce_amd: Add support for FRU text in MCA
> >
> > A new "FRU Text in MCA" feature is defined where the Field Replaceable
> > Unit (FRU) Text for a device is represented by a string in the new
> > MCA_SYND1 and MCA_SYND2 registers. This feature is supported per MCA
> > bank, and it is advertised by the McaFruTextInMca bit (MCA_CONFIG[9]).
> >
> > The FRU Text is populated dynamically for each individual error state
> > (MCA_STATUS, MCA_ADDR, et al.). Handle the case where an MCA bank covers
> > multiple devices, for example, a Unified Memory Controller (UMC) bank
> > that manages two DIMMs.
> >
> > [ Yazen: Add Avadhut as co-developer for wrapper changes. ]
> > [ bp: Do not expose MCA_CONFIG to userspace yet. ]
> >
> > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> > Co-developed-by: Avadhut Naik <avadhut.naik@amd.com>
> > Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> > Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> > Link: https://lore.kernel.org/r/20241022194158.110073-6-avadhut.naik@amd.com
> >
> > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> > index 4d936ee20e24..4543cf2eb5e8 100644
> > --- a/arch/x86/include/asm/mce.h
> > +++ b/arch/x86/include/asm/mce.h
> > @@ -61,6 +61,7 @@
> > * - TCC bit is present in MCx_STATUS.
> > */
> > #define MCI_CONFIG_MCAX 0x1
> > +#define MCI_CONFIG_FRUTEXT BIT_ULL(9)
> > #define MCI_IPID_MCATYPE 0xFFFF0000
> > #define MCI_IPID_HWID 0xFFF
> >
> > diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
> > index 194d9fd47d20..50d74d3bf0f5 100644
> > --- a/drivers/edac/mce_amd.c
> > +++ b/drivers/edac/mce_amd.c
> > @@ -795,6 +795,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> > struct mce *m = (struct mce *)data;
> > struct mce_hw_err *err = to_mce_hw_err(m);
> > unsigned int fam = x86_family(m->cpuid);
> > + u32 mca_config_lo = 0, dummy;
> > int ecc;
> >
> > if (m->kflags & MCE_HANDLED_CEC)
> > @@ -814,11 +815,9 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> > ((m->status & MCI_STATUS_PCC) ? "PCC" : "-"));
> >
> > if (boot_cpu_has(X86_FEATURE_SMCA)) {
> > - u32 low, high;
> > - u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
> > + rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(m->bank), &mca_config_lo, &dummy);
> >
> > - if (!rdmsr_safe(addr, &low, &high) &&
> > - (low & MCI_CONFIG_MCAX))
> > + if (mca_config_lo & MCI_CONFIG_MCAX)
> > pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-"));
> >
> > pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-"));
> > @@ -853,8 +852,15 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> >
> > if (m->status & MCI_STATUS_SYNDV) {
> > pr_cont(", Syndrome: 0x%016llx\n", m->synd);
> > - pr_emerg(HW_ERR "Syndrome1: 0x%016llx, Syndrome2: 0x%016llx",
> > - err->vendor.amd.synd1, err->vendor.amd.synd2);
> > + if (mca_config_lo & MCI_CONFIG_FRUTEXT) {
> > + char frutext[17];
> > +
> > + frutext[16] = '\0';
> > + memcpy(&frutext[0], &err->vendor.amd.synd1, 8);
> > + memcpy(&frutext[8], &err->vendor.amd.synd2, 8);
> > +
> > + pr_emerg(HW_ERR "FRU Text: %s", frutext);
> > + }
> > }
> >
> > pr_cont("\n");
> >
>
> --
> Thanks,
> Avadhut Naik
next prev parent reply other threads:[~2024-10-30 20:46 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 19:36 [PATCH v7 0/5] MCE wrapper and support for new SMCA syndrome MSRs Avadhut Naik
2024-10-22 19:36 ` [PATCH v7 1/5] x86/mce: Add wrapper for struct mce to export vendor specific info Avadhut Naik
2024-10-24 2:21 ` Zhuo, Qiuxu
2024-10-30 13:32 ` Borislav Petkov
2024-10-30 16:35 ` Naik, Avadhut
2024-10-30 16:48 ` Borislav Petkov
2024-10-30 16:50 ` Naik, Avadhut
2024-10-22 19:36 ` [PATCH v7 2/5] tracing: Add __print_dynamic_array() helper Avadhut Naik
2024-10-22 19:36 ` [PATCH v7 3/5] x86/mce, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Avadhut Naik
2024-10-24 2:25 ` Zhuo, Qiuxu
2024-10-22 19:36 ` [PATCH v7 4/5] x86/mce/apei: Handle variable register array size Avadhut Naik
2024-10-24 5:25 ` Zhuo, Qiuxu
2024-10-22 19:36 ` [PATCH v7 5/5] EDAC/mce_amd: Add support for FRU Text in MCA Avadhut Naik
2024-10-24 5:49 ` Zhuo, Qiuxu
2024-10-30 16:05 ` Borislav Petkov
2024-10-30 16:15 ` Borislav Petkov
2024-10-30 16:31 ` Yazen Ghannam
2024-10-30 16:49 ` Naik, Avadhut
2024-10-30 16:50 ` Borislav Petkov
2024-10-30 18:01 ` Borislav Petkov
2024-10-30 19:57 ` Naik, Avadhut
2024-10-30 20:46 ` Yazen Ghannam [this message]
2024-10-30 21:23 ` Borislav Petkov
2024-10-29 18:14 ` [PATCH v7 0/5] MCE wrapper and support for new SMCA syndrome MSRs Naik, Avadhut
2024-10-29 18:27 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241030204601.GA1505849@yaz-khff2.amd.com \
--to=yazen.ghannam@amd.com \
--cc=avadhut.naik@amd.com \
--cc=avadnaik@amd.com \
--cc=bp@alien.de \
--cc=bp@alien8.de \
--cc=john.allen@amd.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=mingo@redhat.com \
--cc=qiuxu.zhuo@intel.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).