From: Borislav Petkov <bp@alien8.de>
To: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: dougthompson@xmission.com, mchehab@osg.samsung.com,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org
Subject: Re: [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors
Date: Wed, 10 Jun 2015 10:25:14 +0200 [thread overview]
Message-ID: <20150610082514.GD24639@pd.tnic> (raw)
In-Reply-To: <1433868317-18417-4-git-send-email-Aravind.Gopalakrishnan@amd.com>
On Tue, Jun 09, 2015 at 11:45:17AM -0500, Aravind Gopalakrishnan wrote:
> For bank 4 errors, MCE is logged and reported only on
> node base cores. Refer D18F3x44[NbMcaToMstCpuEn] field in
> Fam10h and later BKDGs.
>
> This patch ensures that we inject the error on the node base core
> for bank 4 errors. Otherwise, triggering #MC or apic interrupts on
> a non node base core would not have any effect on the system.
> (i.e), we would not see any relevant output on kernel logs for
> the error we just injected.
>
> Update copyrights info while at it.
>
> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> ---
> drivers/edac/mce_amd_inj.c | 57 +++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 56 insertions(+), 1 deletion(-)
Applied, did some cleanups too, see below.
It builds, now to do some testing :-)
---
From: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Date: Tue, 9 Jun 2015 11:45:17 -0500
Subject: [PATCH] EDAC, mce_amd_inj: Inject errors on NBC for bank 4 errors
Bank 4 MCEs are logged and reported only on the node base core (NBC) in
a socket. Refer to the D18F3x44[NbMcaToMstCpuEn] field in Fam10h and
later BKDGs. The node base core (NBC) is the lowest numbered core in the
node.
This patch ensures that we inject the error on the NBC for bank 4
errors. Otherwise, triggering #MC or APIC interrupts on a core which is
not the NBC would not have any effect on the system, i.e we would not
see any relevant output on kernel logs for the error we just injected.
Update copyrights info while at it.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mchehab@osg.samsung.com
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1433868317-18417-4-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Massage commit message, save an indentation level in toggle_nb_mca_mst_cpu,
reflow comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
drivers/edac/mce_amd_inj.c | 59 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 57 insertions(+), 2 deletions(-)
diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 3e1b53fb8f25..012b61a33a60 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -6,7 +6,7 @@
* This file may be distributed under the terms of the GNU General Public
* License version 2.
*
- * Copyright (c) 2010-14: Borislav Petkov <bp@alien8.de>
+ * Copyright (c) 2010-15: Borislav Petkov <bp@alien8.de>
* Advanced Micro Devices Inc.
*/
@@ -17,10 +17,13 @@
#include <linux/cpu.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <linux/pci.h>
#include <asm/mce.h>
#include <asm/irq_vectors.h>
+#include <asm/amd_nb.h>
#include "mce_amd.h"
+#include "amd64_edac.h"
/*
* Collect all the MCi_XXX settings
@@ -200,6 +203,45 @@ static void trigger_thr_int(void *info)
asm volatile("int %0" :: "i" (THRESHOLD_APIC_VECTOR));
}
+static u32 get_nbc_for_node(int node_id)
+{
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+ u32 cores_per_node;
+
+ cores_per_node = c->x86_max_cores / amd_get_nodes_per_socket();
+
+ return cores_per_node * node_id;
+}
+
+static void toggle_nb_mca_mst_cpu(u16 nid)
+{
+ struct pci_dev *F3 = node_to_amd_nb(nid)->misc;
+ u32 val;
+ int err;
+
+ if (!F3)
+ return;
+
+ err = pci_read_config_dword(F3, NBCFG, &val);
+ if (err) {
+ pr_err("%s: Error reading F%dx%03x.\n",
+ __func__, PCI_FUNC(F3->devfn), NBCFG);
+ return;
+ }
+
+ if (val & BIT(27))
+ return;
+
+ pr_err("%s: Set D18F3x44[NbMcaToMstCpuEn] which BIOS hasn't done.\n",
+ __func__);
+
+ val |= BIT(27);
+ err = pci_write_config_dword(F3, NBCFG, val);
+ if (err)
+ pr_err("%s: Error writing F%dx%03x.\n",
+ __func__, PCI_FUNC(F3->devfn), NBCFG);
+}
+
static void do_inject(void)
{
u64 mcg_status = 0;
@@ -220,7 +262,6 @@ static void do_inject(void)
* b. unset MCx_STATUS[UC]
* As deferred errors are _not_ UC
*/
-
i_mce.status |= MCI_STATUS_DEFERRED;
i_mce.status &= ~MCI_STATUS_UC;
}
@@ -235,6 +276,20 @@ static void do_inject(void)
if (!(i_mce.status & MCI_STATUS_PCC))
mcg_status |= MCG_STATUS_RIPV;
+ /*
+ * For multi node CPUs, logging and reporting of bank 4 errors happens
+ * only on the node base core. Refer to D18F3x44[NbMcaToMstCpuEn] for
+ * Fam10h and later BKDGs.
+ */
+ if (static_cpu_has(X86_FEATURE_AMD_DCM) && b == 4) {
+ /*
+ * BIOS sets D18F3x44[NbMcaToMstCpuEn] by default. But make sure
+ * of it here just in case.
+ */
+ toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu));
+ cpu = get_nbc_for_node(amd_get_nb_id(cpu));
+ }
+
toggle_hw_mce_inject(cpu, true);
wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS,
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
prev parent reply other threads:[~2015-06-10 8:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-09 16:45 [PATCH 0/3] Updates to EDAC mce_amd_inj Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
2015-06-10 8:02 ` Borislav Petkov
2015-06-18 10:55 ` [tip:x86/cpu] x86/cpu/amd: Give access to the number of nodes in a physical package tip-bot for Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 2/3] x86, amd: Provide accessor for number of nodes Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Aravind Gopalakrishnan
2015-06-10 8:25 ` Borislav Petkov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150610082514.GD24639@pd.tnic \
--to=bp@alien8.de \
--cc=Aravind.Gopalakrishnan@amd.com \
--cc=dougthompson@xmission.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@osg.samsung.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox