* [PATCH 0/3] Updates to EDAC mce_amd_inj
@ 2015-06-09 16:45 Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Aravind Gopalakrishnan @ 2015-06-09 16:45 UTC (permalink / raw)
To: dougthompson, bp, mchehab; +Cc: linux-edac, linux-kernel, x86
This is basically a V3 of the final 3 patches of an earlier patchset
to update injection interfaces in mce_amd_inj.
Here's a link to V2 of the original series:
http://marc.info/?l=linux-edac&m=143327677901235&w=2
Patches 1-6 of above are currently in 'for-next' of bp.git
Spinning this as a separate series as they needed to be rebased on
top of CONFIG_X86_HT removal patch which made it into tip.
Patch 1: Store number of nodes as a static global in amd.c
No functional change.
Patch 2: Provide accessor function to obtain the number of nodes per processor
Patch 3: Modify injection mechanism for bank 4 errors. Since they are
typically logged or reported only on NBC, we make sure that
we inject on the correct core here.
Since the earlier patches are split across tip and bp.git, this series is
based on top of tip with EDAC patches merged from 'for-next' of bp.git
Changes wrt V2 of the original series:
- Rebase on top of CONFIG_X86_HT removal patch
- Remove unnecessary amd_set_num_nodes() function (from patch 7 of original
series) and simplify code. Current changes are in Patch 1
- And reword the commit message.
- checkpatch with --strict threw couple of checks on Patch 3, fixed them
- Update copyright info in mce_amd_inj.c
Aravind Gopalakrishnan (3):
x86, amd: Store number of nodes in a static global variable
x86, amd: Provide accessor for number of nodes
edac, mce_amd_inj: Inject errors on NBC for bank 4 errors
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/amd.c | 23 ++++++++++++----
drivers/edac/mce_amd_inj.c | 57 +++++++++++++++++++++++++++++++++++++++-
3 files changed, 75 insertions(+), 6 deletions(-)
--
2.4.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] x86, amd: Store number of nodes in a static global variable
2015-06-09 16:45 [PATCH 0/3] Updates to EDAC mce_amd_inj Aravind Gopalakrishnan
@ 2015-06-09 16:45 ` Aravind Gopalakrishnan
2015-06-10 8:02 ` Borislav Petkov
2015-06-18 10:55 ` [tip:x86/cpu] x86/cpu/amd: Give access to the number of nodes in a physical package tip-bot for Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 2/3] x86, amd: Provide accessor for number of nodes Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Aravind Gopalakrishnan
2 siblings, 2 replies; 7+ messages in thread
From: Aravind Gopalakrishnan @ 2015-06-09 16:45 UTC (permalink / raw)
To: dougthompson, bp, mchehab
Cc: linux-edac, linux-kernel, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Jacob Shin, Dave Hansen,
Andy Lutomirski, Paolo Bonzini
Moving this out of function local scope as we want to
allow EDAC to be able to view this topology attribute.
A follow-up patch introduces an accessor function for this variable
that will be used by EDAC's mce_amd_inj module.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
---
arch/x86/kernel/cpu/amd.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 5bd3a99..487083b 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -19,6 +19,13 @@
#include "cpu.h"
+/*
+ * nodes_per_processor: Specifies number of nodes per socket
+ * Refer Fam15h Models 00-0fh BKDG-
+ * CPUID Fn8000_001E_ECX Node Identifiers[10:8]
+ */
+static u32 nodes_per_processor = 1;
+
static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
{
u32 gprs[8] = { 0 };
@@ -291,7 +298,7 @@ static int nearby_node(int apicid)
#ifdef CONFIG_SMP
static void amd_get_topology(struct cpuinfo_x86 *c)
{
- u32 nodes, cores_per_cu = 1;
+ u32 cores_per_cu = 1;
u8 node_id;
int cpu = smp_processor_id();
@@ -300,7 +307,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u32 eax, ebx, ecx, edx;
cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
- nodes = ((ecx >> 8) & 7) + 1;
+ nodes_per_processor = ((ecx >> 8) & 7) + 1;
node_id = ecx & 7;
/* get compute unit information */
@@ -311,18 +318,18 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u64 value;
rdmsrl(MSR_FAM10H_NODE_ID, value);
- nodes = ((value >> 3) & 7) + 1;
+ nodes_per_processor = ((value >> 3) & 7) + 1;
node_id = value & 7;
} else
return;
/* fixup multi-node processor information */
- if (nodes > 1) {
+ if (nodes_per_processor > 1) {
u32 cores_per_node;
u32 cus_per_node;
set_cpu_cap(c, X86_FEATURE_AMD_DCM);
- cores_per_node = c->x86_max_cores / nodes;
+ cores_per_node = c->x86_max_cores / nodes_per_processor;
cus_per_node = cores_per_node / cores_per_cu;
/* store NodeID, use llc_shared_map to store sibling info */
--
2.4.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] x86, amd: Provide accessor for number of nodes
2015-06-09 16:45 [PATCH 0/3] Updates to EDAC mce_amd_inj Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
@ 2015-06-09 16:45 ` Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Aravind Gopalakrishnan
2 siblings, 0 replies; 7+ messages in thread
From: Aravind Gopalakrishnan @ 2015-06-09 16:45 UTC (permalink / raw)
To: dougthompson, bp, mchehab
Cc: linux-edac, linux-kernel, x86, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, Borislav Petkov, Jacob Shin, Dave Hansen,
Andy Lutomirski, Paolo Bonzini, Denys Vlasenko,
Hector Marco-Gisbert
Add an accessor function amd_get_nodes_cnt() which returns
the number of nodes per socket.
In a subsequent patch, we will use this info in EDAC
mce_amd_inj module.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/amd.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 8e04f51..34faf24 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -820,6 +820,7 @@ static inline int mpx_disable_management(struct task_struct *tsk)
#endif /* CONFIG_X86_INTEL_MPX */
extern u16 amd_get_nb_id(int cpu);
+extern u32 amd_get_nodes_cnt(void);
static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
{
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 487083b..788655a 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -373,6 +373,12 @@ u16 amd_get_nb_id(int cpu)
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);
+u32 amd_get_nodes_cnt(void)
+{
+ return nodes_per_processor;
+}
+EXPORT_SYMBOL_GPL(amd_get_nodes_cnt);
+
static void srat_detect_node(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_NUMA
--
2.4.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors
2015-06-09 16:45 [PATCH 0/3] Updates to EDAC mce_amd_inj Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 2/3] x86, amd: Provide accessor for number of nodes Aravind Gopalakrishnan
@ 2015-06-09 16:45 ` Aravind Gopalakrishnan
2015-06-10 8:25 ` Borislav Petkov
2 siblings, 1 reply; 7+ messages in thread
From: Aravind Gopalakrishnan @ 2015-06-09 16:45 UTC (permalink / raw)
To: dougthompson, bp, mchehab; +Cc: linux-edac, linux-kernel, x86
For bank 4 errors, MCE is logged and reported only on
node base cores. Refer D18F3x44[NbMcaToMstCpuEn] field in
Fam10h and later BKDGs.
This patch ensures that we inject the error on the node base core
for bank 4 errors. Otherwise, triggering #MC or apic interrupts on
a non node base core would not have any effect on the system.
(i.e), we would not see any relevant output on kernel logs for
the error we just injected.
Update copyrights info while at it.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
---
drivers/edac/mce_amd_inj.c | 57 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 56 insertions(+), 1 deletion(-)
diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 3e1b53f..af55d49 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -6,7 +6,7 @@
* This file may be distributed under the terms of the GNU General Public
* License version 2.
*
- * Copyright (c) 2010-14: Borislav Petkov <bp@alien8.de>
+ * Copyright (c) 2010-15: Borislav Petkov <bp@alien8.de>
* Advanced Micro Devices Inc.
*/
@@ -17,10 +17,13 @@
#include <linux/cpu.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <linux/pci.h>
#include <asm/mce.h>
#include <asm/irq_vectors.h>
+#include <asm/amd_nb.h>
#include "mce_amd.h"
+#include "amd64_edac.h"
/*
* Collect all the MCi_XXX settings
@@ -200,6 +203,44 @@ static void trigger_thr_int(void *info)
asm volatile("int %0" :: "i" (THRESHOLD_APIC_VECTOR));
}
+static u32 amd_get_nbc_for_node(int node_id)
+{
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+ u32 cores_per_node;
+
+ cores_per_node = c->x86_max_cores / amd_get_nodes_cnt();
+
+ return cores_per_node * node_id;
+}
+
+static void toggle_nb_mca_mst_cpu(u16 nid)
+{
+ struct pci_dev *F3 = node_to_amd_nb(nid)->misc;
+ u32 val;
+ int err;
+
+ if (!F3)
+ return;
+
+ err = pci_read_config_dword(F3, NBCFG, &val);
+ if (err) {
+ pr_err("%s: Error reading F%dx%03x.\n", __func__,
+ PCI_FUNC(F3->devfn),
+ NBCFG);
+ return;
+ }
+
+ if (!(val & BIT(27))) {
+ pr_err("%s: BIOS not setting D18F3x44[NbMcaToMstCpuEn]. Doing that here\n", __func__);
+ val |= BIT(27);
+ err = pci_write_config_dword(F3, NBCFG, val);
+ if (err)
+ pr_err("%s: Error writing F%dx%03x.\n", __func__,
+ PCI_FUNC(F3->devfn),
+ NBCFG);
+ }
+}
+
static void do_inject(void)
{
u64 mcg_status = 0;
@@ -235,6 +276,20 @@ static void do_inject(void)
if (!(i_mce.status & MCI_STATUS_PCC))
mcg_status |= MCG_STATUS_RIPV;
+ /*
+ * For multi node cpus, logging and reporting of bank == 4 errors
+ * happen only on the node base core. Refer D18F3x44[NbMcaToMstCpuEn]
+ * for Fam10h and later BKDGs
+ */
+ if (static_cpu_has(X86_FEATURE_AMD_DCM) && b == 4) {
+ /*
+ * BIOS sets D18F3x44[NbMcaToMstCpuEn] by default.
+ * But make sure of it here just in case..
+ */
+ toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu));
+ cpu = amd_get_nbc_for_node(amd_get_nb_id(cpu));
+ }
+
toggle_hw_mce_inject(cpu, true);
wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS,
--
2.4.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] x86, amd: Store number of nodes in a static global variable
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
@ 2015-06-10 8:02 ` Borislav Petkov
2015-06-18 10:55 ` [tip:x86/cpu] x86/cpu/amd: Give access to the number of nodes in a physical package tip-bot for Aravind Gopalakrishnan
1 sibling, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2015-06-10 8:02 UTC (permalink / raw)
To: Aravind Gopalakrishnan
Cc: dougthompson, mchehab, linux-edac, linux-kernel, x86,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Borislav Petkov,
Jacob Shin, Dave Hansen, Andy Lutomirski, Paolo Bonzini
On Tue, Jun 09, 2015 at 11:45:15AM -0500, Aravind Gopalakrishnan wrote:
> Moving this out of function local scope as we want to
> allow EDAC to be able to view this topology attribute.
>
> A follow-up patch introduces an accessor function for this variable
> that will be used by EDAC's mce_amd_inj module.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jacob Shin <jacob.w.shin@gmail.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> ---
> arch/x86/kernel/cpu/amd.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
I did some cleanup and merged it with the next patch. It seems simple
enough and splitting it was not such a good idea, after all. And called
it nodes_per_socket to avoid confusion.
---
From: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Date: Tue, 9 Jun 2015 11:45:15 -0500
Subject: [PATCH] x86/cpu/amd: Give access to the number of nodes in a physical
package
Stash the number of nodes in a physical processor package locally and
add an accessor to be called by interested parties. The first user is
the MCE injection module which uses it to find the node base core in a
package for injecting a certain type of errors.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mchehab@osg.samsung.com
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1433868317-18417-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Rewrite commit message, merge with the accessor patch and unify naming. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/amd.c | 23 ++++++++++++++++++-----
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 23ba6765b718..9aa52fd13a78 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -946,6 +946,7 @@ static inline int mpx_disable_management(struct task_struct *tsk)
#endif /* CONFIG_X86_INTEL_MPX */
extern u16 amd_get_nb_id(int cpu);
+extern u32 amd_get_nodes_per_socket(void);
static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
{
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 94e7051fba1a..56cae1964a81 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -19,6 +19,13 @@
#include "cpu.h"
+/*
+ * nodes_per_socket: Stores the number of nodes per socket.
+ * Refer to Fam15h Models 00-0fh BKDG - CPUID Fn8000_001E_ECX
+ * Node Identifiers[10:8]
+ */
+static u32 nodes_per_socket = 1;
+
static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
{
u32 gprs[8] = { 0 };
@@ -291,7 +298,7 @@ static int nearby_node(int apicid)
#ifdef CONFIG_X86_HT
static void amd_get_topology(struct cpuinfo_x86 *c)
{
- u32 nodes, cores_per_cu = 1;
+ u32 cores_per_cu = 1;
u8 node_id;
int cpu = smp_processor_id();
@@ -300,7 +307,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u32 eax, ebx, ecx, edx;
cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
- nodes = ((ecx >> 8) & 7) + 1;
+ nodes_per_socket = ((ecx >> 8) & 7) + 1;
node_id = ecx & 7;
/* get compute unit information */
@@ -311,18 +318,18 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u64 value;
rdmsrl(MSR_FAM10H_NODE_ID, value);
- nodes = ((value >> 3) & 7) + 1;
+ nodes_per_socket = ((value >> 3) & 7) + 1;
node_id = value & 7;
} else
return;
/* fixup multi-node processor information */
- if (nodes > 1) {
+ if (nodes_per_socket > 1) {
u32 cores_per_node;
u32 cus_per_node;
set_cpu_cap(c, X86_FEATURE_AMD_DCM);
- cores_per_node = c->x86_max_cores / nodes;
+ cores_per_node = c->x86_max_cores / nodes_per_socket;
cus_per_node = cores_per_node / cores_per_cu;
/* store NodeID, use llc_shared_map to store sibling info */
@@ -366,6 +373,12 @@ u16 amd_get_nb_id(int cpu)
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);
+u32 amd_get_nodes_per_socket(void)
+{
+ return nodes_per_socket;
+}
+EXPORT_SYMBOL_GPL(amd_get_nodes_per_socket);
+
static void srat_detect_node(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_NUMA
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors
2015-06-09 16:45 ` [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Aravind Gopalakrishnan
@ 2015-06-10 8:25 ` Borislav Petkov
0 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2015-06-10 8:25 UTC (permalink / raw)
To: Aravind Gopalakrishnan
Cc: dougthompson, mchehab, linux-edac, linux-kernel, x86
On Tue, Jun 09, 2015 at 11:45:17AM -0500, Aravind Gopalakrishnan wrote:
> For bank 4 errors, MCE is logged and reported only on
> node base cores. Refer D18F3x44[NbMcaToMstCpuEn] field in
> Fam10h and later BKDGs.
>
> This patch ensures that we inject the error on the node base core
> for bank 4 errors. Otherwise, triggering #MC or apic interrupts on
> a non node base core would not have any effect on the system.
> (i.e), we would not see any relevant output on kernel logs for
> the error we just injected.
>
> Update copyrights info while at it.
>
> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> ---
> drivers/edac/mce_amd_inj.c | 57 +++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 56 insertions(+), 1 deletion(-)
Applied, did some cleanups too, see below.
It builds, now to do some testing :-)
---
From: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Date: Tue, 9 Jun 2015 11:45:17 -0500
Subject: [PATCH] EDAC, mce_amd_inj: Inject errors on NBC for bank 4 errors
Bank 4 MCEs are logged and reported only on the node base core (NBC) in
a socket. Refer to the D18F3x44[NbMcaToMstCpuEn] field in Fam10h and
later BKDGs. The node base core (NBC) is the lowest numbered core in the
node.
This patch ensures that we inject the error on the NBC for bank 4
errors. Otherwise, triggering #MC or APIC interrupts on a core which is
not the NBC would not have any effect on the system, i.e we would not
see any relevant output on kernel logs for the error we just injected.
Update copyrights info while at it.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mchehab@osg.samsung.com
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1433868317-18417-4-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Massage commit message, save an indentation level in toggle_nb_mca_mst_cpu,
reflow comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
drivers/edac/mce_amd_inj.c | 59 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 57 insertions(+), 2 deletions(-)
diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 3e1b53fb8f25..012b61a33a60 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -6,7 +6,7 @@
* This file may be distributed under the terms of the GNU General Public
* License version 2.
*
- * Copyright (c) 2010-14: Borislav Petkov <bp@alien8.de>
+ * Copyright (c) 2010-15: Borislav Petkov <bp@alien8.de>
* Advanced Micro Devices Inc.
*/
@@ -17,10 +17,13 @@
#include <linux/cpu.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <linux/pci.h>
#include <asm/mce.h>
#include <asm/irq_vectors.h>
+#include <asm/amd_nb.h>
#include "mce_amd.h"
+#include "amd64_edac.h"
/*
* Collect all the MCi_XXX settings
@@ -200,6 +203,45 @@ static void trigger_thr_int(void *info)
asm volatile("int %0" :: "i" (THRESHOLD_APIC_VECTOR));
}
+static u32 get_nbc_for_node(int node_id)
+{
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+ u32 cores_per_node;
+
+ cores_per_node = c->x86_max_cores / amd_get_nodes_per_socket();
+
+ return cores_per_node * node_id;
+}
+
+static void toggle_nb_mca_mst_cpu(u16 nid)
+{
+ struct pci_dev *F3 = node_to_amd_nb(nid)->misc;
+ u32 val;
+ int err;
+
+ if (!F3)
+ return;
+
+ err = pci_read_config_dword(F3, NBCFG, &val);
+ if (err) {
+ pr_err("%s: Error reading F%dx%03x.\n",
+ __func__, PCI_FUNC(F3->devfn), NBCFG);
+ return;
+ }
+
+ if (val & BIT(27))
+ return;
+
+ pr_err("%s: Set D18F3x44[NbMcaToMstCpuEn] which BIOS hasn't done.\n",
+ __func__);
+
+ val |= BIT(27);
+ err = pci_write_config_dword(F3, NBCFG, val);
+ if (err)
+ pr_err("%s: Error writing F%dx%03x.\n",
+ __func__, PCI_FUNC(F3->devfn), NBCFG);
+}
+
static void do_inject(void)
{
u64 mcg_status = 0;
@@ -220,7 +262,6 @@ static void do_inject(void)
* b. unset MCx_STATUS[UC]
* As deferred errors are _not_ UC
*/
-
i_mce.status |= MCI_STATUS_DEFERRED;
i_mce.status &= ~MCI_STATUS_UC;
}
@@ -235,6 +276,20 @@ static void do_inject(void)
if (!(i_mce.status & MCI_STATUS_PCC))
mcg_status |= MCG_STATUS_RIPV;
+ /*
+ * For multi node CPUs, logging and reporting of bank 4 errors happens
+ * only on the node base core. Refer to D18F3x44[NbMcaToMstCpuEn] for
+ * Fam10h and later BKDGs.
+ */
+ if (static_cpu_has(X86_FEATURE_AMD_DCM) && b == 4) {
+ /*
+ * BIOS sets D18F3x44[NbMcaToMstCpuEn] by default. But make sure
+ * of it here just in case.
+ */
+ toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu));
+ cpu = get_nbc_for_node(amd_get_nb_id(cpu));
+ }
+
toggle_hw_mce_inject(cpu, true);
wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS,
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [tip:x86/cpu] x86/cpu/amd: Give access to the number of nodes in a physical package
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
2015-06-10 8:02 ` Borislav Petkov
@ 2015-06-18 10:55 ` tip-bot for Aravind Gopalakrishnan
1 sibling, 0 replies; 7+ messages in thread
From: tip-bot for Aravind Gopalakrishnan @ 2015-06-18 10:55 UTC (permalink / raw)
To: linux-tip-commits
Cc: pbonzini, luto, peterz, mingo, akpm, tglx, linux-kernel, luto,
oleg, dvlasenk, brgerst, jacob.w.shin, Aravind.Gopalakrishnan, bp,
torvalds, bp, hpa, dave.hansen, linux-edac
Commit-ID: cc2749e4095cbbcb35518fb2db5e926b85c3f25f
Gitweb: http://git.kernel.org/tip/cc2749e4095cbbcb35518fb2db5e926b85c3f25f
Author: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
AuthorDate: Mon, 15 Jun 2015 10:28:15 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 18 Jun 2015 11:16:06 +0200
x86/cpu/amd: Give access to the number of nodes in a physical package
Stash the number of nodes in a physical processor package
locally and add an accessor to be called by interested parties.
The first user is the MCE injection module which uses it to find
the node base core in a package for injecting a certain type of
errors.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Rewrote the commit message, merged it with the accessor patch and unified naming. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mchehab@osg.samsung.com
Link: http://lkml.kernel.org/r/1433868317-18417-2-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/amd.c | 23 ++++++++++++++++++-----
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 23ba676..9aa52fd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -946,6 +946,7 @@ static inline int mpx_disable_management(struct task_struct *tsk)
#endif /* CONFIG_X86_INTEL_MPX */
extern u16 amd_get_nb_id(int cpu);
+extern u32 amd_get_nodes_per_socket(void);
static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
{
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 94e7051..56cae19 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -19,6 +19,13 @@
#include "cpu.h"
+/*
+ * nodes_per_socket: Stores the number of nodes per socket.
+ * Refer to Fam15h Models 00-0fh BKDG - CPUID Fn8000_001E_ECX
+ * Node Identifiers[10:8]
+ */
+static u32 nodes_per_socket = 1;
+
static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
{
u32 gprs[8] = { 0 };
@@ -291,7 +298,7 @@ static int nearby_node(int apicid)
#ifdef CONFIG_X86_HT
static void amd_get_topology(struct cpuinfo_x86 *c)
{
- u32 nodes, cores_per_cu = 1;
+ u32 cores_per_cu = 1;
u8 node_id;
int cpu = smp_processor_id();
@@ -300,7 +307,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u32 eax, ebx, ecx, edx;
cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
- nodes = ((ecx >> 8) & 7) + 1;
+ nodes_per_socket = ((ecx >> 8) & 7) + 1;
node_id = ecx & 7;
/* get compute unit information */
@@ -311,18 +318,18 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u64 value;
rdmsrl(MSR_FAM10H_NODE_ID, value);
- nodes = ((value >> 3) & 7) + 1;
+ nodes_per_socket = ((value >> 3) & 7) + 1;
node_id = value & 7;
} else
return;
/* fixup multi-node processor information */
- if (nodes > 1) {
+ if (nodes_per_socket > 1) {
u32 cores_per_node;
u32 cus_per_node;
set_cpu_cap(c, X86_FEATURE_AMD_DCM);
- cores_per_node = c->x86_max_cores / nodes;
+ cores_per_node = c->x86_max_cores / nodes_per_socket;
cus_per_node = cores_per_node / cores_per_cu;
/* store NodeID, use llc_shared_map to store sibling info */
@@ -366,6 +373,12 @@ u16 amd_get_nb_id(int cpu)
}
EXPORT_SYMBOL_GPL(amd_get_nb_id);
+u32 amd_get_nodes_per_socket(void)
+{
+ return nodes_per_socket;
+}
+EXPORT_SYMBOL_GPL(amd_get_nodes_per_socket);
+
static void srat_detect_node(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_NUMA
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-06-18 10:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-09 16:45 [PATCH 0/3] Updates to EDAC mce_amd_inj Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 1/3] x86, amd: Store number of nodes in a static global variable Aravind Gopalakrishnan
2015-06-10 8:02 ` Borislav Petkov
2015-06-18 10:55 ` [tip:x86/cpu] x86/cpu/amd: Give access to the number of nodes in a physical package tip-bot for Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 2/3] x86, amd: Provide accessor for number of nodes Aravind Gopalakrishnan
2015-06-09 16:45 ` [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Aravind Gopalakrishnan
2015-06-10 8:25 ` Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox