[PATCH 0/7] RAS pile

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/7] RAS pile
@ 2017-05-19  9:39 Borislav Petkov
  2017-05-19  9:39 ` [PATCH 1/7] x86/MCE: Export memory_error() Borislav Petkov
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Borislav Petkov <bp@suse.de>

Hi guys,

here's a bunch of assorted RAS fixes. Please queue the first two for
urgent as otherwise nfit won't be able to report memory errors properly.

Thanks.

Borislav Petkov (1):
  x86/MCE: Export memory_error()

Elena Reshetova (1):
  x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t

Shiju Jose (1):
  ACPI/APEI: Handle GSIV and GPIO notification types

Vishal Verma (1):
  acpi, nfit: Fix the memory error check in nfit_handle_mce()

Wei Yongjun (1):
  RAS: Make local function parse_ras_param() static

Yazen Ghannam (2):
  x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers
  x86/mce/AMD: Carve out SMCA bank configuration

 arch/x86/include/asm/amd_nb.h        |   3 +-
 arch/x86/include/asm/mce.h           |   1 +
 arch/x86/kernel/cpu/mcheck/mce.c     |  13 +-
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 229 ++++++++++++++++++-----------------
 drivers/acpi/apei/ghes.c             |  39 +++---
 drivers/acpi/nfit/mce.c              |   2 +-
 drivers/ras/ras.c                    |   2 +-
 7 files changed, 151 insertions(+), 138 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] x86/MCE: Export memory_error()
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-21 19:43   ` [tip:ras/urgent] " tip-bot for Borislav Petkov
  2017-05-19  9:39 ` [PATCH 2/7] acpi, nfit: Fix the memory error check in nfit_handle_mce() Borislav Petkov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Borislav Petkov <bp@suse.de>

Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.

Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).

The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
---
 arch/x86/include/asm/mce.h       |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 13 ++++++-------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 4fd5195deed0..3f9a3d2a5209 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -266,6 +266,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
 #endif
 
 int mce_available(struct cpuinfo_x86 *c);
+bool mce_is_memory_error(struct mce *m);
 
 DECLARE_PER_CPU(unsigned, mce_exception_count);
 DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 5abd4bf73d6e..5cfbaeb6529a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -499,16 +499,14 @@ static int mce_usable_address(struct mce *m)
 	return 1;
 }
 
-static bool memory_error(struct mce *m)
+bool mce_is_memory_error(struct mce *m)
 {
-	struct cpuinfo_x86 *c = &boot_cpu_data;
-
-	if (c->x86_vendor == X86_VENDOR_AMD) {
+	if (m->cpuvendor == X86_VENDOR_AMD) {
 		/* ErrCodeExt[20:16] */
 		u8 xec = (m->status >> 16) & 0x1f;
 
 		return (xec == 0x0 || xec == 0x8);
-	} else if (c->x86_vendor == X86_VENDOR_INTEL) {
+	} else if (m->cpuvendor == X86_VENDOR_INTEL) {
 		/*
 		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
 		 *
@@ -529,6 +527,7 @@ static bool memory_error(struct mce *m)
 
 	return false;
 }
+EXPORT_SYMBOL_GPL(mce_is_memory_error);
 
 static bool cec_add_mce(struct mce *m)
 {
@@ -536,7 +535,7 @@ static bool cec_add_mce(struct mce *m)
 		return false;
 
 	/* We eat only correctable DRAM errors with usable addresses. */
-	if (memory_error(m) &&
+	if (mce_is_memory_error(m) &&
 	    !(m->status & MCI_STATUS_UC) &&
 	    mce_usable_address(m))
 		if (!cec_add_elem(m->addr >> PAGE_SHIFT))
@@ -713,7 +712,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 
 		severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
 
-		if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m))
+		if (severity == MCE_DEFERRED_SEVERITY && mce_is_memory_error(&m))
 			if (m.status & MCI_STATUS_ADDRV)
 				m.severity = severity;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] acpi, nfit: Fix the memory error check in nfit_handle_mce()
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
  2017-05-19  9:39 ` [PATCH 1/7] x86/MCE: Export memory_error() Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-21 19:43   ` [tip:ras/urgent] " tip-bot for Vishal Verma
  2017-05-19  9:39 ` [PATCH 3/7] ACPI/APEI: Handle GSIV and GPIO notification types Borislav Petkov
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Vishal Verma <vishal.l.verma@intel.com>

The check for an MCE being a memory error in the NFIT mce handler was
bogus. Use the new mce_is_memory_error() helper to detect the error
properly.

Reported-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/acpi/nfit/mce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index 3ba1c3472cf9..fd86bec98dea 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -26,7 +26,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
 	struct nfit_spa *nfit_spa;
 
 	/* We only care about memory errors */
-	if (!(mce->status & MCACOD))
+	if (!mce_is_memory_error(mce))
 		return NOTIFY_DONE;
 
 	/*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] ACPI/APEI: Handle GSIV and GPIO notification types
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
  2017-05-19  9:39 ` [PATCH 1/7] x86/MCE: Export memory_error() Borislav Petkov
  2017-05-19  9:39 ` [PATCH 2/7] acpi, nfit: Fix the memory error check in nfit_handle_mce() Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-19  9:39 ` [PATCH 4/7] RAS: Make local function parse_ras_param() static Borislav Petkov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Shiju Jose <shiju.jose@huawei.com>

System Controller Interrupts are received by ACPI's error device, which
in turn notifies the GHES code. The same is true of APEI's GSIV and
GPIO notification types. Add support for GSIV and GPIO sharing the SCI
register/unregister/notifier code. Rename the list and notifier to show
this is no longer just SCI, but anything from the Hardware Error Device.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
[ Rewrite commit log. ]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Cc: "Guohanjun (Hanjun Guo)" <guohanjun@huawei.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Zhengqiang (turing)" <zhengqiang10@huawei.com>
Cc: "fu.wei@linaro.org" <fu.wei@linaro.org>
Cc: "xuwei (O)" <xuwei5@hisilicon.com>
Cc: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Cc: Geliang Tang <geliangtang@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: linux-acpi@vger.kernel.org
Link: http://lkml.kernel.org/r/86258A5CC0A3704780874CF6004BA8A62E695201@FRAEML521-MBX.china.huawei.com
[ Some small cleanups ontop. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/acpi/apei/ghes.c | 39 +++++++++++++++++++++++++--------------
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d0855c09f32f..d2c8a9286fa8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -89,14 +89,14 @@ bool ghes_disable;
 module_param_named(disable, ghes_disable, bool, 0);
 
 /*
- * All error sources notified with SCI shares one notifier function,
- * so they need to be linked and checked one by one.  This is applied
- * to NMI too.
+ * All error sources notified with HED (Hardware Error Device) share a
+ * single notifier callback, so they need to be linked and checked one
+ * by one. This holds true for NMI too.
  *
  * RCU is used for these lists, so ghes_list_mutex is only used for
  * list changing, not for traversing.
  */
-static LIST_HEAD(ghes_sci);
+static LIST_HEAD(ghes_hed);
 static DEFINE_MUTEX(ghes_list_mutex);
 
 /*
@@ -702,14 +702,14 @@ static irqreturn_t ghes_irq_func(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
-static int ghes_notify_sci(struct notifier_block *this,
-				  unsigned long event, void *data)
+static int ghes_notify_hed(struct notifier_block *this, unsigned long event,
+			   void *data)
 {
 	struct ghes *ghes;
 	int ret = NOTIFY_DONE;
 
 	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sci, list) {
+	list_for_each_entry_rcu(ghes, &ghes_hed, list) {
 		if (!ghes_proc(ghes))
 			ret = NOTIFY_OK;
 	}
@@ -718,8 +718,8 @@ static int ghes_notify_sci(struct notifier_block *this,
 	return ret;
 }
 
-static struct notifier_block ghes_notifier_sci = {
-	.notifier_call = ghes_notify_sci,
+static struct notifier_block ghes_notifier_hed = {
+	.notifier_call = ghes_notify_hed,
 };
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
@@ -966,7 +966,10 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_POLLED:
 	case ACPI_HEST_NOTIFY_EXTERNAL:
 	case ACPI_HEST_NOTIFY_SCI:
+	case ACPI_HEST_NOTIFY_GSIV:
+	case ACPI_HEST_NOTIFY_GPIO:
 		break;
+
 	case ACPI_HEST_NOTIFY_NMI:
 		if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
 			pr_warn(GHES_PFX "Generic hardware error source: %d notified via NMI interrupt is not supported!\n",
@@ -1024,13 +1027,17 @@ static int ghes_probe(struct platform_device *ghes_dev)
 			goto err_edac_unreg;
 		}
 		break;
+
 	case ACPI_HEST_NOTIFY_SCI:
+	case ACPI_HEST_NOTIFY_GSIV:
+	case ACPI_HEST_NOTIFY_GPIO:
 		mutex_lock(&ghes_list_mutex);
-		if (list_empty(&ghes_sci))
-			register_acpi_hed_notifier(&ghes_notifier_sci);
-		list_add_rcu(&ghes->list, &ghes_sci);
+		if (list_empty(&ghes_hed))
+			register_acpi_hed_notifier(&ghes_notifier_hed);
+		list_add_rcu(&ghes->list, &ghes_hed);
 		mutex_unlock(&ghes_list_mutex);
 		break;
+
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_add(ghes);
 		break;
@@ -1066,14 +1073,18 @@ static int ghes_remove(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_EXTERNAL:
 		free_irq(ghes->irq, ghes);
 		break;
+
 	case ACPI_HEST_NOTIFY_SCI:
+	case ACPI_HEST_NOTIFY_GSIV:
+	case ACPI_HEST_NOTIFY_GPIO:
 		mutex_lock(&ghes_list_mutex);
 		list_del_rcu(&ghes->list);
-		if (list_empty(&ghes_sci))
-			unregister_acpi_hed_notifier(&ghes_notifier_sci);
+		if (list_empty(&ghes_hed))
+			unregister_acpi_hed_notifier(&ghes_notifier_hed);
 		mutex_unlock(&ghes_list_mutex);
 		synchronize_rcu();
 		break;
+
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_remove(ghes);
 		break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] RAS: Make local function parse_ras_param() static
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
                   ` (2 preceding siblings ...)
  2017-05-19  9:39 ` [PATCH 3/7] ACPI/APEI: Handle GSIV and GPIO notification types Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-19  9:39 ` [PATCH 5/7] x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t Borislav Petkov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Wei Yongjun <weiyongjun1@huawei.com>

Make parse_ras_param() static as it is used locally only.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: http://lkml.kernel.org/r/20170516161034.2973-1-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/ras/ras.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index 94f8038864b4..ed4c343d08c4 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -29,7 +29,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
 
 
-int __init parse_ras_param(char *str)
+static int __init parse_ras_param(char *str)
 {
 #ifdef CONFIG_RAS_CEC
 	parse_cec_param(str);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
                   ` (3 preceding siblings ...)
  2017-05-19  9:39 ` [PATCH 4/7] RAS: Make local function parse_ras_param() static Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-19  9:39 ` [PATCH 6/7] x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers Borislav Petkov
  2017-05-19  9:39 ` [PATCH 7/7] x86/mce/AMD: Carve out SMCA bank configuration Borislav Petkov
  6 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Elena Reshetova <elena.reshetova@intel.com>

The refcount_t type and corresponding API should be used instead
of atomic_t when the variable is used as a reference counter. This
allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1492695536-5947-1-git-send-email-elena.reshetova@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/amd_nb.h        | 3 ++-
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index 00c88a01301d..da181ad1d5f8 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -3,6 +3,7 @@
 
 #include <linux/ioport.h>
 #include <linux/pci.h>
+#include <linux/refcount.h>
 
 struct amd_nb_bus_dev_range {
 	u8 bus;
@@ -55,7 +56,7 @@ struct threshold_bank {
 	struct threshold_block	*blocks;
 
 	/* initialized to the number of CPUs on the node sharing this bank */
-	atomic_t		cpus;
+	refcount_t		cpus;
 };
 
 struct amd_northbridge {
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 6e4a047e4b68..41439ab41102 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -1202,7 +1202,7 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank)
 				goto out;
 
 			per_cpu(threshold_banks, cpu)[bank] = b;
-			atomic_inc(&b->cpus);
+			refcount_inc(&b->cpus);
 
 			err = __threshold_add_blocks(b);
 
@@ -1225,7 +1225,7 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank)
 	per_cpu(threshold_banks, cpu)[bank] = b;
 
 	if (is_shared_bank(bank)) {
-		atomic_set(&b->cpus, 1);
+		refcount_set(&b->cpus, 1);
 
 		/* nb is already initialized, see above */
 		if (nb) {
@@ -1289,7 +1289,7 @@ static void threshold_remove_bank(unsigned int cpu, int bank)
 		goto free_out;
 
 	if (is_shared_bank(bank)) {
-		if (!atomic_dec_and_test(&b->cpus)) {
+		if (!refcount_dec_and_test(&b->cpus)) {
 			__threshold_remove_blocks(b);
 			per_cpu(threshold_banks, cpu)[bank] = NULL;
 			return;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
                   ` (4 preceding siblings ...)
  2017-05-19  9:39 ` [PATCH 5/7] x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  2017-05-19  9:39 ` [PATCH 7/7] x86/mce/AMD: Carve out SMCA bank configuration Borislav Petkov
  6 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Yazen Ghannam <yazen.ghannam@amd.com>

We have support for the new SMCA MCA_DE{STAT,ADDR} registers in Linux.
So we've used these registers in place of MCA_{STATUS,ADDR} on SMCA
systems.

However, the guidance for current SMCA implementations of is to continue
using MCA_{STATUS,ADDR} and to use MCA_DE{STAT,ADDR} only if a Deferred
error was not found in the former registers. If we logged a Deferred
error in MCA_STATUS then we should also clear MCA_DESTAT. This also
means we shouldn't clear MCA_CONFIG[LogDeferredInMcaStat].

Rework __log_error() to only log an error and add helpers for the
different error types being logged from the corresponding interrupt
handlers.

Boris: carve out common functionality into a _log_error_bank(). Cleanup
comments, check MCi_STATUS bits before reading MSRs. Streamline flow.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1493147772-2721-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 147 ++++++++++++++++++-----------------
 1 file changed, 74 insertions(+), 73 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 41439ab41102..c511fa38ef4e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -472,20 +472,6 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
 		smca_high |= BIT(0);
 
 		/*
-		 * SMCA logs Deferred Error information in MCA_DE{STAT,ADDR}
-		 * registers with the option of additionally logging to
-		 * MCA_{STATUS,ADDR} if MCA_CONFIG[LogDeferredInMcaStat] is set.
-		 *
-		 * This bit is usually set by BIOS to retain the old behavior
-		 * for OSes that don't use the new registers. Linux supports the
-		 * new registers so let's disable that additional logging here.
-		 *
-		 * MCA_CONFIG[LogDeferredInMcaStat] is bit 34 (bit 2 in the high
-		 * portion of the MSR).
-		 */
-		smca_high &= ~BIT(2);
-
-		/*
 		 * SMCA sets the Deferred Error Interrupt type per bank.
 		 *
 		 * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us
@@ -755,37 +741,19 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 }
 EXPORT_SYMBOL_GPL(umc_normaddr_to_sysaddr);
 
-static void
-__log_error(unsigned int bank, bool deferred_err, bool threshold_err, u64 misc)
+static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc)
 {
-	u32 msr_status = msr_ops.status(bank);
-	u32 msr_addr = msr_ops.addr(bank);
 	struct mce m;
-	u64 status;
-
-	WARN_ON_ONCE(deferred_err && threshold_err);
-
-	if (deferred_err && mce_flags.smca) {
-		msr_status = MSR_AMD64_SMCA_MCx_DESTAT(bank);
-		msr_addr = MSR_AMD64_SMCA_MCx_DEADDR(bank);
-	}
-
-	rdmsrl(msr_status, status);
-
-	if (!(status & MCI_STATUS_VAL))
-		return;
 
 	mce_setup(&m);
 
 	m.status = status;
+	m.misc   = misc;
 	m.bank   = bank;
 	m.tsc	 = rdtsc();
 
-	if (threshold_err)
-		m.misc = misc;
-
 	if (m.status & MCI_STATUS_ADDRV) {
-		rdmsrl(msr_addr, m.addr);
+		m.addr = addr;
 
 		/*
 		 * Extract [55:<lsb>] where lsb is the least significant
@@ -806,8 +774,6 @@ __log_error(unsigned int bank, bool deferred_err, bool threshold_err, u64 misc)
 	}
 
 	mce_log(&m);
-
-	wrmsrl(msr_status, 0);
 }
 
 static inline void __smp_deferred_error_interrupt(void)
@@ -832,45 +798,85 @@ asmlinkage __visible void __irq_entry smp_trace_deferred_error_interrupt(void)
 	exiting_ack_irq();
 }
 
-/* APIC interrupt handler for deferred errors */
-static void amd_deferred_error_interrupt(void)
+/*
+ * Returns true if the logged error is deferred. False, otherwise.
+ */
+static inline bool
+_log_error_bank(unsigned int bank, u32 msr_stat, u32 msr_addr, u64 misc)
 {
-	unsigned int bank;
-	u32 msr_status;
-	u64 status;
+	u64 status, addr = 0;
 
-	for (bank = 0; bank < mca_cfg.banks; ++bank) {
-		msr_status = (mce_flags.smca) ? MSR_AMD64_SMCA_MCx_DESTAT(bank)
-					      : msr_ops.status(bank);
+	rdmsrl(msr_stat, status);
+	if (!(status & MCI_STATUS_VAL))
+		return false;
 
-		rdmsrl(msr_status, status);
+	if (status & MCI_STATUS_ADDRV)
+		rdmsrl(msr_addr, addr);
 
-		if (!(status & MCI_STATUS_VAL) ||
-		    !(status & MCI_STATUS_DEFERRED))
-			continue;
+	__log_error(bank, status, addr, misc);
 
-		__log_error(bank, true, false, 0);
-		break;
-	}
+	wrmsrl(status, 0);
+
+	return status & MCI_STATUS_DEFERRED;
 }
 
 /*
- * APIC Interrupt Handler
+ * We have three scenarios for checking for Deferred errors:
+ *
+ * 1) Non-SMCA systems check MCA_STATUS and log error if found.
+ * 2) SMCA systems check MCA_STATUS. If error is found then log it and also
+ *    clear MCA_DESTAT.
+ * 3) SMCA systems check MCA_DESTAT, if error was not found in MCA_STATUS, and
+ *    log it.
  */
+static void log_error_deferred(unsigned int bank)
+{
+	bool defrd;
+
+	defrd = _log_error_bank(bank, msr_ops.status(bank),
+					msr_ops.addr(bank), 0);
+
+	if (!mce_flags.smca)
+		return;
+
+	/* Clear MCA_DESTAT if we logged the deferred error from MCA_STATUS. */
+	if (defrd) {
+		wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0);
+		return;
+	}
+
+	/*
+	 * Only deferred errors are logged in MCA_DE{STAT,ADDR} so just check
+	 * for a valid error.
+	 */
+	_log_error_bank(bank, MSR_AMD64_SMCA_MCx_DESTAT(bank),
+			      MSR_AMD64_SMCA_MCx_DEADDR(bank), 0);
+}
+
+/* APIC interrupt handler for deferred errors */
+static void amd_deferred_error_interrupt(void)
+{
+	unsigned int bank;
+
+	for (bank = 0; bank < mca_cfg.banks; ++bank)
+		log_error_deferred(bank);
+}
+
+static void log_error_thresholding(unsigned int bank, u64 misc)
+{
+	_log_error_bank(bank, msr_ops.status(bank), msr_ops.addr(bank), misc);
+}
 
 /*
- * threshold interrupt handler will service THRESHOLD_APIC_VECTOR.
- * the interrupt goes off when error_count reaches threshold_limit.
- * the handler will simply log mcelog w/ software defined bank number.
+ * Threshold interrupt handler will service THRESHOLD_APIC_VECTOR. The interrupt
+ * goes off when error_count reaches threshold_limit.
  */
-
 static void amd_threshold_interrupt(void)
 {
 	u32 low = 0, high = 0, address = 0;
 	unsigned int bank, block, cpu = smp_processor_id();
 	struct thresh_restart tr;
 
-	/* assume first bank caused it */
 	for (bank = 0; bank < mca_cfg.banks; ++bank) {
 		if (!(per_cpu(bank_map, cpu) & (1 << bank)))
 			continue;
@@ -893,23 +899,18 @@ static void amd_threshold_interrupt(void)
 			     (high & MASK_LOCKED_HI))
 				continue;
 
-			/*
-			 * Log the machine check that caused the threshold
-			 * event.
-			 */
-			if (high & MASK_OVERFLOW_HI)
-				goto log;
-		}
-	}
-	return;
+			if (!(high & MASK_OVERFLOW_HI))
+				continue;
 
-log:
-	__log_error(bank, false, true, ((u64)high << 32) | low);
+			/* Log the MCE which caused the threshold event. */
+			log_error_thresholding(bank, ((u64)high << 32) | low);
 
-	/* Reset threshold block after logging error. */
-	memset(&tr, 0, sizeof(tr));
-	tr.b = &per_cpu(threshold_banks, cpu)[bank]->blocks[block];
-	threshold_restart_bank(&tr);
+			/* Reset threshold block after logging error. */
+			memset(&tr, 0, sizeof(tr));
+			tr.b = &per_cpu(threshold_banks, cpu)[bank]->blocks[block];
+			threshold_restart_bank(&tr);
+		}
+	}
 }
 
 /*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] x86/mce/AMD: Carve out SMCA bank configuration
  2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
                   ` (5 preceding siblings ...)
  2017-05-19  9:39 ` [PATCH 6/7] x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers Borislav Petkov
@ 2017-05-19  9:39 ` Borislav Petkov
  6 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2017-05-19  9:39 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Yazen Ghannam <yazen.ghannam@amd.com>

Scalable MCA systems have a new MCA_CONFIG register that we use to
configure each bank. We currently use this when we set up thresholding.
However, this is logically separate.

Group all SMCA-related initialization into a single function.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1493147772-2721-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 76 ++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index c511fa38ef4e..d00f299f2ada 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -164,17 +164,48 @@ static void default_deferred_error_interrupt(void)
 }
 void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt;
 
-static void get_smca_bank_info(unsigned int bank)
+static void smca_configure(unsigned int bank, unsigned int cpu)
 {
-	unsigned int i, hwid_mcatype, cpu = smp_processor_id();
+	unsigned int i, hwid_mcatype;
 	struct smca_hwid *s_hwid;
-	u32 high, instance_id;
+	u32 high, low;
+	u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank);
+
+	/* Set appropriate bits in MCA_CONFIG */
+	if (!rdmsr_safe(smca_config, &low, &high)) {
+		/*
+		 * OS is required to set the MCAX bit to acknowledge that it is
+		 * now using the new MSR ranges and new registers under each
+		 * bank. It also means that the OS will configure deferred
+		 * errors in the new MCx_CONFIG register. If the bit is not set,
+		 * uncorrectable errors will cause a system panic.
+		 *
+		 * MCA_CONFIG[MCAX] is bit 32 (0 in the high portion of the MSR.)
+		 */
+		high |= BIT(0);
+
+		/*
+		 * SMCA sets the Deferred Error Interrupt type per bank.
+		 *
+		 * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us
+		 * if the DeferredIntType bit field is available.
+		 *
+		 * MCA_CONFIG[DeferredIntType] is bits [38:37] ([6:5] in the
+		 * high portion of the MSR). OS should set this to 0x1 to enable
+		 * APIC based interrupt. First, check that no interrupt has been
+		 * set.
+		 */
+		if ((low & BIT(5)) && !((high >> 5) & 0x3))
+			high |= BIT(5);
+
+		wrmsr(smca_config, low, high);
+	}
 
 	/* Collect bank_info using CPU 0 for now. */
 	if (cpu)
 		return;
 
-	if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &instance_id, &high)) {
+	if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) {
 		pr_warn("Failed to read MCA_IPID for bank %d\n", bank);
 		return;
 	}
@@ -191,7 +222,7 @@ static void get_smca_bank_info(unsigned int bank)
 			     smca_get_name(s_hwid->bank_type));
 
 			smca_banks[bank].hwid = s_hwid;
-			smca_banks[bank].id = instance_id;
+			smca_banks[bank].id = low;
 			smca_banks[bank].sysfs_id = s_hwid->count++;
 			break;
 		}
@@ -433,7 +464,7 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
 			int offset, u32 misc_high)
 {
 	unsigned int cpu = smp_processor_id();
-	u32 smca_low, smca_high, smca_addr;
+	u32 smca_low, smca_high;
 	struct threshold_block b;
 	int new;
 
@@ -457,37 +488,6 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
 		goto set_offset;
 	}
 
-	smca_addr = MSR_AMD64_SMCA_MCx_CONFIG(bank);
-
-	if (!rdmsr_safe(smca_addr, &smca_low, &smca_high)) {
-		/*
-		 * OS is required to set the MCAX bit to acknowledge that it is
-		 * now using the new MSR ranges and new registers under each
-		 * bank. It also means that the OS will configure deferred
-		 * errors in the new MCx_CONFIG register. If the bit is not set,
-		 * uncorrectable errors will cause a system panic.
-		 *
-		 * MCA_CONFIG[MCAX] is bit 32 (0 in the high portion of the MSR.)
-		 */
-		smca_high |= BIT(0);
-
-		/*
-		 * SMCA sets the Deferred Error Interrupt type per bank.
-		 *
-		 * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us
-		 * if the DeferredIntType bit field is available.
-		 *
-		 * MCA_CONFIG[DeferredIntType] is bits [38:37] ([6:5] in the
-		 * high portion of the MSR). OS should set this to 0x1 to enable
-		 * APIC based interrupt. First, check that no interrupt has been
-		 * set.
-		 */
-		if ((smca_low & BIT(5)) && !((smca_high >> 5) & 0x3))
-			smca_high |= BIT(5);
-
-		wrmsr(smca_addr, smca_low, smca_high);
-	}
-
 	/* Gather LVT offset for thresholding: */
 	if (rdmsr_safe(MSR_CU_DEF_ERR, &smca_low, &smca_high))
 		goto out;
@@ -516,7 +516,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
 
 	for (bank = 0; bank < mca_cfg.banks; ++bank) {
 		if (mce_flags.smca)
-			get_smca_bank_info(bank);
+			smca_configure(bank, cpu);
 
 		for (block = 0; block < NR_BLOCKS; ++block) {
 			address = get_block_address(cpu, address, low, high, bank, block);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip:ras/urgent] x86/MCE: Export memory_error()
  2017-05-19  9:39 ` [PATCH 1/7] x86/MCE: Export memory_error() Borislav Petkov
@ 2017-05-21 19:43   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-05-21 19:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, stable, vishal.l.verma, linux-kernel, mingo, hpa, tony.luck,
	bp

Commit-ID:  2d1f406139ec20320bf38bcd2461aa8e358084b5
Gitweb:     http://git.kernel.org/tip/2d1f406139ec20320bf38bcd2461aa8e358084b5
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Fri, 19 May 2017 11:39:09 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 21 May 2017 21:39:58 +0200

x86/MCE: Export memory_error()

Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.

Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).

The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/mce.h       |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 13 ++++++-------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 4fd5195..3f9a3d2 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -266,6 +266,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
 #endif
 
 int mce_available(struct cpuinfo_x86 *c);
+bool mce_is_memory_error(struct mce *m);
 
 DECLARE_PER_CPU(unsigned, mce_exception_count);
 DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 5abd4bf..5cfbaeb 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -499,16 +499,14 @@ static int mce_usable_address(struct mce *m)
 	return 1;
 }
 
-static bool memory_error(struct mce *m)
+bool mce_is_memory_error(struct mce *m)
 {
-	struct cpuinfo_x86 *c = &boot_cpu_data;
-
-	if (c->x86_vendor == X86_VENDOR_AMD) {
+	if (m->cpuvendor == X86_VENDOR_AMD) {
 		/* ErrCodeExt[20:16] */
 		u8 xec = (m->status >> 16) & 0x1f;
 
 		return (xec == 0x0 || xec == 0x8);
-	} else if (c->x86_vendor == X86_VENDOR_INTEL) {
+	} else if (m->cpuvendor == X86_VENDOR_INTEL) {
 		/*
 		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
 		 *
@@ -529,6 +527,7 @@ static bool memory_error(struct mce *m)
 
 	return false;
 }
+EXPORT_SYMBOL_GPL(mce_is_memory_error);
 
 static bool cec_add_mce(struct mce *m)
 {
@@ -536,7 +535,7 @@ static bool cec_add_mce(struct mce *m)
 		return false;
 
 	/* We eat only correctable DRAM errors with usable addresses. */
-	if (memory_error(m) &&
+	if (mce_is_memory_error(m) &&
 	    !(m->status & MCI_STATUS_UC) &&
 	    mce_usable_address(m))
 		if (!cec_add_elem(m->addr >> PAGE_SHIFT))
@@ -713,7 +712,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 
 		severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
 
-		if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m))
+		if (severity == MCE_DEFERRED_SEVERITY && mce_is_memory_error(&m))
 			if (m.status & MCI_STATUS_ADDRV)
 				m.severity = severity;
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip:ras/urgent] acpi, nfit: Fix the memory error check in nfit_handle_mce()
  2017-05-19  9:39 ` [PATCH 2/7] acpi, nfit: Fix the memory error check in nfit_handle_mce() Borislav Petkov
@ 2017-05-21 19:43   ` tip-bot for Vishal Verma
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Vishal Verma @ 2017-05-21 19:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, stable, mingo, linux-kernel, vishal.l.verma, tony.luck, tglx,
	bp

Commit-ID:  fc08a4703a418a398bbb575ac311d36d110ac786
Gitweb:     http://git.kernel.org/tip/fc08a4703a418a398bbb575ac311d36d110ac786
Author:     Vishal Verma <vishal.l.verma@intel.com>
AuthorDate: Fri, 19 May 2017 11:39:10 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 21 May 2017 21:39:59 +0200

acpi, nfit: Fix the memory error check in nfit_handle_mce()

The check for an MCE being a memory error in the NFIT mce handler was
bogus. Use the new mce_is_memory_error() helper to detect the error
properly.

Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-3-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/acpi/nfit/mce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index 3ba1c34..fd86bec 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -26,7 +26,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
 	struct nfit_spa *nfit_spa;
 
 	/* We only care about memory errors */
-	if (!(mce->status & MCACOD))
+	if (!mce_is_memory_error(mce))
 		return NOTIFY_DONE;
 
 	/*

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-05-21 19:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19  9:39 [PATCH 0/7] RAS pile Borislav Petkov
2017-05-19  9:39 ` [PATCH 1/7] x86/MCE: Export memory_error() Borislav Petkov
2017-05-21 19:43   ` [tip:ras/urgent] " tip-bot for Borislav Petkov
2017-05-19  9:39 ` [PATCH 2/7] acpi, nfit: Fix the memory error check in nfit_handle_mce() Borislav Petkov
2017-05-21 19:43   ` [tip:ras/urgent] " tip-bot for Vishal Verma
2017-05-19  9:39 ` [PATCH 3/7] ACPI/APEI: Handle GSIV and GPIO notification types Borislav Petkov
2017-05-19  9:39 ` [PATCH 4/7] RAS: Make local function parse_ras_param() static Borislav Petkov
2017-05-19  9:39 ` [PATCH 5/7] x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t Borislav Petkov
2017-05-19  9:39 ` [PATCH 6/7] x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers Borislav Petkov
2017-05-19  9:39 ` [PATCH 7/7] x86/mce/AMD: Carve out SMCA bank configuration Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox