All of lore.kernel.org
 help / color / mirror / Atom feed
* [4.14,19/31] x86/MCE/AMD: Always give panic severity for UC errors in kernel context
  2017-11-19 14:59 [PATCH 4.14 00/31] 4.14.1-stable review Greg Kroah-Hartman
@ 2017-11-19 14:59 ` Greg Kroah-Hartman
  2017-11-19 14:59 ` [PATCH 4.14 03/31] media: imon: Fix null-ptr-deref in imon_probe Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 52+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-19 14:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yazen Ghannam, Borislav Petkov,
	Linus Torvalds, Peter Zijlstra, Thomas Gleixner, Tony Luck,
	linux-edac, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yazen Ghannam <yazen.ghannam@amd.com>

commit d65dfc81bb3894fdb68cbc74bbf5fb48d2354071 upstream.

The AMD severity grading function was introduced in kernel 4.1. The
current logic can possibly give MCE_AR_SEVERITY for uncorrectable
errors in kernel context. The system may then get stuck in a loop as
memory_failure() will try to handle the bad kernel memory and find it
busy.

Return MCE_PANIC_SEVERITY for all UC errors IN_KERNEL context on AMD
systems.

After:

  b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")

was accepted in v4.6, this issue was masked because of the tail-end attempt
at kernel mode recovery in the #MC handler.

However, uncorrectable errors IN_KERNEL context should always be considered
unrecoverable and cause a panic.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: bf80bbd7dcf5 (x86/mce: Add an AMD severities-grading function)
Link: http://lkml.kernel.org/r/20171106174633.13576-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/mcheck/mce-severity.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -245,6 +245,9 @@ static int mce_severity_amd(struct mce *
 
 	if (m->status & MCI_STATUS_UC) {
 
+		if (ctx == IN_KERNEL)
+			return MCE_PANIC_SEVERITY;
+
 		/*
 		 * On older systems where overflow_recov flag is not present, we
 		 * should simply panic if an error overflow occurs. If
@@ -255,10 +258,6 @@ static int mce_severity_amd(struct mce *
 			if (mce_flags.smca)
 				return mce_severity_amd_smca(m, ctx);
 
-			/* software can try to contain */
-			if (!(m->mcgstatus & MCG_STATUS_RIPV) && (ctx == IN_KERNEL))
-				return MCE_PANIC_SEVERITY;
-
 			/* kill current process */
 			return MCE_AR_SEVERITY;
 		} else {

^ permalink raw reply	[flat|nested] 52+ messages in thread
* [4.14,01/31] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present
  2017-11-19 14:59 [PATCH 4.14 00/31] 4.14.1-stable review Greg Kroah-Hartman
@ 2017-11-19 14:59 ` Greg Kroah-Hartman
  2017-11-19 14:59 ` [PATCH 4.14 03/31] media: imon: Fix null-ptr-deref in imon_probe Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 52+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-19 14:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qiuxu Zhuo, Tony Luck, linux-edac,
	Borislav Petkov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Qiuxu Zhuo <qiuxu.zhuo@intel.com>

commit 15cc3ae001873845b5d842e212478a6570c7d938 upstream.

Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3)
server (DELL PowerEdge 730xd):

  EDAC sbridge: Some needed devices are missing
  EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0
  EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Failed to register device with error -19.

The refactored sb_edac driver creates the IMC1 (the 2nd memory
controller) if any IMC1 device is present. In this case only
HA1_TA of IMC1 was present, but the driver expected to find
HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure.

The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi
Zhang inserted one DIMM per channel for each CPU, and did random error
address injection test with this patch:

      4024  addresses fell in TOLM hole area
     12715  addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
     12774  addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
     12798  addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
     12913  addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
     12674  addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0
     12686  addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0
     12882  addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0
     12934  addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0
    106400  addresses were injected totally.

The test result shows that all the 4 channels belong to IMC0 per CPU, so
the server really only has one IMC per CPU.

In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3'
implements either one or two IMCs. For CPUs with one IMC, IMC1 is not
used and should be ignored.

Thus, do not create a second memory controller if the key HA1 is absent.

[1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz
[2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf

Reported-and-tested-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/edac/sb_edac.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -462,6 +462,7 @@ static const struct pci_id_table pci_dev
 static const struct pci_id_descr pci_dev_descr_ibridge[] = {
 		/* Processor Home Agent */
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0,        0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1,        1, IMC1) },
 
 		/* Memory controller */
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA,     0, IMC0) },
@@ -472,7 +473,6 @@ static const struct pci_id_descr pci_dev
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3,   0, IMC0) },
 
 		/* Optional, mode 2HA */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1,        1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA,     1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS,    1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0,   1, IMC1) },
@@ -2291,6 +2291,13 @@ static int sbridge_get_onedevice(struct
 next_imc:
 	sbridge_dev = get_sbridge_dev(bus, dev_descr->dom, multi_bus, sbridge_dev);
 	if (!sbridge_dev) {
+		/* If the HA1 wasn't found, don't create EDAC second memory controller */
+		if (dev_descr->dom == IMC1 && devno != 1) {
+			edac_dbg(0, "Skip IMC1: %04x:%04x (since HA1 was absent)\n",
+				 PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+			pci_dev_put(pdev);
+			return 0;
+		}
 
 		if (dev_descr->dom == SOCK)
 			goto out_imc;

^ permalink raw reply	[flat|nested] 52+ messages in thread
* [PATCH 4.14 00/31] 4.14.1-stable review
@ 2017-11-19 14:59 Greg Kroah-Hartman
  2017-11-19 14:59 ` [PATCH 4.14 02/31] dmaengine: dmatest: warn user when dma test times out Greg Kroah-Hartman
                   ` (30 more replies)
  0 siblings, 31 replies; 52+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-19 14:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, stable

This is the start of the stable review cycle for the 4.14.1 release.
There are 31 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Nov 21 14:59:32 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.1-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.14.1-rc1

Johan Hovold <johan@kernel.org>
    spi: fix use-after-free at controller deregistration

Hans de Goede <hdegoede@redhat.com>
    staging: rtl8188eu: Revert 4 commits breaking ARP

Hans de Goede <hdegoede@redhat.com>
    staging: vboxvideo: Fix reporting invalid suggested-offset-properties

Johan Hovold <johan@kernel.org>
    staging: greybus: spilib: fix use-after-free after deregistration

Gilad Ben-Yossef <gilad@benyossef.com>
    staging: ccree: fix 64 bit scatter/gather DMA ops

Huacai Chen <chenhc@lemote.com>
    staging: sm750fb: Fix parameter mistake in poke32

Aditya Shankar <aditya.shankar@microchip.com>
    staging: wilc1000: Fix bssid buffer offset in Txq

Bjorn Andersson <bjorn.andersson@linaro.org>
    rpmsg: glink: Add missing MODULE_LICENSE

Jason Gerecke <killertofu@gmail.com>
    HID: wacom: generic: Recognize WACOM_HID_WD_PEN as a type of pen collection

Sébastien Szymanski <sebastien.szymanski@armadeus.com>
    HID: cp2112: add HIDRAW dependency

Hans de Goede <hdegoede@redhat.com>
    platform/x86: peaq_wmi: Fix missing terminating entry for peaq_dmi_table

Hans de Goede <hdegoede@redhat.com>
    platform/x86: peaq-wmi: Add DMI check before binding to the WMI interface

Yazen Ghannam <yazen.ghannam@amd.com>
    x86/MCE/AMD: Always give panic severity for UC errors in kernel context

Andy Lutomirski <luto@kernel.org>
    selftests/x86/protection_keys: Fix syscall NR redefinition warnings

Johan Hovold <johan@kernel.org>
    USB: serial: garmin_gps: fix memory leak on probe errors

Johan Hovold <johan@kernel.org>
    USB: serial: garmin_gps: fix I/O after failed probe and remove

Douglas Fischer <douglas.fischer@outlook.com>
    USB: serial: qcserial: add pid/vid for Sierra Wireless EM7355 fw update

Lu Baolu <baolu.lu@linux.intel.com>
    USB: serial: Change DbC debug device binding ID

Johan Hovold <johan@kernel.org>
    USB: serial: metro-usb: stop I/O after failed open

Andrew Gabbasov <andrew_gabbasov@mentor.com>
    usb: gadget: f_fs: Fix use-after-free in ffs_free_inst

Bernhard Rosenkraenzer <bernhard.rosenkranzer@linaro.org>
    USB: Add delay-init quirk for Corsair K70 LUX keyboards

Alan Stern <stern@rowland.harvard.edu>
    USB: usbfs: compute urb->actual_length for isochronous

Lu Baolu <baolu.lu@linux.intel.com>
    USB: early: Use new USB product ID and strings for DbC device

raveendra padasalagi <raveendra.padasalagi@broadcom.com>
    crypto: brcm - Explicity ACK mailbox message

Eric Biggers <ebiggers@google.com>
    crypto: dh - Don't permit 'key' or 'g' size longer than 'p'

Eric Biggers <ebiggers@google.com>
    crypto: dh - Don't permit 'p' to be 0

Eric Biggers <ebiggers@google.com>
    crypto: dh - Fix double free of ctx->p

Andrey Konovalov <andreyknvl@google.com>
    media: dib0700: fix invalid dvb_detach argument

Arvind Yadav <arvind.yadav.cs@gmail.com>
    media: imon: Fix null-ptr-deref in imon_probe

Adam Wallis <awallis@codeaurora.org>
    dmaengine: dmatest: warn user when dma test times out

Qiuxu Zhuo <qiuxu.zhuo@intel.com>
    EDAC, sb_edac: Don't create a second memory controller if HA1 is not present


-------------

Diffstat:

 Makefile                                      |   4 +-
 arch/x86/kernel/cpu/mcheck/mce-severity.c     |   7 +-
 crypto/dh.c                                   |  33 ++++-----
 crypto/dh_helper.c                            |  16 ++++
 drivers/crypto/bcm/cipher.c                   | 101 ++++++++++++--------------
 drivers/dma/dmatest.c                         |   1 +
 drivers/edac/sb_edac.c                        |   9 ++-
 drivers/hid/Kconfig                           |   2 +-
 drivers/hid/wacom_wac.h                       |   1 +
 drivers/media/rc/imon.c                       |   5 ++
 drivers/media/usb/dvb-usb/dib0700_devices.c   |  24 +++---
 drivers/platform/x86/peaq-wmi.c               |  19 +++++
 drivers/rpmsg/qcom_glink_native.c             |   3 +
 drivers/spi/spi.c                             |   5 +-
 drivers/staging/ccree/cc_lli_defs.h           |   2 +-
 drivers/staging/greybus/spilib.c              |   8 +-
 drivers/staging/rtl8188eu/core/rtw_recv.c     |  83 ++++++++++++---------
 drivers/staging/rtl8188eu/os_dep/mon.c        |  34 ++-------
 drivers/staging/sm750fb/ddk750_chip.h         |   2 +-
 drivers/staging/vboxvideo/vbox_drv.h          |   8 +-
 drivers/staging/vboxvideo/vbox_irq.c          |   4 +-
 drivers/staging/vboxvideo/vbox_mode.c         |  26 +++++--
 drivers/staging/wilc1000/wilc_wlan.c          |   2 +-
 drivers/usb/core/devio.c                      |  14 ++++
 drivers/usb/core/quirks.c                     |   3 +
 drivers/usb/early/xhci-dbc.h                  |   6 +-
 drivers/usb/gadget/function/f_fs.c            |   1 +
 drivers/usb/serial/garmin_gps.c               |  22 +++++-
 drivers/usb/serial/metro-usb.c                |  11 ++-
 drivers/usb/serial/qcserial.c                 |   1 +
 drivers/usb/serial/usb_debug.c                |   4 +-
 tools/testing/selftests/x86/protection_keys.c |  24 ++++--
 32 files changed, 289 insertions(+), 196 deletions(-)

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2017-11-22 18:52 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-19 14:59 [4.14,19/31] x86/MCE/AMD: Always give panic severity for UC errors in kernel context Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 19/31] " Greg Kroah-Hartman
  -- strict thread matches above, loose matches on Subject: below --
2017-11-19 14:59 [4.14,01/31] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 01/31] " Greg Kroah-Hartman
2017-11-19 14:59 [PATCH 4.14 00/31] 4.14.1-stable review Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 02/31] dmaengine: dmatest: warn user when dma test times out Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 03/31] media: imon: Fix null-ptr-deref in imon_probe Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 04/31] media: dib0700: fix invalid dvb_detach argument Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 05/31] crypto: dh - Fix double free of ctx->p Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 06/31] crypto: dh - Dont permit p to be 0 Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 07/31] crypto: dh - Dont permit key or g size longer than p Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 08/31] crypto: brcm - Explicity ACK mailbox message Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 09/31] USB: early: Use new USB product ID and strings for DbC device Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 10/31] USB: usbfs: compute urb->actual_length for isochronous Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 11/31] USB: Add delay-init quirk for Corsair K70 LUX keyboards Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 12/31] usb: gadget: f_fs: Fix use-after-free in ffs_free_inst Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 13/31] USB: serial: metro-usb: stop I/O after failed open Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 14/31] USB: serial: Change DbC debug device binding ID Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 15/31] USB: serial: qcserial: add pid/vid for Sierra Wireless EM7355 fw update Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 16/31] USB: serial: garmin_gps: fix I/O after failed probe and remove Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 17/31] USB: serial: garmin_gps: fix memory leak on probe errors Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 18/31] selftests/x86/protection_keys: Fix syscall NR redefinition warnings Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 20/31] platform/x86: peaq-wmi: Add DMI check before binding to the WMI interface Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 21/31] platform/x86: peaq_wmi: Fix missing terminating entry for peaq_dmi_table Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 23/31] HID: wacom: generic: Recognize WACOM_HID_WD_PEN as a type of pen collection Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 24/31] rpmsg: glink: Add missing MODULE_LICENSE Greg Kroah-Hartman
2017-11-19 14:59 ` [PATCH 4.14 25/31] staging: wilc1000: Fix bssid buffer offset in Txq Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 26/31] staging: sm750fb: Fix parameter mistake in poke32 Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 27/31] staging: ccree: fix 64 bit scatter/gather DMA ops Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 28/31] staging: greybus: spilib: fix use-after-free after deregistration Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 29/31] staging: vboxvideo: Fix reporting invalid suggested-offset-properties Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 30/31] staging: rtl8188eu: Revert 4 commits breaking ARP Greg Kroah-Hartman
2017-11-19 15:00 ` [PATCH 4.14 31/31] spi: fix use-after-free at controller deregistration Greg Kroah-Hartman
2017-11-20 14:27 ` [PATCH 4.14 00/31] 4.14.1-stable review Guenter Roeck
2017-11-21 15:26   ` Ben Hutchings
2017-11-21 16:35     ` Greg Kroah-Hartman
2017-11-21 16:35       ` Greg Kroah-Hartman
2017-11-21 16:46       ` Ben Hutchings
2017-11-21 17:09         ` Greg Kroah-Hartman
2017-11-21 17:09           ` Greg Kroah-Hartman
2017-11-21 19:07           ` Ben Hutchings
2017-11-21 19:38             ` Guenter Roeck
2017-11-21 19:38               ` Guenter Roeck
2017-11-22 16:06               ` Ben Hutchings
2017-11-22 17:00                 ` Greg Kroah-Hartman
2017-11-22 17:00                   ` Greg Kroah-Hartman
2017-11-22 18:52                   ` Guenter Roeck
2017-11-22 18:52                     ` Guenter Roeck
2017-11-20 18:21 ` Guenter Roeck
2017-11-20 19:16   ` Greg Kroah-Hartman
2017-11-20 21:19 ` Shuah Khan
2017-11-21  7:22   ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.