All of lore.kernel.org
 help / color / mirror / Atom feed
* [4.13,43/44] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present
  2017-11-16 17:42 [PATCH 4.13 00/44] 4.13.14-stable review Greg Kroah-Hartman
@ 2017-11-16 17:43 ` Greg Kroah-Hartman
  2017-11-16 17:42 ` [PATCH 4.13 02/44] gso: fix payload length when gso_size is zero Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  44 siblings, 0 replies; 51+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-16 17:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qiuxu Zhuo, Tony Luck, linux-edac,
	Borislav Petkov

4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Qiuxu Zhuo <qiuxu.zhuo@intel.com>

commit 15cc3ae001873845b5d842e212478a6570c7d938 upstream.

Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3)
server (DELL PowerEdge 730xd):

  EDAC sbridge: Some needed devices are missing
  EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0
  EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Failed to register device with error -19.

The refactored sb_edac driver creates the IMC1 (the 2nd memory
controller) if any IMC1 device is present. In this case only
HA1_TA of IMC1 was present, but the driver expected to find
HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure.

The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi
Zhang inserted one DIMM per channel for each CPU, and did random error
address injection test with this patch:

      4024  addresses fell in TOLM hole area
     12715  addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
     12774  addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
     12798  addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
     12913  addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
     12674  addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0
     12686  addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0
     12882  addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0
     12934  addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0
    106400  addresses were injected totally.

The test result shows that all the 4 channels belong to IMC0 per CPU, so
the server really only has one IMC per CPU.

In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3'
implements either one or two IMCs. For CPUs with one IMC, IMC1 is not
used and should be ignored.

Thus, do not create a second memory controller if the key HA1 is absent.

[1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz
[2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf

Reported-and-tested-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/edac/sb_edac.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -455,6 +455,7 @@ static const struct pci_id_table pci_dev
 static const struct pci_id_descr pci_dev_descr_ibridge[] = {
 		/* Processor Home Agent */
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0,        0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1,        1, IMC1) },
 
 		/* Memory controller */
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA,     0, IMC0) },
@@ -465,7 +466,6 @@ static const struct pci_id_descr pci_dev
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3,   0, IMC0) },
 
 		/* Optional, mode 2HA */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1,        1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA,     1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS,    1, IMC1) },
 	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0,   1, IMC1) },
@@ -2260,6 +2260,13 @@ static int sbridge_get_onedevice(struct
 next_imc:
 	sbridge_dev = get_sbridge_dev(bus, dev_descr->dom, multi_bus, sbridge_dev);
 	if (!sbridge_dev) {
+		/* If the HA1 wasn't found, don't create EDAC second memory controller */
+		if (dev_descr->dom == IMC1 && devno != 1) {
+			edac_dbg(0, "Skip IMC1: %04x:%04x (since HA1 was absent)\n",
+				 PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+			pci_dev_put(pdev);
+			return 0;
+		}
 
 		if (dev_descr->dom == SOCK)
 			goto out_imc;

^ permalink raw reply	[flat|nested] 51+ messages in thread
* [PATCH 4.13 00/44] 4.13.14-stable review
@ 2017-11-16 17:42 Greg Kroah-Hartman
  2017-11-16 17:42 ` [PATCH 4.13 01/44] ppp: fix race in ppp device destruction Greg Kroah-Hartman
                   ` (44 more replies)
  0 siblings, 45 replies; 51+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-16 17:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, stable

This is the start of the stable review cycle for the 4.13.14 release.
There are 44 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sat Nov 18 17:28:05 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.13.14-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.13.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.13.14-rc1

Adam Wallis <awallis@codeaurora.org>
    dmaengine: dmatest: warn user when dma test times out

Qiuxu Zhuo <qiuxu.zhuo@intel.com>
    EDAC, sb_edac: Don't create a second memory controller if HA1 is not present

Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Input: ims-psu - check if CDC union descriptor is sane

Alan Stern <stern@rowland.harvard.edu>
    usb: usbtest: fix NULL pointer dereference

Johannes Berg <johannes.berg@intel.com>
    mac80211: don't compare TKIP TX MIC key in reinstall prevention

Jason A. Donenfeld <Jason@zx2c4.com>
    mac80211: use constant time comparison with keys

Johannes Berg <johannes.berg@intel.com>
    mac80211: accept key reinstall without changing anything

Eric Dumazet <edumazet@google.com>
    tcp: fix tcp_mtu_probe() vs highest_sack

Eric Dumazet <edumazet@google.com>
    ipv6: addrconf: increment ifp refcount before ipv6_del_addr()

Craig Gallek <kraig@google.com>
    tun/tap: sanitize TUNSETSNDBUF input

Guillaume Nault <g.nault@alphalink.fr>
    l2tp: hold tunnel in pppol2tp_connect()

Cong Wang <xiyou.wangcong@gmail.com>
    net_sched: avoid matching qdisc with zero handle

Xin Long <lucien.xin@gmail.com>
    sctp: reset owner sk for data chunks on out queues when migrating a sock

Julien Gomes <julien@arista.com>
    tun: allow positive return values on dev_get_valid_name() call

Girish Moodalbail <girish.moodalbail@oracle.com>
    tap: reference to KVA of an unloaded module causes kernel panic

Eric Dumazet <edumazet@google.com>
    tcp: refresh tp timestamp before tcp_mtu_probe()

Xin Long <lucien.xin@gmail.com>
    ip6_gre: update dst pmtu if dev mtu has been updated by toobig in __gre6_xmit

Xin Long <lucien.xin@gmail.com>
    ip6_gre: only increase err_count for some certain type icmpv6 in ip6gre_err

Xin Long <lucien.xin@gmail.com>
    ipip: only increase err_count for some certain type icmp in ipip_err

Or Gerlitz <ogerlitz@mellanox.com>
    net/mlx5e: Properly deal with encap flows add/del under neigh update

Moshe Shemesh <moshe@mellanox.com>
    net/mlx5: Fix health work queue spin lock to IRQ safe

Girish Moodalbail <girish.moodalbail@oracle.com>
    tap: double-free in error path in tap_open()

Andrei Vagin <avagin@openvz.org>
    net/unix: don't show information about sockets from other namespaces

Vivien Didelot <vivien.didelot@savoirfairelinux.com>
    net: dsa: check master device before put

Eric Dumazet <edumazet@google.com>
    tcp/dccp: fix other lockdep splats accessing ireq_opt

Eric Dumazet <edumazet@google.com>
    tcp/dccp: fix lockdep splat in inet_csk_route_req()

Laszlo Toth <laszlth@gmail.com>
    sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND

Eric Dumazet <edumazet@google.com>
    ipv6: flowlabel: do not leave opt->tot_len with garbage

Craig Gallek <kraig@google.com>
    soreuseport: fix initialization race

Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    net: bridge: fix returning of vlan range op errors

Stefano Brivio <sbrivio@redhat.com>
    geneve: Fix function matching VNI and tunnel ID on big-endian

Eric Dumazet <edumazet@google.com>
    packet: avoid panic in packet_getsockopt()

Eric Dumazet <edumazet@google.com>
    tcp/dccp: fix ireq->opt races

Xin Long <lucien.xin@gmail.com>
    sctp: add the missing sock_owned_by_user check in sctp_icmp_redirect

Johannes Berg <johannes.berg@intel.com>
    netlink: fix netlink_ack() extack race

Cong Wang <xiyou.wangcong@gmail.com>
    tun: call dev_get_valid_name() before register_netdevice()

Guillaume Nault <g.nault@alphalink.fr>
    l2tp: check ps->sock before running pppol2tp_session_ioctl()

Sabrina Dubroca <sd@queasysnail.net>
    macsec: fix memory leaks when skb_to_sgvec fails

Eric Dumazet <edumazet@google.com>
    net: call cgroup_sk_alloc() earlier in sk_clone_lock()

Jason A. Donenfeld <Jason@zx2c4.com>
    netlink: do not set cb_running if dump's start() errs

Steffen Klassert <steffen.klassert@secunet.com>
    ipv6: Fix traffic triggered IPsec connections.

Steffen Klassert <steffen.klassert@secunet.com>
    ipv4: Fix traffic triggered IPsec connections.

Alexey Kodanev <alexey.kodanev@oracle.com>
    gso: fix payload length when gso_size is zero

Guillaume Nault <g.nault@alphalink.fr>
    ppp: fix race in ppp device destruction


-------------

Diffstat:

 Makefile                                         |  4 +-
 drivers/dma/dmatest.c                            |  1 +
 drivers/edac/sb_edac.c                           |  9 ++-
 drivers/input/misc/ims-pcu.c                     | 16 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c  | 89 ++++++++++++++----------
 drivers/net/ethernet/mellanox/mlx5/core/health.c |  5 +-
 drivers/net/geneve.c                             |  6 --
 drivers/net/ipvlan/ipvtap.c                      |  4 +-
 drivers/net/macsec.c                             |  2 +
 drivers/net/macvtap.c                            |  4 +-
 drivers/net/ppp/ppp_generic.c                    | 20 ++++++
 drivers/net/tap.c                                | 25 ++++---
 drivers/net/tun.c                                |  7 ++
 drivers/usb/misc/usbtest.c                       |  5 +-
 include/linux/if_tap.h                           |  4 +-
 include/linux/netdevice.h                        |  3 +
 include/net/inet_sock.h                          |  8 ++-
 include/net/tcp.h                                |  6 +-
 net/bridge/br_netlink.c                          |  2 +-
 net/core/dev.c                                   |  6 +-
 net/core/sock.c                                  |  3 +-
 net/core/sock_reuseport.c                        | 12 +++-
 net/dccp/ipv4.c                                  | 13 ++--
 net/dsa/dsa2.c                                   |  7 +-
 net/ipv4/cipso_ipv4.c                            | 24 ++-----
 net/ipv4/gre_offload.c                           |  2 +-
 net/ipv4/inet_connection_sock.c                  |  9 ++-
 net/ipv4/inet_hashtables.c                       |  5 +-
 net/ipv4/ipip.c                                  | 59 +++++++++++-----
 net/ipv4/route.c                                 |  2 +-
 net/ipv4/syncookies.c                            |  2 +-
 net/ipv4/tcp_input.c                             |  2 +-
 net/ipv4/tcp_ipv4.c                              | 21 +++---
 net/ipv4/tcp_output.c                            |  5 +-
 net/ipv4/udp.c                                   |  5 +-
 net/ipv4/udp_offload.c                           |  2 +-
 net/ipv6/addrconf.c                              |  1 +
 net/ipv6/ip6_flowlabel.c                         |  1 +
 net/ipv6/ip6_gre.c                               | 20 ++++--
 net/ipv6/ip6_offload.c                           |  2 +-
 net/ipv6/ip6_output.c                            |  4 +-
 net/ipv6/route.c                                 |  2 +-
 net/l2tp/l2tp_ppp.c                              | 10 ++-
 net/mac80211/key.c                               | 54 ++++++++++++--
 net/netlink/af_netlink.c                         | 21 +++---
 net/packet/af_packet.c                           | 24 ++++---
 net/sched/sch_api.c                              |  2 +
 net/sctp/input.c                                 |  2 +-
 net/sctp/ipv6.c                                  |  6 +-
 net/sctp/socket.c                                | 32 +++++++++
 net/unix/diag.c                                  |  2 +
 51 files changed, 395 insertions(+), 187 deletions(-)

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-11-18  0:41 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-16 17:43 [4.13,43/44] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 43/44] " Greg Kroah-Hartman
  -- strict thread matches above, loose matches on Subject: below --
2017-11-16 17:42 [PATCH 4.13 00/44] 4.13.14-stable review Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 01/44] ppp: fix race in ppp device destruction Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 02/44] gso: fix payload length when gso_size is zero Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 03/44] ipv4: Fix traffic triggered IPsec connections Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 04/44] ipv6: " Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 05/44] netlink: do not set cb_running if dumps start() errs Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 06/44] net: call cgroup_sk_alloc() earlier in sk_clone_lock() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 07/44] macsec: fix memory leaks when skb_to_sgvec fails Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 08/44] l2tp: check ps->sock before running pppol2tp_session_ioctl() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 09/44] tun: call dev_get_valid_name() before register_netdevice() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 10/44] netlink: fix netlink_ack() extack race Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 11/44] sctp: add the missing sock_owned_by_user check in sctp_icmp_redirect Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 12/44] tcp/dccp: fix ireq->opt races Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 13/44] packet: avoid panic in packet_getsockopt() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 14/44] geneve: Fix function matching VNI and tunnel ID on big-endian Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 15/44] net: bridge: fix returning of vlan range op errors Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 16/44] soreuseport: fix initialization race Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 17/44] ipv6: flowlabel: do not leave opt->tot_len with garbage Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 18/44] sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 19/44] tcp/dccp: fix lockdep splat in inet_csk_route_req() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 21/44] net: dsa: check master device before put Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 22/44] net/unix: dont show information about sockets from other namespaces Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 23/44] tap: double-free in error path in tap_open() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 24/44] net/mlx5: Fix health work queue spin lock to IRQ safe Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 25/44] net/mlx5e: Properly deal with encap flows add/del under neigh update Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 26/44] ipip: only increase err_count for some certain type icmp in ipip_err Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 27/44] ip6_gre: only increase err_count for some certain type icmpv6 in ip6gre_err Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 28/44] ip6_gre: update dst pmtu if dev mtu has been updated by toobig in __gre6_xmit Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 29/44] tcp: refresh tp timestamp before tcp_mtu_probe() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 30/44] tap: reference to KVA of an unloaded module causes kernel panic Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 31/44] tun: allow positive return values on dev_get_valid_name() call Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 32/44] sctp: reset owner sk for data chunks on out queues when migrating a sock Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 33/44] net_sched: avoid matching qdisc with zero handle Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 34/44] l2tp: hold tunnel in pppol2tp_connect() Greg Kroah-Hartman
2017-11-16 17:42 ` [PATCH 4.13 35/44] tun/tap: sanitize TUNSETSNDBUF input Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 36/44] ipv6: addrconf: increment ifp refcount before ipv6_del_addr() Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 37/44] tcp: fix tcp_mtu_probe() vs highest_sack Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 38/44] mac80211: accept key reinstall without changing anything Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 39/44] mac80211: use constant time comparison with keys Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 40/44] mac80211: dont compare TKIP TX MIC key in reinstall prevention Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 41/44] usb: usbtest: fix NULL pointer dereference Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 42/44] Input: ims-psu - check if CDC union descriptor is sane Greg Kroah-Hartman
2017-11-16 17:43 ` [PATCH 4.13 44/44] dmaengine: dmatest: warn user when dma test times out Greg Kroah-Hartman
2017-11-16 22:45 ` [PATCH 4.13 00/44] 4.13.14-stable review Shuah Khan
2017-11-17  8:10   ` Greg Kroah-Hartman
2017-11-17  2:03 ` Guenter Roeck
2017-11-17  8:11   ` Greg Kroah-Hartman
     [not found] ` <5a0e2714.c5badf0a.3983c.7eaa@mx.google.com>
2017-11-17  8:14   ` Greg Kroah-Hartman
2017-11-18  0:41     ` Kevin Hilman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.