From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Subject: [4.13,43/44] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present From: Greg Kroah-Hartman Message-Id: <20171116172825.483949271@linuxfoundation.org> Date: Thu, 16 Nov 2017 18:43:07 +0100 To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Qiuxu Zhuo , Tony Luck , linux-edac , Borislav Petkov List-ID: NC4xMy1zdGFibGUgcmV2aWV3IHBhdGNoLiAgSWYgYW55b25lIGhhcyBhbnkgb2JqZWN0aW9ucywg cGxlYXNlIGxldCBtZSBrbm93LgoKLS0tLS0tLS0tLS0tLS0tLS0tCgpGcm9tOiBRaXV4dSBaaHVv IDxxaXV4dS56aHVvQGludGVsLmNvbT4KCmNvbW1pdCAxNWNjM2FlMDAxODczODQ1YjVkODQyZTIx MjQ3OGE2NTcwYzdkOTM4IHVwc3RyZWFtLgoKWWkgWmhhbmcgcmVwb3J0ZWQgdGhlIGZvbGxvd2lu ZyBmYWlsdXJlIG9uIGEgMi1zb2NrZXQgSGFzd2VsbCAoRTUtMjYwM3YzKQpzZXJ2ZXIgKERFTEwg UG93ZXJFZGdlIDczMHhkKToKCiAgRURBQyBzYnJpZGdlOiBTb21lIG5lZWRlZCBkZXZpY2VzIGFy ZSBtaXNzaW5nCiAgRURBQyBNQzogUmVtb3ZlZCBkZXZpY2UgMCBmb3Igc2JfZWRhYy5jIEhhc3dl bGwgU3JjSUQjMF9IYSMwOiBERVYgMDAwMDo3ZjoxMi4wCiAgRURBQyBNQzogUmVtb3ZlZCBkZXZp Y2UgMSBmb3Igc2JfZWRhYy5jIEhhc3dlbGwgU3JjSUQjMV9IYSMwOiBERVYgMDAwMDpmZjoxMi4w CiAgRURBQyBzYnJpZGdlOiBDb3VsZG4ndCBmaW5kIG1jaSBoYW5kbGVyCiAgRURBQyBzYnJpZGdl OiBDb3VsZG4ndCBmaW5kIG1jaSBoYW5kbGVyCiAgRURBQyBzYnJpZGdlOiBGYWlsZWQgdG8gcmVn aXN0ZXIgZGV2aWNlIHdpdGggZXJyb3IgLTE5LgoKVGhlIHJlZmFjdG9yZWQgc2JfZWRhYyBkcml2 ZXIgY3JlYXRlcyB0aGUgSU1DMSAodGhlIDJuZCBtZW1vcnkKY29udHJvbGxlcikgaWYgYW55IElN QzEgZGV2aWNlIGlzIHByZXNlbnQuIEluIHRoaXMgY2FzZSBvbmx5CkhBMV9UQSBvZiBJTUMxIHdh cyBwcmVzZW50LCBidXQgdGhlIGRyaXZlciBleHBlY3RlZCB0byBmaW5kCkhBMS9IQTFfVE0vSEEx X1RBRFswLTNdIGRldmljZXMgdG9vLCBsZWFkaW5nIHRvIHRoZSBhYm92ZSBmYWlsdXJlLgoKVGhl IGRvY3VtZW50IFsxXSBzYXlzIHRoZSAnRTUtMjYwMyB2MycgQ1BVIGhhcyA0IG1lbW9yeSBjaGFu bmVscyBtYXguIFlpClpoYW5nIGluc2VydGVkIG9uZSBESU1NIHBlciBjaGFubmVsIGZvciBlYWNo IENQVSwgYW5kIGRpZCByYW5kb20gZXJyb3IKYWRkcmVzcyBpbmplY3Rpb24gdGVzdCB3aXRoIHRo aXMgcGF0Y2g6CgogICAgICA0MDI0ICBhZGRyZXNzZXMgZmVsbCBpbiBUT0xNIGhvbGUgYXJlYQog ICAgIDEyNzE1ICBhZGRyZXNzZXMgZmVsbCBpbiBDUFVfU3JjSUQjMF9IYSMwX0NoYW4jMF9ESU1N IzAKICAgICAxMjc3NCAgYWRkcmVzc2VzIGZlbGwgaW4gQ1BVX1NyY0lEIzBfSGEjMF9DaGFuIzFf RElNTSMwCiAgICAgMTI3OTggIGFkZHJlc3NlcyBmZWxsIGluIENQVV9TcmNJRCMwX0hhIzBfQ2hh biMyX0RJTU0jMAogICAgIDEyOTEzICBhZGRyZXNzZXMgZmVsbCBpbiBDUFVfU3JjSUQjMF9IYSMw X0NoYW4jM19ESU1NIzAKICAgICAxMjY3NCAgYWRkcmVzc2VzIGZlbGwgaW4gQ1BVX1NyY0lEIzFf SGEjMF9DaGFuIzBfRElNTSMwCiAgICAgMTI2ODYgIGFkZHJlc3NlcyBmZWxsIGluIENQVV9TcmNJ RCMxX0hhIzBfQ2hhbiMxX0RJTU0jMAogICAgIDEyODgyICBhZGRyZXNzZXMgZmVsbCBpbiBDUFVf U3JjSUQjMV9IYSMwX0NoYW4jMl9ESU1NIzAKICAgICAxMjkzNCAgYWRkcmVzc2VzIGZlbGwgaW4g Q1BVX1NyY0lEIzFfSGEjMF9DaGFuIzNfRElNTSMwCiAgICAxMDY0MDAgIGFkZHJlc3NlcyB3ZXJl IGluamVjdGVkIHRvdGFsbHkuCgpUaGUgdGVzdCByZXN1bHQgc2hvd3MgdGhhdCBhbGwgdGhlIDQg Y2hhbm5lbHMgYmVsb25nIHRvIElNQzAgcGVyIENQVSwgc28KdGhlIHNlcnZlciByZWFsbHkgb25s eSBoYXMgb25lIElNQyBwZXIgQ1BVLgoKSW4gdGhlIDFzdCBwYWdlIG9mIGNoYXB0ZXIgMiBpbiBk YXRhc2hlZXQgWzJdLCBpdCBhbHNvIHNheXMgJ0U1LTI2MDAgdjMnCmltcGxlbWVudHMgZWl0aGVy IG9uZSBvciB0d28gSU1Dcy4gRm9yIENQVXMgd2l0aCBvbmUgSU1DLCBJTUMxIGlzIG5vdAp1c2Vk IGFuZCBzaG91bGQgYmUgaWdub3JlZC4KClRodXMsIGRvIG5vdCBjcmVhdGUgYSBzZWNvbmQgbWVt b3J5IGNvbnRyb2xsZXIgaWYgdGhlIGtleSBIQTEgaXMgYWJzZW50LgoKWzFdIGh0dHA6Ly9hcmsu aW50ZWwuY29tL3Byb2R1Y3RzLzgzMzQ5L0ludGVsLVhlb24tUHJvY2Vzc29yLUU1LTI2MDMtdjMt MTVNLUNhY2hlLTFfNjAtR0h6ClsyXSBodHRwczovL3d3dy5pbnRlbC5jb20vY29udGVudC9kYW0v d3d3L3B1YmxpYy91cy9lbi9kb2N1bWVudHMvZGF0YXNoZWV0cy94ZW9uLWU1LXYzLWRhdGFzaGVl dC12b2wtMi5wZGYKClJlcG9ydGVkLWFuZC10ZXN0ZWQtYnk6IFlpIFpoYW5nIDx5aXpoYW5AcmVk aGF0LmNvbT4KU2lnbmVkLW9mZi1ieTogUWl1eHUgWmh1byA8cWl1eHUuemh1b0BpbnRlbC5jb20+ CkNjOiBUb255IEx1Y2sgPHRvbnkubHVja0BpbnRlbC5jb20+CkNjOiBsaW51eC1lZGFjIDxsaW51 eC1lZGFjQHZnZXIua2VybmVsLm9yZz4KTGluazogaHR0cDovL2xrbWwua2VybmVsLm9yZy9yLzIw MTcwOTEzMTA0MjE0LjczMjUtMS1xaXV4dS56aHVvQGludGVsLmNvbQpbIE1hc3NhZ2UgY29tbWl0 IG1lc3NhZ2UuIF0KU2lnbmVkLW9mZi1ieTogQm9yaXNsYXYgUGV0a292IDxicEBzdXNlLmRlPgpT aWduZWQtb2ZmLWJ5OiBHcmVnIEtyb2FoLUhhcnRtYW4gPGdyZWdraEBsaW51eGZvdW5kYXRpb24u b3JnPgotLS0KIGRyaXZlcnMvZWRhYy9zYl9lZGFjLmMgfCAgICA5ICsrKysrKysrLQogMSBmaWxl IGNoYW5nZWQsIDggaW5zZXJ0aW9ucygrKSwgMSBkZWxldGlvbigtKQoKCgotLQpUbyB1bnN1YnNj cmliZSBmcm9tIHRoaXMgbGlzdDogc2VuZCB0aGUgbGluZSAidW5zdWJzY3JpYmUgbGludXgtZWRh YyIgaW4KdGhlIGJvZHkgb2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcK TW9yZSBtYWpvcmRvbW8gaW5mbyBhdCAgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8t aW5mby5odG1sCgotLS0gYS9kcml2ZXJzL2VkYWMvc2JfZWRhYy5jCisrKyBiL2RyaXZlcnMvZWRh Yy9zYl9lZGFjLmMKQEAgLTQ1NSw2ICs0NTUsNyBAQCBzdGF0aWMgY29uc3Qgc3RydWN0IHBjaV9p ZF90YWJsZSBwY2lfZGV2CiBzdGF0aWMgY29uc3Qgc3RydWN0IHBjaV9pZF9kZXNjciBwY2lfZGV2 X2Rlc2NyX2licmlkZ2VbXSA9IHsKIAkJLyogUHJvY2Vzc29yIEhvbWUgQWdlbnQgKi8KIAl7IFBD SV9ERVNDUihQQ0lfREVWSUNFX0lEX0lOVEVMX0lCUklER0VfSU1DX0hBMCwgICAgICAgIDAsIElN QzApIH0sCisJeyBQQ0lfREVTQ1IoUENJX0RFVklDRV9JRF9JTlRFTF9JQlJJREdFX0lNQ19IQTEs ICAgICAgICAxLCBJTUMxKSB9LAogCiAJCS8qIE1lbW9yeSBjb250cm9sbGVyICovCiAJeyBQQ0lf REVTQ1IoUENJX0RFVklDRV9JRF9JTlRFTF9JQlJJREdFX0lNQ19IQTBfVEEsICAgICAwLCBJTUMw KSB9LApAQCAtNDY1LDcgKzQ2Niw2IEBAIHN0YXRpYyBjb25zdCBzdHJ1Y3QgcGNpX2lkX2Rlc2Ny IHBjaV9kZXYKIAl7IFBDSV9ERVNDUihQQ0lfREVWSUNFX0lEX0lOVEVMX0lCUklER0VfSU1DX0hB MF9UQUQzLCAgIDAsIElNQzApIH0sCiAKIAkJLyogT3B0aW9uYWwsIG1vZGUgMkhBICovCi0JeyBQ Q0lfREVTQ1IoUENJX0RFVklDRV9JRF9JTlRFTF9JQlJJREdFX0lNQ19IQTEsICAgICAgICAxLCBJ TUMxKSB9LAogCXsgUENJX0RFU0NSKFBDSV9ERVZJQ0VfSURfSU5URUxfSUJSSURHRV9JTUNfSEEx X1RBLCAgICAgMSwgSU1DMSkgfSwKIAl7IFBDSV9ERVNDUihQQ0lfREVWSUNFX0lEX0lOVEVMX0lC UklER0VfSU1DX0hBMV9SQVMsICAgIDEsIElNQzEpIH0sCiAJeyBQQ0lfREVTQ1IoUENJX0RFVklD RV9JRF9JTlRFTF9JQlJJREdFX0lNQ19IQTFfVEFEMCwgICAxLCBJTUMxKSB9LApAQCAtMjI2MCw2 ICsyMjYwLDEzIEBAIHN0YXRpYyBpbnQgc2JyaWRnZV9nZXRfb25lZGV2aWNlKHN0cnVjdAogbmV4 dF9pbWM6CiAJc2JyaWRnZV9kZXYgPSBnZXRfc2JyaWRnZV9kZXYoYnVzLCBkZXZfZGVzY3ItPmRv bSwgbXVsdGlfYnVzLCBzYnJpZGdlX2Rldik7CiAJaWYgKCFzYnJpZGdlX2RldikgeworCQkvKiBJ ZiB0aGUgSEExIHdhc24ndCBmb3VuZCwgZG9uJ3QgY3JlYXRlIEVEQUMgc2Vjb25kIG1lbW9yeSBj b250cm9sbGVyICovCisJCWlmIChkZXZfZGVzY3ItPmRvbSA9PSBJTUMxICYmIGRldm5vICE9IDEp IHsKKwkJCWVkYWNfZGJnKDAsICJTa2lwIElNQzE6ICUwNHg6JTA0eCAoc2luY2UgSEExIHdhcyBh YnNlbnQpXG4iLAorCQkJCSBQQ0lfVkVORE9SX0lEX0lOVEVMLCBkZXZfZGVzY3ItPmRldl9pZCk7 CisJCQlwY2lfZGV2X3B1dChwZGV2KTsKKwkJCXJldHVybiAwOworCQl9CiAKIAkJaWYgKGRldl9k ZXNjci0+ZG9tID09IFNPQ0spCiAJCQlnb3RvIG91dF9pbWM7Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965822AbdKPRxQ (ORCPT ); Thu, 16 Nov 2017 12:53:16 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:38208 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966121AbdKPRvl (ORCPT ); Thu, 16 Nov 2017 12:51:41 -0500 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Qiuxu Zhuo , Tony Luck , linux-edac , Borislav Petkov Subject: [PATCH 4.13 43/44] EDAC, sb_edac: Dont create a second memory controller if HA1 is not present Date: Thu, 16 Nov 2017 18:43:07 +0100 Message-Id: <20171116172825.483949271@linuxfoundation.org> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20171116172823.336649076@linuxfoundation.org> References: <20171116172823.336649076@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.13-stable review patch. If anyone has any objections, please let me know. ------------------ From: Qiuxu Zhuo commit 15cc3ae001873845b5d842e212478a6570c7d938 upstream. Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3) server (DELL PowerEdge 730xd): EDAC sbridge: Some needed devices are missing EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0 EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0 EDAC sbridge: Couldn't find mci handler EDAC sbridge: Couldn't find mci handler EDAC sbridge: Failed to register device with error -19. The refactored sb_edac driver creates the IMC1 (the 2nd memory controller) if any IMC1 device is present. In this case only HA1_TA of IMC1 was present, but the driver expected to find HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure. The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi Zhang inserted one DIMM per channel for each CPU, and did random error address injection test with this patch: 4024 addresses fell in TOLM hole area 12715 addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 12774 addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 12798 addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 12913 addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 12674 addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 12686 addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 12882 addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 12934 addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 106400 addresses were injected totally. The test result shows that all the 4 channels belong to IMC0 per CPU, so the server really only has one IMC per CPU. In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3' implements either one or two IMCs. For CPUs with one IMC, IMC1 is not used and should be ignored. Thus, do not create a second memory controller if the key HA1 is absent. [1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz [2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf Reported-and-tested-by: Yi Zhang Signed-off-by: Qiuxu Zhuo Cc: Tony Luck Cc: linux-edac Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com [ Massage commit message. ] Signed-off-by: Borislav Petkov Signed-off-by: Greg Kroah-Hartman --- drivers/edac/sb_edac.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) --- a/drivers/edac/sb_edac.c +++ b/drivers/edac/sb_edac.c @@ -455,6 +455,7 @@ static const struct pci_id_table pci_dev static const struct pci_id_descr pci_dev_descr_ibridge[] = { /* Processor Home Agent */ { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0, 0, IMC0) }, + { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1, 1, IMC1) }, /* Memory controller */ { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA, 0, IMC0) }, @@ -465,7 +466,6 @@ static const struct pci_id_descr pci_dev { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3, 0, IMC0) }, /* Optional, mode 2HA */ - { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1, 1, IMC1) }, { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA, 1, IMC1) }, { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS, 1, IMC1) }, { PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0, 1, IMC1) }, @@ -2260,6 +2260,13 @@ static int sbridge_get_onedevice(struct next_imc: sbridge_dev = get_sbridge_dev(bus, dev_descr->dom, multi_bus, sbridge_dev); if (!sbridge_dev) { + /* If the HA1 wasn't found, don't create EDAC second memory controller */ + if (dev_descr->dom == IMC1 && devno != 1) { + edac_dbg(0, "Skip IMC1: %04x:%04x (since HA1 was absent)\n", + PCI_VENDOR_ID_INTEL, dev_descr->dev_id); + pci_dev_put(pdev); + return 0; + } if (dev_descr->dom == SOCK) goto out_imc;