linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-03 21:57 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-03 21:57 UTC (permalink / raw)
  To: rjw, bp
  Cc: mchehab, tony.luck, lenb, linux-acpi, linux-edac, linux-kernel,
	Toshi Kani

ghes_edac_register() is called for each GHES platform device
instantiated per a GHES entry in ACPI HEST table.  dmi_walk()
counts the number of DIMMs on the system, and there is no need
to call it multiple times.

Change ghes_edac_register() to call dmi_walk() only when
'num_dimm' is uninitialized.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 drivers/edac/ghes_edac.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 4e61a62..2e9ce9c 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -407,15 +407,18 @@ EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
 
 int ghes_edac_register(struct ghes *ghes, struct device *dev)
 {
-	bool fake = false;
-	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_pvt *pvt;
 	struct ghes_edac_dimm_fill dimm_fill;
+	int rc;
+
+	static int num_dimm;
+	static bool fake;
 
 	/* Get the number of DIMMs */
-	dmi_walk(ghes_edac_count_dimms, &num_dimm);
+	if (num_dimm == 0)
+		dmi_walk(ghes_edac_count_dimms, &num_dimm);
 
 	/* Check if we've got a bogus BIOS */
 	if (num_dimm == 0) {

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-04  4:05 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-04  4:05 UTC (permalink / raw)
  To: Toshi Kani
  Cc: rjw, mchehab, tony.luck, lenb, linux-acpi, linux-edac,
	linux-kernel

On Thu, Aug 03, 2017 at 03:57:50PM -0600, Toshi Kani wrote:
> ghes_edac_register() is called for each GHES platform device
> instantiated per a GHES entry in ACPI HEST table.  dmi_walk()
> counts the number of DIMMs on the system, and there is no need
> to call it multiple times.
> 
> Change ghes_edac_register() to call dmi_walk() only when
> 'num_dimm' is uninitialized.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Suggested-by: Borislav Petkov <bp@alien8.de>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> ---
>  drivers/edac/ghes_edac.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 4e61a62..2e9ce9c 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -407,15 +407,18 @@ EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
>  
>  int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  {
> -	bool fake = false;
> -	int rc, num_dimm = 0;
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
>  	struct ghes_edac_pvt *pvt;
>  	struct ghes_edac_dimm_fill dimm_fill;
> +	int rc;
> +
> +	static int num_dimm;
> +	static bool fake;
>  
>  	/* Get the number of DIMMs */
> -	dmi_walk(ghes_edac_count_dimms, &num_dimm);
> +	if (num_dimm == 0)
> +		dmi_walk(ghes_edac_count_dimms, &num_dimm);

So the problem is that ghes_edac_register() gets called multiple times
depending on how many GHES platform devices are on the system. But yet
they all scan *all* DIMMs. So instead you should return if the DIMMs
have been counted already and not register a second time.

Which makes that whole mc counting kinda useless. So you could rip that
out too.

Unless I'm missing something...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-04 21:02 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-04 21:02 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Fri, 2017-08-04 at 06:05 +0200, Borislav Petkov wrote:
> On Thu, Aug 03, 2017 at 03:57:50PM -0600, Toshi Kani wrote:
> > ghes_edac_register() is called for each GHES platform device
> > instantiated per a GHES entry in ACPI HEST table.  dmi_walk()
> > counts the number of DIMMs on the system, and there is no need
> > to call it multiple times.
> > 
> > Change ghes_edac_register() to call dmi_walk() only when
> > 'num_dimm' is uninitialized.
> > 
> > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> > Suggested-by: Borislav Petkov <bp@alien8.de>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> > ---
> >  drivers/edac/ghes_edac.c |    9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> > index 4e61a62..2e9ce9c 100644
> > --- a/drivers/edac/ghes_edac.c
> > +++ b/drivers/edac/ghes_edac.c
> > @@ -407,15 +407,18 @@
> > EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
> >  
> >  int ghes_edac_register(struct ghes *ghes, struct device *dev)
> >  {
> > -	bool fake = false;
> > -	int rc, num_dimm = 0;
> >  	struct mem_ctl_info *mci;
> >  	struct edac_mc_layer layers[1];
> >  	struct ghes_edac_pvt *pvt;
> >  	struct ghes_edac_dimm_fill dimm_fill;
> > +	int rc;
> > +
> > +	static int num_dimm;
> > +	static bool fake;
> >  
> >  	/* Get the number of DIMMs */
> > -	dmi_walk(ghes_edac_count_dimms, &num_dimm);
> > +	if (num_dimm == 0)
> > +		dmi_walk(ghes_edac_count_dimms, &num_dimm);
> 
> So the problem is that ghes_edac_register() gets called multiple
> times depending on how many GHES platform devices are on the system.
> But yet they all scan *all* DIMMs. So instead you should return if
> the DIMMs have been counted already and not register a second time.
> 
> Which makes that whole mc counting kinda useless. So you could rip
> that out too.
> 
> Unless I'm missing something...

GHES platform devices correspond to GHES entries, which define firmware
interfaces to report generic memory errors to the OS, such as NMI and
SCI.  These devices are associated with all DIMMs, not a particular
DIMM.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-05  5:16 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-05  5:16 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Fri, Aug 04, 2017 at 09:02:17PM +0000, Kani, Toshimitsu wrote:
> GHES platform devices correspond to GHES entries, which define firmware
> interfaces to report generic memory errors to the OS, such as NMI and
> SCI.  These devices are associated with all DIMMs, not a particular
> DIMM.

And? Stating the obvious brings you what exactly?

IOW, you still haven't answered my question.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-07 17:59 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-07 17:59 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Sat, 2017-08-05 at 07:16 +0200, Borislav Petkov wrote:
> On Fri, Aug 04, 2017 at 09:02:17PM +0000, Kani, Toshimitsu wrote:
> > GHES platform devices correspond to GHES entries, which define
> > firmware interfaces to report generic memory errors to the OS, such
> > as NMI and SCI.  These devices are associated with all DIMMs, not a
> > particular DIMM.
> 
> And? Stating the obvious brings you what exactly?
> 
> IOW, you still haven't answered my question.

Sorry about that.  Let me copy your original comments to make sure I
answer your questions this time.

> So the problem is that ghes_edac_register() gets called multiple
> times depending on how many GHES platform devices are on the system.
> But yet they all scan *all* DIMMs. 

Right, and this patch changes ghes_edac_register() to scan all DIMMs at
the first time and do not scan them next times.

> So instead you should return if
> the DIMMs have been counted already and not register a second time.
> Which makes that whole mc counting kinda useless. So you could rip
>
that out too.

Oh, I see.  So, ghes_edac_register() should return no-op a second time,
and does not call edac_mc_add_mc() to register with a separate mci.

I think we should keep the current scheme, which registers an mci for
each GHES entry.  ghes_edac_report_mem_error() expects that error-
reporting is serialized per a GHES entry.  Sharing a single mci among
all GHES entries / error interfaces might lead to a race condition.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-11  9:04 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-11  9:04 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Mon, Aug 07, 2017 at 05:59:15PM +0000, Kani, Toshimitsu wrote:
> I think we should keep the current scheme, which registers an mci for

No we shouldn't.

> each GHES entry.  ghes_edac_report_mem_error() expects that error-
> reporting is serialized per a GHES entry.  Sharing a single mci among
> all GHES entries / error interfaces might lead to a race condition.

See how I solved it in my patchset and feel free to reuse it.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 15:57 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 15:57 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

T24gRnJpLCAyMDE3LTA4LTExIGF0IDExOjA0ICswMjAwLCBCb3Jpc2xhdiBQZXRrb3Ygd3JvdGU6
DQo+IE9uIE1vbiwgQXVnIDA3LCAyMDE3IGF0IDA1OjU5OjE1UE0gKzAwMDAsIEthbmksIFRvc2hp
bWl0c3Ugd3JvdGU6DQo+ID4gSSB0aGluayB3ZSBzaG91bGQga2VlcCB0aGUgY3VycmVudCBzY2hl
bWUsIHdoaWNoIHJlZ2lzdGVycyBhbiBtY2kNCj4gPiBmb3INCj4gDQo+IE5vIHdlIHNob3VsZG4n
dC4NCj4gDQo+ID4gZWFjaCBHSEVTIGVudHJ5LsKgwqBnaGVzX2VkYWNfcmVwb3J0X21lbV9lcnJv
cigpIGV4cGVjdHMgdGhhdCBlcnJvci0NCj4gPiByZXBvcnRpbmcgaXMgc2VyaWFsaXplZCBwZXIg
YSBHSEVTIGVudHJ5LsKgwqBTaGFyaW5nIGEgc2luZ2xlIG1jaQ0KPiA+IGFtb25nIGFsbCBHSEVT
IGVudHJpZXMgLyBlcnJvciBpbnRlcmZhY2VzIG1pZ2h0IGxlYWQgdG8gYSByYWNlDQo+ID4gY29u
ZGl0aW9uLg0KPiANCj4gU2VlIGhvdyBJIHNvbHZlZCBpdCBpbiBteSBwYXRjaHNldCBhbmQgZmVl
bCBmcmVlIHRvIHJldXNlIGl0Lg0KDQpIbW0uLi4gU29ycnksIEkgZmFpbGVkIHRvIHNlZSBob3cg
eW91ciBwYXRjaHNldCBzb2x2ZWQgaXQuICBXb3VsZCB5b3UNCm1pbmQgdG/CoGV4cGxhaW4gaG93
IGl0IGlzIGRvbmU/DQoNClRoYW5rcyENCi1Ub3NoaQ0K
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 16:24 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 16:24 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Mon, Aug 14, 2017 at 03:57:35PM +0000, Kani, Toshimitsu wrote:
> Hmm... Sorry, I failed to see how your patchset solved it.  Would you
> mind to explain how it is done?

+static int __init ghes_edac_register(void)
 {
+       struct ghes_edac_pvt *pvt = ghes_pvt;

Only one local ghes_pvt structure.

And you handle multiple calls into ghes_edac_register() by exiting all
those which are not the first one as the first one already did all the
init.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 16:48 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 16:48 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

T24gTW9uLCAyMDE3LTA4LTE0IGF0IDE4OjI0ICswMjAwLCBCb3Jpc2xhdiBQZXRrb3Ygd3JvdGU6
DQo+IE9uIE1vbiwgQXVnIDE0LCAyMDE3IGF0IDAzOjU3OjM1UE0gKzAwMDAsIEthbmksIFRvc2hp
bWl0c3Ugd3JvdGU6DQo+ID4gSG1tLi4uIFNvcnJ5LCBJIGZhaWxlZCB0byBzZWUgaG93IHlvdXIg
cGF0Y2hzZXQgc29sdmVkIGl0LsKgwqBXb3VsZA0KPiA+IHlvdSBtaW5kIHRvwqBleHBsYWluIGhv
dyBpdCBpcyBkb25lPw0KPiANCj4gK3N0YXRpYyBpbnQgX19pbml0IGdoZXNfZWRhY19yZWdpc3Rl
cih2b2lkKQ0KPiDCoHsNCj4gK8KgwqDCoMKgwqDCoMKgc3RydWN0IGdoZXNfZWRhY19wdnQgKnB2
dCA9IGdoZXNfcHZ0Ow0KPiANCj4gT25seSBvbmUgbG9jYWwgZ2hlc19wdnQgc3RydWN0dXJlLg0K
PiANCj4gQW5kIHlvdSBoYW5kbGUgbXVsdGlwbGUgY2FsbHMgaW50byBnaGVzX2VkYWNfcmVnaXN0
ZXIoKSBieSBleGl0aW5nDQo+IGFsbCB0aG9zZSB3aGljaCBhcmUgbm90IHRoZSBmaXJzdCBvbmUg
YXMgdGhlIGZpcnN0IG9uZSBhbHJlYWR5IGRpZA0KPiBhbGwgdGhlIGluaXQuDQoNClJpZ2h0LCBi
dXQgdGhlIGlzc3VlIGlzIGhvdyBbZ2hlc19lZGFjX11yZXBvcnRfbWVtX2Vycm9yKCkgcHJvdGVj
dHMNCmZyb20gcG9zc2libGUgY29uY3VycmVudCBjYWxscyBmcm9tIG11bHRpcGxlIEdIRVMgc291
cmNlcyB3aGVuIHRoZXJlIGlzDQpvbmx5IGEgc2luZ2xlIG1jaS4NCg0KVGhhbmtzLA0KLVRvc2hp
DQog
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 17:05 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 17:05 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Mon, Aug 14, 2017 at 04:48:57PM +0000, Kani, Toshimitsu wrote:
> Right, but the issue is how [ghes_edac_]report_mem_error() protects
> from possible concurrent calls from multiple GHES sources when there is
> only a single mci.

Do you know of an actual firmware reporting multiple errors concurrently?

GHES v2 even needs to ACK the current error first before it can read the
next one.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 17:52 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 17:52 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Mon, 2017-08-14 at 19:05 +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2017 at 04:48:57PM +0000, Kani, Toshimitsu wrote:
> > Right, but the issue is how [ghes_edac_]report_mem_error() protects
> > from possible concurrent calls from multiple GHES sources when
> > there is only a single mci.
> 
> Do you know of an actual firmware reporting multiple errors
> concurrently?

I do not know.  We have multiple GHES entries, but they all use SCI. 
Since ACPICA uses a single threaded workqueue for notify handlers, they
are serialized among SCIs.

ACPI 6.2 defines multiple notification types in Table 18-383, and
ghes_proc() can be called from ghes_poll_func(), ghes_irq_func(), and
ghes_notify_sci().  So, I think it is safe to operate per an entry
basis.

> GHES v2 even needs to ACK the current error first before it can read
> the next one.

Yes, but this ACK is done per a GHES entry as well.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 18:05 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 18:05 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Mon, Aug 14, 2017 at 05:52:25PM +0000, Kani, Toshimitsu wrote:
> Yes, but this ACK is done per a GHES entry as well.

So is the ghes_edac_report_mem_error() call.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 18:17 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 18:17 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

T24gTW9uLCAyMDE3LTA4LTE0IGF0IDIwOjA1ICswMjAwLCBCb3Jpc2xhdiBQZXRrb3Ygd3JvdGU6
DQo+IE9uIE1vbiwgQXVnIDE0LCAyMDE3IGF0IDA1OjUyOjI1UE0gKzAwMDAsIEthbmksIFRvc2hp
bWl0c3Ugd3JvdGU6DQo+ID4gWWVzLCBidXQgdGhpcyBBQ0sgaXMgZG9uZSBwZXIgYSBHSEVTIGVu
dHJ5IGFzIHdlbGwuDQo+IA0KPiBTbyBpcyB0aGUgZ2hlc19lZGFjX3JlcG9ydF9tZW1fZXJyb3Io
KSBjYWxsLg0KDQpSaWdodCwgZ2hlc19lZGFjX3JlcG9ydF9tZW1fZXJyb3IoKSBnZXRzIHNlcmlh
bGl6ZWQgcGVyIGEgR0hFUyBlbnRyeSwNCmJ1dCBub3QgZ2xvYmFsbHkuDQoNClRoYW5rcywNCi1U
b3NoaQ0K
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 18:35 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 18:35 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Mon, Aug 14, 2017 at 06:17:47PM +0000, Kani, Toshimitsu wrote:
> Right, ghes_edac_report_mem_error() gets serialized per a GHES entry,
> but not globally.

Globally what?

What is the actual potential scenario for concurrency issues you see?
Example pls.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 19:02 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 19:02 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Mon, 2017-08-14 at 20:35 +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2017 at 06:17:47PM +0000, Kani, Toshimitsu wrote:
> > Right, ghes_edac_report_mem_error() gets serialized per a GHES
> > entry, but not globally.
> 
> Globally what?

GHES v2's ACK is not a global lock.  So, it does not guarantee that
ghes_edac_report_mem_error() never gets called concurrently.

> What is the actual potential scenario for concurrency issues you see?
> Example pls.

ghes_probe() supports multiple sources defined in
acpi_hest_notify_types.  Say, there are two entries for memory errors,
one with ACPI_HEST_NOTIFY_EXTERNAL and the other with
ACPI_HEST_NOTIFY_SCI.  They may report errors independently.  While
ghes_edac_report_mem_error() is being called from the SCI, it can be
called from the ext interrupt at a same time.

I do not know how likely we see such case, but the code should be
written according to the spec.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 19:34 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 19:34 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Mon, Aug 14, 2017 at 07:02:15PM +0000, Kani, Toshimitsu wrote:
> I do not know how likely we see such case, but the code should be
> written according to the spec.

Well, then you'll have to make ghes_edac_report_mem_error() reentrant.
Which doesn't look that hard as the only thing it really needs from
struct ghes_edac_pvt are those string buffers. I guess you can try to do
the simplest thing first and allocate them on the stack.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 20:17 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-14 20:17 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Mon, 2017-08-14 at 21:34 +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2017 at 07:02:15PM +0000, Kani, Toshimitsu wrote:
> > I do not know how likely we see such case, but the code should be
> > written according to the spec.
> 
> Well, then you'll have to make ghes_edac_report_mem_error()
> reentrant. Which doesn't look that hard as the only thing it really
> needs from struct ghes_edac_pvt are those string buffers. I guess you
> can try to do the simplest thing first and allocate them on the
> stack.

ghes_edac_report_mem_error() is reentrant as it is now.  I think the
current code design of allocating mci & ghes_edac_pvt for each GHES
source entry makes sense.  edac_raw_mc_handle_error() also has the same
  expectation that the call is serialized per mci.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-14 20:39 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-14 20:39 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org, rjw@rjwysocki.net

On Mon, Aug 14, 2017 at 08:17:54PM +0000, Kani, Toshimitsu wrote:
> I think the current code design of allocating mci & ghes_edac_pvt for
> each GHES source entry makes sense.

And I don't.

> edac_raw_mc_handle_error() also has the same expectation that the call
> is serialized per mci.

There's no such thing as "per mci" if the driver scans *all DIMMs* per
register call. If it does it this way, then it is only one mci.

It is actually wrong right now because if you register more than one
mci and you do edac_inc_ce_error()/edac_inc_ue_error(), potentially
different counters get incremented for the same errors. Exactly because
each instance registered is *wrongly* responsible for all DIMMs on the
system.

So you either need to partition the DIMMs per mci (which I can't imagine
how it would work) or introduce locking when incrementing the mci->
counters.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-15 15:35 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-15 15:35 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Mon, 2017-08-14 at 22:39 +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2017 at 08:17:54PM +0000, Kani, Toshimitsu wrote:
> > I think the current code design of allocating mci & ghes_edac_pvt
> > for each GHES source entry makes sense.
> 
> And I don't.
> 
> > edac_raw_mc_handle_error() also has the same expectation that the
> > call is serialized per mci.
> 
> There's no such thing as "per mci" if the driver scans *all DIMMs*
> per register call. If it does it this way, then it is only one mci.

ghes_edac instantiates an mci as a pseudo device representing a GHES
error source.  Each error source associates with all DIMMs, and may
report errors independently.  As ghes_edac is an GHES error-reporting
wrapper to edac, this abstraction makes sense.

> It is actually wrong right now because if you register more than one
> mci and you do edac_inc_ce_error()/edac_inc_ue_error(), potentially
> different counters get incremented for the same errors. Exactly
> because each instance registered is *wrongly* responsible for all
> DIMMs on the system.

I do not see a problem in having counters for each GHES error source. 
This is just statistics info, and ghes_edac does not expect any OS
action from the counters.

> So you either need to partition the DIMMs per mci (which I can't
> imagine how it would work) or introduce locking when incrementing the
> mci->counters.

I do not think changing the calling convention to edac library
interfaces is a good idea for a special case like ghes_edac.  Such
changes can be a burden for us going forward.  I think ghes_edac just
needs to work with the current prerequisite.

User apps like ras-mc-ctl works as expected for a given (not-so-great)
DIMM info from SMBIOS as well.  I do not see a probelm from user
perspective, either.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-15 15:48 Luck, Tony
  0 siblings, 0 replies; 36+ messages in thread
From: Luck, Tony @ 2017-08-15 15:48 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: bp@alien8.de, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Tue, Aug 15, 2017 at 08:35:51AM -0700, Kani, Toshimitsu wrote:
> User apps like ras-mc-ctl works as expected for a given (not-so-great)
> DIMM info from SMBIOS as well.  I do not see a probelm from user
> perspective, either.

Won't the user see all their DIMMs reported for each memory controller
under /sys/devices/system/edac/mc/mc*/dimm* ?

That sounds confusing.

-Tony
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-15 15:50 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-15 15:50 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: rjw@rjwysocki.net, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Tue, Aug 15, 2017 at 03:35:51PM +0000, Kani, Toshimitsu wrote:
> ghes_edac instantiates an mci as a pseudo device representing a GHES
> error source.  Each error source associates with all DIMMs, and may
> report errors independently.  As ghes_edac is an GHES error-reporting
> wrapper to edac, this abstraction makes sense.

Bullshit.

An MCI is a memory controller descriptor. That doesn't fit the GHES
platform devices that get probed. GHES platform device != MCI. How many
times do I need to say this for it to get through to you?

> I do not see a problem in having counters for each GHES error source.

And the error counters of that "simulated" mci get incremented depending
on which pointer gets passed in from GHES? More bullshit.

> This is just statistics info, and ghes_edac does not expect any OS
> action from the counters.

So let me know if you don't want to do it and rather would prefer to
pointlessly debate. I certainly don't want to waste my time debating.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-15 15:53 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-15 15:53 UTC (permalink / raw)
  To: tony.luck@intel.com
  Cc: linux-acpi@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	bp@alien8.de, rjw@rjwysocki.net

On Tue, 2017-08-15 at 08:48 -0700, Luck, Tony wrote:
> On Tue, Aug 15, 2017 at 08:35:51AM -0700, Kani, Toshimitsu wrote:
> > User apps like ras-mc-ctl works as expected for a given (not-so-
> > great) DIMM info from SMBIOS as well.  I do not see a probelm from
> > user perspective, either.
> 
> Won't the user see all their DIMMs reported for each memory
> controller under /sys/devices/system/edac/mc/mc*/dimm* ?
> 
> That sounds confusing.

ghes_edac only fills dimm_info to the first mci.  So, users do not see
duplicated info.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-15 16:19 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-15 16:19 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	tony.luck@intel.com, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, linux-acpi@vger.kernel.org

On Tue, 2017-08-15 at 17:50 +0200, Borislav Petkov wrote:
> On Tue, Aug 15, 2017 at 03:35:51PM +0000, Kani, Toshimitsu wrote:
> > ghes_edac instantiates an mci as a pseudo device representing a
> > GHES error source.  Each error source associates with all DIMMs,
> > and may report errors independently.  As ghes_edac is an GHES
> > error-reporting wrapper to edac, this abstraction makes sense.
> 
> Bullshit.
> 
> An MCI is a memory controller descriptor. That doesn't fit the GHES
> platform devices that get probed. GHES platform device != MCI. How
> many times do I need to say this for it to get through to you?

Right, but it has to be a "pseudo" device for ghes_edac.  There is no
memory controller info available.  A single mci does not make it a real
memory controller, either.

> > I do not see a problem in having counters for each GHES error
> > source.
> 
> And the error counters of that "simulated" mci get incremented
> depending on which pointer gets passed in from GHES? More bullshit.
>
> > This is just statistics info, and ghes_edac does not expect any OS
> > action from the counters.
> 
> So let me know if you don't want to do it and rather would prefer to
> pointlessly debate. I certainly don't want to waste my time debating.

Yes, ghes_edac refactoring like this should be considered a separate
item.  My patchset is aimed to introduce a platform-check to attach
ghes_edac on supported platforms.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16  8:29 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16  8:29 UTC (permalink / raw)
  To: Luck, Tony, Steven Rostedt
  Cc: Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Tue, Aug 15, 2017 at 08:48:16AM -0700, Luck, Tony wrote:
> Won't the user see all their DIMMs reported for each memory controller
> under /sys/devices/system/edac/mc/mc*/dimm* ?
> 
> That sounds confusing.

Right, and adding the locking was really easy. If only people would
debate less and actually try to do what they're being advised to.
But not really: if you wanna have something done, you have to do it
yourself.

Anyway, I think I have a box to run it to, lemme go find it.

Steve, pls check my locking. It looks straightforward to me but I might
be missing some corner case.

Thx.

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 6f80eb65c26c..a22fabef4791 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -28,10 +28,14 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
-static LIST_HEAD(ghes_reglist);
-static DEFINE_MUTEX(ghes_edac_lock);
-static int ghes_edac_mc_num;
+static struct ghes_edac_pvt *ghes_pvt;
 
+/*
+ * Sync with other, potentially concurrent callers of
+ * ghes_edac_report_mem_error(). We don't know what the
+ * "inventive" firmware would do.
+ */
+static DEFINE_SPINLOCK(ghes_lock);
 
 /* Memory Device - Type 17 of SMBIOS spec */
 struct memdev_dmi_entry {
@@ -169,14 +173,11 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 	enum hw_event_mc_err_type type;
 	struct edac_raw_error_desc *e;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = NULL;
+	struct ghes_edac_pvt *pvt = ghes_pvt;
+	unsigned long flags;
 	char *p;
 	u8 grain_bits;
 
-	list_for_each_entry(pvt, &ghes_reglist, list) {
-		if (ghes == pvt->ghes)
-			break;
-	}
 	if (!pvt) {
 		pr_err("Internal error: Can't find EDAC structure\n");
 		return;
@@ -398,8 +399,16 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
 		       grain_bits, e->syndrome, pvt->detail_location);
 
-	/* Report the error via EDAC API */
+	/*
+	 * We can do the locking below because GHES defers error processing
+	 * from NMI to IRQ context. Whenever that changes, we'd at least
+	 * know.
+	 */
+	WARN_ON_ONCE(in_nmi());
+
+	spin_lock_irqsave(&ghes_lock, flags);
 	edac_raw_mc_handle_error(type, mci, e);
+	spin_unlock_irqrestore(&ghes_lock, flags);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
 
@@ -409,9 +418,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
-	struct ghes_edac_pvt *pvt;
 	struct ghes_edac_dimm_fill dimm_fill;
 
+	/*
+	 * We have only one logical memory controller to which all DIMMs belong.
+	 */
+	if (ghes_pvt)
+		return 0;
+
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
 
@@ -425,26 +439,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	/*
-	 * We need to serialize edac_mc_alloc() and edac_mc_add_mc(),
-	 * to avoid duplicated memory controller numbers
-	 */
-	mutex_lock(&ghes_edac_lock);
-	mci = edac_mc_alloc(ghes_edac_mc_num, ARRAY_SIZE(layers), layers,
-			    sizeof(*pvt));
+	mci = edac_mc_alloc(1, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		mutex_unlock(&ghes_edac_lock);
 		return -ENOMEM;
 	}
 
-	pvt = mci->pvt_info;
-	memset(pvt, 0, sizeof(*pvt));
-	list_add_tail(&pvt->list, &ghes_reglist);
-	pvt->ghes = ghes;
-	pvt->mci  = mci;
-	mci->pdev = dev;
+	ghes_pvt	= mci->pvt_info;
+	ghes_pvt->ghes	= ghes;
+	ghes_pvt->mci	= mci;
 
+	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE;
 	mci->edac_cap = EDAC_FLAG_NONE;
@@ -452,36 +457,23 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	mci->ctl_name = "ghes_edac";
 	mci->dev_name = "ghes";
 
-	if (!ghes_edac_mc_num) {
-		if (!fake) {
-			pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
-			pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
-			pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
-			pr_info("If you find incorrect reports, please contact your hardware vendor\n");
-			pr_info("to correct its BIOS.\n");
-			pr_info("This system has %d DIMM sockets.\n",
-				num_dimm);
-		} else {
-			pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
-			pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
-			pr_info("work on such system. Use this driver with caution\n");
-		}
+	if (!fake) {
+		pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
+		pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
+		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
+		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
+		pr_info("to correct its BIOS.\n");
+		pr_info("This system has %d DIMM sockets.\n", num_dimm);
+	} else {
+		pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
+		pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
+		pr_info("work on such system. Use this driver with caution\n");
 	}
 
 	if (!fake) {
-		/*
-		 * Fill DIMM info from DMI for the memory controller #0
-		 *
-		 * Keep it in blank for the other memory controllers, as
-		 * there's no reliable way to properly credit each DIMM to
-		 * the memory controller, as different BIOSes fill the
-		 * DMI bank location fields on different ways
-		 */
-		if (!ghes_edac_mc_num) {
-			dimm_fill.count = 0;
-			dimm_fill.mci = mci;
-			dmi_walk(ghes_edac_dmidecode, &dimm_fill);
-		}
+		dimm_fill.count = 0;
+		dimm_fill.mci = mci;
+		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
 	} else {
 		struct dimm_info *dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
 						       mci->n_layers, 0, 0, 0);
@@ -497,12 +489,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		mutex_unlock(&ghes_edac_lock);
 		return -ENODEV;
 	}
-
-	ghes_edac_mc_num++;
-	mutex_unlock(&ghes_edac_lock);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ghes_edac_register);
@@ -510,15 +498,9 @@ EXPORT_SYMBOL_GPL(ghes_edac_register);
 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt, *tmp;
-
-	list_for_each_entry_safe(pvt, tmp, &ghes_reglist, list) {
-		if (ghes == pvt->ghes) {
-			mci = pvt->mci;
-			edac_mc_del_mc(mci->pdev);
-			edac_mc_free(mci);
-			list_del(&pvt->list);
-		}
-	}
+
+	mci = ghes_pvt->mci;
+	edac_mc_del_mc(mci->pdev);
+	edac_mc_free(mci);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_unregister);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 11:29 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16 11:29 UTC (permalink / raw)
  To: Luck, Tony, Steven Rostedt
  Cc: Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Wed, Aug 16, 2017 at 10:29:31AM +0200, Borislav Petkov wrote:
> Anyway, I think I have a box to run it to, lemme go find it.

Seems to boot.

It's a whole another story whether it actually works. :-)

Need some EINJ capabilities urgently but finding a box where it works
reliably is like finding gold.

[    7.784960] ERST: Failed to get Error Log Address Range.
[    7.795905] EDAC DEBUG: edac_mc_alloc: allocating 2444 bytes for mci data (16 dimms, 16 csrows/ch
annels)
[    7.815266] ghes_edac: This EDAC driver relies on BIOS to enumerate memory and get error reports.
[    7.833372] ghes_edac: Unfortunately, not all BIOSes reflect the memory layout correctly.
[    7.850093] ghes_edac: So, the end result of using this driver varies from vendor to vendor.
[    7.867322] ghes_edac: If you find incorrect reports, please contact your hardware vendor
[    7.884034] ghes_edac: to correct its BIOS.
[    7.892599] ghes_edac: This system has 16 DIMM sockets.
[    7.903274] EDAC DEBUG: ghes_edac_dmidecode: DIMM8: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.920337] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    7.936532] EDAC DEBUG: ghes_edac_dmidecode: DIMM9: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.953591] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    7.969778] EDAC DEBUG: ghes_edac_dmidecode: DIMM10: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.987010] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.003199] EDAC DEBUG: ghes_edac_dmidecode: DIMM11: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.020432] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.036618] EDAC DEBUG: ghes_edac_dmidecode: DIMM12: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.053848] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.070038] EDAC DEBUG: ghes_edac_dmidecode: DIMM13: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.087268] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.103456] EDAC DEBUG: ghes_edac_dmidecode: DIMM14: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.120687] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.128053] tsc: Refined TSC clocksource calibration: 2099.999 MHz
[    8.128173] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1e452fc488e, max_idle_ns: 44
0795307124 ns
[    8.169800] EDAC DEBUG: ghes_edac_dmidecode: DIMM15: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.187043] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.203230] EDAC DEBUG: edac_mc_add_mc_with_groups: 
[    8.213363] EDAC DEBUG: edac_create_sysfs_mci_device: creating bus mc1
[    8.226643] EDAC DEBUG: edac_create_sysfs_mci_device: creating device mc1
[    8.240450] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm8, located at memory 8 
[    8.257362] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm8
[    8.272506] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm9, located at memory 9 
[    8.289409] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm9
[    8.304556] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm10, located at memory 10 
[    8.321808] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm10
[    8.337126] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm11, located at memory 11 
[    8.354377] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm11
[    8.369697] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm12, located at memory 12 
[    8.386946] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm12
[    8.402266] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm13, located at memory 13 
[    8.419516] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm13
[    8.434835] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm14, located at memory 14 
[    8.452094] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm14
[    8.467423] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm15, located at memory 15 
[    8.484674] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm15
[    8.499994] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow8
[    8.516215] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow9
[    8.532433] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow10
[    8.548824] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow11
[    8.565203] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow12
[    8.581585] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow13
[    8.597964] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow14
[    8.614347] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow15
[    8.630744] EDAC MC1: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[    8.650073] [Firmware Warn]: GHES: Poll interval is 0 for generic hardware error source: 1, disabled.
[    8.669012] GHES: APEI firmware first mode is enabled by WHEA _OSC.
[    8.681940] Serial: 8250/16550 driver, 32 ports, IRQ sharing disabled

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 13:59 Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2017-08-16 13:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Wed, 16 Aug 2017 10:29:31 +0200
Borislav Petkov <bp@alien8.de> wrote:


> ---
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 6f80eb65c26c..a22fabef4791 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -28,10 +28,14 @@ struct ghes_edac_pvt {
>  	char msg[80];
>  };
>  
> -static LIST_HEAD(ghes_reglist);
> -static DEFINE_MUTEX(ghes_edac_lock);
> -static int ghes_edac_mc_num;
> +static struct ghes_edac_pvt *ghes_pvt;
>  
> +/*
> + * Sync with other, potentially concurrent callers of
> + * ghes_edac_report_mem_error(). We don't know what the
> + * "inventive" firmware would do.
> + */
> +static DEFINE_SPINLOCK(ghes_lock);
>  
>  /* Memory Device - Type 17 of SMBIOS spec */
>  struct memdev_dmi_entry {
> @@ -169,14 +173,11 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
>  	enum hw_event_mc_err_type type;
>  	struct edac_raw_error_desc *e;
>  	struct mem_ctl_info *mci;
> -	struct ghes_edac_pvt *pvt = NULL;
> +	struct ghes_edac_pvt *pvt = ghes_pvt;
> +	unsigned long flags;
>  	char *p;
>  	u8 grain_bits;
>  
> -	list_for_each_entry(pvt, &ghes_reglist, list) {
> -		if (ghes == pvt->ghes)
> -			break;
> -	}
>  	if (!pvt) {
>  		pr_err("Internal error: Can't find EDAC structure\n");
>  		return;
> @@ -398,8 +399,16 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
>  		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
>  		       grain_bits, e->syndrome, pvt->detail_location);
>  
> -	/* Report the error via EDAC API */
> +	/*
> +	 * We can do the locking below because GHES defers error processing
> +	 * from NMI to IRQ context. Whenever that changes, we'd at least
> +	 * know.
> +	 */
> +	WARN_ON_ONCE(in_nmi());

Should the above be:

	if (WARN_ON_ONCE(in_nmi()))
		return;

To prevent a deadlock? Or do we not care?

> +
> +	spin_lock_irqsave(&ghes_lock, flags);
>  	edac_raw_mc_handle_error(type, mci, e);
> +	spin_unlock_irqrestore(&ghes_lock, flags);

The above looks fine, as long as there's nothing before it that needs
synchronization.

>  }
>  EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
>  
> @@ -409,9 +418,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	int rc, num_dimm = 0;
>  	struct mem_ctl_info *mci;
>  	struct edac_mc_layer layers[1];
> -	struct ghes_edac_pvt *pvt;
>  	struct ghes_edac_dimm_fill dimm_fill;
>  
> +	/*
> +	 * We have only one logical memory controller to which all DIMMs belong.
> +	 */
> +	if (ghes_pvt)
> +		return 0;

What's the likelihood of two calls to ghes_edac_register being done
simultaneously?  Because two calls at the same time will get past this.

-- Steve


> +
>  	/* Get the number of DIMMs */
>  	dmi_walk(ghes_edac_count_dimms, &num_dimm);
>  
> @@ -425,26 +439,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	layers[0].size = num_dimm;
>  	layers[0].is_virt_csrow = true;
>  
> -	/*
> -	 * We need to serialize edac_mc_alloc() and edac_mc_add_mc(),
> -	 * to avoid duplicated memory controller numbers
> -	 */
> -	mutex_lock(&ghes_edac_lock);
> -	mci = edac_mc_alloc(ghes_edac_mc_num, ARRAY_SIZE(layers), layers,
> -			    sizeof(*pvt));
> +	mci = edac_mc_alloc(1, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
>  	if (!mci) {
>  		pr_info("Can't allocate memory for EDAC data\n");
> -		mutex_unlock(&ghes_edac_lock);
>  		return -ENOMEM;
>  	}
>  
> -	pvt = mci->pvt_info;
> -	memset(pvt, 0, sizeof(*pvt));
> -	list_add_tail(&pvt->list, &ghes_reglist);
> -	pvt->ghes = ghes;
> -	pvt->mci  = mci;
> -	mci->pdev = dev;
> +	ghes_pvt	= mci->pvt_info;
> +	ghes_pvt->ghes	= ghes;
> +	ghes_pvt->mci	= mci;
>  
> +	mci->pdev = dev;
>  	mci->mtype_cap = MEM_FLAG_EMPTY;
>  	mci->edac_ctl_cap = EDAC_FLAG_NONE;
>  	mci->edac_cap = EDAC_FLAG_NONE;
> @@ -452,36 +457,23 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	mci->ctl_name = "ghes_edac";
>  	mci->dev_name = "ghes";
>  
> -	if (!ghes_edac_mc_num) {
> -		if (!fake) {
> -			pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
> -			pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
> -			pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
> -			pr_info("If you find incorrect reports, please contact your hardware vendor\n");
> -			pr_info("to correct its BIOS.\n");
> -			pr_info("This system has %d DIMM sockets.\n",
> -				num_dimm);
> -		} else {
> -			pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
> -			pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
> -			pr_info("work on such system. Use this driver with caution\n");
> -		}
> +	if (!fake) {
> +		pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
> +		pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
> +		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
> +		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
> +		pr_info("to correct its BIOS.\n");
> +		pr_info("This system has %d DIMM sockets.\n", num_dimm);
> +	} else {
> +		pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
> +		pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
> +		pr_info("work on such system. Use this driver with caution\n");
>  	}
>  
>  	if (!fake) {
> -		/*
> -		 * Fill DIMM info from DMI for the memory controller #0
> -		 *
> -		 * Keep it in blank for the other memory controllers, as
> -		 * there's no reliable way to properly credit each DIMM to
> -		 * the memory controller, as different BIOSes fill the
> -		 * DMI bank location fields on different ways
> -		 */
> -		if (!ghes_edac_mc_num) {
> -			dimm_fill.count = 0;
> -			dimm_fill.mci = mci;
> -			dmi_walk(ghes_edac_dmidecode, &dimm_fill);
> -		}
> +		dimm_fill.count = 0;
> +		dimm_fill.mci = mci;
> +		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
>  	} else {
>  		struct dimm_info *dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
>  						       mci->n_layers, 0, 0, 0);
> @@ -497,12 +489,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>  	if (rc < 0) {
>  		pr_info("Can't register at EDAC core\n");
>  		edac_mc_free(mci);
> -		mutex_unlock(&ghes_edac_lock);
>  		return -ENODEV;
>  	}
> -
> -	ghes_edac_mc_num++;
> -	mutex_unlock(&ghes_edac_lock);
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(ghes_edac_register);
> @@ -510,15 +498,9 @@ EXPORT_SYMBOL_GPL(ghes_edac_register);
>  void ghes_edac_unregister(struct ghes *ghes)
>  {
>  	struct mem_ctl_info *mci;
> -	struct ghes_edac_pvt *pvt, *tmp;
> -
> -	list_for_each_entry_safe(pvt, tmp, &ghes_reglist, list) {
> -		if (ghes == pvt->ghes) {
> -			mci = pvt->mci;
> -			edac_mc_del_mc(mci->pdev);
> -			edac_mc_free(mci);
> -			list_del(&pvt->list);
> -		}
> -	}
> +
> +	mci = ghes_pvt->mci;
> +	edac_mc_del_mc(mci->pdev);
> +	edac_mc_free(mci);
>  }
>  EXPORT_SYMBOL_GPL(ghes_edac_unregister);
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 14:03 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16 14:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Luck, Tony, Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Wed, Aug 16, 2017 at 09:59:01AM -0400, Steven Rostedt wrote:
> Should the above be:
> 
> 	if (WARN_ON_ONCE(in_nmi()))
> 		return;
> 
> To prevent a deadlock? Or do we not care?

Yeah, better this way.

> What's the likelihood of two calls to ghes_edac_register being done
> simultaneously?  Because two calls at the same time will get past this.

Well, that thing gets called per GHES platform device and last time I
checked they do get probed back-to-back but I'll check that again.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 14:22 Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2017-08-16 14:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Wed, 16 Aug 2017 16:03:50 +0200
Borislav Petkov <bp@alien8.de> wrote:

> > What's the likelihood of two calls to ghes_edac_register being done
> > simultaneously?  Because two calls at the same time will get past this.  
> 
> Well, that thing gets called per GHES platform device and last time I
> checked they do get probed back-to-back but I'll check that again.

Maybe keep that original mutex just in case.

-- Steve
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 15:26 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-16 15:26 UTC (permalink / raw)
  To: tony.luck@intel.com, rostedt@goodmis.org, bp@alien8.de
  Cc: linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	linux-kernel@vger.kernel.org, rjw@rjwysocki.net,
	linux-acpi@vger.kernel.org

On Wed, 2017-08-16 at 10:29 +0200, Borislav Petkov wrote:
> On Tue, Aug 15, 2017 at 08:48:16AM -0700, Luck, Tony wrote:
> > Won't the user see all their DIMMs reported for each memory
> > controller
> > under /sys/devices/system/edac/mc/mc*/dimm* ?
> > 
> > That sounds confusing.
> 
> Right, and adding the locking was really easy. If only people would
> debate less and actually try to do what they're being advised to.
> But not really: if you wanna have something done, you have to do it
> yourself.

Sorry, but I did not agree on allowing concurrent accesses to mci...

 /* Memory Device - Type 17 of SMBIOS spec */
>  struct memdev_dmi_entry {
> @@ -169,14 +173,11 @@ void ghes_edac_report_mem_error(struct ghes
> *ghes, int sev,
>  	enum hw_event_mc_err_type type;
>  	struct edac_raw_error_desc *e;
>  	struct mem_ctl_info *mci;
> -	struct ghes_edac_pvt *pvt = NULL;
> +	struct ghes_edac_pvt *pvt = ghes_pvt;
> +	unsigned long flags;
>  	char *p;
>  	u8 grain_bits;

I believe you now need to protect from a race condition that a single
mci and pvt can be initialized / consumed from multiple threads.  This
protection is missing in your patch.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 16:42 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16 16:42 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: tony.luck@intel.com, rostedt@goodmis.org,
	linux-edac@vger.kernel.org, lenb@kernel.org, mchehab@kernel.org,
	linux-kernel@vger.kernel.org, rjw@rjwysocki.net,
	linux-acpi@vger.kernel.org

On Wed, Aug 16, 2017 at 03:26:04PM +0000, Kani, Toshimitsu wrote:
> I believe you now need to protect from a race condition that a single
> mci and pvt can be initialized / consumed from multiple threads.  This
> protection is missing in your patch.

Easy. Done.

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 6f80eb65c26c..d4089c2980ef 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -28,10 +28,14 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
-static LIST_HEAD(ghes_reglist);
-static DEFINE_MUTEX(ghes_edac_lock);
-static int ghes_edac_mc_num;
+static struct ghes_edac_pvt *ghes_pvt;
 
+/*
+ * Sync with other, potentially concurrent callers of
+ * ghes_edac_report_mem_error(). We don't know what the
+ * "inventive" firmware would do.
+ */
+static DEFINE_SPINLOCK(ghes_lock);
 
 /* Memory Device - Type 17 of SMBIOS spec */
 struct memdev_dmi_entry {
@@ -169,18 +173,26 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 	enum hw_event_mc_err_type type;
 	struct edac_raw_error_desc *e;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = NULL;
+	struct ghes_edac_pvt *pvt = ghes_pvt;
+	unsigned long flags;
 	char *p;
 	u8 grain_bits;
 
-	list_for_each_entry(pvt, &ghes_reglist, list) {
-		if (ghes == pvt->ghes)
-			break;
-	}
 	if (!pvt) {
 		pr_err("Internal error: Can't find EDAC structure\n");
 		return;
 	}
+
+	/*
+	 * We can do the locking below because GHES defers error processing
+	 * from NMI to IRQ context. Whenever that changes, we'd at least
+	 * know.
+	 */
+	if (WARN_ON_ONCE(in_nmi()))
+		return;
+
+	spin_lock_irqsave(&ghes_lock, flags);
+
 	mci = pvt->mci;
 	e = &mci->error_desc;
 
@@ -398,8 +410,8 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
 		       grain_bits, e->syndrome, pvt->detail_location);
 
-	/* Report the error via EDAC API */
 	edac_raw_mc_handle_error(type, mci, e);
+	spin_unlock_irqrestore(&ghes_lock, flags);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
 
@@ -409,9 +421,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
-	struct ghes_edac_pvt *pvt;
 	struct ghes_edac_dimm_fill dimm_fill;
 
+	/*
+	 * We have only one logical memory controller to which all DIMMs belong.
+	 */
+	if (ghes_pvt)
+		return 0;
+
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
 
@@ -425,26 +442,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	/*
-	 * We need to serialize edac_mc_alloc() and edac_mc_add_mc(),
-	 * to avoid duplicated memory controller numbers
-	 */
-	mutex_lock(&ghes_edac_lock);
-	mci = edac_mc_alloc(ghes_edac_mc_num, ARRAY_SIZE(layers), layers,
-			    sizeof(*pvt));
+	mci = edac_mc_alloc(1, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		mutex_unlock(&ghes_edac_lock);
 		return -ENOMEM;
 	}
 
-	pvt = mci->pvt_info;
-	memset(pvt, 0, sizeof(*pvt));
-	list_add_tail(&pvt->list, &ghes_reglist);
-	pvt->ghes = ghes;
-	pvt->mci  = mci;
-	mci->pdev = dev;
+	ghes_pvt	= mci->pvt_info;
+	ghes_pvt->ghes	= ghes;
+	ghes_pvt->mci	= mci;
 
+	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE;
 	mci->edac_cap = EDAC_FLAG_NONE;
@@ -452,36 +460,23 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	mci->ctl_name = "ghes_edac";
 	mci->dev_name = "ghes";
 
-	if (!ghes_edac_mc_num) {
-		if (!fake) {
-			pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
-			pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
-			pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
-			pr_info("If you find incorrect reports, please contact your hardware vendor\n");
-			pr_info("to correct its BIOS.\n");
-			pr_info("This system has %d DIMM sockets.\n",
-				num_dimm);
-		} else {
-			pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
-			pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
-			pr_info("work on such system. Use this driver with caution\n");
-		}
+	if (!fake) {
+		pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
+		pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
+		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
+		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
+		pr_info("to correct its BIOS.\n");
+		pr_info("This system has %d DIMM sockets.\n", num_dimm);
+	} else {
+		pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
+		pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
+		pr_info("work on such system. Use this driver with caution\n");
 	}
 
 	if (!fake) {
-		/*
-		 * Fill DIMM info from DMI for the memory controller #0
-		 *
-		 * Keep it in blank for the other memory controllers, as
-		 * there's no reliable way to properly credit each DIMM to
-		 * the memory controller, as different BIOSes fill the
-		 * DMI bank location fields on different ways
-		 */
-		if (!ghes_edac_mc_num) {
-			dimm_fill.count = 0;
-			dimm_fill.mci = mci;
-			dmi_walk(ghes_edac_dmidecode, &dimm_fill);
-		}
+		dimm_fill.count = 0;
+		dimm_fill.mci = mci;
+		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
 	} else {
 		struct dimm_info *dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
 						       mci->n_layers, 0, 0, 0);
@@ -497,12 +492,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		mutex_unlock(&ghes_edac_lock);
 		return -ENODEV;
 	}
-
-	ghes_edac_mc_num++;
-	mutex_unlock(&ghes_edac_lock);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ghes_edac_register);
@@ -510,15 +501,9 @@ EXPORT_SYMBOL_GPL(ghes_edac_register);
 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt, *tmp;
-
-	list_for_each_entry_safe(pvt, tmp, &ghes_reglist, list) {
-		if (ghes == pvt->ghes) {
-			mci = pvt->mci;
-			edac_mc_del_mc(mci->pdev);
-			edac_mc_free(mci);
-			list_del(&pvt->list);
-		}
-	}
+
+	mci = ghes_pvt->mci;
+	edac_mc_del_mc(mci->pdev);
+	edac_mc_free(mci);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_unregister);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 17:28 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-16 17:28 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	mchehab@kernel.org, rjw@rjwysocki.net, tony.luck@intel.com,
	lenb@kernel.org, linux-acpi@vger.kernel.org,
	linux-edac@vger.kernel.org

On Wed, 2017-08-16 at 18:42 +0200, Borislav Petkov wrote:
> On Wed, Aug 16, 2017 at 03:26:04PM +0000, Kani, Toshimitsu wrote:
> > I believe you now need to protect from a race condition that a
> > single mci and pvt can be initialized / consumed from multiple
> > threads.  This protection is missing in your patch.
> 
> Easy. Done.

Assuming this big spinlock works, yes, this addresses my concern that
it does not allow concurrent accesses to mci. :-)

I will test the patch with an SCI when I got a chance.  I won't be able
to test other notification types or race conditions, though.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 17:31 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16 17:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Luck, Tony, Kani, Toshimitsu, rjw@rjwysocki.net, lenb@kernel.org,
	mchehab@kernel.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Wed, Aug 16, 2017 at 10:22:49AM -0400, Steven Rostedt wrote:
> Maybe keep that original mutex just in case.

Let's do the elegant thing:

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index d4089c2980ef..386e04a7bda0 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -28,6 +28,7 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
+static atomic_t ghes_init = ATOMIC_INIT(0);
 static struct ghes_edac_pvt *ghes_pvt;
 
 /*
@@ -426,7 +427,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	/*
 	 * We have only one logical memory controller to which all DIMMs belong.
 	 */
-	if (ghes_pvt)
+	if (atomic_inc_return(&ghes_init) > 1)
 		return 0;
 
 	/* Get the number of DIMMs */

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 17:40 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-16 17:40 UTC (permalink / raw)
  To: Kani, Toshimitsu
  Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	mchehab@kernel.org, rjw@rjwysocki.net, tony.luck@intel.com,
	lenb@kernel.org, linux-acpi@vger.kernel.org,
	linux-edac@vger.kernel.org

On Wed, Aug 16, 2017 at 05:28:50PM +0000, Kani, Toshimitsu wrote:
> Assuming this big spinlock works, yes, this addresses my concern that

You mean, lengthy locked section. We can always switch to on-stack
buffers if there are issues or even to something more fancy like
genpool.

> I will test the patch with an SCI when I got a chance.  I won't be able
> to test other notification types or race conditions, though.

Thanks, here's the latest version with the atomic registration too.
---
 drivers/edac/ghes_edac.c | 116 +++++++++++++++++++++--------------------------
 1 file changed, 51 insertions(+), 65 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 6f80eb65c26c..386e04a7bda0 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -28,10 +28,15 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
-static LIST_HEAD(ghes_reglist);
-static DEFINE_MUTEX(ghes_edac_lock);
-static int ghes_edac_mc_num;
+static atomic_t ghes_init = ATOMIC_INIT(0);
+static struct ghes_edac_pvt *ghes_pvt;
 
+/*
+ * Sync with other, potentially concurrent callers of
+ * ghes_edac_report_mem_error(). We don't know what the
+ * "inventive" firmware would do.
+ */
+static DEFINE_SPINLOCK(ghes_lock);
 
 /* Memory Device - Type 17 of SMBIOS spec */
 struct memdev_dmi_entry {
@@ -169,18 +174,26 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 	enum hw_event_mc_err_type type;
 	struct edac_raw_error_desc *e;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = NULL;
+	struct ghes_edac_pvt *pvt = ghes_pvt;
+	unsigned long flags;
 	char *p;
 	u8 grain_bits;
 
-	list_for_each_entry(pvt, &ghes_reglist, list) {
-		if (ghes == pvt->ghes)
-			break;
-	}
 	if (!pvt) {
 		pr_err("Internal error: Can't find EDAC structure\n");
 		return;
 	}
+
+	/*
+	 * We can do the locking below because GHES defers error processing
+	 * from NMI to IRQ context. Whenever that changes, we'd at least
+	 * know.
+	 */
+	if (WARN_ON_ONCE(in_nmi()))
+		return;
+
+	spin_lock_irqsave(&ghes_lock, flags);
+
 	mci = pvt->mci;
 	e = &mci->error_desc;
 
@@ -398,8 +411,8 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
 		       grain_bits, e->syndrome, pvt->detail_location);
 
-	/* Report the error via EDAC API */
 	edac_raw_mc_handle_error(type, mci, e);
+	spin_unlock_irqrestore(&ghes_lock, flags);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
 
@@ -409,9 +422,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
-	struct ghes_edac_pvt *pvt;
 	struct ghes_edac_dimm_fill dimm_fill;
 
+	/*
+	 * We have only one logical memory controller to which all DIMMs belong.
+	 */
+	if (atomic_inc_return(&ghes_init) > 1)
+		return 0;
+
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
 
@@ -425,26 +443,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	/*
-	 * We need to serialize edac_mc_alloc() and edac_mc_add_mc(),
-	 * to avoid duplicated memory controller numbers
-	 */
-	mutex_lock(&ghes_edac_lock);
-	mci = edac_mc_alloc(ghes_edac_mc_num, ARRAY_SIZE(layers), layers,
-			    sizeof(*pvt));
+	mci = edac_mc_alloc(1, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		mutex_unlock(&ghes_edac_lock);
 		return -ENOMEM;
 	}
 
-	pvt = mci->pvt_info;
-	memset(pvt, 0, sizeof(*pvt));
-	list_add_tail(&pvt->list, &ghes_reglist);
-	pvt->ghes = ghes;
-	pvt->mci  = mci;
-	mci->pdev = dev;
+	ghes_pvt	= mci->pvt_info;
+	ghes_pvt->ghes	= ghes;
+	ghes_pvt->mci	= mci;
 
+	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE;
 	mci->edac_cap = EDAC_FLAG_NONE;
@@ -452,36 +461,23 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	mci->ctl_name = "ghes_edac";
 	mci->dev_name = "ghes";
 
-	if (!ghes_edac_mc_num) {
-		if (!fake) {
-			pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
-			pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
-			pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
-			pr_info("If you find incorrect reports, please contact your hardware vendor\n");
-			pr_info("to correct its BIOS.\n");
-			pr_info("This system has %d DIMM sockets.\n",
-				num_dimm);
-		} else {
-			pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
-			pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
-			pr_info("work on such system. Use this driver with caution\n");
-		}
+	if (!fake) {
+		pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
+		pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
+		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
+		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
+		pr_info("to correct its BIOS.\n");
+		pr_info("This system has %d DIMM sockets.\n", num_dimm);
+	} else {
+		pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
+		pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
+		pr_info("work on such system. Use this driver with caution\n");
 	}
 
 	if (!fake) {
-		/*
-		 * Fill DIMM info from DMI for the memory controller #0
-		 *
-		 * Keep it in blank for the other memory controllers, as
-		 * there's no reliable way to properly credit each DIMM to
-		 * the memory controller, as different BIOSes fill the
-		 * DMI bank location fields on different ways
-		 */
-		if (!ghes_edac_mc_num) {
-			dimm_fill.count = 0;
-			dimm_fill.mci = mci;
-			dmi_walk(ghes_edac_dmidecode, &dimm_fill);
-		}
+		dimm_fill.count = 0;
+		dimm_fill.mci = mci;
+		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
 	} else {
 		struct dimm_info *dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
 						       mci->n_layers, 0, 0, 0);
@@ -497,12 +493,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		mutex_unlock(&ghes_edac_lock);
 		return -ENODEV;
 	}
-
-	ghes_edac_mc_num++;
-	mutex_unlock(&ghes_edac_lock);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ghes_edac_register);
@@ -510,15 +502,9 @@ EXPORT_SYMBOL_GPL(ghes_edac_register);
 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt, *tmp;
-
-	list_for_each_entry_safe(pvt, tmp, &ghes_reglist, list) {
-		if (ghes == pvt->ghes) {
-			mci = pvt->mci;
-			edac_mc_del_mc(mci->pdev);
-			edac_mc_free(mci);
-			list_del(&pvt->list);
-		}
-	}
+
+	mci = ghes_pvt->mci;
+	edac_mc_del_mc(mci->pdev);
+	edac_mc_free(mci);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_unregister);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-16 18:01 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-16 18:01 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	mchehab@kernel.org, rjw@rjwysocki.net, tony.luck@intel.com,
	lenb@kernel.org, linux-acpi@vger.kernel.org,
	linux-edac@vger.kernel.org

On Wed, 2017-08-16 at 19:40 +0200, Borislav Petkov wrote:
> On Wed, Aug 16, 2017 at 05:28:50PM +0000, Kani, Toshimitsu wrote:
> > Assuming this big spinlock works, yes, this addresses my concern
> > that
> 
> You mean, lengthy locked section. We can always switch to on-stack
> buffers if there are issues or even to something more fancy like
> genpool.

Yes, I meant the lengthy locked section.  I still think that multiple
mcis is a better approach for concurrency as it naturally addresses it.
But as long as edac func and error handlers are tolerant with this
spinlock and serialization, I am OK with that.

> > I will test the patch with an SCI when I got a chance.  I won't be
> > able to test other notification types or race conditions, though.
> 
> Thanks, here's the latest version with the atomic registration too.

Sure, I will test with this version.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-17 21:08 Toshi Kani
  0 siblings, 0 replies; 36+ messages in thread
From: Toshi Kani @ 2017-08-17 21:08 UTC (permalink / raw)
  To: bp@alien8.de
  Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	mchehab@kernel.org, rjw@rjwysocki.net, tony.luck@intel.com,
	lenb@kernel.org, linux-acpi@vger.kernel.org,
	linux-edac@vger.kernel.org

On Wed, 2017-08-16 at 11:51 -0600, Toshi Kani wrote:
> On Wed, 2017-08-16 at 19:40 +0200, Borislav Petkov wrote:
> > On Wed, Aug 16, 2017 at 05:28:50PM +0000, Kani, Toshimitsu wrote:
 :
> > > I will test the patch with an SCI when I got a chance.  I won't
> > > be able to test other notification types or race conditions,
> > > though.
> > 
> > Thanks, here's the latest version with the atomic registration too.
> 
> Sure, I will test with this version.

I briefly tested the patch with an SCI. The error reporting part seems
to work fine.

There are two issues I noticed:

1. It creates mc0 and mc1.  
I think this is because you called edac_mc_alloc() with mc_num 1.

2. 'ras-mc-ctl --layout' does not show all DIMMs.

# ras-mc-ctl --layout
         +------------------------+
         |    mc0     |    mc1     |
---------+------------------------+
memory4: |     0 MB  |  16384 MB  |
memory3: |     0 MB  |     0 MB  |
memory2: |     0 MB  |     0 MB  |
memory1: |     0 MB  |     0 MB  |
memory0: |     0 MB  |  8192 MB  |
---------+-----------------------+

# ls /sys/bus/mc1/devices
csrow0   csrow12  csrow23  dimm0   dimm12  dimm23  mc1
csrow11  csrow19  csrow4   dimm11  dimm19  dimm4

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk()
@ 2017-08-21  9:29 Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-08-21  9:29 UTC (permalink / raw)
  To: Kani, Toshimitsu, mchehab@kernel.org
  Cc: linux-kernel@vger.kernel.org, rostedt@goodmis.org,
	rjw@rjwysocki.net, tony.luck@intel.com, lenb@kernel.org,
	linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org

On Thu, Aug 17, 2017 at 09:08:40PM +0000, Kani, Toshimitsu wrote:
> 1. It creates mc0 and mc1.  
> I think this is because you called edac_mc_alloc() with mc_num 1.

Fixed, see below.

> 
> 2. 'ras-mc-ctl --layout' does not show all DIMMs.

Yap, that's strange.

$ grep . /sys/devices/system/edac/mc/mc0/dimm*/size
/sys/devices/system/edac/mc/mc0/dimm10/size:2048
/sys/devices/system/edac/mc/mc0/dimm11/size:2048
/sys/devices/system/edac/mc/mc0/dimm12/size:2048
/sys/devices/system/edac/mc/mc0/dimm13/size:2048
/sys/devices/system/edac/mc/mc0/dimm14/size:2048
/sys/devices/system/edac/mc/mc0/dimm15/size:2048
/sys/devices/system/edac/mc/mc0/dimm8/size:2048
/sys/devices/system/edac/mc/mc0/dimm9/size:2048
$ ras-mc-ctl --layout
         +-----------+
         |    mc0    |
---------+-----------+
memory9: |  2048 MB  |
memory8: |  2048 MB  |
---------+-----------+
memory7: |     0 MB  |
memory6: |     0 MB  |
---------+-----------+
memory5: |     0 MB  |
memory4: |     0 MB  |
---------+-----------+
memory3: |     0 MB  |
memory2: |     0 MB  |
---------+-----------+
memory1: |     0 MB  |
memory0: |     0 MB  |
---------+-----------+

the driver detects them correctly though:

[    7.900694] ghes_edac: This system has 16 DIMM sockets.
[    7.911366] EDAC DEBUG: ghes_edac_dmidecode: DIMM8: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.928437] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    7.944628] EDAC DEBUG: ghes_edac_dmidecode: DIMM9: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.961683] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    7.977871] EDAC DEBUG: ghes_edac_dmidecode: DIMM10: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    7.995105] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.011291] EDAC DEBUG: ghes_edac_dmidecode: DIMM11: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.028524] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.044712] EDAC DEBUG: ghes_edac_dmidecode: DIMM12: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.061942] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.078129] EDAC DEBUG: ghes_edac_dmidecode: DIMM13: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.095360] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.111547] EDAC DEBUG: ghes_edac_dmidecode: DIMM14: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.161703] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.177904] EDAC DEBUG: ghes_edac_dmidecode: DIMM15: Unbuffered DDR3 RAM size = 2048 MB(ECC)
[    8.195132] EDAC DEBUG: ghes_edac_dmidecode:         type 24, detail 0x80, width 72(total 64)
[    8.211321] EDAC DEBUG: edac_mc_add_mc_with_groups: 
[    8.221456] EDAC DEBUG: edac_create_sysfs_mci_device: creating bus mc0
[    8.234736] EDAC DEBUG: edac_create_sysfs_mci_device: creating device mc0
[    8.248545] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm8, located at memory 8 
[    8.265457] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm8
[    8.280601] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm9, located at memory 9 
[    8.297503] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm9
[    8.312650] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm10, located at memory 10 
[    8.329900] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm10
[    8.345220] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm11, located at memory 11 
[    8.362470] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm11
[    8.377789] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm12, located at memory 12 
[    8.395039] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm12
[    8.410358] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm13, located at memory 13 
[    8.427608] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm13
[    8.442928] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm14, located at memory 14 
[    8.460194] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm14
[    8.475517] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm15, located at memory 15 
[    8.492768] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device dimm15

Mauro?
---
From: Borislav Petkov <bp@suse.de>
Date: Wed, 16 Aug 2017 10:33:44 +0200
Subject: [PATCH] EDAC, ghes: Model a single, logical memory controller

We're enumerating the DIMMs through a DMI walk and since we can't get
any more detailed topological information about which DIMMs belong to
which memory controller, convert it to a single, logical controller
which contains all the DIMMs.

The error reporting path from GHES ghes_edac_report_mem_error() doesn't
get called in NMI context but add a warning about it to catch any
changes in the future as if so, our locking scheme will be insufficient
then.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/edac/ghes_edac.c | 116 +++++++++++++++++++++--------------------------
 1 file changed, 51 insertions(+), 65 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 6f80eb65c26c..e790d64b8edd 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -28,10 +28,15 @@ struct ghes_edac_pvt {
 	char msg[80];
 };
 
-static LIST_HEAD(ghes_reglist);
-static DEFINE_MUTEX(ghes_edac_lock);
-static int ghes_edac_mc_num;
+static atomic_t ghes_init = ATOMIC_INIT(0);
+static struct ghes_edac_pvt *ghes_pvt;
 
+/*
+ * Sync with other, potentially concurrent callers of
+ * ghes_edac_report_mem_error(). We don't know what the
+ * "inventive" firmware would do.
+ */
+static DEFINE_SPINLOCK(ghes_lock);
 
 /* Memory Device - Type 17 of SMBIOS spec */
 struct memdev_dmi_entry {
@@ -169,18 +174,26 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 	enum hw_event_mc_err_type type;
 	struct edac_raw_error_desc *e;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt = NULL;
+	struct ghes_edac_pvt *pvt = ghes_pvt;
+	unsigned long flags;
 	char *p;
 	u8 grain_bits;
 
-	list_for_each_entry(pvt, &ghes_reglist, list) {
-		if (ghes == pvt->ghes)
-			break;
-	}
 	if (!pvt) {
 		pr_err("Internal error: Can't find EDAC structure\n");
 		return;
 	}
+
+	/*
+	 * We can do the locking below because GHES defers error processing
+	 * from NMI to IRQ context. Whenever that changes, we'd at least
+	 * know.
+	 */
+	if (WARN_ON_ONCE(in_nmi()))
+		return;
+
+	spin_lock_irqsave(&ghes_lock, flags);
+
 	mci = pvt->mci;
 	e = &mci->error_desc;
 
@@ -398,8 +411,8 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
 		       (e->page_frame_number << PAGE_SHIFT) | e->offset_in_page,
 		       grain_bits, e->syndrome, pvt->detail_location);
 
-	/* Report the error via EDAC API */
 	edac_raw_mc_handle_error(type, mci, e);
+	spin_unlock_irqrestore(&ghes_lock, flags);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_report_mem_error);
 
@@ -409,9 +422,14 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	int rc, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct edac_mc_layer layers[1];
-	struct ghes_edac_pvt *pvt;
 	struct ghes_edac_dimm_fill dimm_fill;
 
+	/*
+	 * We have only one logical memory controller to which all DIMMs belong.
+	 */
+	if (atomic_inc_return(&ghes_init) > 1)
+		return 0;
+
 	/* Get the number of DIMMs */
 	dmi_walk(ghes_edac_count_dimms, &num_dimm);
 
@@ -425,26 +443,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	/*
-	 * We need to serialize edac_mc_alloc() and edac_mc_add_mc(),
-	 * to avoid duplicated memory controller numbers
-	 */
-	mutex_lock(&ghes_edac_lock);
-	mci = edac_mc_alloc(ghes_edac_mc_num, ARRAY_SIZE(layers), layers,
-			    sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
-		mutex_unlock(&ghes_edac_lock);
 		return -ENOMEM;
 	}
 
-	pvt = mci->pvt_info;
-	memset(pvt, 0, sizeof(*pvt));
-	list_add_tail(&pvt->list, &ghes_reglist);
-	pvt->ghes = ghes;
-	pvt->mci  = mci;
-	mci->pdev = dev;
+	ghes_pvt	= mci->pvt_info;
+	ghes_pvt->ghes	= ghes;
+	ghes_pvt->mci	= mci;
 
+	mci->pdev = dev;
 	mci->mtype_cap = MEM_FLAG_EMPTY;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE;
 	mci->edac_cap = EDAC_FLAG_NONE;
@@ -452,36 +461,23 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	mci->ctl_name = "ghes_edac";
 	mci->dev_name = "ghes";
 
-	if (!ghes_edac_mc_num) {
-		if (!fake) {
-			pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
-			pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
-			pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
-			pr_info("If you find incorrect reports, please contact your hardware vendor\n");
-			pr_info("to correct its BIOS.\n");
-			pr_info("This system has %d DIMM sockets.\n",
-				num_dimm);
-		} else {
-			pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
-			pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
-			pr_info("work on such system. Use this driver with caution\n");
-		}
+	if (!fake) {
+		pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
+		pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
+		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
+		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
+		pr_info("to correct its BIOS.\n");
+		pr_info("This system has %d DIMM sockets.\n", num_dimm);
+	} else {
+		pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
+		pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
+		pr_info("work on such system. Use this driver with caution\n");
 	}
 
 	if (!fake) {
-		/*
-		 * Fill DIMM info from DMI for the memory controller #0
-		 *
-		 * Keep it in blank for the other memory controllers, as
-		 * there's no reliable way to properly credit each DIMM to
-		 * the memory controller, as different BIOSes fill the
-		 * DMI bank location fields on different ways
-		 */
-		if (!ghes_edac_mc_num) {
-			dimm_fill.count = 0;
-			dimm_fill.mci = mci;
-			dmi_walk(ghes_edac_dmidecode, &dimm_fill);
-		}
+		dimm_fill.count = 0;
+		dimm_fill.mci = mci;
+		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
 	} else {
 		struct dimm_info *dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
 						       mci->n_layers, 0, 0, 0);
@@ -497,12 +493,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (rc < 0) {
 		pr_info("Can't register at EDAC core\n");
 		edac_mc_free(mci);
-		mutex_unlock(&ghes_edac_lock);
 		return -ENODEV;
 	}
-
-	ghes_edac_mc_num++;
-	mutex_unlock(&ghes_edac_lock);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ghes_edac_register);
@@ -510,15 +502,9 @@ EXPORT_SYMBOL_GPL(ghes_edac_register);
 void ghes_edac_unregister(struct ghes *ghes)
 {
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt, *tmp;
-
-	list_for_each_entry_safe(pvt, tmp, &ghes_reglist, list) {
-		if (ghes == pvt->ghes) {
-			mci = pvt->mci;
-			edac_mc_del_mc(mci->pdev);
-			edac_mc_free(mci);
-			list_del(&pvt->list);
-		}
-	}
+
+	mci = ghes_pvt->mci;
+	edac_mc_del_mc(mci->pdev);
+	edac_mc_free(mci);
 }
 EXPORT_SYMBOL_GPL(ghes_edac_unregister);

^ permalink raw reply related	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-08-21  9:29 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-15 16:19 [v2,4/7] ghes_edac: avoid multiple calls to dmi_walk() Toshi Kani
  -- strict thread matches above, loose matches on Subject: below --
2017-08-21  9:29 Borislav Petkov
2017-08-17 21:08 Toshi Kani
2017-08-16 18:01 Toshi Kani
2017-08-16 17:40 Borislav Petkov
2017-08-16 17:31 Borislav Petkov
2017-08-16 17:28 Toshi Kani
2017-08-16 16:42 Borislav Petkov
2017-08-16 15:26 Toshi Kani
2017-08-16 14:22 Steven Rostedt
2017-08-16 14:03 Borislav Petkov
2017-08-16 13:59 Steven Rostedt
2017-08-16 11:29 Borislav Petkov
2017-08-16  8:29 Borislav Petkov
2017-08-15 15:53 Toshi Kani
2017-08-15 15:50 Borislav Petkov
2017-08-15 15:48 Luck, Tony
2017-08-15 15:35 Toshi Kani
2017-08-14 20:39 Borislav Petkov
2017-08-14 20:17 Toshi Kani
2017-08-14 19:34 Borislav Petkov
2017-08-14 19:02 Toshi Kani
2017-08-14 18:35 Borislav Petkov
2017-08-14 18:17 Toshi Kani
2017-08-14 18:05 Borislav Petkov
2017-08-14 17:52 Toshi Kani
2017-08-14 17:05 Borislav Petkov
2017-08-14 16:48 Toshi Kani
2017-08-14 16:24 Borislav Petkov
2017-08-14 15:57 Toshi Kani
2017-08-11  9:04 Borislav Petkov
2017-08-07 17:59 Toshi Kani
2017-08-05  5:16 Borislav Petkov
2017-08-04 21:02 Toshi Kani
2017-08-04  4:05 Borislav Petkov
2017-08-03 21:57 Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).