public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] edac: Add a sysfs node to test the EDAC error report facility
@ 2012-03-07 12:08 Mauro Carvalho Chehab
  2012-03-13 23:31 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-07 12:08 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Even on memory controllers that have memory injection, such functionality
can be disabled by BIOS during bootstrap, and it may not be possible to
enable it via BIOS setup.

So, as not all hardware supports error injection, add a mechanism to
allow testing the edac driver and the core.

This feature is only enabled when EDAC_DEBUG is equal to Y, so there's no
extra code for the production Kernels.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

NOTE: This patch should be applied after the series that add proper support for
FB-DIMM, due to context changes.

 drivers/edac/edac_mc_sysfs.c |   62 ++++++++++++++++++++++++++++++++++++++++-
 include/linux/edac.h         |    4 +++
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 870ccb0..f538f9e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -744,6 +744,42 @@ static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data,
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
 }
 
+#ifdef CONFIG_EDAC_DEBUG
+static ssize_t edac_fake_inject_show(struct mem_ctl_info *mci,
+				     char *data, void *priv)
+{
+	return sprintf(data,
+		       "EDAC fake test engine. Writing to this node a value in the form of :\n"
+		       "\t0:1:0\n"
+		       "will call the EDAC core routine to produce a memory error for the given memory location (0, 1, 0).\n"
+		       "The driver's error parsing logic won't be tested. This tool is useful only\n"
+		       "if you're testing the EDAC core tracing facility, or if you're needing to test\n"
+		       "some userspace application.\n");
+}
+
+static ssize_t edac_fake_inject_store(struct mem_ctl_info *mci,
+				      const char *data, size_t count,
+				      void *priv)
+{
+	static enum hw_event_mc_err_type type = HW_EVENT_ERR_CORRECTED;
+	int err, layer0 = -1, layer1 = -1, layer2 = -1;
+	err = sscanf(data, "%i:%i:%i", &layer0, &layer1, &layer2);
+	if (err < 0)
+		return err;
+
+	printk(KERN_DEBUG
+	       "Generating a fake error to %d.%d.%d to test core handling. NOTE: this won't test the driver-specific decoding logic.\n",
+	       layer0, layer1, layer2);
+	edac_mc_handle_error(type, mci, 0, 0, 0,
+			     layer0, layer1, layer2,
+			     "FAKE ERROR", "for EDAC testing only", NULL);
+	if (++type == HW_EVENT_ERR_FATAL)
+		type = HW_EVENT_ERR_CORRECTED;
+
+	return count;
+}
+#endif
+
 #define to_mci(k) container_of(k, struct mem_ctl_info, edac_mci_kobj)
 #define to_mcidev_attr(a) container_of(a,struct mcidev_sysfs_attribute,attr)
 
@@ -877,7 +913,7 @@ static int edac_create_errcount_layer(struct mem_ctl_info *mci,
 		debugf4("%s() creating %s\n", __func__, (*erc)->attr.name);
 		if (!(*erc)->attr.name)
 			return -ENOMEM;
-		(*erc)->attr.mode = S_IRUGO;
+		(*erc)->attr.mode = S_IRUGO | S_IWUSR;
 		(*erc)->show = errcount_ce_show;
 		(*erc)->priv = *ercd;
 		(*ercd)->n_layers = layer + 1;
@@ -1326,7 +1362,24 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 			goto fail2;
 		}
 	}
-	edac_create_errcount_objects(mci);
+	err = edac_create_errcount_objects(mci);
+	if (err) {
+		debugf1("%s() failure: create error count objects\n",
+			__func__);
+		goto fail2;
+	}
+#ifdef CONFIG_EDAC_DEBUG
+	mci->errinject_attr.attr.name = "fake_inject";
+	mci->errinject_attr.attr.mode = S_IRUGO | S_IWUSR;
+	mci->errinject_attr.show = edac_fake_inject_show;
+	mci->errinject_attr.store = edac_fake_inject_store;
+	err = sysfs_create_file(&mci->edac_mci_kobj, &mci->errinject_attr.attr);
+	if (err < 0) {
+		printk(KERN_ERR
+		       "sysfs_create_file for fake inject failed: %d\n", err);
+		mci->errinject_attr.attr.name = NULL;
+	}
+#endif
 
 	return 0;
 
@@ -1371,6 +1424,11 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	debugf0("%s()\n", __func__);
 
+#ifdef CONFIG_EDAC_DEBUG
+	if (mci->errinject_attr.attr.name)
+		sysfs_remove_file(&mci->edac_mci_kobj,
+				  &mci->errinject_attr.attr);
+#endif
 	edac_remove_errcount(mci);
 
 	/* remove all dimms kobjects */
diff --git a/include/linux/edac.h b/include/linux/edac.h
index beb6170..895c4a8 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -635,6 +635,10 @@ struct mem_ctl_info {
 	struct mcidev_sysfs_attribute *errcount_attr;
 	struct errcount_attribute_data *errcount_attr_data;
 
+#ifdef CONFIG_EDAC_DEBUG
+	struct mcidev_sysfs_attribute errinject_attr;
+#endif
+
 	struct completion complete;
 
 	/* edac sysfs device control */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] edac: Add a sysfs node to test the EDAC error report facility
  2012-03-07 12:08 [PATCH] edac: Add a sysfs node to test the EDAC error report facility Mauro Carvalho Chehab
@ 2012-03-13 23:31 ` Greg KH
  2012-03-14 19:26   ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2012-03-13 23:31 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Wed, Mar 07, 2012 at 09:08:21AM -0300, Mauro Carvalho Chehab wrote:
> Even on memory controllers that have memory injection, such functionality
> can be disabled by BIOS during bootstrap, and it may not be possible to
> enable it via BIOS setup.
> 
> So, as not all hardware supports error injection, add a mechanism to
> allow testing the edac driver and the core.
> 
> This feature is only enabled when EDAC_DEBUG is equal to Y, so there's no
> extra code for the production Kernels.

Then please move it to debugfs, where debugging stuff belongs.

At the least, you need to document all sysfs files in Documentation/ABI/

> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
> 
> NOTE: This patch should be applied after the series that add proper support for
> FB-DIMM, due to context changes.
> 
>  drivers/edac/edac_mc_sysfs.c |   62 ++++++++++++++++++++++++++++++++++++++++-
>  include/linux/edac.h         |    4 +++
>  2 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 870ccb0..f538f9e 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -744,6 +744,42 @@ static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data,
>  	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
>  }
>  
> +#ifdef CONFIG_EDAC_DEBUG
> +static ssize_t edac_fake_inject_show(struct mem_ctl_info *mci,
> +				     char *data, void *priv)
> +{
> +	return sprintf(data,
> +		       "EDAC fake test engine. Writing to this node a value in the form of :\n"
> +		       "\t0:1:0\n"
> +		       "will call the EDAC core routine to produce a memory error for the given memory location (0, 1, 0).\n"
> +		       "The driver's error parsing logic won't be tested. This tool is useful only\n"
> +		       "if you're testing the EDAC core tracing facility, or if you're needing to test\n"
> +		       "some userspace application.\n");
> +}

Ick, no NEVER do this for a sysfs file, even for "debugging", that's
what debugfs is for, please move this file there, this isn't ok.

Again, sysfs is "one value per file".

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] edac: Add a sysfs node to test the EDAC error report facility
  2012-03-13 23:31 ` Greg KH
@ 2012-03-14 19:26   ` Mauro Carvalho Chehab
  2012-03-14 22:33     ` Borislav Petkov
  0 siblings, 1 reply; 4+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-14 19:26 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 13-03-2012 20:31, Greg KH escreveu:
> On Wed, Mar 07, 2012 at 09:08:21AM -0300, Mauro Carvalho Chehab wrote:
>> Even on memory controllers that have memory injection, such functionality
>> can be disabled by BIOS during bootstrap, and it may not be possible to
>> enable it via BIOS setup.
>>
>> So, as not all hardware supports error injection, add a mechanism to
>> allow testing the edac driver and the core.
>>
>> This feature is only enabled when EDAC_DEBUG is equal to Y, so there's no
>> extra code for the production Kernels.
> 
> Then please move it to debugfs, where debugging stuff belongs.

There are other edac error injection tools at sysfs. That's why I opted to keep
it there - plus, I was too lazy[1] to learn about the debugfs bits at the time I
wrote it, and adding a single node to sysfs were very trivial ;)

[1] In a matter of fact, testing this patchset is too time-consuming,
as it required me to get access to several systems, including some old ones.
This patch in particular were written in order to help me with the tests.
While I intend to improve it, this has lower priority than the tests themselves,
the discussions about the patches/ABI and the media maintainer stuff.

> 
> At the least, you need to document all sysfs files in Documentation/ABI/

Yes, this is on my TODO list. I'll document it at the Documentation on the
final patchset.

Btw, the current EDAC sysfs API is not documented there at ABI/, but,
instead, at Documentation/edac.txt. What is there is basically the sysfs node
description, but it is more verbose than what is there at Documentation/ABI.

I'm not sure what would be the better to do there: to move everything into
Documentation/ABI, to keep edac.txt as is (of course, adding there the new
nodes) or to keep a simplified edac.txt file and add a more concise
description at Documentation/ABI.

While there's current no plans to remove the existing EDAC sysfs API, when
writing the documentation patch, it is probably a good idea to put the
nodes that provide duplicated information at Documentation/ABI/obsolete.

> 
>>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>
>> NOTE: This patch should be applied after the series that add proper support for
>> FB-DIMM, due to context changes.
>>
>>  drivers/edac/edac_mc_sysfs.c |   62 ++++++++++++++++++++++++++++++++++++++++-
>>  include/linux/edac.h         |    4 +++
>>  2 files changed, 64 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
>> index 870ccb0..f538f9e 100644
>> --- a/drivers/edac/edac_mc_sysfs.c
>> +++ b/drivers/edac/edac_mc_sysfs.c
>> @@ -744,6 +744,42 @@ static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data,
>>  	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
>>  }
>>  
>> +#ifdef CONFIG_EDAC_DEBUG
>> +static ssize_t edac_fake_inject_show(struct mem_ctl_info *mci,
>> +				     char *data, void *priv)
>> +{
>> +	return sprintf(data,
>> +		       "EDAC fake test engine. Writing to this node a value in the form of :\n"
>> +		       "\t0:1:0\n"
>> +		       "will call the EDAC core routine to produce a memory error for the given memory location (0, 1, 0).\n"
>> +		       "The driver's error parsing logic won't be tested. This tool is useful only\n"
>> +		       "if you're testing the EDAC core tracing facility, or if you're needing to test\n"
>> +		       "some userspace application.\n");
>> +}
> 
> Ick, no NEVER do this for a sysfs file, even for "debugging", that's
> what debugfs is for, please move this file there, this isn't ok.
> 
> Again, sysfs is "one value per file".

Ok. I won't add this patch on my linux-next tree after finishing the tests. I'll keep
it on my todo list and re-write it later, when I find some time to convert it into
debugfs.

> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] edac: Add a sysfs node to test the EDAC error report facility
  2012-03-14 19:26   ` Mauro Carvalho Chehab
@ 2012-03-14 22:33     ` Borislav Petkov
  0 siblings, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2012-03-14 22:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Greg KH, Linux Edac Mailing List, Linux Kernel Mailing List

On Wed, Mar 14, 2012 at 04:26:15PM -0300, Mauro Carvalho Chehab wrote:
> > Then please move it to debugfs, where debugging stuff belongs.
> 
> There are other edac error injection tools at sysfs. That's why I opted to keep
> it there - plus, I was too lazy[1] to learn about the debugfs bits at the time I
> wrote it, and adding a single node to sysfs were very trivial ;)

I have to agree with Greg here, all debug-related stuff (error injection
being somewhat related) should go to debugfs.

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-03-14 22:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-07 12:08 [PATCH] edac: Add a sysfs node to test the EDAC error report facility Mauro Carvalho Chehab
2012-03-13 23:31 ` Greg KH
2012-03-14 19:26   ` Mauro Carvalho Chehab
2012-03-14 22:33     ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox