linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Tony Luck <tony.luck@intel.com>, Ingo Molnar <mingo@elte.hu>,
	EDAC devel <linux-edac@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Borislav Petkov <borislav.petkov@amd.com>
Subject: Re: [PATCH 4/4] EDAC: Convert AMD EDAC pieces to use RAS printk buffer
Date: Fri, 02 Mar 2012 11:52:59 -0300	[thread overview]
Message-ID: <4F50DECB.8030200@redhat.com> (raw)
In-Reply-To: <1330698314-9863-5-git-send-email-bp@amd64.org>

Em 02-03-2012 11:25, Borislav Petkov escreveu:
> From: Borislav Petkov <borislav.petkov@amd.com>
> 
> This is an initial version of the patch which converts MCE decoding
> facilities to use the RAS printk buffer. When there's no userspace agent
> running (i.e., /sys/devices/system/ras/agent == 0), we fall back to the
> default printk's into dmesg which is what we've been doing so far.
> 
> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
> ---
>  drivers/edac/amd64_edac.c |    8 ++-
>  drivers/edac/edac_mc.c    |   42 ++++++---
>  drivers/edac/mce_amd.c    |  217 ++++++++++++++++++++++++---------------------
>  3 files changed, 149 insertions(+), 118 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 08413377a43b..29e153c57e33 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -1,6 +1,7 @@
> -#include "amd64_edac.h"
>  #include <asm/amd_nb.h>
> +#include <asm/ras.h>
>  
> +#include "amd64_edac.h"
>  static struct edac_pci_ctl_info *amd64_ctl_pci;
>  
>  static int report_gart_errors;
> @@ -1901,7 +1902,10 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
>  	sys_addr = get_error_address(m);
>  	syndrome = extract_syndrome(m->status);
>  
> -	amd64_mc_err(mci, "CE ERROR_ADDRESS= 0x%llx\n", sys_addr);
> +	if (ras_agent)
> +		ras_printk(PR_EMERG, "ERR_ADDR: 0x%llx", sys_addr);
> +	else
> +		amd64_mc_err(mci, "CE ERROR_ADDRESS= 0x%llx\n", sys_addr);
>  
>  	pvt->ops->map_sysaddr_to_csrow(mci, sys_addr, syndrome);
>  }
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index ca6c04d350ee..3b3db477b5d0 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -30,8 +30,10 @@
>  #include <asm/uaccess.h>
>  #include <asm/page.h>
>  #include <asm/edac.h>
> +#include <asm/ras.h>
>  #include "edac_core.h"
>  #include "edac_module.h"
> +#include "mce_amd.h"
>  
>  /* lock to memory controller's control array */
>  static DEFINE_MUTEX(mem_ctls_mutex);
> @@ -701,14 +703,22 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  		return;
>  	}
>  
> -	if (edac_mc_get_log_ce())
> -		/* FIXME - put in DIMM location */
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
> -			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
> -			page_frame_number, offset_in_page,
> -			mci->csrows[row].grain, syndrome, row, channel,
> -			mci->csrows[row].channels[channel].label, msg);
> +	if (edac_mc_get_log_ce()) {
> +		if (ras_agent)
> +			ras_printk(PR_CONT, ", row: %d, channel: %d\n",
> +				   row, channel);
> +		else
> +			/* FIXME - put in DIMM location */
> +			edac_mc_printk(mci, KERN_WARNING,
> +					"CE page 0x%lx, offset 0x%lx, grain %d,"
> +					" syndrome 0x%lx, row %d, channel %d,"
> +					" label \"%s\": %s\n",
> +					page_frame_number, offset_in_page,
> +					mci->csrows[row].grain, syndrome,
> +					row, channel,
> +					mci->csrows[row].channels[channel].label,
> +					msg);
> +	}


As I've commented already, This piece of the code is not ok, due
to several reasons:

	- the "ras_agent" helper functions is used only for amd64_edac. There's
no reason for use it elsewhere;

	- the code here is adding the location only for CE errors. The location
of the error is also pertinent for the other types of errors;

	- on my patches, this function disappears, being replaced by a single
function, that can be used to report all types of memory errors;

	- non MCA drivers should also generate tracepoints;

	- it is a way easier to add just one call to the tracepoint function here, than
to spread on all drivers.

As it is shown at:
	http://git.kernel.org/?p=linux/kernel/git/mchehab/linux-edac.git;a=commitdiff_plain;h=4eb2a29419c1fefd76c8dbcd308b84a4b52faf4d

Once we made an agreement with regards to the tracepoint function, 
a change similar to what it was proposed there, e. g.:

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 37d2c97..2dca0e3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -899,7 +899,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			  const int layer2,
 			  const char *msg,
 			  const char *other_detail,
-			  const void *mcelog)
+			  const void *arch_log)
 {
 	unsigned long remapped_page;
 	/* FIXME: too much for stack: move it to some pre-alocated area */
@@ -1033,8 +1047,17 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			"page 0x%lx offset 0x%lx grain %d\n",
 			page_frame_number, offset_in_page, grain);
 
+#ifdef CONFIG_X86
+	if (arch_log)
+		trace_mc_error_mce(type, mci->mc_idx, msg, label, location,
+				   detail, other_detail, arch_log);
+	else
+		trace_mc_error(type, mci->mc_idx, msg, label, location,
+			       detail, other_detail);
+#else
 	trace_mc_error(type, mci->mc_idx, msg, label, location,
 		       detail, other_detail);
+#endif
 
 	if (type == HW_EVENT_ERR_CORRECTED) {
 		if (edac_mc_get_log_ce())

is enough to generate traces for all existing drivers.

Regards,
Mauro

  reply	other threads:[~2012-03-02 14:53 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-02 14:25 [RFC -v2 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-02 14:25 ` [PATCH 1/4] mce: Slim up struct mce Borislav Petkov
2012-03-02 17:47   ` Luck, Tony
2012-03-03  7:37     ` Ingo Molnar
2012-03-05  9:17       ` Borislav Petkov
2012-03-02 14:25 ` [PATCH 2/4] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-03-02 14:25 ` [PATCH 3/4] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-03-02 14:25 ` [PATCH 4/4] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
2012-03-02 14:52   ` Mauro Carvalho Chehab [this message]
2012-03-05 11:04     ` Borislav Petkov
2012-03-05 11:43       ` Mauro Carvalho Chehab
2012-03-05 12:44         ` Borislav Petkov
2012-03-05 13:35           ` Mauro Carvalho Chehab
2012-03-05 14:13             ` Borislav Petkov
2012-03-05 14:58               ` Mauro Carvalho Chehab
2012-03-05 22:00                 ` [PATCHv5] EDAC core changes in order to properly report errors from all types of memory controllers Mauro Carvalho Chehab
2012-03-05 23:23                   ` Borislav Petkov
2012-03-06 11:31                     ` Mauro Carvalho Chehab
2012-03-06 12:16                       ` Borislav Petkov
2012-03-07  0:20                         ` [PATCHv7] " Mauro Carvalho Chehab
2012-03-07  8:42                           ` Borislav Petkov
2012-03-07 11:36                             ` Mauro Carvalho Chehab
2012-03-07 12:06                               ` Borislav Petkov
2012-03-07 12:13                                 ` Mauro Carvalho Chehab
2012-03-02 14:41 ` [RFC -v2 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Mauro Carvalho Chehab
2012-03-02 14:48   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F50DECB.8030200@redhat.com \
    --to=mchehab@redhat.com \
    --cc=borislav.petkov@amd.com \
    --cc=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).