From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>,
Linux Edac Mailing List <linux-edac@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Doug Thompson <norsk5@yahoo.com>,
Steven Rostedt <rostedt@goodmis.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH v22] edac, ras/hw_event.h: use events to handle hw issues
Date: Thu, 10 May 2012 19:07:49 -0300 [thread overview]
Message-ID: <4FAC3C35.70101@redhat.com> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F192EC1A2@ORSMSX104.amr.corp.intel.com>
Em 10-05-2012 18:10, Luck, Tony escreveu:
>>> + TP_printk(HW_ERR "mce#%d: %s error %s on memory stick \"%s\" (%s %s %s)",
>>
>> This still says "mce" and it should say "MC" or "mem_ctl" or similar.
>
> I'm trying to look at how this will look to an end user who is not intimately
> acquainted with the internals of how memory subsystems work.
This is what patch v23 prints on sb_edac:
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 30/30 #P:32
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
kworker/u:6-201 [007] .N.. 186.197280: mc_error: [Hardware Error]: mem_ctl#0: Corrected error memory read error on memory stick "DIMM_A1" (channel:0 slot:1 page:0x2f1eb3 offset:0x446 grain:32 syndrome:0x0 1 error(s): Unknown: Err=0001:0090 socket=0 channel=0/mask=1 rank=5)
kworker/u:6-201 [007] .N.. 186.239536: mc_error: [Hardware Error]: mem_ctl#1: Corrected error memory read error on memory stick "DIMM_E2" (channel:0 slot:0 page:0x93180b offset:0x927 grain:32 syndrome:0x0 1 error(s): Unknown: Err=0001:0090 socket=1 channel=2/mask=4 rank=0)
There are still some space to improve the fields provided by the drivers.
> Whether the string starts with "mce" or "MC" or whatever ... what will the
> user do with the mc_index that is printed with that first %d? I don't think
> it helps them find the DIMM when they open the box.
Well, it helps to match the memory information on the trace with the sysfs nodes
that are memory-controller based and with the dmesg info.
Calling it as "mc" is more coherent with the dmesg prints.
> I suppose it is useful
> if there are multiple messages ... and they see that the same memory controller
> is mentioned in each. But I almost think it belongs inside the parentheses at
> the end as the "low level details that most users won't need to care about.
I don't have a strong preference, although I think it is better to have it at
the beginning.
> Next %s is "Corrected" or "Fatal" or "Uncorrected" ... that's good.
>
> What are the options for the next "%s" (msg)?
The type of the error. In the above, it is "memory read error".
> "memory stick"?? I suppose "DIMM" is a bit implementation dependent (SIMMs
> are long gone ... but perhaps there will be some new acronym for stacked
> memory ... STIMS :-) )
Memory stick is described at edac.h as:
* Memory Stick: A printed circuit board that aggregates multiple
* memory devices in parallel. In general, this is the
* Field Replaceable Unit (FRU) which gets replaced, in
* the case of excessive errors. Most often it is also
* called DIMM (Dual Inline Memory Module).
*
DIMM is implementation dependent. As EDAC is also used on non-x86 archs, calling
it as DIMM on ARM is probably wrong.
Calling it as "STIMS" (or any other unusual acronym) seems worse ;)
>
> Then label (from SMBIOS) ... then the internal details. Good.
Regards,
Mauro
next prev parent reply other threads:[~2012-05-10 22:08 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-10 19:56 [PATCH v22] edac, ras/hw_event.h: use events to handle hw issues Mauro Carvalho Chehab
2012-05-10 20:40 ` Borislav Petkov
2012-05-10 20:55 ` Mauro Carvalho Chehab
2012-05-10 22:46 ` Steven Rostedt
2012-05-10 23:16 ` Mauro Carvalho Chehab
2012-05-10 21:00 ` [PATCHv23] RAS: " Mauro Carvalho Chehab
2012-05-11 10:04 ` Borislav Petkov
2012-05-11 14:54 ` [PATCH v.23-2] RAS: use tracepoint " Mauro Carvalho Chehab
2012-05-11 17:02 ` Luck, Tony
2012-05-11 18:53 ` Mauro Carvalho Chehab
2012-05-11 20:07 ` Tony Luck
2012-05-11 17:06 ` Borislav Petkov
2012-05-11 17:10 ` Mauro Carvalho Chehab
2012-05-11 22:31 ` Borislav Petkov
2012-05-11 22:35 ` Luck, Tony
2012-05-12 14:13 ` [PATCH v24] RAS: Add a tracepoint for reporting memory controller events Mauro Carvalho Chehab
2012-05-10 21:10 ` [PATCH v22] edac, ras/hw_event.h: use events to handle hw issues Luck, Tony
2012-05-10 22:07 ` Mauro Carvalho Chehab [this message]
2012-05-10 22:37 ` Luck, Tony
2012-05-11 1:48 ` Mauro Carvalho Chehab
2012-05-11 10:25 ` Borislav Petkov
2012-05-11 12:37 ` Mauro Carvalho Chehab
2012-05-11 17:24 ` Borislav Petkov
2012-05-11 18:38 ` Mauro Carvalho Chehab
2012-05-14 13:34 ` Borislav Petkov
2012-05-14 14:27 ` Mauro Carvalho Chehab
2012-05-15 15:09 ` Borislav Petkov
2012-05-15 16:05 ` Mauro Carvalho Chehab
2012-05-15 16:38 ` Borislav Petkov
2012-05-16 11:22 ` Mauro Carvalho Chehab
2012-05-16 13:16 ` Borislav Petkov
2012-05-16 13:27 ` Steven Rostedt
2012-05-16 13:32 ` Borislav Petkov
2012-05-16 13:47 ` Steven Rostedt
2012-05-16 15:16 ` Mauro Carvalho Chehab
2012-05-16 15:47 ` Borislav Petkov
2012-05-16 16:52 ` Mauro Carvalho Chehab
2012-05-16 19:59 ` Borislav Petkov
2012-05-16 20:27 ` Luck, Tony
2012-05-16 21:05 ` Borislav Petkov
2012-05-16 12:48 ` Steven Rostedt
2012-05-16 15:24 ` Mauro Carvalho Chehab
2012-05-16 17:05 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FAC3C35.70101@redhat.com \
--to=mchehab@redhat.com \
--cc=bp@amd64.org \
--cc=fweisbec@gmail.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=norsk5@yahoo.com \
--cc=rostedt@goodmis.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.