From: Mauro Carvalho Chehab <m.chehab@samsung.com>
To: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>,
tony.luck@intel.com, bhelgaas@google.com, rostedt@goodmis.org,
rjw@sisk.pl, lance.ortiz@hp.com, linux-pci@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org,
Aristeu Rozanski Filho <arozansk@redhat.com>
Subject: Re: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event
Date: Wed, 14 Aug 2013 20:54:33 -0300 [thread overview]
Message-ID: <20130814205433.452ef58d@concha.lan> (raw)
In-Reply-To: <520A651E.3050604@linux.vnet.ibm.com>
Em Tue, 13 Aug 2013 22:25:58 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> escreveu:
(sorry for a late answer, I had to do a small travel yesterday)
> On 08/13/2013 05:51 PM, Mauro Carvalho Chehab wrote:
> > Em Tue, 13 Aug 2013 17:06:14 +0530
> > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> escreveu:
> >
> >> On 08/12/2013 11:26 PM, Borislav Petkov wrote:
> >>> On Mon, Aug 12, 2013 at 02:25:57PM -0300, Mauro Carvalho Chehab wrote:
> >>>> Userspace still needs the EDAC sysfs, in order to identify how the
> >>>> memory is organized, and do the proper memory labels association.
> >>>>
> >>>> What edac_ghes does is to fill those sysfs nodes, and to call the
> >>>> existing tracing to report errors.
> >>
> >> I suppose you're referring to the entries under /sys/devices/system/edac/mc?
> >
> > Yes.
> >
> >>
> >> I'm not sure I understand how this helps. ghes_edac seems to just be
> >> populating this based on dmi, which if I'm not mistaken, can be obtained
> >> in userspace (mcelog as an example).
> >>
> >> Also, on my system, all DIMMs are being reported under mc0. I doubt if
> >> the labels there are accurate.
> >
> > Yes, this is the current status of ghes_edac, where BIOS doesn't provide any
> > reliable way to associate a given APEI report to a physical DIMM slot label.
> >
> > The plan is to add more logic there as BIOSes start to provide some reliable
> > way to do such association. I discussed this subject with a few vendors
> > while I was working at Red Hat.
>
> Hmm... is there anything specific in the APEI report that could help?
I didn't see anything at APEI spec that would allow to describe how the
memory is organized. So, it is hard for the ghes_edac driver to discover
how many memory controllers, channels and slots are available. This data
is needed, in order to allow userspace to pass the labels for each DIMM,
or for the Kernel to auto-discover.
> More importantly, is there a need to do this in-kernel rather than in
> user-space?
Yes, due to 2 aspects:
On a critical error, the machine will die. The EDAC core will print the
error at dmesg, but no other record to be latter parsed will be available;
With hot pluggable memories, dynamic channel rerouting, memory poisoning
and other funny things, it could not be possible to point to a DIMM,
if the parsing is done on a latter time.
Regards,
Mauro
next prev parent reply other threads:[~2013-08-14 23:54 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-08 18:27 [PATCH 0/3] Add trace event for ghes memory error Naveen N. Rao
2013-08-08 18:27 ` [PATCH 1/3] mce: acpi/apei: trace: Include PCIe AER trace event conditionally Naveen N. Rao
2013-08-08 19:23 ` Steven Rostedt
2013-08-12 11:37 ` Naveen N. Rao
2013-08-12 13:13 ` Steven Rostedt
2013-08-12 13:26 ` Borislav Petkov
2013-08-08 18:27 ` [PATCH 2/3] mce: acpi/apei: trace: Add trace event for ghes memory error Naveen N. Rao
2013-08-08 19:17 ` Borislav Petkov
2013-08-12 11:28 ` Naveen N. Rao
2013-08-08 18:27 ` [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event Naveen N. Rao
2013-08-08 19:38 ` Mauro Carvalho Chehab
2013-08-10 18:03 ` Borislav Petkov
2013-08-12 11:33 ` Mauro Carvalho Chehab
2013-08-12 12:38 ` Borislav Petkov
2013-08-12 14:49 ` Mauro Carvalho Chehab
2013-08-12 15:04 ` Borislav Petkov
2013-08-12 17:25 ` Mauro Carvalho Chehab
2013-08-12 17:54 ` Luck, Tony
2013-08-12 17:56 ` Borislav Petkov
2013-08-13 11:36 ` Naveen N. Rao
2013-08-13 12:21 ` Mauro Carvalho Chehab
2013-08-13 12:33 ` Borislav Petkov
2013-08-13 16:55 ` Naveen N. Rao
2013-08-14 23:54 ` Mauro Carvalho Chehab [this message]
2013-08-12 12:41 ` Naveen N. Rao
2013-08-12 12:53 ` Borislav Petkov
2013-08-13 11:21 ` Naveen N. Rao
2013-08-13 12:42 ` Borislav Petkov
2013-08-13 17:32 ` Naveen N. Rao
2013-08-13 17:58 ` Borislav Petkov
2013-08-13 18:05 ` Luck, Tony
2013-08-13 18:05 ` Luck, Tony
2013-08-13 18:05 ` Luck, Tony
2013-08-13 18:10 ` Borislav Petkov
2013-08-13 20:13 ` Luck, Tony
2013-08-13 20:13 ` Luck, Tony
2013-08-13 20:13 ` Luck, Tony
2013-08-14 5:43 ` Borislav Petkov
2013-08-14 18:38 ` Luck, Tony
2013-08-14 18:38 ` Luck, Tony
2013-08-14 18:38 ` Luck, Tony
2013-08-15 10:14 ` Borislav Petkov
2013-08-15 19:14 ` Luck, Tony
2013-08-15 19:14 ` Luck, Tony
2013-08-15 19:14 ` Luck, Tony
2013-08-15 19:43 ` Borislav Petkov
2013-08-15 0:05 ` Mauro Carvalho Chehab
2013-08-14 10:57 ` Naveen N. Rao
2013-08-15 0:22 ` Mauro Carvalho Chehab
2013-08-15 9:38 ` Borislav Petkov
2013-08-15 13:26 ` Mauro Carvalho Chehab
2013-08-15 13:44 ` Borislav Petkov
2013-08-15 14:14 ` Mauro Carvalho Chehab
2013-08-15 16:11 ` Borislav Petkov
2013-08-15 19:20 ` Luck, Tony
2013-08-15 19:41 ` Borislav Petkov
2013-08-15 0:00 ` Mauro Carvalho Chehab
2013-08-15 9:43 ` Borislav Petkov
2013-08-12 14:44 ` Mauro Carvalho Chehab
2013-08-13 11:41 ` Naveen N. Rao
2013-08-13 12:41 ` Mauro Carvalho Chehab
2013-08-13 17:17 ` Naveen N. Rao
2013-08-13 17:39 ` Luck, Tony
2013-08-14 10:47 ` Naveen N. Rao
2013-08-14 12:18 ` Borislav Petkov
2013-08-15 0:15 ` Mauro Carvalho Chehab
2013-08-15 10:01 ` Borislav Petkov
2013-08-15 13:34 ` Mauro Carvalho Chehab
2013-08-15 13:51 ` Borislav Petkov
2013-08-15 18:16 ` Luck, Tony
2013-08-15 18:16 ` Luck, Tony
2013-08-15 18:16 ` Luck, Tony
2013-08-15 18:41 ` Borislav Petkov
2013-08-14 23:56 ` Mauro Carvalho Chehab
2013-08-15 10:02 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130814205433.452ef58d@concha.lan \
--to=m.chehab@samsung.com \
--cc=arozansk@redhat.com \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=lance.ortiz@hp.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=rjw@sisk.pl \
--cc=rostedt@goodmis.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.