linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <m.chehab@samsung.com>
To: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>,
	tony.luck@intel.com, bhelgaas@google.com, rostedt@goodmis.org,
	rjw@sisk.pl, lance.ortiz@hp.com, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org,
	Aristeu Rozanski Filho <arozansk@redhat.com>
Subject: Re: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event
Date: Tue, 13 Aug 2013 09:21:54 -0300	[thread overview]
Message-ID: <20130813092154.1e17385f@concha.lan> (raw)
In-Reply-To: <520A1A2E.9080500@linux.vnet.ibm.com>

Em Tue, 13 Aug 2013 17:06:14 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> escreveu:

> On 08/12/2013 11:26 PM, Borislav Petkov wrote:
> > On Mon, Aug 12, 2013 at 02:25:57PM -0300, Mauro Carvalho Chehab wrote:
> >> Userspace still needs the EDAC sysfs, in order to identify how the
> >> memory is organized, and do the proper memory labels association.
> >>
> >> What edac_ghes does is to fill those sysfs nodes, and to call the
> >> existing tracing to report errors.
> 
> I suppose you're referring to the entries under /sys/devices/system/edac/mc?

Yes.

> 
> I'm not sure I understand how this helps. ghes_edac seems to just be 
> populating this based on dmi, which if I'm not mistaken, can be obtained 
> in userspace (mcelog as an example).
> 
> Also, on my system, all DIMMs are being reported under mc0. I doubt if 
> the labels there are accurate.

Yes, this is the current status of ghes_edac, where BIOS doesn't provide any
reliable way to associate a given APEI report to a physical DIMM slot label.

The plan is to add more logic there as BIOSes start to provide some reliable
way to do such association. I discussed this subject with a few vendors
while I was working at Red Hat.

> >
> > This is the only reason which justifies EDAC's existence. Naveen, can
> > your BIOS directly report the silkscreen label of the DIMM in error?
> > Generally, can any BIOS do that?
> >
> > More specifically, what are those gdata_fru_id and gdata_fru_text
> > things?
> 
> My understanding was that this provides the DIMM serial number, but I'm 
> double checking just to be sure.

If it provides the DIMM serial number, then it is possible to improve the
ghes_edac driver to associate them. One option could be to write an I2C
driver and dig those information directly from the memories, although 
doing that could be risky, as BIOS could also try to access the same I2C
buses.

> 
> Thanks,
> Naveen
> 
> >
> > Because if it can, then having the memory error tracepoint come direct
> > from APEI should be enough. The ghes_edac functionality could be then
> > fallback for BIOSes which cannot report the silkscreen label and in such
> > case I can imagine keeping both tracepoints, but disabling one of the
> > two...
> >
> 


-- 

Cheers,
Mauro

  reply	other threads:[~2013-08-13 12:21 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-08 18:27 [PATCH 0/3] Add trace event for ghes memory error Naveen N. Rao
2013-08-08 18:27 ` [PATCH 1/3] mce: acpi/apei: trace: Include PCIe AER trace event conditionally Naveen N. Rao
2013-08-08 19:23   ` Steven Rostedt
2013-08-12 11:37     ` Naveen N. Rao
2013-08-12 13:13       ` Steven Rostedt
2013-08-12 13:26         ` Borislav Petkov
2013-08-08 18:27 ` [PATCH 2/3] mce: acpi/apei: trace: Add trace event for ghes memory error Naveen N. Rao
2013-08-08 19:17   ` Borislav Petkov
2013-08-12 11:28     ` Naveen N. Rao
2013-08-08 18:27 ` [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event Naveen N. Rao
2013-08-08 19:38   ` Mauro Carvalho Chehab
2013-08-10 18:03     ` Borislav Petkov
2013-08-12 11:33       ` Mauro Carvalho Chehab
2013-08-12 12:38         ` Borislav Petkov
2013-08-12 14:49           ` Mauro Carvalho Chehab
2013-08-12 15:04             ` Borislav Petkov
2013-08-12 17:25               ` Mauro Carvalho Chehab
2013-08-12 17:54                 ` Luck, Tony
2013-08-12 17:56                 ` Borislav Petkov
2013-08-13 11:36                   ` Naveen N. Rao
2013-08-13 12:21                     ` Mauro Carvalho Chehab [this message]
2013-08-13 12:33                       ` Borislav Petkov
2013-08-13 16:55                       ` Naveen N. Rao
2013-08-14 23:54                         ` Mauro Carvalho Chehab
2013-08-12 12:41         ` Naveen N. Rao
2013-08-12 12:53           ` Borislav Petkov
2013-08-13 11:21             ` Naveen N. Rao
2013-08-13 12:42               ` Borislav Petkov
2013-08-13 17:32                 ` Naveen N. Rao
2013-08-13 17:58                   ` Borislav Petkov
2013-08-13 18:05                     ` Luck, Tony
2013-08-13 18:10                       ` Borislav Petkov
2013-08-13 20:13                         ` Luck, Tony
2013-08-14  5:43                           ` Borislav Petkov
2013-08-14 18:38                             ` Luck, Tony
2013-08-15 10:14                               ` Borislav Petkov
2013-08-15 19:14                                 ` Luck, Tony
2013-08-15 19:43                                   ` Borislav Petkov
2013-08-15  0:05                             ` Mauro Carvalho Chehab
2013-08-14 10:57                     ` Naveen N. Rao
2013-08-15  0:22                       ` Mauro Carvalho Chehab
2013-08-15  9:38                         ` Borislav Petkov
2013-08-15 13:26                           ` Mauro Carvalho Chehab
2013-08-15 13:44                             ` Borislav Petkov
2013-08-15 14:14                               ` Mauro Carvalho Chehab
2013-08-15 16:11                                 ` Borislav Petkov
2013-08-15 19:20                                 ` Luck, Tony
2013-08-15 19:41                                   ` Borislav Petkov
2013-08-15  0:00                   ` Mauro Carvalho Chehab
2013-08-15  9:43                     ` Borislav Petkov
2013-08-12 14:44           ` Mauro Carvalho Chehab
2013-08-13 11:41             ` Naveen N. Rao
2013-08-13 12:41               ` Mauro Carvalho Chehab
2013-08-13 17:17                 ` Naveen N. Rao
2013-08-13 17:39                   ` Luck, Tony
2013-08-14 10:47                     ` Naveen N. Rao
2013-08-14 12:18                       ` Borislav Petkov
2013-08-15  0:15                       ` Mauro Carvalho Chehab
2013-08-15 10:01                         ` Borislav Petkov
2013-08-15 13:34                           ` Mauro Carvalho Chehab
2013-08-15 13:51                             ` Borislav Petkov
2013-08-15 18:16                               ` Luck, Tony
2013-08-15 18:41                                 ` Borislav Petkov
2013-08-14 23:56                   ` Mauro Carvalho Chehab
2013-08-15 10:02                     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130813092154.1e17385f@concha.lan \
    --to=m.chehab@samsung.com \
    --cc=arozansk@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=lance.ortiz@hp.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=rjw@sisk.pl \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).