From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
Linux Edac Mailing List <linux-edac@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts
Date: Thu, 1 Nov 2012 15:47:30 -0200 [thread overview]
Message-ID: <20121101154730.3580c356@redhat.com> (raw)
In-Reply-To: <CA+8MBbK5J1qWqTZjC6nHsVbqk05t0yF1F7d-_0PQpvBQBXgO1w@mail.gmail.com>
Em Thu, 1 Nov 2012 10:25:23 -0700
Tony Luck <tony.luck@intel.com> escreveu:
> On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab
> <mchehab@redhat.com> wrote:
> > Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:
> >
> > void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> > {
> > struct mce m;
> >
> > /* Only corrected MC is reported */
> > if (!corrected || !(mem_err->validation_bits &
> > CPER_MEM_VALID_PHYSICAL_ADDRESS))
> > return;
> >
> > mce_setup(&m);
> > m.bank = 1;
> > /* Fake a memory read corrected error with unknown channel */
> > m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
> > m.addr = mem_err->physical_addr;
> > mce_log(&m);
> > mce_notify_irq();
> > }
> >
> > Bank information there is fake; status is fake. Only addr is really filled
> > there; it works only for corrected errors.
>
> This went in like this to help out the Westmere-EX processors that
> didn't fill out MCi_ADDR for corrected errors. APEI could get the
> address from some platform CSRs ... reporting via /dev/mcelog
> so that predictive analysis in mcelog(8) would work on these machines.
Ok, but it is broken on other platforms like Sandy Bridge.
> I don't think we can rip it out yet ... not until those machines are
> shuffled off to recycle heaven.
Perhaps then we could add a logic at apei-mce to only forward errors to
MCE on the platforms where the MCE log is known to be right.
> But perhaps we should get smarter about which machines we enable
> APEI on?
That makes sense. IMO, APEI should be on by default only if no other driver
exists, like in the case of Nehalem-EX. For platforms supported by i7core_edac,
sb_edac and amd64_edac, we could add a parameter to explicitly force it to
be on, otherwise, APEI will be disabled.
> If we get everything we need from the machine check banks,
> then the detour via the BIOS to report the same thing again isn't helpful.
Agreed.
Regards,
Mauro
next prev parent reply other threads:[~2012-11-01 17:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-31 13:58 [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts Mauro Carvalho Chehab
2012-11-01 11:05 ` Borislav Petkov
2012-11-01 11:47 ` Mauro Carvalho Chehab
2012-11-01 17:25 ` Tony Luck
2012-11-01 17:47 ` Mauro Carvalho Chehab [this message]
2012-11-01 19:55 ` Borislav Petkov
2012-11-01 21:09 ` Luck, Tony
2012-11-01 22:02 ` Borislav Petkov
2012-11-01 23:47 ` Luck, Tony
2012-11-01 23:54 ` Borislav Petkov
2012-11-02 2:25 ` Mauro Carvalho Chehab
2012-11-02 2:12 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121101154730.3580c356@redhat.com \
--to=mchehab@redhat.com \
--cc=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.