From: Borislav Petkov <bp@alien8.de>
To: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
Aristeu Rozanski <aris@redhat.com>,
linux-edac@vger.kernel.org
Subject: [3/3] EDAC, sb_edac: Fix reporting wrong DIMM when patrol scrubber finds error
Date: Thu, 6 Sep 2018 12:53:36 +0200 [thread overview]
Message-ID: <20180906105336.GD10768@zn.tnic> (raw)
On Tue, Sep 04, 2018 at 02:07:59PM -0700, Tony Luck wrote:
> From: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>
> EDAC driver sometimes reports the wrong DIMM for a memory
"EDAC driver"? Which one?
Please be more precise when writing your commit messages. They're not
write-only.
> error found by the patrol scrubber. It's rooted in h/w that
> only provides a 4KB page aligned address for the error case.
> This means that the EDAC driver will point at the DIMM matching
> offset 0x0 in the 4KB page, but because of interleaving across
> channels and ranks the actual DIMM involved may be different
> if the error is on some other cache line within the page.
>
> For this error case, we could pass the socket/iMC/channel
> information from the "mce" structure passed the EDAC driver
> and "dimm=-1" to the EDAC core. So it will report all the
> DIMMs on that channel may be affected.
>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> drivers/edac/sb_edac.c | 95 +++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 89 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
> index f3678cdada83..f6009c7d452b 100644
> --- a/drivers/edac/sb_edac.c
> +++ b/drivers/edac/sb_edac.c
> @@ -326,6 +326,7 @@ struct sbridge_info {
> const struct interleave_pkg *interleave_pkg;
> u8 max_sad;
> u8 (*get_node_id)(struct sbridge_pvt *pvt);
> + u8 (*get_ha)(u8 bank);
I'm staring at all this code and wondering what this "ha" is. I could
use a comment somewhere...
> enum mem_type (*get_memory_type)(struct sbridge_pvt *pvt);
> enum dev_type (*get_width)(struct sbridge_pvt *pvt, u32 mtr);
> struct pci_dev *pci_vtd;
> @@ -1002,6 +1003,22 @@ static u8 knl_get_node_id(struct sbridge_pvt *pvt)
> return GET_BITFIELD(reg, 0, 2);
> }
>
> +static u8 sbridge_get_ha(u8 bank)
> +{
> + return 0;
> +}
> +
> +static u8 ibridge_get_ha(u8 bank)
> +{
> + switch (bank) {
> + case 7 ... 8:
> + return bank - 7;
> + case 9 ... 16:
> + return (bank - 9) / 4;
> + default:
> + return -EINVAL;
> + }
> +}
>
> static u64 haswell_get_tolm(struct sbridge_pvt *pvt)
> {
> @@ -2207,6 +2224,56 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
> return 0;
> }
>
> +static int get_memory_error_data_from_mce(struct mem_ctl_info *mci,
> + const struct mce *m, u8 *socket,
> + u8 *ha, long *channel_mask,
> + char *msg)
> +{
> + u32 reg, channel = GET_BITFIELD(m->status, 0, 3);
> + struct mem_ctl_info *new_mci;
> + struct sbridge_pvt *pvt;
> + struct pci_dev *pci_ha;
> + bool tad0;
> +
> + if (channel >= NUM_CHANNELS) {
> + sprintf(msg, "Invalid channel 0x%x", channel);
> + return -EINVAL;
> + }
> +
> + pvt = mci->pvt_info;
> + *ha = pvt->info.get_ha(m->bank);
You need to check the get_ha pointer before calling it because
KNIGHTS_LANDING assigns NULL to it.
next reply other threads:[~2018-09-06 10:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-06 10:53 Borislav Petkov [this message]
-- strict thread matches above, loose matches on Subject: below --
2018-09-06 13:12 [3/3] EDAC, sb_edac: Fix reporting wrong DIMM when patrol scrubber finds error Qiuxu Zhuo
2018-09-06 12:34 Borislav Petkov
2018-09-06 11:48 Qiuxu Zhuo
2018-09-04 21:07 Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180906105336.GD10768@zn.tnic \
--to=bp@alien8.de \
--cc=aris@redhat.com \
--cc=linux-edac@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=qiuxu.zhuo@intel.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox