From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: "Shaohui Xie" <Shaohui.Xie@freescale.com>,
"Jason Uhlenkott" <juhlenko@akamai.com>,
"Aristeu Rozanski" <arozansk@redhat.com>,
"Hitoshi Mitake" <h.mitake@gmail.com>,
"Mark Gross" <mark.gross@intel.com>,
"Dmitry Eremin-Solenikov" <dbaryshkov@gmail.com>,
"Ranganathan Desikan" <ravi@jetztechnologies.com>,
"Egor Martovetsky" <egor@pasemi.com>,
"Niklas Söderlund" <niklas.soderlund@ericsson.com>,
"Tim Small" <tim@buttersideup.com>,
"Arvind R." <arvino55@gmail.com>,
"Chris Metcalf" <cmetcalf@tilera.com>,
"Olof Johansson" <olof@lixom.net>,
"Doug Thompson" <norsk5@yahoo.com>,
"Linux Edac Mailing List" <linux-edac@vger.kernel.org>,
"Michal Marek" <mmarek@suse.cz>, "Jiri Kosina" <jkosina@suse.cz>,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
"Joe Perches" <joe@perches.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
Date: Fri, 27 Apr 2012 12:36:12 -0300 [thread overview]
Message-ID: <4F9ABCEC.9090807@redhat.com> (raw)
In-Reply-To: <20120427133304.GE9626@aftab.osrc.amd.com>
Em 27-04-2012 10:33, Borislav Petkov escreveu:
> Btw,
>
> this patch gives
>
> [ 8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [ 8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [ 8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
> [ 8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
> [ 8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
> [ 8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
> [ 8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
> [ 8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
> [ 8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
> [ 8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
> [ 8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
> [ 8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
> [ 8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
> [ 8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
> [ 8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
> [ 8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1
>
> and the memory controller has the following chip selects
>
> [ 8.137662] EDAC MC: DCT0 chip selects:
> [ 8.150291] EDAC amd64: MC: 0: 2048MB 1: 2048MB
> [ 8.155349] EDAC amd64: MC: 2: 2048MB 3: 2048MB
> [ 8.160408] EDAC amd64: MC: 4: 0MB 5: 0MB
> [ 8.165475] EDAC amd64: MC: 6: 0MB 7: 0MB
> [ 8.180499] EDAC MC: DCT1 chip selects:
> [ 8.184693] EDAC amd64: MC: 0: 2048MB 1: 2048MB
> [ 8.189753] EDAC amd64: MC: 2: 2048MB 3: 2048MB
> [ 8.194812] EDAC amd64: MC: 4: 0MB 5: 0MB
> [ 8.199875] EDAC amd64: MC: 6: 0MB 7: 0MB
>
> Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
> is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
> misleading and has nothing to do with the reality. So, if this is to use
> your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
> a rank.
>
> Or, the most correct thing to do would be to have dimm0-dimm3, each
> dual-ranked.
>
> So either tot_dimms is computed wrongly or there's a more serious error
> somewhere.
>
> I've reviewed almost the half patch, will review the rest when/if we
> sort out the above issue first.
>
> Thanks.
The fix for it were in another patch[1], as calling them as "rank" is needed
also at the sysfs API.
[1] http://lists-archives.com/linux-kernel/27623222-edac-add-a-new-per-dimm-api-and-make-the-old-per-virtual-rank-api-obsolete.html
I can just merge the fix on this patch, with the enclosed diff.
Regards,
Mauro
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 4d4d8b7..e0d9481 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -86,7 +86,7 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
debugf4("\tmci->edac_check = %p\n", mci->edac_check);
debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
mci->nr_csrows, mci->csrows);
- debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+ debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
mci->tot_dimms, mci->dimms);
debugf3("\tdev = %p\n", mci->dev);
debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
@@ -183,10 +183,6 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
* @size_pvt: size of private storage needed
*
*
- * FIXME: drivers handle multi-rank memories on different ways: on some
- * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
- * a single multi-rank DIMM would be mapped into several "dimms".
- *
* Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
* such DIMMS properly, but the CSROWS-based ones will likely do the wrong
* thing, as two chip select values are used for dual-rank memories (and 4, for
@@ -201,6 +197,12 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
*
* Use edac_mc_free() to free mc structures allocated by this function.
*
+ * NOTE: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one entry, while, on others,
+ * a single multi-rank DIMM would be mapped into several entries. Currently,
+ * this function will allocate multiple struct dimm_info on such scenarios,
+ * as grouping the multiple ranks require drivers change.
+ *
* Returns:
* NULL allocation failed
* struct mem_ctl_info pointer
@@ -220,10 +222,11 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
void *pvt;
unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
- unsigned tot_csrows, tot_cschannels;
+ unsigned tot_csrows, tot_cschannels, tot_errcount = 0;
int i, j;
int err;
int row, chn;
+ bool per_rank = false;
BUG_ON(n_layers > EDAC_MAX_LAYERS);
/*
@@ -239,6 +242,9 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
tot_csrows *= layers[i].size;
else
tot_cschannels *= layers[i].size;
+
+ if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+ per_rank = true;
}
/* Figure out the offsets of the various items from the start of an mc
@@ -254,14 +260,21 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
count = 1;
for (i = 0; i < n_layers; i++) {
count *= layers[i].size;
+ debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+ tot_errcount += 2 * count;
}
+
+ debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
pvt = edac_align_ptr(&ptr, sz_pvt, 1);
size = ((unsigned long)pvt) + sz_pvt;
- debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
- __func__, size, tot_dimms, tot_csrows * tot_cschannels);
+ debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+ __func__, size,
+ tot_dimms,
+ per_rank ? "ranks" : "dimms",
+ tot_csrows * tot_cschannels);
mci = kzalloc(size, GFP_KERNEL);
if (mci == NULL)
return NULL;
@@ -290,6 +303,7 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
mci->nr_csrows = tot_csrows;
mci->num_cschannel = tot_cschannels;
+ mci->mem_is_per_rank = per_rank;
/*
* Fills the csrow struct
@@ -315,15 +329,16 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
memset(&pos, 0, sizeof(pos));
row = 0;
chn = 0;
- debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+ debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+ per_rank ? "ranks" : "dimms");
for (i = 0; i < tot_dimms; i++) {
chan = &csi[row].channels[chn];
dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
pos[0], pos[1], pos[2]);
dimm->mci = mci;
- debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
- i, (dimm - mci->dimms),
+ debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+ i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
pos[0], pos[1], pos[2], row, chn);
/* Copy DIMM location */
@@ -1040,8 +1055,10 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
* get csrow/channel of the dimm, in order to allow
* incrementing the compat API counters
*/
- debugf4("%s: dimm csrows (%d,%d)\n",
- __func__, dimm->csrow, dimm->cschannel);
+ debugf4("%s: %s csrows map: (%d,%d)\n",
+ __func__,
+ mci->mem_is_per_rank ? "rank" : "dimm",
+ dimm->csrow, dimm->cschannel);
if (row == -1)
row = dimm->csrow;
else if (row >= 0 && row != dimm->csrow)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 412d5cd..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -555,6 +555,8 @@ struct mem_ctl_info {
/* Memory Controller hierarchy */
unsigned n_layers;
struct edac_mc_layer *layers;
+ bool mem_is_per_rank;
+
/*
* DIMM info. Will eventually remove the entire csrows_info some day
*/
next prev parent reply other threads:[~2012-04-27 15:37 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1335289087-11337-1-git-send-email-mchehab@redhat.com>
2012-04-24 18:15 ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-04-27 13:33 ` Borislav Petkov
2012-04-27 14:11 ` Joe Perches
2012-04-27 15:12 ` Borislav Petkov
2012-04-27 16:07 ` Mauro Carvalho Chehab
2012-04-28 8:52 ` Borislav Petkov
2012-04-28 20:38 ` Joe Perches
2012-04-29 14:25 ` Mauro Carvalho Chehab
2012-04-29 15:11 ` Mauro Carvalho Chehab
2012-04-29 16:03 ` Joe Perches
2012-04-29 17:18 ` Mauro Carvalho Chehab
2012-04-27 15:36 ` Mauro Carvalho Chehab [this message]
2012-04-28 9:05 ` Borislav Petkov
2012-04-29 13:49 ` Mauro Carvalho Chehab
2012-04-30 8:15 ` Borislav Petkov
2012-04-30 10:58 ` Mauro Carvalho Chehab
2012-04-30 11:11 ` Borislav Petkov
2012-04-30 11:45 ` Mauro Carvalho Chehab
2012-04-30 12:38 ` Borislav Petkov
2012-04-30 13:00 ` Mauro Carvalho Chehab
2012-04-30 13:53 ` Mauro Carvalho Chehab
2012-04-30 11:37 ` Mauro Carvalho Chehab
2012-04-27 17:52 ` Mauro Carvalho Chehab
2012-04-27 18:11 ` Luck, Tony
2012-04-27 19:24 ` Mauro Carvalho Chehab
2012-04-28 8:58 ` Borislav Petkov
2012-04-28 9:16 ` Borislav Petkov
2012-04-28 17:07 ` Joe Perches
2012-04-29 14:02 ` Mauro Carvalho Chehab
2012-04-29 14:16 ` Mauro Carvalho Chehab
2012-04-30 7:59 ` Borislav Petkov
2012-04-30 11:23 ` Mauro Carvalho Chehab
2012-04-30 12:51 ` Borislav Petkov
2012-05-02 13:30 Borislav Petkov
2012-05-03 14:16 ` Mauro Carvalho Chehab
2012-05-04 9:52 ` Borislav Petkov
2012-05-04 10:15 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9ABCEC.9090807@redhat.com \
--to=mchehab@redhat.com \
--cc=Shaohui.Xie@freescale.com \
--cc=akpm@linux-foundation.org \
--cc=arozansk@redhat.com \
--cc=arvino55@gmail.com \
--cc=bp@amd64.org \
--cc=cmetcalf@tilera.com \
--cc=dbaryshkov@gmail.com \
--cc=egor@pasemi.com \
--cc=h.mitake@gmail.com \
--cc=jkosina@suse.cz \
--cc=joe@perches.com \
--cc=juhlenko@akamai.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mark.gross@intel.com \
--cc=mmarek@suse.cz \
--cc=niklas.soderlund@ericsson.com \
--cc=norsk5@yahoo.com \
--cc=olof@lixom.net \
--cc=ravi@jetztechnologies.com \
--cc=tim@buttersideup.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).