public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <borislav.petkov@amd.com>
To: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Doug Thompson <dougthompson@xmission.com>
Subject: Re: 2.6.32-rc8: amd64_edac slub error
Date: Tue, 1 Dec 2009 16:16:39 +0100	[thread overview]
Message-ID: <20091201151639.GA670@aftab> (raw)
In-Reply-To: <20091130141629.44401d86.randy.dunlap@oracle.com>

> EDAC DEBUG: in drivers/edac/amd64_edac.c, line at 2367:     DRAM MEM-CTL PCI Bus ID:	0000:00:18.2
> EDAC DEBUG: in drivers/edac/amd64_edac.c, line at 2369:     Misc device PCI Bus ID:	0000:00:18.3
> calling  alsa_pcm_init+0x0/0x71 [snd_pcm] @ 1402
> initcall alsa_pcm_init+0x0/0x71 [snd_pcm] returned 0 after 17 usecs
> EDAC amd64: ECC is enabled by BIOS.
> get_cpus_on_this_dct_cpumask: nid: 0, cpu: 0
> get_cpus_on_this_dct_cpumask: nid: 0, cpu: 2
> amd64_nb_mce_bank_enabled_on_node: weight: 2
> EDAC DEBUG: in drivers/edac/amd64_edac.c, line at 2776: core: 0, MCG_CTL: 0x1f, NB MSR is enabled
> EDAC DEBUG: in drivers/edac/amd64_edac.c, line at 2776: core: 2, MCG_CTL: 0x0, NB MSR is disabled
> =============================================================================
> BUG kmalloc-16: Redzone overwritten
> -----------------------------------------------------------------------------

Hmm, I think I know what happens. This machine has non-contigious
core enumeration on a node (e.g. 0,2 on node 0 instead of 0,1) but
rdmsr_on_cpus assumes the former. Therefore we write outside of the
allocated msrs struct and thus the redzone overwrite. Here's a simple
fix that should take care of it. Please apply on top of the debugging
patch and catch the output again so that we could verify it.

I'll fix this properly when I get back and then maybe even backport it
depending on the intrusiveness of the changes.

Thanks.

---
 drivers/edac/amd64_edac.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 139bc14..c013261 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2750,7 +2750,8 @@ static bool amd64_nb_mce_bank_enabled_on_node(int nid)
 {
 	cpumask_t mask;
 	struct msr *msrs;
-	int cpu, nbe, idx = 0;
+	int cpu, nbe, i, idx = 0;
+	int first_cpu, last_cpu = 0;
 	bool ret = false;
 
 	cpumask_clear(&mask);
@@ -2759,7 +2760,17 @@ static bool amd64_nb_mce_bank_enabled_on_node(int nid)
 
 	pr_err("%s: weight: %d\n", __func__, cpumask_weight(&mask));
 
-	msrs = kzalloc(sizeof(struct msr) * cpumask_weight(&mask), GFP_KERNEL);
+	/*
+	 * calc. cores interval when non-contigious core enumeration
+	 */
+	first_cpu = cpumask_first(&mask);
+
+	for (i = first_cpu; i < nr_cpu_ids; i++)
+		if (cpumask_test_cpu(i, &mask))
+			last_cpu = i;
+
+	msrs = kzalloc(sizeof(struct msr) * (last_cpu - first_cpu + 1),
+		       GFP_KERNEL);
 	if (!msrs) {
 		amd64_printk(KERN_WARNING, "%s: error allocating msrs\n",
 			      __func__);
-- 
1.6.4.3

-- 
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
  System  | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
 Research | Geschäftsführer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
  Center  | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
  (OSRC)  | Registergericht München, HRB Nr. 43632


  reply	other threads:[~2009-12-01 15:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-30 17:28 2.6.32-rc8: amd64_edac slub error Randy Dunlap
2009-11-30 20:35 ` Borislav Petkov
2009-11-30 21:29   ` Randy Dunlap
2009-11-30 22:16   ` Randy Dunlap
2009-12-01 15:16     ` Borislav Petkov [this message]
2009-12-01 17:19       ` Randy Dunlap
2009-12-02 10:58         ` Borislav Petkov
2009-12-02 18:11           ` Randy Dunlap
2009-12-02 22:12             ` Doug Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091201151639.GA670@aftab \
    --to=borislav.petkov@amd.com \
    --cc=dougthompson@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=petkovbb@googlemail.com \
    --cc=randy.dunlap@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox