From: Andi Kleen <andi@firstfloor.org>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
Pekka Enberg <penberg@cs.helsinki.fi>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: MCE boot crash in qemu
Date: Mon, 15 Jun 2009 14:52:01 +0200 [thread overview]
Message-ID: <20090615125200.GD31969@one.firstfloor.org> (raw)
In-Reply-To: <19f34abd0906150459v2eb6fd1ak86586bc697c1e69f@mail.gmail.com>
On Mon, Jun 15, 2009 at 01:59:04PM +0200, Vegard Nossum wrote:
> Hi,
>
> I get an MCE-related crash like this in latest linus tree:
>
> [ 0.115341] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> [ 0.116396] CPU: L2 Cache: 512K (64 bytes/line)
> [ 0.120570] mce: CPU supports 0 MCE banks
> [ 0.124870] BUG: unable to handle kernel NULL pointer dereference at 00000000
> 00000010
> [ 0.128001] IP: [<ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] PGD 0
> [ 0.128001] Thread overran stack, or stack corrupted
> [ 0.128001] Oops: 0002 [#1] PREEMPT SMP
> [ 0.128001] last sysfs file:
> [ 0.128001] CPU 0
> [ 0.128001] Modules linked in:
> [ 0.128001] Pid: 0, comm: swapper Not tainted 2.6.30 #426
> [ 0.128001] RIP: 0010:[<ffffffff813b98ad>] [<ffffffff813b98ad>] mcheck_init+
> 0x278/0x320
> [ 0.128001] RSP: 0018:ffffffff81595e38 EFLAGS: 00000246
> [ 0.128001] RAX: 0000000000000010 RBX: ffffffff8158f900 RCX: 0000000000000000
> [ 0.128001] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000000000010
> [ 0.128001] RBP: ffffffff81595e68 R08: 0000000000000001 R09: 0000000000000000
> [ 0.128001] R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
> [ 0.128001] R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
> [ 0.128001] FS: 0000000000000000(0000) GS:ffff880002288000(0000) knlGS:00000
> 00000000000
> [ 0.128001] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 0.128001] CR2: 0000000000000010 CR3: 0000000001001000 CR4: 00000000000006b0
> [ 0.128001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.128001] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
> [ 0.128001] Process swapper (pid: 0, threadinfo ffffffff81594000, task ffffff
> ff8152a4a0)
> [ 0.128001] Stack:
> [ 0.128001] 0000000081595e68 5aa50ed3b4ddbe6e ffffffff8158f900 ffffffff8158f
> 914
> [ 0.128001] ffffffff8158f948 0000000000000000 ffffffff81595eb8 ffffffff813b8
> 69c
> [ 0.128001] 5aa50ed3b4ddbe6e 00000001078bfbfd 0000062300000800 5aa50ed3b4ddb
> e6e
> [ 0.128001] Call Trace:
> [ 0.128001] [<ffffffff813b869c>] identify_cpu+0x331/0x392
> [ 0.128001] [<ffffffff815a1445>] identify_boot_cpu+0x23/0x6e
> [ 0.128001] [<ffffffff815a14ac>] check_bugs+0x1c/0x60
> [ 0.128001] [<ffffffff8159c075>] start_kernel+0x403/0x46e
> [ 0.128001] [<ffffffff8159b2ac>] x86_64_start_reservations+0xac/0xd5
> [ 0.128001] [<ffffffff8159b3ea>] x86_64_start_kernel+0x115/0x14b
> [ 0.128001] [<ffffffff8159b140>] ? early_idt_handler+0x0/0x71
> [ 0.128001] Code: c7 48 89 05 9e 71 40 00 74 2a 48 63 15 91 71 40 00 be ff 00
> 00 00 48 c1 e2 03 e8 bf a1 e2 ff e9 3f fe ff ff 48 8b 05 7b 71 40 00 <48> c7 00
> 00 00 00 00 eb 84 c7 05 40 71 40 00 01 00 00 00 e9 2b
> [ 0.128001] RIP [<ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] RSP <ffffffff81595e38>
> [ 0.128001] CR2: 0000000000000010
> [ 0.129306] ---[ end trace a7919e7f17c0a725 ]---
>
> It's this:
>
> /*
> * Various K7s with broken bank 0 around. Always disable
> * by default.
> */
> if (c->x86 == 6)
> bank[0] = 0;
>
> in mce_cpu_quirks() in arch/x86/kernel/cpu/mcheck/mce.c around line
> 1217. Strange that it thinks this is AMD cpu, though?
Probably qemu fakes that. You can check in /proc/cpuinfo after
it booted.
It should really clear the mca cpuid flag if it doesn't have any mca banks,
but ok.
Here's a untested patch (sorry not able to test any patches currently).
Does it fix the problem?
A workaround if you don't want to apply the patch is to boot with mce=off
-Andi
---
x86: mce: Handle banks == 0 case in K7 quirk
This happens on QEMU which reports MCA capability, but no banks.
Without this patch there is a buffer overrun and boot ops because the code
would try to initialize the 0 element of a zero length kmalloc()
buffer.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
--- linux-2.6.30-git8/arch/x86/kernel/cpu/mcheck/mce.c-o 2009-06-15 14:45:52.000000000 +0200
+++ linux-2.6.30-git8/arch/x86/kernel/cpu/mcheck/mce.c 2009-06-15 14:46:40.000000000 +0200
@@ -1245,7 +1245,7 @@
* Various K7s with broken bank 0 around. Always disable
* by default.
*/
- if (c->x86 == 6)
+ if (c->x86 == 6 && banks > 0)
bank[0] = 0;
}
next prev parent reply other threads:[~2009-06-15 12:43 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-15 11:59 MCE boot crash in qemu Vegard Nossum
2009-06-15 12:01 ` Pekka Enberg
2009-06-15 12:52 ` Andi Kleen [this message]
2009-06-15 13:22 ` Pekka Enberg
2009-06-17 5:50 ` Pekka Enberg
2009-06-17 6:57 ` Ingo Molnar
2009-06-17 10:32 ` [tip:x86/urgent] x86: mce: Handle banks == 0 case in K7 quirk tip-bot for Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090615125200.GD31969@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=penberg@cs.helsinki.fi \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.