From: Avi Kivity <avi@argo.co.il>
To: Dave Jones <davej@redhat.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: discriminate single bit error hardware failure from slab corruption.
Date: Fri, 03 Feb 2006 02:44:52 +0200 [thread overview]
Message-ID: <43E2A784.2070809@argo.co.il> (raw)
In-Reply-To: <20060202192414.GA22074@redhat.com>
Dave Jones wrote:
>In the case where we detect a single bit has been flipped, we spew
>the usual slab corruption message, which users instantly think
>is a kernel bug. In a lot of cases, single bit errors are
>down to bad memory, or other hardware failure.
>
>This patch adds an extra line to the slab debug messages in those
>cases, in the hope that users will try memtest before they report a bug.
>
>000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
>Single bit error detected. Possibly bad RAM. Please run memtest86.
>
>Signed-off-by: Dave Jones <davej@redhat.com>
>
>--- linux-2.6.15/mm/slab.c~ 2006-01-09 13:25:17.000000000 -0500
>+++ linux-2.6.15/mm/slab.c 2006-01-09 13:26:01.000000000 -0500
>@@ -1313,8 +1313,11 @@ static void poison_obj(kmem_cache_t *cac
> static void dump_line(char *data, int offset, int limit)
> {
> int i;
>+ unsigned char total=0;
> printk(KERN_ERR "%03x:", offset);
> for (i = 0; i < limit; i++) {
>+ if (data[offset+i] != POISON_FREE)
>+ total += data[offset+i];
>
>
how about
total += hweight8(data[offset+i] ^ POISON_FREE);
> printk(" %02x", (unsigned char)data[offset + i]);
> }
> printk("\n");
>@@ -1019,6 +1023,18 @@ static void dump_line(char *data, int of
> }
> }
> printk("\n");
>+ switch (total) {
>+ case 0x36:
>+ case 0x6a:
>+ case 0x6f:
>+ case 0x81:
>+ case 0xac:
>+ case 0xd3:
>+ case 0xd5:
>+ case 0xea:
>+ printk (KERN_ERR "Single bit error detected. Possibly bad RAM. Please run memtest86.\n");
>+ return;
>+ }
>
>
and a
if (total == 1)
printk(...);
here? it seems more readable and more correct as well.
> }
> #endif
>
>
>
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2006-02-03 0:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-02 19:24 discriminate single bit error hardware failure from slab corruption Dave Jones
2006-02-02 19:28 ` Randy.Dunlap
2006-02-02 19:38 ` Jesper Juhl
2006-02-02 19:53 ` Pekka Enberg
2006-02-03 0:44 ` Avi Kivity [this message]
2006-02-03 1:46 ` Dave Jones
2006-02-03 2:05 ` Avi Kivity
2006-02-03 4:20 ` Dave Jones
2006-02-03 4:41 ` Roland Dreier
2006-02-03 5:03 ` Dave Jones
2006-02-03 14:12 ` Jan Engelhardt
2006-02-03 11:05 ` Olivier Galibert
2006-02-06 20:19 ` Pavel Machek
2006-02-03 14:09 ` Jan Engelhardt
-- strict thread matches above, loose matches on Subject: below --
2006-02-03 9:25 linux
2006-02-03 14:14 ` Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43E2A784.2070809@argo.co.il \
--to=avi@argo.co.il \
--cc=davej@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox