From: Dave Jones <davej@redhat.com>
To: Avi Kivity <avi@argo.co.il>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: discriminate single bit error hardware failure from slab corruption.
Date: Thu, 2 Feb 2006 23:20:35 -0500 [thread overview]
Message-ID: <20060203042035.GF10209@redhat.com> (raw)
In-Reply-To: <43E2BA63.5050505@argo.co.il>
On Fri, Feb 03, 2006 at 04:05:23AM +0200, Avi Kivity wrote:
> unsigned char modified_bits = data[offset+i] ^ POSION_FREE;
> int modified_bits_count = hweight8(modified_bits);
> total += modified_bits_count;
>
> >wrt correctness, what do you see wrong with my approach?
> Your code will generate a false positive 8 times in 256 runs, or 1 in
> 32. A 3% false positive rate seems excessive, It's also sensitive to
> changes to POISON_FREE.
Hmm, I made a mistake in my maths somewhere, and some of those values
are incorrect, so having the compiler do the work would have stopped
me screwing up, but once the correct values are used, I doubt there's
ever a really compelling reason to change the slab poison pattern.
Dave
In case where we detect a single bit has been flipped, we spew
the usual slab corruption message, which users instantly think
is a kernel bug. In a lot of cases, single bit errors are
down to bad memory, or other hardware failure.
This patch adds an extra line to the slab debug messages
in those cases, in the hope that users will try memtest before
they report a bug.
000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Single bit error detected. Possibly bad RAM. Run memtest86.
Signed-off-by: Dave Jones <davej@redhat.com>
--- linux-2.6.15/mm/slab.c~ 2006-01-09 13:25:17.000000000 -0500
+++ linux-2.6.15/mm/slab.c 2006-01-09 13:26:01.000000000 -0500
@@ -1313,8 +1313,11 @@ static void poison_obj(kmem_cache_t *cac
static void dump_line(char *data, int offset, int limit)
{
int i;
+ unsigned char total=0;
printk(KERN_ERR "%03x:", offset);
for (i = 0; i < limit; i++) {
+ if (data[offset+i] != POISON_FREE)
+ total += data[offset+i];
printk(" %02x", (unsigned char)data[offset + i]);
}
printk("\n");
@@ -1019,6 +1023,22 @@ static void dump_line(char *data, int of
}
}
printk("\n");
+ switch (total) {
+ /* 01101011 (0x6b - SLAB_POISON) */
+ case 0x6a: /* 01101010 bit 0 flipped */
+ case 0x69: /* 01101001 bit 1 flipped */
+ case 0x6f: /* 01101111 bit 2 flipped */
+ case 0x63: /* 01100011 bit 3 flipped */
+ case 0x7b: /* 01111011 bit 4 flipped */
+ case 0x4b: /* 01001011 bit 5 flipped */
+ case 0x2b: /* 00101011 bit 6 flipped */
+ case 0xeb: /* 11101011 bit 7 flipped */
+ printk (KERN_ERR "Single bit error detected. Possibly bad RAM\n"
+#ifdef CONFIG_X86
+ printk (KERN_ERR "Run memtest86 or other memory test tool.\n");
+#endif
+ return;
+ }
}
#endif
next prev parent reply other threads:[~2006-02-03 4:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-02 19:24 discriminate single bit error hardware failure from slab corruption Dave Jones
2006-02-02 19:28 ` Randy.Dunlap
2006-02-02 19:38 ` Jesper Juhl
2006-02-02 19:53 ` Pekka Enberg
2006-02-03 0:44 ` Avi Kivity
2006-02-03 1:46 ` Dave Jones
2006-02-03 2:05 ` Avi Kivity
2006-02-03 4:20 ` Dave Jones [this message]
2006-02-03 4:41 ` Roland Dreier
2006-02-03 5:03 ` Dave Jones
2006-02-03 14:12 ` Jan Engelhardt
2006-02-03 11:05 ` Olivier Galibert
2006-02-06 20:19 ` Pavel Machek
2006-02-03 14:09 ` Jan Engelhardt
-- strict thread matches above, loose matches on Subject: below --
2006-02-03 9:25 linux
2006-02-03 14:14 ` Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060203042035.GF10209@redhat.com \
--to=davej@redhat.com \
--cc=avi@argo.co.il \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox