public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bs@q-leap.de>
To: linux-kernel@vger.kernel.org
Subject: mkfs.ext2 triggerd RAM corruption
Date: Fri, 4 May 2007 16:59:51 +0200	[thread overview]
Message-ID: <200705041659.51675.bs@q-leap.de> (raw)

Hi,

I'm presently rather puzzled, if this is really a kernel bug, its a big bug. 

Summary: The system ramdisk (initrd) gets corrupted while running mkfs.ext2 on 
a local sata disk partition.

Reproduced on kernel versions: vanilla 2.6.16 - 2.6.20 (<2.6.16 doesn't run on 
any of the systems I can do tests with).
Please note: I could reproduce this on serveral systems, all of them use ECC 
memory and the memory of most of them the memory is monitored using EDAC. 

Details:

1.) Our systems boot from an initrd, all system services are running from the 
initrd/ramdisk.

2.) While setting up a lustre meta data storage server, lustre runs 
mkfs.ext2 -j -b 4096 -F -i 4096 -J size=400 -I 512 /dev/sda4
(Please note, I first observed this while using a lustre patched kernel, but I 
could reproduce this with vanilla kernels).


While this mkfs.ext2 command was running, suddenly running commands such as 
ps, top, ls, etc. resulted in segmentation faults.

To see whats going on, I copied the entire / (so the initrd) into a tmpfs 
root, chrooted into it, also bind mounted the main / into this chroot and 
compared several times /bin of chroot/bin and the bind-mounted /bin while the 
mkfs.ext2 command was running.

beo-05:/# diff -r /bin /oldroot/bin/
beo-05:/# diff -r /bin /oldroot/bin/
beo-05:/# diff -r /bin /oldroot/bin/
Binary files /bin/sleep and /oldroot/bin/sleep differ
beo-05:/# diff -r /bin /oldroot/bin/
Binary files /bin/bsd-csh and /oldroot/bin/bsd-csh differ
Binary files /bin/cat and /oldroot/bin/cat differ
...

Also tested different schedulers, at least happens with deadline and 
anticipatory.

The corruption does NOT happen on running the mkfs command on /dev/sda1, but 
happens with sda2, sda3 and sda3. Also doesn't happen with extended 
partitions of sda1.

Any idea whats going on?


Thanks,
Bernd


-- 
Bernd Schubert
Q-Leap Networks GmbH

             reply	other threads:[~2007-05-04 15:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-04 14:59 Bernd Schubert [this message]
2007-05-04 18:49 ` mkfs.ext2 triggerd RAM corruption Theodore Tso
2007-05-05  1:36   ` Bernd Schubert
2007-05-05 18:57     ` Theodore Tso
2007-05-05 19:12       ` Jan Engelhardt
2007-05-05 22:06         ` Bernd Schubert
2007-05-05 23:09       ` Bernd Schubert
2007-05-04 20:39 ` Jan-Benedict Glaw
2007-05-05  1:38   ` Bernd Schubert
2007-05-07 18:42   ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200705041659.51675.bs@q-leap.de \
    --to=bs@q-leap.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox