linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Don Dupuis" <dondster@gmail.com>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Md corruption using RAID10 on linux-2.6.21
Date: Mon, 21 May 2007 14:32:26 -0500	[thread overview]
Message-ID: <632b79000705211232gfb57e0focae8cdd0bf25bef@mail.gmail.com> (raw)
In-Reply-To: <632b79000705162050w70a2feb1w14919fde6e98818c@mail.gmail.com>

On 5/16/07, Don Dupuis <dondster@gmail.com> wrote:
> On 5/16/07, Don Dupuis <dondster@gmail.com> wrote:
> > On 5/16/07, Don Dupuis <dondster@gmail.com> wrote:
> > > On 5/16/07, Neil Brown <neilb@suse.de> wrote:
> > > > On Wednesday May 16, dondster@gmail.com wrote:
> > > > ...
> > > > >
> > > > > The problem arises when I do a drive removal such as sda and then I
> > > > > remove power from the system. Most of the time I will have a corrupted
> > > > > partition on the md device. Other corruption will be my root partition
> > > > > which is an ext3 filesystem. I seem to have a better chance of booting
> > > > > a least 1 time with no errors with bitmap turned on, but If I repeat
> > > > > the process, I will have corruption as well. Also with bitmap turned
> > > > > on, adding the new drive into the md device will take way to too long.
> > > > > I only get about 3MB per second on the resync. With bitmap turned off,
> > > > > I will get between 10MB to 15MB resync rate. Has anyone else seen this
> > > > > behavior, or is this situation is no tested very often? I would think
> > > > > that I shouldn't get corruption with this raid  setup and jornaling of
> > > > > my filesytems? Any help would be appreciated.
> > > >
> > > >
> > > > The resync rate should be the same whether you have a bitmap or not,
> > > > so that observation is very strange.  Can you double check, and report
> > > > the contents of "/proc/mdstat" in the two situations.
> > > >
> > > > You say you have corruption on your root filesystem.  Presumably that
> > > > is not on the raid?  Maybe the drive doesn't get a chance to flush
> > > > it's cache when you power-off.  Do you get the same corruption if you
> > > > simulate a crash without turning off the power. e.g.
> > > >    echo b > /proc/sysrq-trigger
> > > >
> > > > Do you get the same corruption in the raid10 if you turn it off
> > > > *without* removing a drive first?
> > > >
> > > > NeilBrown
> > > >
> > > Powering off with all drives will not have corruption. When I have a
> > > drive missing and the md device does a full resync, I will get the
> > > corruption. Usually the md partition table is corrupt or gone. and
> > > with the first drive gone it happens more frequently. If the partition
> > > table is not corrupt, then the rootfilesystem or one of the other
> > > filesystems on the md device will be corrupted. Yes my root filesystem
> > > is on the raid device. I will update with the bitmap resync rate stuff
> > > later.
> > >
> > > Don
> > >
> > Forgot to tell you that I have the drive write cache disabled on all my drives.
> >
> > Don
> >
> Here is the /proc/mdstat output doing a recover after adding a drive
> to the md device:
> unused devices: <none>
> -bash-3.1$ cat /proc/mdstat
> Personalities : [raid10]
> md_d0 : active raid10 sda2[4] sdd2[3] sdc2[2] sdb2[1]
>       3646464 blocks 256K chunks 3 near-copies [4/3] [_UUU]
>       [>....................]  recovery =  2.6% (73216/2734848)
> finish=4.8min speed=9152K/sec
>
> unused devices: <none>
> -bash-3.1$ cat /proc/mdstat
> Personalities : [raid10]
> md_d0 : active raid10 sda2[4] sdd2[3] sdc2[2] sdb2[1]
>       3646464 blocks 256K chunks 3 near-copies [4/3] [_UUU]
>       [>....................]  recovery =  3.4% (93696/2734848)
> finish=4.6min speed=9369K/sec
>
> I am still trying to get where I had the low recover rate with the
> bitmap turned on. I will get back with you
> Don
>
Any new updates Neil?
Any new things to try to get you additional info?
THanks

Don

  reply	other threads:[~2007-05-21 19:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-16 18:38 Md corruption using RAID10 on linux-2.6.21 Don Dupuis
2007-05-17  1:54 ` Neil Brown
2007-05-17  2:57   ` Don Dupuis
2007-05-17  2:58     ` Don Dupuis
2007-05-17  3:50       ` Don Dupuis
2007-05-21 19:32         ` Don Dupuis [this message]
2007-05-22  0:50           ` Neil Brown
2007-05-22  2:47             ` Don Dupuis
2007-05-22  3:59               ` Neil Brown
2007-05-31  1:52                 ` Don Dupuis
2007-05-31  5:16                   ` Neil Brown
2007-06-01 15:58                     ` Don Dupuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=632b79000705211232gfb57e0focae8cdd0bf25bef@mail.gmail.com \
    --to=dondster@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).