linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: whollygoat@letterboxes.org
To: linux-raid@vger.kernel.org
Cc: David Greaves <david@dgreaves.com>
Subject: Re: some ?? re failed disk and resyncing of array
Date: Sat, 31 Jan 2009 04:03:08 -0800	[thread overview]
Message-ID: <1233403388.29916.1297756217@webmail.messagingengine.com> (raw)
In-Reply-To: <49842A1E.1090105@dgreaves.com>


On Sat, 31 Jan 2009 10:38:22 +0000, "David Greaves" <david@dgreaves.com>
said:
> whollygoat@letterboxes.org wrote:
> > On a boot a couple of days ago, mdadm failed a disk and
> > started resyncing to spare (raid5, 6 drives, 5 active, 1
> > spare).  smartctl -H <disk> returned info (can't remember
> > the exact text) that made me suspect the drive was
> > fine, but the data connection was bad.  Sure enough the
> > data cable was damaged.  Replaced the cable and smartctl
> > sees the disk just fine and reports no errors.
> > 
> > - I'd like to readd the drive as a spare.  Is it enough
> > to "mdadm --add /dev/hdk" or do I need to prep the drive to
> > remove any data that said where it previously belonged
> > in the array?
> That should work.
> Any issues and you can zero the superblock (man mdadm)
> No need to zero the disk.

Would --re-add be better?

I've noticed something else since I made the initial post

--------- begin output -------------
fly:~# mdadm -D /dev/md0
/dev/md0:
        Version : 01.00.03
  Creation Time : Sun Jan 11 21:49:36 2009
     Raid Level : raid5
     Array Size : 312602368 (298.12 GiB 320.10 GB)
    Device Size : 156301184 (74.53 GiB 80.03 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Jan 30 15:52:01 2009
          State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : fly:FlyFileServ_md  (local to host fly)
           UUID : 0e2b9157:a58edc1d:213a220f:68a555c9
         Events : 16

    Number   Major   Minor   RaidDevice State
       0      33        1        0      active sync   /dev/hde1
       1      34        1        1      active sync   /dev/hdg1
       2      56        1        2      active sync   /dev/hdi1
       5      89        1        3      active sync   /dev/hdo1
       6      88        1        4      active sync   /dev/hdm1


fly:~# mdadm -E /dev/hdo1
/dev/hdo1:
          Magic : a92b4efc
        Version : 01
    Feature Map : 0x1
     Array UUID : 0e2b9157:a58edc1d:213a220f:68a555c9
           Name : fly:FlyFileServ_md  (local to host fly)
  Creation Time : Sun Jan 11 21:49:36 2009
     Raid Level : raid5
   Raid Devices : 5

    Device Size : 234436336 (111.79 GiB 120.03 GB)
     Array Size : 625204736 (298.12 GiB 320.10 GB)
      Used Size : 156301184 (74.53 GiB 80.03 GB)
   Super Offset : 234436464 sectors
          State : clean
    Device UUID : e072bd09:2df53d6d:d23321cc:cf2c37de

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Jan 30 15:52:01 2009
       Checksum : 4689ff5 - correct
         Events : 16

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 5 (0, 1, 2, failed, failed, 3, 4)
   Array State : uuuUu 2 failed
--------- end output -------------

Why does the "Array Slot" field show 7 slots?  And why
does the field "Array State" show 2 failed?  There 
ever only were 6 disks in the array.  Only one of those
is currently missing.  mdadm -D above doesn't list any
failed devices in the "Failed Devices" field.

Thanks for your answers below as well.  It's kind of 
what I was expecting.  There was a h/w problem that
took ages to track down and I think it was reponsible
for all the e2fs errors.

WG

> 
> > - When I tried to list some files on one of the filesystems
> > on the array (the fact that it took so long to react to
> > the ls is how I discovered the box was in the middle of
> > rebuiling to spare)
> This is OK - resync involves a lot of IO and can slow things down. This
> is tuneable.
> 
> > it couldn't find the file (or many 
> > others).  I thought that resyncing was supposed to be
> > transparent, yet parts of the fs seemed to be missing.
> > Everything was there afterwards.  Is that normal?
> No. This is nothing to do with normal md resyncing and certainly not
> expected.
> 
> > - On a subsequent boot I had to run e2fsck on the three
> > filesystems housed on the array.  Many stray blocks, 
> > illegal inodes, etc were found.  An artifact of the rebuild
> > or unrelated?
> Well, you had a fault in your IO system there's a good chance your O
> broke.
> 
> Verify against a backup.
> 
> David
> 
> 
> -- 
> "Don't worry, you'll be fine; I saw it work in a cartoon once..."
-- 
  
  whollygoat@letterboxes.org

-- 
http://www.fastmail.fm - IMAP accessible web-mail


  reply	other threads:[~2009-01-31 12:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-31  8:16 some ?? re failed disk and resyncing of array whollygoat
2009-01-31 10:38 ` David Greaves
2009-01-31 12:03   ` whollygoat [this message]
2009-02-01 19:41     ` Bill Davidsen
2009-02-02  1:47       ` whollygoat
2009-02-03  0:52       ` zero-superblock, " whollygoat
2009-02-03  8:48         ` David Greaves
2009-02-04  4:48           ` whollygoat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1233403388.29916.1297756217@webmail.messagingengine.com \
    --to=whollygoat@letterboxes.org \
    --cc=david@dgreaves.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).