All of lore.kernel.org
 help / color / mirror / Atom feed
From: whollygoat@letterboxes.org
To: linux-raid@vger.kernel.org
Cc: David Greaves <david@dgreaves.com>
Subject: Re: some ?? re failed disk and resyncing of array
Date: Sat, 31 Jan 2009 04:03:08 -0800	[thread overview]
Message-ID: <1233403388.29916.1297756217@webmail.messagingengine.com> (raw)
In-Reply-To: <49842A1E.1090105@dgreaves.com>


On Sat, 31 Jan 2009 10:38:22 +0000, "David Greaves" <david@dgreaves.com>
said:
> whollygoat@letterboxes.org wrote:
> > On a boot a couple of days ago, mdadm failed a disk and
> > started resyncing to spare (raid5, 6 drives, 5 active, 1
> > spare).  smartctl -H <disk> returned info (can't remember
> > the exact text) that made me suspect the drive was
> > fine, but the data connection was bad.  Sure enough the
> > data cable was damaged.  Replaced the cable and smartctl
> > sees the disk just fine and reports no errors.
> > 
> > - I'd like to readd the drive as a spare.  Is it enough
> > to "mdadm --add /dev/hdk" or do I need to prep the drive to
> > remove any data that said where it previously belonged
> > in the array?
> That should work.
> Any issues and you can zero the superblock (man mdadm)
> No need to zero the disk.

Would --re-add be better?

I've noticed something else since I made the initial post

--------- begin output -------------
fly:~# mdadm -D /dev/md0
/dev/md0:
        Version : 01.00.03
  Creation Time : Sun Jan 11 21:49:36 2009
     Raid Level : raid5
     Array Size : 312602368 (298.12 GiB 320.10 GB)
    Device Size : 156301184 (74.53 GiB 80.03 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Jan 30 15:52:01 2009
          State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : fly:FlyFileServ_md  (local to host fly)
           UUID : 0e2b9157:a58edc1d:213a220f:68a555c9
         Events : 16

    Number   Major   Minor   RaidDevice State
       0      33        1        0      active sync   /dev/hde1
       1      34        1        1      active sync   /dev/hdg1
       2      56        1        2      active sync   /dev/hdi1
       5      89        1        3      active sync   /dev/hdo1
       6      88        1        4      active sync   /dev/hdm1


fly:~# mdadm -E /dev/hdo1
/dev/hdo1:
          Magic : a92b4efc
        Version : 01
    Feature Map : 0x1
     Array UUID : 0e2b9157:a58edc1d:213a220f:68a555c9
           Name : fly:FlyFileServ_md  (local to host fly)
  Creation Time : Sun Jan 11 21:49:36 2009
     Raid Level : raid5
   Raid Devices : 5

    Device Size : 234436336 (111.79 GiB 120.03 GB)
     Array Size : 625204736 (298.12 GiB 320.10 GB)
      Used Size : 156301184 (74.53 GiB 80.03 GB)
   Super Offset : 234436464 sectors
          State : clean
    Device UUID : e072bd09:2df53d6d:d23321cc:cf2c37de

Internal Bitmap : 2 sectors from superblock
    Update Time : Fri Jan 30 15:52:01 2009
       Checksum : 4689ff5 - correct
         Events : 16

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 5 (0, 1, 2, failed, failed, 3, 4)
   Array State : uuuUu 2 failed
--------- end output -------------

Why does the "Array Slot" field show 7 slots?  And why
does the field "Array State" show 2 failed?  There 
ever only were 6 disks in the array.  Only one of those
is currently missing.  mdadm -D above doesn't list any
failed devices in the "Failed Devices" field.

Thanks for your answers below as well.  It's kind of 
what I was expecting.  There was a h/w problem that
took ages to track down and I think it was reponsible
for all the e2fs errors.

WG

> 
> > - When I tried to list some files on one of the filesystems
> > on the array (the fact that it took so long to react to
> > the ls is how I discovered the box was in the middle of
> > rebuiling to spare)
> This is OK - resync involves a lot of IO and can slow things down. This
> is tuneable.
> 
> > it couldn't find the file (or many 
> > others).  I thought that resyncing was supposed to be
> > transparent, yet parts of the fs seemed to be missing.
> > Everything was there afterwards.  Is that normal?
> No. This is nothing to do with normal md resyncing and certainly not
> expected.
> 
> > - On a subsequent boot I had to run e2fsck on the three
> > filesystems housed on the array.  Many stray blocks, 
> > illegal inodes, etc were found.  An artifact of the rebuild
> > or unrelated?
> Well, you had a fault in your IO system there's a good chance your O
> broke.
> 
> Verify against a backup.
> 
> David
> 
> 
> -- 
> "Don't worry, you'll be fine; I saw it work in a cartoon once..."
-- 
  
  whollygoat@letterboxes.org

-- 
http://www.fastmail.fm - IMAP accessible web-mail


  reply	other threads:[~2009-01-31 12:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-31  8:16 some ?? re failed disk and resyncing of array whollygoat
2009-01-31 10:38 ` David Greaves
2009-01-31 12:03   ` whollygoat [this message]
2009-02-01 19:41     ` Bill Davidsen
2009-02-02  1:47       ` whollygoat
2009-02-03  0:52       ` zero-superblock, " whollygoat
2009-02-03  8:48         ` David Greaves
2009-02-04  4:48           ` whollygoat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1233403388.29916.1297756217@webmail.messagingengine.com \
    --to=whollygoat@letterboxes.org \
    --cc=david@dgreaves.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.