RAID reconstruction problems - Michael Welsh Duggan

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michael Welsh Duggan <md5i@cs.cmu.edu>
To: linux-raid@vger.kernel.org
Subject: RAID reconstruction problems
Date: Mon, 01 Sep 2003 23:04:20 -0400	[thread overview]
Message-ID: <87wucrnb63.fsf@cs.cmu.edu> (raw)

[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]

I currently have two small Software RAIDs, a RAID 1 for my root
partition, and a RAID 5 for my usr partition.  One of the disks in the
arrays died, and I threw in a new disk in with the intention of
rebuilding the arrays.

The rebuilds failed, but in an extremely strange fashion.  Monitoring
/proc/mdstat, it seems that the rebuilds are going just fine.  When
they finish however, /proc/mdstat includes the new disk, but also
declares it invalid.  The system continues running in degraded mode.

When I run this from the root console, I get some messages from the
raid subsystem, including full debugging output.  I have not yet
figured out how to capture this output in order to include in this
message, but I did write down a part of one attempt (this was by hand,
so there may be small inconsistancies):

RAID5 conf printout
 --- rd:3 wd:2 fd:1
 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
 disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: bug in file raid5.c, line 1901

Here is some output from my system.  If any more information would be
useful, or anyone thinks I should try something else, please let me
know.  I would like to get out of my currently degraded state!


[-- Attachment #2: Output from my system --]
[-- Type: text/plain, Size: 7655 bytes --]

maru:/# uname -a
Linux maru 2.4.21 #3 Fri Aug 29 13:14:01 EDT 2003 i686 GNU/Linux
maru:/# cat ~md5i/dmesg-raid
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
   8regs     :  1841.200 MB/sec
   32regs    :   935.600 MB/sec
   pIII_sse  :  2052.000 MB/sec
   pII_mmx   :  2247.600 MB/sec
   p5_mmx    :  2383.200 MB/sec
raid5: using function: pIII_sse (2052.000 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
 [events: 00000198]
 [events: 00000008]
 [events: 00000196]
 [events: 000000f3]
 [events: 00000086]
 [events: 00000008]
md: autorun ...
md: considering ide/host2/bus1/target0/lun0/part3 ...
md:  adding ide/host2/bus1/target0/lun0/part3 ...
md:  adding ide/host0/bus0/target1/lun0/part3 ...
md: created md0
md: bind<ide/host0/bus0/target1/lun0/part3,1>
md: bind<ide/host2/bus1/target0/lun0/part3,2>
md: running: <ide/host2/bus1/target0/lun0/part3><ide/host0/bus0/target1/lun0/part3>
md: ide/host2/bus1/target0/lun0/part3's event counter: 00000008
md: ide/host0/bus0/target1/lun0/part3's event counter: 00000008
md0: max total readahead window set to 496k
md0: 2 data-disks, max readahead per data-disk: 248k
raid5: device ide/host2/bus1/target0/lun0/part3 operational as raid disk 2
raid5: device ide/host0/bus0/target1/lun0/part3 operational as raid disk 0
raid5: md0, not all disks are operational -- trying to recover array
raid5: allocated 3284kB for md0
raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
 --- rd:3 wd:2 fd:1
 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
 disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
RAID5 conf printout:
 --- rd:3 wd:2 fd:1
 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
 disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part3 [events: 00000009]<6>(write) ide/host2/bus1/target0/lun0/part3's sb offset: 53640960
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: ide/host0/bus0/target1/lun0/part3 [events: 00000009]<6>(write) ide/host0/bus0/target1/lun0/part3's sb offset: 53616832
md: considering ide/host2/bus1/target0/lun0/part1 ...
md:  adding ide/host2/bus1/target0/lun0/part1 ...
md:  adding ide/host0/bus1/target0/lun0/part1 ...
md:  adding ide/host0/bus0/target1/lun0/part1 ...
md: created md1
md: bind<ide/host0/bus0/target1/lun0/part1,1>
md: bind<ide/host0/bus1/target0/lun0/part1,2>
md: bind<ide/host2/bus1/target0/lun0/part1,3>
md: running: <ide/host2/bus1/target0/lun0/part1><ide/host0/bus1/target0/lun0/part1><ide/host0/bus0/target1/lun0/part1>
md: ide/host2/bus1/target0/lun0/part1's event counter: 00000086
md: ide/host0/bus1/target0/lun0/part1's event counter: 00000196
md: ide/host0/bus0/target1/lun0/part1's event counter: 00000198
md: superblock update time inconsistency -- using the most recent one
md: freshest: ide/host0/bus0/target1/lun0/part1
md: kicking non-fresh ide/host2/bus1/target0/lun0/part1 from array!
md: unbind<ide/host2/bus1/target0/lun0/part1,2>
md: export_rdev(ide/host2/bus1/target0/lun0/part1)
md: kicking non-fresh ide/host0/bus1/target0/lun0/part1 from array!
md: unbind<ide/host0/bus1/target0/lun0/part1,1>
md: export_rdev(ide/host0/bus1/target0/lun0/part1)
md1: removing former faulty ide/host0/bus1/target0/lun0/part1!
md: RAID level 1 does not need chunksize! Continuing anyway.
md1: max total readahead window set to 124k
md1: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host0/bus0/target1/lun0/part1 operational as mirror 0
raid1: md1, not all disks are operational -- trying to recover array
raid1: raid set md1 active with 1 out of 2 mirrors
md: updating md1 RAID superblock on device
md: ide/host0/bus0/target1/lun0/part1 [events: 00000199]<6>(write) ide/host0/bus0/target1/lun0/part1's sb offset: 6144704
md: recovery thread got woken up ...
md1: no spare disk to reconstruct array! -- continuing in degraded mode
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: considering ide/host0/bus1/target0/lun0/part3 ...
md:  adding ide/host0/bus1/target0/lun0/part3 ...
md: md0 already running, cannot run ide/host0/bus1/target0/lun0/part3
md: export_rdev(ide/host0/bus1/target0/lun0/part3)
md: (ide/host0/bus1/target0/lun0/part3 was pending)
md: ... autorun DONE.
maru:/# cat /proc/mdstat 
Personalities : [raid1] [raid5] 
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus0/target1/lun0/part1[0]
      6144704 blocks [2/1] [U_]
      
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
      107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
      
unused devices: <none>
maru:/# lsraid -A -a /dev/md0
[dev   9,   0] /dev/md0         94BF0D82.2B9C1BFB.89401B38.92B8F93B online
[dev   3,  67] /dev/ide/host0/bus0/target1/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev  34,   3] /dev/ide/host2/bus1/target0/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good

maru:/# lsraid -A -a /dev/md1
[dev   9,   1] /dev/md1         0E953226.03C91D46.CD00D52F.83A1334E online
[dev   3,  65] /dev/ide/host0/bus0/target1/lun0/part1 0E953226.03C91D46.CD00D52F.83A1334E good
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing

maru:/# cat /etc/raidtab
raiddev /dev/md0
        raid-level      5
        nr-raid-disks   3
        nr-spare-disks  0
        persistent-superblock 1
        parity-algorithm        left-symmetric
        chunk-size      32
        device          /dev/hdb3
        raid-disk       0
        device          /dev/hdc3
        raid-disk       1
        device		/dev/hdg3
        raid-disk       2

raiddev /dev/md1
        raid-level      1
        nr-raid-disks   2
        nr-spare-disks  1
        persistent-superblock 1
        chunk-size      4
        device          /dev/hdb1
        raid-disk       0
        device          /dev/hdc1
        raid-disk       1
	device		/dev/hdg1
	spare-disk	0
maru:/# ls -l /dev/hdb1
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdb1 -> ide/host0/bus0/target1/lun0/part1
maru:/# ls -l /dev/hdc1
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdc1 -> ide/host0/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdg1
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdg1 -> ide/host2/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdb3
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdb3 -> ide/host0/bus0/target1/lun0/part3
maru:/# ls -l /dev/hdc3
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdc3 -> ide/host0/bus1/target0/lun0/part3
maru:/# ls -l /dev/hdg3
lr-xr-xr-x    1 root     root           33 Sep  1 18:29 /dev/hdg3 -> ide/host2/bus1/target0/lun0/part3
maru:/# raidhotadd /dev/md1 /dev/hdc1
maru:/# echo Waited for some time...
Waited for some time...
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5] 
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2] ide/host0/bus0/target1/lun0/part1[0]
      6144704 blocks [2/1] [U_]
      
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
      107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
      
unused devices: <none>
maru:/# 

[-- Attachment #3: Type: text/plain, Size: 44 bytes --]


-- 
Michael Welsh Duggan
(md5i@cs.cmu.edu)

next             reply	other threads:[~2003-09-02  3:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-09-02  3:04 Michael Welsh Duggan [this message]
2003-09-02 16:04 ` RAID reconstruction problems Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
2003-09-03  1:51 ` Michael Welsh Duggan
2003-09-04 17:07   ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wucrnb63.fsf@cs.cmu.edu \
    --to=md5i@cs.cmu.edu \
    --cc=linux-raid@vger.kernel.org \
    --cc=md5i@sei.cmu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.