From: Michael Welsh Duggan <md5i@cs.cmu.edu>
To: linux-raid@vger.kernel.org
Subject: RAID reconstruction problems
Date: Mon, 01 Sep 2003 23:04:20 -0400 [thread overview]
Message-ID: <87wucrnb63.fsf@cs.cmu.edu> (raw)
[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]
I currently have two small Software RAIDs, a RAID 1 for my root
partition, and a RAID 5 for my usr partition. One of the disks in the
arrays died, and I threw in a new disk in with the intention of
rebuilding the arrays.
The rebuilds failed, but in an extremely strange fashion. Monitoring
/proc/mdstat, it seems that the rebuilds are going just fine. When
they finish however, /proc/mdstat includes the new disk, but also
declares it invalid. The system continues running in degraded mode.
When I run this from the root console, I get some messages from the
raid subsystem, including full debugging output. I have not yet
figured out how to capture this output in order to include in this
message, but I did write down a part of one attempt (this was by hand,
so there may be small inconsistancies):
RAID5 conf printout
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: bug in file raid5.c, line 1901
Here is some output from my system. If any more information would be
useful, or anyone thinks I should try something else, please let me
know. I would like to get out of my currently degraded state!
[-- Attachment #2: Output from my system --]
[-- Type: text/plain, Size: 7655 bytes --]
maru:/# uname -a
Linux maru 2.4.21 #3 Fri Aug 29 13:14:01 EDT 2003 i686 GNU/Linux
maru:/# cat ~md5i/dmesg-raid
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
8regs : 1841.200 MB/sec
32regs : 935.600 MB/sec
pIII_sse : 2052.000 MB/sec
pII_mmx : 2247.600 MB/sec
p5_mmx : 2383.200 MB/sec
raid5: using function: pIII_sse (2052.000 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
[events: 00000198]
[events: 00000008]
[events: 00000196]
[events: 000000f3]
[events: 00000086]
[events: 00000008]
md: autorun ...
md: considering ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus0/target1/lun0/part3 ...
md: created md0
md: bind<ide/host0/bus0/target1/lun0/part3,1>
md: bind<ide/host2/bus1/target0/lun0/part3,2>
md: running: <ide/host2/bus1/target0/lun0/part3><ide/host0/bus0/target1/lun0/part3>
md: ide/host2/bus1/target0/lun0/part3's event counter: 00000008
md: ide/host0/bus0/target1/lun0/part3's event counter: 00000008
md0: max total readahead window set to 496k
md0: 2 data-disks, max readahead per data-disk: 248k
raid5: device ide/host2/bus1/target0/lun0/part3 operational as raid disk 2
raid5: device ide/host0/bus0/target1/lun0/part3 operational as raid disk 0
raid5: md0, not all disks are operational -- trying to recover array
raid5: allocated 3284kB for md0
raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part3 [events: 00000009]<6>(write) ide/host2/bus1/target0/lun0/part3's sb offset: 53640960
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: ide/host0/bus0/target1/lun0/part3 [events: 00000009]<6>(write) ide/host0/bus0/target1/lun0/part3's sb offset: 53616832
md: considering ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus0/target1/lun0/part1 ...
md: created md1
md: bind<ide/host0/bus0/target1/lun0/part1,1>
md: bind<ide/host0/bus1/target0/lun0/part1,2>
md: bind<ide/host2/bus1/target0/lun0/part1,3>
md: running: <ide/host2/bus1/target0/lun0/part1><ide/host0/bus1/target0/lun0/part1><ide/host0/bus0/target1/lun0/part1>
md: ide/host2/bus1/target0/lun0/part1's event counter: 00000086
md: ide/host0/bus1/target0/lun0/part1's event counter: 00000196
md: ide/host0/bus0/target1/lun0/part1's event counter: 00000198
md: superblock update time inconsistency -- using the most recent one
md: freshest: ide/host0/bus0/target1/lun0/part1
md: kicking non-fresh ide/host2/bus1/target0/lun0/part1 from array!
md: unbind<ide/host2/bus1/target0/lun0/part1,2>
md: export_rdev(ide/host2/bus1/target0/lun0/part1)
md: kicking non-fresh ide/host0/bus1/target0/lun0/part1 from array!
md: unbind<ide/host0/bus1/target0/lun0/part1,1>
md: export_rdev(ide/host0/bus1/target0/lun0/part1)
md1: removing former faulty ide/host0/bus1/target0/lun0/part1!
md: RAID level 1 does not need chunksize! Continuing anyway.
md1: max total readahead window set to 124k
md1: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host0/bus0/target1/lun0/part1 operational as mirror 0
raid1: md1, not all disks are operational -- trying to recover array
raid1: raid set md1 active with 1 out of 2 mirrors
md: updating md1 RAID superblock on device
md: ide/host0/bus0/target1/lun0/part1 [events: 00000199]<6>(write) ide/host0/bus0/target1/lun0/part1's sb offset: 6144704
md: recovery thread got woken up ...
md1: no spare disk to reconstruct array! -- continuing in degraded mode
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: considering ide/host0/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus1/target0/lun0/part3 ...
md: md0 already running, cannot run ide/host0/bus1/target0/lun0/part3
md: export_rdev(ide/host0/bus1/target0/lun0/part3)
md: (ide/host0/bus1/target0/lun0/part3 was pending)
md: ... autorun DONE.
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/# lsraid -A -a /dev/md0
[dev 9, 0] /dev/md0 94BF0D82.2B9C1BFB.89401B38.92B8F93B online
[dev 3, 67] /dev/ide/host0/bus0/target1/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 34, 3] /dev/ide/host2/bus1/target0/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
maru:/# lsraid -A -a /dev/md1
[dev 9, 1] /dev/md1 0E953226.03C91D46.CD00D52F.83A1334E online
[dev 3, 65] /dev/ide/host0/bus0/target1/lun0/part1 0E953226.03C91D46.CD00D52F.83A1334E good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
maru:/# cat /etc/raidtab
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/hdb3
raid-disk 0
device /dev/hdc3
raid-disk 1
device /dev/hdg3
raid-disk 2
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 1
persistent-superblock 1
chunk-size 4
device /dev/hdb1
raid-disk 0
device /dev/hdc1
raid-disk 1
device /dev/hdg1
spare-disk 0
maru:/# ls -l /dev/hdb1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb1 -> ide/host0/bus0/target1/lun0/part1
maru:/# ls -l /dev/hdc1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc1 -> ide/host0/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdg1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg1 -> ide/host2/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdb3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb3 -> ide/host0/bus0/target1/lun0/part3
maru:/# ls -l /dev/hdc3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc3 -> ide/host0/bus1/target0/lun0/part3
maru:/# ls -l /dev/hdg3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg3 -> ide/host2/bus1/target0/lun0/part3
maru:/# raidhotadd /dev/md1 /dev/hdc1
maru:/# echo Waited for some time...
Waited for some time...
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2] ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/#
[-- Attachment #3: Type: text/plain, Size: 44 bytes --]
--
Michael Welsh Duggan
(md5i@cs.cmu.edu)
next reply other threads:[~2003-09-02 3:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-09-02 3:04 Michael Welsh Duggan [this message]
2003-09-02 16:04 ` RAID reconstruction problems Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
2003-09-03 1:51 ` Michael Welsh Duggan
2003-09-04 17:07 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wucrnb63.fsf@cs.cmu.edu \
--to=md5i@cs.cmu.edu \
--cc=linux-raid@vger.kernel.org \
--cc=md5i@sei.cmu.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.