From: Michael Welsh Duggan <md5i@cs.cmu.edu>
To: linux-raid@vger.kernel.org
Subject: RAID reconstruction problems
Date: Mon, 01 Sep 2003 23:04:20 -0400 [thread overview]
Message-ID: <87wucrnb63.fsf@cs.cmu.edu> (raw)
[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]
I currently have two small Software RAIDs, a RAID 1 for my root
partition, and a RAID 5 for my usr partition. One of the disks in the
arrays died, and I threw in a new disk in with the intention of
rebuilding the arrays.
The rebuilds failed, but in an extremely strange fashion. Monitoring
/proc/mdstat, it seems that the rebuilds are going just fine. When
they finish however, /proc/mdstat includes the new disk, but also
declares it invalid. The system continues running in degraded mode.
When I run this from the root console, I get some messages from the
raid subsystem, including full debugging output. I have not yet
figured out how to capture this output in order to include in this
message, but I did write down a part of one attempt (this was by hand,
so there may be small inconsistancies):
RAID5 conf printout
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: bug in file raid5.c, line 1901
Here is some output from my system. If any more information would be
useful, or anyone thinks I should try something else, please let me
know. I would like to get out of my currently degraded state!
[-- Attachment #2: Output from my system --]
[-- Type: text/plain, Size: 7655 bytes --]
maru:/# uname -a
Linux maru 2.4.21 #3 Fri Aug 29 13:14:01 EDT 2003 i686 GNU/Linux
maru:/# cat ~md5i/dmesg-raid
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
8regs : 1841.200 MB/sec
32regs : 935.600 MB/sec
pIII_sse : 2052.000 MB/sec
pII_mmx : 2247.600 MB/sec
p5_mmx : 2383.200 MB/sec
raid5: using function: pIII_sse (2052.000 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
[events: 00000198]
[events: 00000008]
[events: 00000196]
[events: 000000f3]
[events: 00000086]
[events: 00000008]
md: autorun ...
md: considering ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus0/target1/lun0/part3 ...
md: created md0
md: bind<ide/host0/bus0/target1/lun0/part3,1>
md: bind<ide/host2/bus1/target0/lun0/part3,2>
md: running: <ide/host2/bus1/target0/lun0/part3><ide/host0/bus0/target1/lun0/part3>
md: ide/host2/bus1/target0/lun0/part3's event counter: 00000008
md: ide/host0/bus0/target1/lun0/part3's event counter: 00000008
md0: max total readahead window set to 496k
md0: 2 data-disks, max readahead per data-disk: 248k
raid5: device ide/host2/bus1/target0/lun0/part3 operational as raid disk 2
raid5: device ide/host0/bus0/target1/lun0/part3 operational as raid disk 0
raid5: md0, not all disks are operational -- trying to recover array
raid5: allocated 3284kB for md0
raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part3 [events: 00000009]<6>(write) ide/host2/bus1/target0/lun0/part3's sb offset: 53640960
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: ide/host0/bus0/target1/lun0/part3 [events: 00000009]<6>(write) ide/host0/bus0/target1/lun0/part3's sb offset: 53616832
md: considering ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus0/target1/lun0/part1 ...
md: created md1
md: bind<ide/host0/bus0/target1/lun0/part1,1>
md: bind<ide/host0/bus1/target0/lun0/part1,2>
md: bind<ide/host2/bus1/target0/lun0/part1,3>
md: running: <ide/host2/bus1/target0/lun0/part1><ide/host0/bus1/target0/lun0/part1><ide/host0/bus0/target1/lun0/part1>
md: ide/host2/bus1/target0/lun0/part1's event counter: 00000086
md: ide/host0/bus1/target0/lun0/part1's event counter: 00000196
md: ide/host0/bus0/target1/lun0/part1's event counter: 00000198
md: superblock update time inconsistency -- using the most recent one
md: freshest: ide/host0/bus0/target1/lun0/part1
md: kicking non-fresh ide/host2/bus1/target0/lun0/part1 from array!
md: unbind<ide/host2/bus1/target0/lun0/part1,2>
md: export_rdev(ide/host2/bus1/target0/lun0/part1)
md: kicking non-fresh ide/host0/bus1/target0/lun0/part1 from array!
md: unbind<ide/host0/bus1/target0/lun0/part1,1>
md: export_rdev(ide/host0/bus1/target0/lun0/part1)
md1: removing former faulty ide/host0/bus1/target0/lun0/part1!
md: RAID level 1 does not need chunksize! Continuing anyway.
md1: max total readahead window set to 124k
md1: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host0/bus0/target1/lun0/part1 operational as mirror 0
raid1: md1, not all disks are operational -- trying to recover array
raid1: raid set md1 active with 1 out of 2 mirrors
md: updating md1 RAID superblock on device
md: ide/host0/bus0/target1/lun0/part1 [events: 00000199]<6>(write) ide/host0/bus0/target1/lun0/part1's sb offset: 6144704
md: recovery thread got woken up ...
md1: no spare disk to reconstruct array! -- continuing in degraded mode
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: considering ide/host0/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus1/target0/lun0/part3 ...
md: md0 already running, cannot run ide/host0/bus1/target0/lun0/part3
md: export_rdev(ide/host0/bus1/target0/lun0/part3)
md: (ide/host0/bus1/target0/lun0/part3 was pending)
md: ... autorun DONE.
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/# lsraid -A -a /dev/md0
[dev 9, 0] /dev/md0 94BF0D82.2B9C1BFB.89401B38.92B8F93B online
[dev 3, 67] /dev/ide/host0/bus0/target1/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 34, 3] /dev/ide/host2/bus1/target0/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
maru:/# lsraid -A -a /dev/md1
[dev 9, 1] /dev/md1 0E953226.03C91D46.CD00D52F.83A1334E online
[dev 3, 65] /dev/ide/host0/bus0/target1/lun0/part1 0E953226.03C91D46.CD00D52F.83A1334E good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
maru:/# cat /etc/raidtab
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/hdb3
raid-disk 0
device /dev/hdc3
raid-disk 1
device /dev/hdg3
raid-disk 2
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 1
persistent-superblock 1
chunk-size 4
device /dev/hdb1
raid-disk 0
device /dev/hdc1
raid-disk 1
device /dev/hdg1
spare-disk 0
maru:/# ls -l /dev/hdb1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb1 -> ide/host0/bus0/target1/lun0/part1
maru:/# ls -l /dev/hdc1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc1 -> ide/host0/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdg1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg1 -> ide/host2/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdb3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb3 -> ide/host0/bus0/target1/lun0/part3
maru:/# ls -l /dev/hdc3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc3 -> ide/host0/bus1/target0/lun0/part3
maru:/# ls -l /dev/hdg3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg3 -> ide/host2/bus1/target0/lun0/part3
maru:/# raidhotadd /dev/md1 /dev/hdc1
maru:/# echo Waited for some time...
Waited for some time...
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2] ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/#
[-- Attachment #3: Type: text/plain, Size: 44 bytes --]
--
Michael Welsh Duggan
(md5i@cs.cmu.edu)
next reply other threads:[~2003-09-02 3:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-09-02 3:04 Michael Welsh Duggan [this message]
2003-09-02 16:04 ` RAID reconstruction problems Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
2003-09-03 1:51 ` Michael Welsh Duggan
2003-09-04 17:07 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wucrnb63.fsf@cs.cmu.edu \
--to=md5i@cs.cmu.edu \
--cc=linux-raid@vger.kernel.org \
--cc=md5i@sei.cmu.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).