* RAID reconstruction problems
@ 2003-09-02 3:04 Michael Welsh Duggan
2003-09-02 16:04 ` Bernd Schubert
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Michael Welsh Duggan @ 2003-09-02 3:04 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]
I currently have two small Software RAIDs, a RAID 1 for my root
partition, and a RAID 5 for my usr partition. One of the disks in the
arrays died, and I threw in a new disk in with the intention of
rebuilding the arrays.
The rebuilds failed, but in an extremely strange fashion. Monitoring
/proc/mdstat, it seems that the rebuilds are going just fine. When
they finish however, /proc/mdstat includes the new disk, but also
declares it invalid. The system continues running in degraded mode.
When I run this from the root console, I get some messages from the
raid subsystem, including full debugging output. I have not yet
figured out how to capture this output in order to include in this
message, but I did write down a part of one attempt (this was by hand,
so there may be small inconsistancies):
RAID5 conf printout
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: bug in file raid5.c, line 1901
Here is some output from my system. If any more information would be
useful, or anyone thinks I should try something else, please let me
know. I would like to get out of my currently degraded state!
[-- Attachment #2: Output from my system --]
[-- Type: text/plain, Size: 7655 bytes --]
maru:/# uname -a
Linux maru 2.4.21 #3 Fri Aug 29 13:14:01 EDT 2003 i686 GNU/Linux
maru:/# cat ~md5i/dmesg-raid
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
8regs : 1841.200 MB/sec
32regs : 935.600 MB/sec
pIII_sse : 2052.000 MB/sec
pII_mmx : 2247.600 MB/sec
p5_mmx : 2383.200 MB/sec
raid5: using function: pIII_sse (2052.000 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
[events: 00000198]
[events: 00000008]
[events: 00000196]
[events: 000000f3]
[events: 00000086]
[events: 00000008]
md: autorun ...
md: considering ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host2/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus0/target1/lun0/part3 ...
md: created md0
md: bind<ide/host0/bus0/target1/lun0/part3,1>
md: bind<ide/host2/bus1/target0/lun0/part3,2>
md: running: <ide/host2/bus1/target0/lun0/part3><ide/host0/bus0/target1/lun0/part3>
md: ide/host2/bus1/target0/lun0/part3's event counter: 00000008
md: ide/host0/bus0/target1/lun0/part3's event counter: 00000008
md0: max total readahead window set to 496k
md0: 2 data-disks, max readahead per data-disk: 248k
raid5: device ide/host2/bus1/target0/lun0/part3 operational as raid disk 2
raid5: device ide/host0/bus0/target1/lun0/part3 operational as raid disk 0
raid5: md0, not all disks are operational -- trying to recover array
raid5: allocated 3284kB for md0
raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part3 [events: 00000009]<6>(write) ide/host2/bus1/target0/lun0/part3's sb offset: 53640960
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: ide/host0/bus0/target1/lun0/part3 [events: 00000009]<6>(write) ide/host0/bus0/target1/lun0/part3's sb offset: 53616832
md: considering ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host2/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus1/target0/lun0/part1 ...
md: adding ide/host0/bus0/target1/lun0/part1 ...
md: created md1
md: bind<ide/host0/bus0/target1/lun0/part1,1>
md: bind<ide/host0/bus1/target0/lun0/part1,2>
md: bind<ide/host2/bus1/target0/lun0/part1,3>
md: running: <ide/host2/bus1/target0/lun0/part1><ide/host0/bus1/target0/lun0/part1><ide/host0/bus0/target1/lun0/part1>
md: ide/host2/bus1/target0/lun0/part1's event counter: 00000086
md: ide/host0/bus1/target0/lun0/part1's event counter: 00000196
md: ide/host0/bus0/target1/lun0/part1's event counter: 00000198
md: superblock update time inconsistency -- using the most recent one
md: freshest: ide/host0/bus0/target1/lun0/part1
md: kicking non-fresh ide/host2/bus1/target0/lun0/part1 from array!
md: unbind<ide/host2/bus1/target0/lun0/part1,2>
md: export_rdev(ide/host2/bus1/target0/lun0/part1)
md: kicking non-fresh ide/host0/bus1/target0/lun0/part1 from array!
md: unbind<ide/host0/bus1/target0/lun0/part1,1>
md: export_rdev(ide/host0/bus1/target0/lun0/part1)
md1: removing former faulty ide/host0/bus1/target0/lun0/part1!
md: RAID level 1 does not need chunksize! Continuing anyway.
md1: max total readahead window set to 124k
md1: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host0/bus0/target1/lun0/part1 operational as mirror 0
raid1: md1, not all disks are operational -- trying to recover array
raid1: raid set md1 active with 1 out of 2 mirrors
md: updating md1 RAID superblock on device
md: ide/host0/bus0/target1/lun0/part1 [events: 00000199]<6>(write) ide/host0/bus0/target1/lun0/part1's sb offset: 6144704
md: recovery thread got woken up ...
md1: no spare disk to reconstruct array! -- continuing in degraded mode
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: considering ide/host0/bus1/target0/lun0/part3 ...
md: adding ide/host0/bus1/target0/lun0/part3 ...
md: md0 already running, cannot run ide/host0/bus1/target0/lun0/part3
md: export_rdev(ide/host0/bus1/target0/lun0/part3)
md: (ide/host0/bus1/target0/lun0/part3 was pending)
md: ... autorun DONE.
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/# lsraid -A -a /dev/md0
[dev 9, 0] /dev/md0 94BF0D82.2B9C1BFB.89401B38.92B8F93B online
[dev 3, 67] /dev/ide/host0/bus0/target1/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 34, 3] /dev/ide/host2/bus1/target0/lun0/part3 94BF0D82.2B9C1BFB.89401B38.92B8F93B good
maru:/# lsraid -A -a /dev/md1
[dev 9, 1] /dev/md1 0E953226.03C91D46.CD00D52F.83A1334E online
[dev 3, 65] /dev/ide/host0/bus0/target1/lun0/part1 0E953226.03C91D46.CD00D52F.83A1334E good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
maru:/# cat /etc/raidtab
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/hdb3
raid-disk 0
device /dev/hdc3
raid-disk 1
device /dev/hdg3
raid-disk 2
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 1
persistent-superblock 1
chunk-size 4
device /dev/hdb1
raid-disk 0
device /dev/hdc1
raid-disk 1
device /dev/hdg1
spare-disk 0
maru:/# ls -l /dev/hdb1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb1 -> ide/host0/bus0/target1/lun0/part1
maru:/# ls -l /dev/hdc1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc1 -> ide/host0/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdg1
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg1 -> ide/host2/bus1/target0/lun0/part1
maru:/# ls -l /dev/hdb3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdb3 -> ide/host0/bus0/target1/lun0/part3
maru:/# ls -l /dev/hdc3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdc3 -> ide/host0/bus1/target0/lun0/part3
maru:/# ls -l /dev/hdg3
lr-xr-xr-x 1 root root 33 Sep 1 18:29 /dev/hdg3 -> ide/host2/bus1/target0/lun0/part3
maru:/# raidhotadd /dev/md1 /dev/hdc1
maru:/# echo Waited for some time...
Waited for some time...
maru:/# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2] ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/1] [U_]
md0 : active raid5 ide/host2/bus1/target0/lun0/part3[2] ide/host0/bus0/target1/lun0/part3[0]
107233664 blocks level 5, 32k chunk, algorithm 2 [3/2] [U_U]
unused devices: <none>
maru:/#
[-- Attachment #3: Type: text/plain, Size: 44 bytes --]
--
Michael Welsh Duggan
(md5i@cs.cmu.edu)
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: RAID reconstruction problems
2003-09-02 3:04 RAID reconstruction problems Michael Welsh Duggan
@ 2003-09-02 16:04 ` Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
2003-09-03 1:51 ` Michael Welsh Duggan
2 siblings, 0 replies; 5+ messages in thread
From: Bernd Schubert @ 2003-09-02 16:04 UTC (permalink / raw)
To: Michael Duggan; +Cc: linux-raid
On Tuesday 02 September 2003 05:04, Michael Welsh Duggan wrote:
> I currently have two small Software RAIDs, a RAID 1 for my root
> partition, and a RAID 5 for my usr partition. One of the disks in the
> arrays died, and I threw in a new disk in with the intention of
> rebuilding the arrays.
>
> The rebuilds failed, but in an extremely strange fashion. Monitoring
> /proc/mdstat, it seems that the rebuilds are going just fine. When
> they finish however, /proc/mdstat includes the new disk, but also
> declares it invalid. The system continues running in degraded mode.
>
> When I run this from the root console, I get some messages from the
> raid subsystem, including full debugging output. I have not yet
> figured out how to capture this output in order to include in this
> message, but I did write down a part of one attempt (this was by hand,
> so there may be small inconsistancies):
>
> RAID5 conf printout
> --- rd:3 wd:2 fd:1
> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
> disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
> md: bug in file raid5.c, line 1901
>
> Here is some output from my system. If any more information would be
> useful, or anyone thinks I should try something else, please let me
> know. I would like to get out of my currently degraded state!
Hi,
Neil has posted several mdadm commands for similar problems like this. So I
think you should install mdadm, search in the list-archive for problems like
this and try to get a working array using mdadm.
Regards,
Bernd
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID reconstruction problems
2003-09-02 3:04 RAID reconstruction problems Michael Welsh Duggan
2003-09-02 16:04 ` Bernd Schubert
@ 2003-09-02 18:29 ` Donghui Wen
2003-09-03 1:51 ` Michael Welsh Duggan
2 siblings, 0 replies; 5+ messages in thread
From: Donghui Wen @ 2003-09-02 18:29 UTC (permalink / raw)
To: Michael Duggan, linux-raid
Have you partitioned the new disk before you rebuild?
Donghui
----- Original Message -----
From: "Michael Welsh Duggan" <md5i@cs.cmu.edu>
To: <linux-raid@vger.kernel.org>
Sent: Monday, September 01, 2003 8:04 PM
Subject: RAID reconstruction problems
> I currently have two small Software RAIDs, a RAID 1 for my root
> partition, and a RAID 5 for my usr partition. One of the disks in the
> arrays died, and I threw in a new disk in with the intention of
> rebuilding the arrays.
>
> The rebuilds failed, but in an extremely strange fashion. Monitoring
> /proc/mdstat, it seems that the rebuilds are going just fine. When
> they finish however, /proc/mdstat includes the new disk, but also
> declares it invalid. The system continues running in degraded mode.
>
> When I run this from the root console, I get some messages from the
> raid subsystem, including full debugging output. I have not yet
> figured out how to capture this output in order to include in this
> message, but I did write down a part of one attempt (this was by hand,
> so there may be small inconsistancies):
>
> RAID5 conf printout
> --- rd:3 wd:2 fd:1
> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
> disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
> md: bug in file raid5.c, line 1901
>
> Here is some output from my system. If any more information would be
> useful, or anyone thinks I should try something else, please let me
> know. I would like to get out of my currently degraded state!
>
>
----------------------------------------------------------------------------
----
>
> --
> Michael Welsh Duggan
> (md5i@cs.cmu.edu)
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Have you partitioned the new disk before you rebuild?
Donghui
----- Original Message -----
From: "Michael Welsh Duggan" <md5i@cs.cmu.edu>
To: <linux-raid@vger.kernel.org>
Sent: Monday, September 01, 2003 8:04 PM
Subject: RAID reconstruction problems
> I currently have two small Software RAIDs, a RAID 1 for my root
> partition, and a RAID 5 for my usr partition. One of the disks in the
> arrays died, and I threw in a new disk in with the intention of
> rebuilding the arrays.
>
> The rebuilds failed, but in an extremely strange fashion. Monitoring
> /proc/mdstat, it seems that the rebuilds are going just fine. When
> they finish however, /proc/mdstat includes the new disk, but also
> declares it invalid. The system continues running in degraded mode.
>
> When I run this from the root console, I get some messages from the
> raid subsystem, including full debugging output. I have not yet
> figured out how to capture this output in order to include in this
> message, but I did write down a part of one attempt (this was by hand,
> so there may be small inconsistancies):
>
> RAID5 conf printout
> --- rd:3 wd:2 fd:1
> disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target1/lun0/part3
> disk 1, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
> disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
> md: bug in file raid5.c, line 1901
>
> Here is some output from my system. If any more information would be
> useful, or anyone thinks I should try something else, please let me
> know. I would like to get out of my currently degraded state!
>
>
----------------------------------------------------------------------------
----
>
> --
> Michael Welsh Duggan
> (md5i@cs.cmu.edu)
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID reconstruction problems
2003-09-02 3:04 RAID reconstruction problems Michael Welsh Duggan
2003-09-02 16:04 ` Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
@ 2003-09-03 1:51 ` Michael Welsh Duggan
2003-09-04 17:07 ` Bernd Schubert
2 siblings, 1 reply; 5+ messages in thread
From: Michael Welsh Duggan @ 2003-09-03 1:51 UTC (permalink / raw)
To: linux-raid
Sorry about putting the wrong email address on my initial email.
In reply to Donghui Wen, Yes, it was partitioned correctly.
In reply to Bernd Schubert, I have installed mdadm, and am looking
around for some pointers. I am unsure what is going on still. mdadm
reports the follownig for /dev/md1:
/dev/md1:
Version : 00.90.00
Creation Time : Sun Mar 16 21:42:44 2003
Raid Level : raid1
Array Size : 6144704 (5.86 GiB 6.29 GB)
Device Size : 6144704 (5.86 GiB 6.29 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Mon Sep 1 19:48:11 2003
State : dirty, no-errors
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Number Major Minor RaidDevice State
0 3 65 0 active sync /dev/ide/host0/bus0/target1/lun0/part1
1 0 0 0 sync
2 22 1 2 active /dev/ide/host0/bus1/target0/lun0/part1
UUID : 0e953226:03c91d46:cd00d52f:83a1334e
Events : 0.413
/proc/mdstat reports the following:
md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2]
ide/host0/bus0/target1/lun0/part1[0]
6144704 blocks [2/2] [U_]
I'll play around with it some more, but it anyone recognizes these
symptoms, please reply.
--
Michael Welsh Duggan
(mwd@cert.org)
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: RAID reconstruction problems
2003-09-03 1:51 ` Michael Welsh Duggan
@ 2003-09-04 17:07 ` Bernd Schubert
0 siblings, 0 replies; 5+ messages in thread
From: Bernd Schubert @ 2003-09-04 17:07 UTC (permalink / raw)
To: linux-raid
On Wednesday 03 September 2003 03:51, Michael Welsh Duggan wrote:
> Sorry about putting the wrong email address on my initial email.
>
> In reply to Donghui Wen, Yes, it was partitioned correctly.
>
> In reply to Bernd Schubert, I have installed mdadm, and am looking
> around for some pointers. I am unsure what is going on still. mdadm
> reports the follownig for /dev/md1:
>
> /dev/md1:
> Version : 00.90.00
> Creation Time : Sun Mar 16 21:42:44 2003
> Raid Level : raid1
> Array Size : 6144704 (5.86 GiB 6.29 GB)
> Device Size : 6144704 (5.86 GiB 6.29 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 1
> Persistence : Superblock is persistent
>
> Update Time : Mon Sep 1 19:48:11 2003
> State : dirty, no-errors
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
>
> Number Major Minor RaidDevice State
> 0 3 65 0 active sync
> /dev/ide/host0/bus0/target1/lun0/part1 1 0 0 0
> sync
> 2 22 1 2 active
> /dev/ide/host0/bus1/target0/lun0/part1 UUID :
> 0e953226:03c91d46:cd00d52f:83a1334e
> Events : 0.413
>
>
> /proc/mdstat reports the following:
>
> md1 : active raid1 ide/host0/bus1/target0/lun0/part1[2]
> ide/host0/bus0/target1/lun0/part1[0]
> 6144704 blocks [2/2] [U_]
>
> I'll play around with it some more, but it anyone recognizes these
> symptoms, please reply.
Hello Michael,
perhaps the same issue that David Chow had a few weeks ago? See the attached
mail, you will also find it in the archives.
Bernd
PS: I removed the attachment for the ML, as its anyway in the archives.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-09-04 17:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-02 3:04 RAID reconstruction problems Michael Welsh Duggan
2003-09-02 16:04 ` Bernd Schubert
2003-09-02 18:29 ` Donghui Wen
2003-09-03 1:51 ` Michael Welsh Duggan
2003-09-04 17:07 ` Bernd Schubert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).