* Is my RAID broken?
@ 2006-11-05 7:29 Robert Hulme
2006-11-05 22:22 ` Neil Brown
0 siblings, 1 reply; 6+ messages in thread
From: Robert Hulme @ 2006-11-05 7:29 UTC (permalink / raw)
To: linux-raid
Hi everyone,
I recently got an email warning from Zabbix about my raid array, so I
went to have a look at it:
----------
Personalities : [linear] [raid0] [raid1] [raid10] [raid5] [raid4]
md2 : active raid5 hdk1[3] hdi1[2] hdg1[1] hde1[0]
1172126208 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
[===============>.....] resync = 77.6% (303281336/390708736)
finish=102.7min speed=14181K/sec
md1 : active raid5 sdd3[2] sdc3[3] sdb3[1] sda3[0]
184490112 blocks level 5, 32k chunk, algorithm 0 [4/4] [UUUU]
md0 : active raid1 sdb1[1] sda1[0]
9767424 blocks [2/2] [UU]
unused devices: <none>
----------
Then I got detail on md2:
----------
weebl:/var/log# mdadm --detail /dev/md2
/dev/md2:
Version : 00.90.03
Creation Time : Sun May 14 09:43:37 2006
Raid Level : raid5
Array Size : 1172126208 (1117.83 GiB 1200.26 GB)
Device Size : 390708736 (372.61 GiB 400.09 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Sun Nov 5 07:13:01 2006
State : clean, resyncing
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 32K
Rebuild Status : 77% complete
UUID : 41af16c5:36143507:8dcabe0b:f372280d
Events : 0.12485388
Number Major Minor RaidDevice State
0 33 1 0 active sync /dev/hde1
1 34 1 1 active sync /dev/hdg1
2 56 1 2 active sync /dev/hdi1
3 57 1 3 active sync /dev/hdk1
----------
I thought it would be good to check /var/log/messages:
----------
Nov 5 01:06:01 xen-dom0 kernel: md: syncing RAID array md0
Nov 5 01:06:01 xen-dom0 kernel: md: minimum _guaranteed_
reconstruction speed: 1000 KB/sec/disc.
Nov 5 01:06:01 xen-dom0 kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Nov 5 01:06:01 xen-dom0 kernel: md: using 128k window, over a total
of 9767424 blocks.
Nov 5 01:06:01 xen-dom0 kernel: md: delaying resync of md1 until md0
has finished resync (they share one or more physical units)
Nov 5 01:06:01 xen-dom0 kernel: md: syncing RAID array md2
Nov 5 01:06:01 xen-dom0 kernel: md: minimum _guaranteed_
reconstruction speed: 1000 KB/sec/disc.
Nov 5 01:06:01 xen-dom0 kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Nov 5 01:06:01 xen-dom0 kernel: md: using 128k window, over a total
of 390708736 blocks.
Nov 5 01:12:37 xen-dom0 kernel: md: md0: sync done.
Nov 5 01:12:37 xen-dom0 kernel: md: syncing RAID array md1
Nov 5 01:12:37 xen-dom0 kernel: RAID1 conf printout:
Nov 5 01:12:37 xen-dom0 kernel: --- wd:2 rd:2
Nov 5 01:12:37 xen-dom0 kernel: disk 0, wo:0, o:1, dev:sda1
Nov 5 01:12:37 xen-dom0 kernel: disk 1, wo:0, o:1, dev:sdb1
Nov 5 01:12:37 xen-dom0 kernel: md: minimum _guaranteed_
reconstruction speed: 1000 KB/sec/disc.
Nov 5 01:12:37 xen-dom0 kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Nov 5 01:12:37 xen-dom0 kernel: md: using 128k window, over a total
of 61496704 blocks.
Nov 5 02:21:14 xen-dom0 kernel: RAID5 conf printout:
Nov 5 02:21:14 xen-dom0 kernel: --- rd:4 wd:4 fd:0
Nov 5 02:21:14 xen-dom0 kernel: disk 0, o:1, dev:sda3
Nov 5 02:21:14 xen-dom0 kernel: disk 1, o:1, dev:sdb3
Nov 5 02:21:14 xen-dom0 kernel: disk 2, o:1, dev:sdd3
Nov 5 02:21:14 xen-dom0 kernel: disk 3, o:1, dev:sdc3
----------
This system has been running for a year now without problems so it is
really disappointing to think there is a problem :'(
Why might this happen? Is it serious? What should I do to resolve the situation?
If its resyncing that means it detected an error, right?
Thank you for your help
-Rob
--
------------------------------------------------------
"By all means let's be open-minded, but not so open-minded that our
brains drop out." - Richard Dawkins
"It is far better to grasp the universe as it really is than to
persist in delusion, however satisfying and reassuring." - Carl Sagan
http://www.robhulme.com/
http://robhu.livejournal.com/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is my RAID broken?
2006-11-05 7:29 Is my RAID broken? Robert Hulme
@ 2006-11-05 22:22 ` Neil Brown
2006-11-06 7:59 ` Mikael Abrahamsson
2006-11-08 17:07 ` Bill Davidsen
0 siblings, 2 replies; 6+ messages in thread
From: Neil Brown @ 2006-11-05 22:22 UTC (permalink / raw)
To: Robert Hulme; +Cc: linux-raid
On Sunday November 5, rob@robhulme.com wrote:
>
> If its resyncing that means it detected an error, right?
Not a disk error. 'resyncing' means that at startup it looked like
the array hadn't been shutdown properly so it is making sure that all
the redundancy in the array is consistent.
So it looks like you machine recently crashed (power failure?) and
it is restarting.
NeilBrown
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is my RAID broken?
2006-11-05 22:22 ` Neil Brown
@ 2006-11-06 7:59 ` Mikael Abrahamsson
2006-11-06 8:17 ` dean gaudet
2006-11-08 17:07 ` Bill Davidsen
1 sibling, 1 reply; 6+ messages in thread
From: Mikael Abrahamsson @ 2006-11-06 7:59 UTC (permalink / raw)
To: linux-raid
On Mon, 6 Nov 2006, Neil Brown wrote:
> So it looks like you machine recently crashed (power failure?) and it is
> restarting.
Or upgrade some part of the OS and now it'll do resync every week or so (I
think this is debian default nowadays, don't know the interval though).
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is my RAID broken?
2006-11-06 7:59 ` Mikael Abrahamsson
@ 2006-11-06 8:17 ` dean gaudet
2006-11-06 9:45 ` Mikael Abrahamsson
0 siblings, 1 reply; 6+ messages in thread
From: dean gaudet @ 2006-11-06 8:17 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: linux-raid
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1282 bytes --]
On Mon, 6 Nov 2006, Mikael Abrahamsson wrote:
> On Mon, 6 Nov 2006, Neil Brown wrote:
>
> > So it looks like you machine recently crashed (power failure?) and it is
> > restarting.
>
> Or upgrade some part of the OS and now it'll do resync every week or so (I
> think this is debian default nowadays, don't know the interval though).
it should be only once a month... and it's just a "check" -- it reads
everything and corrects errors.
i think it's a great thing actually... way more useful than smart long
self-tests because md can reconstruct read errors immediately -- before
you lose redundancy in that stripe.
-dean
% cat /etc/cron.d/mdadm
#
# cron.d/mdadm -- schedules periodic redundancy checks of MD devices
#
# Copyright © martin f. krafft <madduck@madduck.net>
# distributed under the terms of the Artistic Licence 2.0
#
# $Id: mdadm.cron.d 147 2006-08-30 09:26:11Z madduck $
#
# By default, run at 01:06 on every Sunday, but do nothing unless the day of
# the month is less than or equal to 7. Thus, only run on the first Sunday of
# each month. crontab(5) sucks, unfortunately, in this regard; therefore this
# hack (see #380425).
6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ] && /usr/share/mdadm/checkarray --cron --all --quiet
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is my RAID broken?
2006-11-06 8:17 ` dean gaudet
@ 2006-11-06 9:45 ` Mikael Abrahamsson
0 siblings, 0 replies; 6+ messages in thread
From: Mikael Abrahamsson @ 2006-11-06 9:45 UTC (permalink / raw)
To: linux-raid
On Mon, 6 Nov 2006, dean gaudet wrote:
> it should be only once a month... and it's just a "check" -- it reads
> everything and corrects errors.
If you have a recent enough kernel, yes. I used 2.6.15 and it would
actually rebuild, I think it was 2.6.16 that does correct "read and
compare".
> i think it's a great thing actually... way more useful than smart long
> self-tests because md can reconstruct read errors immediately -- before
> you lose redundancy in that stripe.
Quite, I highly recommend doing this on all RAID sets including hardware
ones (3ware for instance) once a week.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is my RAID broken?
2006-11-05 22:22 ` Neil Brown
2006-11-06 7:59 ` Mikael Abrahamsson
@ 2006-11-08 17:07 ` Bill Davidsen
1 sibling, 0 replies; 6+ messages in thread
From: Bill Davidsen @ 2006-11-08 17:07 UTC (permalink / raw)
To: Neil Brown; +Cc: Robert Hulme, linux-raid
Neil Brown wrote:
>On Sunday November 5, rob@robhulme.com wrote:
>
>
>>If its resyncing that means it detected an error, right?
>>
>>
>
>Not a disk error. 'resyncing' means that at startup it looked like
>the array hadn't been shutdown properly so it is making sure that all
>the redundancy in the array is consistent.
>
>So it looks like you machine recently crashed (power failure?) and
>it is restarting.
>
There is always the possibility that shutdown scripts don't do the right
thing, as well. I believe one of the major distros showed this problem
within the last few months, depending on the RAID options used.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-11-08 17:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-05 7:29 Is my RAID broken? Robert Hulme
2006-11-05 22:22 ` Neil Brown
2006-11-06 7:59 ` Mikael Abrahamsson
2006-11-06 8:17 ` dean gaudet
2006-11-06 9:45 ` Mikael Abrahamsson
2006-11-08 17:07 ` Bill Davidsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).