* Care and feeding of RAID?
@ 2006-09-05 12:48 Paul Waldo
2006-09-05 13:14 ` Benjamin Schieder
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Paul Waldo @ 2006-09-05 12:48 UTC (permalink / raw)
To: linux-raid
Hi all,
I have a RAID6 array and I wondering about care and feeding instructions :-)
Here is what I currently do:
- daily incremental and weekly full backups to a separate machine
- run smartd tests (short once a day, long once a week)
- check the raid for bad blocks every week
What else can I do make sure the array keeps humming? Thanks in advance!
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 12:48 Care and feeding of RAID? Paul Waldo
@ 2006-09-05 13:14 ` Benjamin Schieder
2006-09-05 16:56 ` Patrik Jonsson
2006-09-05 13:19 ` Gordon Henderson
2006-09-05 17:39 ` Paul Waldo
2 siblings, 1 reply; 27+ messages in thread
From: Benjamin Schieder @ 2006-09-05 13:14 UTC (permalink / raw)
To: Paul Waldo; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 861 bytes --]
On 05.09.2006 08:48:44, Paul Waldo wrote:
> Hi all,
>
> I have a RAID6 array and I wondering about care and feeding instructions :-)
>
> Here is what I currently do:
> - daily incremental and weekly full backups to a separate machine
> - run smartd tests (short once a day, long once a week)
> - check the raid for bad blocks every week
>
> What else can I do make sure the array keeps humming? Thanks in advance!
The mdadm man-page has information about running mdadm from cron to check
for 'unusual' activity.
You may want to consider that. I run it as daemon, personally.
Greetings,
Benjamin
--
#!/bin/sh #!/bin/bash #!/bin/tcsh #!/bin/csh #!/bin/kiss #!/bin/ksh
#!/bin/pdksh #!/usr/bin/perl #!/usr/bin/python #!/bin/zsh #!/bin/ash
Feel at home? Got some of them? Want to show some magic?
http://shellscripts.org
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 12:48 Care and feeding of RAID? Paul Waldo
2006-09-05 13:14 ` Benjamin Schieder
@ 2006-09-05 13:19 ` Gordon Henderson
2006-09-05 15:03 ` Steve Cousins
2006-09-05 17:39 ` Paul Waldo
2 siblings, 1 reply; 27+ messages in thread
From: Gordon Henderson @ 2006-09-05 13:19 UTC (permalink / raw)
To: Paul Waldo; +Cc: linux-raid
On Tue, 5 Sep 2006, Paul Waldo wrote:
> Hi all,
>
> I have a RAID6 array and I wondering about care and feeding instructions :-)
>
> Here is what I currently do:
> - daily incremental and weekly full backups to a separate machine
> - run smartd tests (short once a day, long once a week)
> - check the raid for bad blocks every week
>
> What else can I do make sure the array keeps humming? Thanks in advance!
Stop fiddling with it :)
I run similar stuff, but don't forget running mdadm in daemon mode to send
you an email should a drive fail. I also check each device individually,
rather than the array although I don't know the value of doing this over
the SMART tests on modern drives though...
Gordon
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 13:19 ` Gordon Henderson
@ 2006-09-05 15:03 ` Steve Cousins
2006-09-05 15:41 ` Benjamin Schieder
` (4 more replies)
0 siblings, 5 replies; 27+ messages in thread
From: Steve Cousins @ 2006-09-05 15:03 UTC (permalink / raw)
To: Gordon Henderson; +Cc: Paul Waldo, linux-raid
Gordon Henderson wrote:
>On Tue, 5 Sep 2006, Paul Waldo wrote:
>
>
>
>>Hi all,
>>
>>I have a RAID6 array and I wondering about care and feeding instructions :-)
>>
>>Here is what I currently do:
>> - daily incremental and weekly full backups to a separate machine
>> - run smartd tests (short once a day, long once a week)
>> - check the raid for bad blocks every week
>>
>>What else can I do make sure the array keeps humming? Thanks in advance!
>>
>>
>
>Stop fiddling with it :)
>
>I run similar stuff, but don't forget running mdadm in daemon mode to send
>you an email should a drive fail. I also check each device individually,
>rather than the array although I don't know the value of doing this over
>the SMART tests on modern drives though...
>
>
Would people be willing to list their setup? Including such things as
mdadm.conf file, crontab -l, plus scripts that they use to check the
smart data and the array, mdadm daemon parameters and anything else that
is relevant to checking and maintaining an array?
I'm running the mdmonitor script at startup and a sample mdadm.conf
(one of 3 machines) looks like:
MAILADDR cousins@limpet-gb.umeoce.maine.edu
ARRAY /dev/md0 level=raid5 num-devices=3
UUID=39d07542:f3c97e69:fbb63d9d:64a052d3
devices=/dev/sdb1,/dev/sdc1,/dev/sdd1
These are SATA drives and except for the one machine that has a 3Ware
8506 card in it I haven't been able to get SMART programs to do anything
with these drives. How do others deal with this?
Thanks,
Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:03 ` Steve Cousins
@ 2006-09-05 15:41 ` Benjamin Schieder
2006-09-05 18:29 ` Steve Cousins
2006-09-06 7:33 ` Mario 'BitKoenig' Holbe
2006-09-05 16:23 ` Mike Hardy
` (3 subsequent siblings)
4 siblings, 2 replies; 27+ messages in thread
From: Benjamin Schieder @ 2006-09-05 15:41 UTC (permalink / raw)
To: Steve Cousins; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1856 bytes --]
On 05.09.2006 11:03:45, Steve Cousins wrote:
> Would people be willing to list their setup? Including such things as
> mdadm.conf file, crontab -l, plus scripts that they use to check the
> smart data and the array, mdadm daemon parameters and anything else that
> is relevant to checking and maintaining an array?
Personally, I use this script from cron:
http://shellscripts.org/project/hdtest
0 3 * * * root /root/sbin/hdtest.sh -l /var/log/smart_ata-ST3250624A_4ND33CLT.log /dev/disk/by-id/ata-ST3250624A_4ND33CLT short
1 3 * * * root /root/sbin/hdtest.sh -l /var/log/smart_ata-ST3250624A_4ND33EJE.log /dev/disk/by-id/ata-ST3250624A_4ND33EJE short
2 3 * * * root /root/sbin/hdtest.sh -l /var/log/smart_ata-ST3250624A_4ND33ELA.log /dev/disk/by-id/ata-ST3250624A_4ND33ELA short
I have made the experience that long tests slow down the raid to a point
where the system becomes unusable.
My mdadm.conf is like this:
---
DEVICE partitions
ARRAY /dev/md/0 level=raid1 num-devices=3 UUID=3559ffcf:14eb9889:3826d6c2:c13731d7
ARRAY /dev/md/1 level=raid5 num-devices=3 UUID=649fc7cc:d4b52c31:240fce2c:c64686e7
ARRAY /dev/md/2 level=raid5 num-devices=3 UUID=9a3bf634:58f39e44:27ba8087:d5189766
spares=1
ARRAY /dev/md/4 level=raid5 num-devices=3 UUID=d4799be3:5b157884:e38718c2:c05ab840
spares=1
ARRAY /dev/md/5 level=raid5 num-devices=3 UUID=ca4a6110:4533d8d5:0e2ed4e1:2f5805b2
spares=1
MAIL root@localhost
---
Also, I run
mdadm --monitor /dev/md/* --daemonise
from an init script.
Greetings,
Benjamin
--
_ _ _ _ _
| \| |___| |_| |_ __ _ __| |__
| .` / -_) _| ' \/ _` / _| / /
|_|\_\___|\__|_||_\__,_\__|_\_\
| | (_)_ _ _ ___ __
| |__| | ' \ || \ \ /
|____|_|_||_\_,_/_\_\
Play Nethack anywhere with an x86 computer:
http://www.crash-override.net/nethacklinux.html
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:03 ` Steve Cousins
2006-09-05 15:41 ` Benjamin Schieder
@ 2006-09-05 16:23 ` Mike Hardy
2006-09-05 17:02 ` Gordon Henderson
` (2 subsequent siblings)
4 siblings, 0 replies; 27+ messages in thread
From: Mike Hardy @ 2006-09-05 16:23 UTC (permalink / raw)
To: Steve Cousins; +Cc: Gordon Henderson, Paul Waldo, linux-raid
Steve Cousins wrote:
> MAILADDR cousins@limpet-gb.umeoce.maine.edu
> ARRAY /dev/md0 level=raid5 num-devices=3
> UUID=39d07542:f3c97e69:fbb63d9d:64a052d3
> devices=/dev/sdb1,/dev/sdc1,/dev/sdd1
If you list the devices explicitly, you're opening the possibility for
errors when the devices are re-ordered following insertion (or removal)
of any other SATA or SCSI (or USB storage) device
I think you want is a "DEVICE partitions" line accompanied by ARRAY
lines that have the UUID attribute you've already got in there.
-Mike
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 13:14 ` Benjamin Schieder
@ 2006-09-05 16:56 ` Patrik Jonsson
2006-09-05 17:04 ` Gordon Henderson
0 siblings, 1 reply; 27+ messages in thread
From: Patrik Jonsson @ 2006-09-05 16:56 UTC (permalink / raw)
To: Benjamin Schieder; +Cc: Paul Waldo, linux-raid
[-- Attachment #1: Type: text/plain, Size: 815 bytes --]
On 05.09.2006 08:48:44, Paul Waldo wrote:
>> Hi all,
>>
>> I have a RAID6 array and I wondering about care and feeding instructions :-)
>>
>> Here is what I currently do:
>> - daily incremental and weekly full backups to a separate machine
>> - run smartd tests (short once a day, long once a week)
>> - check the raid for bad blocks every week
>>
>> What else can I do make sure the array keeps humming? Thanks in advance!
>>
Make sure the drives are adequately cooled. I use this nifty utility to
look at my drive temps:
http://martybugs.net/linux/hddtemp.cgi
mtbf seems to have an exponential dependence on temperature, so it pays
off to keep temp down. Exactly what temp you consider safe is
individual, but my drives only occasionally go above 40C.
cheers,
/Patrik
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:03 ` Steve Cousins
2006-09-05 15:41 ` Benjamin Schieder
2006-09-05 16:23 ` Mike Hardy
@ 2006-09-05 17:02 ` Gordon Henderson
2006-09-05 17:46 ` Paul Waldo
2006-09-06 7:41 ` Mario 'BitKoenig' Holbe
2006-09-05 17:09 ` Paul Waldo
2006-09-05 17:57 ` Rev. Jeffrey Paul
4 siblings, 2 replies; 27+ messages in thread
From: Gordon Henderson @ 2006-09-05 17:02 UTC (permalink / raw)
To: Steve Cousins; +Cc: linux-raid
On Tue, 5 Sep 2006, Steve Cousins wrote:
> Would people be willing to list their setup? Including such things as
> mdadm.conf file, crontab -l, plus scripts that they use to check the
> smart data and the array, mdadm daemon parameters and anything else that
> is relevant to checking and maintaining an array?
I don't have any mdadm.conf files ... What am I missing? (I've always been
under the impression that after needing the /etc/raidtab file with the old
raidtools, you didn't need a config file as such under mdadm... However,
I'm willing to be enlightened!)
For checking the smart stuff, I use the standard Debian packages and a
smartd.conf file typically looks like:
#DEVICESCAN
/dev/hda -d ata -o on -S on -a -m smart-mon@domain.com -s (S/../.././04|L/../../1/20) -M daily -M test
/dev/hdc -d ata -o on -S on -a -m smart-mon@domain.com -s (S/../.././04|L/../../1/20) -M daily
/dev/hde -d ata -o on -S on -a -m smart-mon@domain.com -s (S/../.././04|L/../../1/20) -M daily
/dev/hdi -d ata -o on -S on -a -m smart-mon@domain.com -s (S/../.././04|L/../../1/20) -M daily
The running mdadm in monitor mode looks like:
/sbin/mdadm -F -i /var/run/mdadm.pid -m root -f -s
and my weekly badblocks script looks like:
#!/bin/csh
echo "`uname -n`: Badblocks test starting at [`date`]"
foreach disk ( a c )
foreach partition ( 1 2 3 5 6 )
echo -n "hd$disk${partition}: "
badblocks -c 128 /dev/hd$disk$partition
end
echo ""
end
echo "`uname -n`: Badblocks test ending at [`date`]"
I do loads of stuff with disk temperatures (when I can), etc. but thats
just for making pretty graphs I can point at my customers... (eg
http://lion.drogon.net/mrtg/diskTemp.html and tell me when that data
centre upgraded their AC ;-)
Gordon
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 16:56 ` Patrik Jonsson
@ 2006-09-05 17:04 ` Gordon Henderson
0 siblings, 0 replies; 27+ messages in thread
From: Gordon Henderson @ 2006-09-05 17:04 UTC (permalink / raw)
To: Patrik Jonsson; +Cc: linux-raid
On Tue, 5 Sep 2006, Patrik Jonsson wrote:
> mtbf seems to have an exponential dependence on temperature, so it pays
> off to keep temp down. Exactly what temp you consider safe is
> individual, but my drives only occasionally go above 40C.
I had a pair (2 x Hitachi IDE 80GB) that ran in a sealed case at the top
of a lift-shaft for 2 years. They averaged 55C... I never got to see the
box after it was decomissioned...
Gordon
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:03 ` Steve Cousins
` (2 preceding siblings ...)
2006-09-05 17:02 ` Gordon Henderson
@ 2006-09-05 17:09 ` Paul Waldo
2006-09-05 20:17 ` Richard Scobie
2006-09-05 17:57 ` Rev. Jeffrey Paul
4 siblings, 1 reply; 27+ messages in thread
From: Paul Waldo @ 2006-09-05 17:09 UTC (permalink / raw)
To: linux-raid; +Cc: Steve Cousins
Steve Cousins wrote:
> Gordon Henderson wrote:
>
>> On Tue, 5 Sep 2006, Paul Waldo wrote:
>>
>>
>>
>>> Hi all,
>>>
>>> I have a RAID6 array and I wondering about care and feeding
>>> instructions :-)
>>>
>>> Here is what I currently do:
>>> - daily incremental and weekly full backups to a separate machine
>>> - run smartd tests (short once a day, long once a week)
>>> - check the raid for bad blocks every week
>>>
>>> What else can I do make sure the array keeps humming? Thanks in
>>> advance!
>>>
>>
>> Stop fiddling with it :)
>>
>> I run similar stuff, but don't forget running mdadm in daemon mode to
>> send
>> you an email should a drive fail. I also check each device individually,
>> rather than the array although I don't know the value of doing this over
>> the SMART tests on modern drives though...
>>
>>
>
> Would people be willing to list their setup? Including such things as
> mdadm.conf file, crontab -l, plus scripts that they use to check the
> smart data and the array, mdadm daemon parameters and anything else that
> is relevant to checking and maintaining an array?
> I'm running the mdmonitor script at startup and a sample mdadm.conf
> (one of 3 machines) looks like:
>
> MAILADDR cousins@limpet-gb.umeoce.maine.edu
> ARRAY /dev/md0 level=raid5 num-devices=3
> UUID=39d07542:f3c97e69:fbb63d9d:64a052d3
> devices=/dev/sdb1,/dev/sdc1,/dev/sdd1
>
> These are SATA drives and except for the one machine that has a 3Ware
> 8506 card in it I haven't been able to get SMART programs to do anything
> with these drives. How do others deal with this?
> Thanks,
>
> Steve
>
Excellent idea, Steve.
In my crontab, I have this:
# Check RAID arrays for bad blocks once a week
30 2 * * Tue echo check >> /sys/block/md0/md/sync_action ; echo "Checking md0 bad blocks"
30 2 * * Wed echo check >> /sys/block/md1/md/sync_action ; echo "Checking md1 bad blocks"
I have this in my smartd.conf:
/dev/hda -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hdc -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hde -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/hdg -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sda -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sdb -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
/dev/sdc -d ata -H -m root -S on -o on -I 194 -s (S/../.././02|L/../../6/03)
My Fedora Core box has this in /etc/init.d/mdmonitor:
daemon --check --user=root mdadm ${OPTIONS}
where OPTIONS="--monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid"
I have no mdadm.conf. My entire filesystem consists of md0 (/boot) and md1(/).
I figure if I have problems and need the file, it won't be available anyway. If I am mistaken,
please do let me know!
Any other suggestions would be welcomed!
Paul
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 12:48 Care and feeding of RAID? Paul Waldo
2006-09-05 13:14 ` Benjamin Schieder
2006-09-05 13:19 ` Gordon Henderson
@ 2006-09-05 17:39 ` Paul Waldo
2006-09-09 15:58 ` Nix
2006-09-10 5:23 ` dean gaudet
2 siblings, 2 replies; 27+ messages in thread
From: Paul Waldo @ 2006-09-05 17:39 UTC (permalink / raw)
To: linux-raid
Paul Waldo wrote:
> Hi all,
>
> I have a RAID6 array and I wondering about care and feeding instructions
> :-)
>
> Here is what I currently do:
> - daily incremental and weekly full backups to a separate machine
> - run smartd tests (short once a day, long once a week)
> - check the raid for bad blocks every week
>
> What else can I do make sure the array keeps humming? Thanks in advance!
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
What about bitmaps? Nobody has mentioned them. It is my understanding that
you just turn them on with "mdadm /dev/mdX -b internal". Any caveats for this?
Paul
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:02 ` Gordon Henderson
@ 2006-09-05 17:46 ` Paul Waldo
2006-09-06 0:04 ` Gordon Henderson
2006-09-06 7:41 ` Mario 'BitKoenig' Holbe
1 sibling, 1 reply; 27+ messages in thread
From: Paul Waldo @ 2006-09-05 17:46 UTC (permalink / raw)
To: Gordon Henderson; +Cc: linux-raid
Gordon Henderson wrote:
> On Tue, 5 Sep 2006, Steve Cousins wrote:
[snip]
> and my weekly badblocks script looks like:
>
> #!/bin/csh
>
> echo "`uname -n`: Badblocks test starting at [`date`]"
>
> foreach disk ( a c )
> foreach partition ( 1 2 3 5 6 )
> echo -n "hd$disk${partition}: "
> badblocks -c 128 /dev/hd$disk$partition
> end
> echo ""
> end
>
> echo "`uname -n`: Badblocks test ending at [`date`]"
[snip]
Maybe I'm missing something, but are these partitions mounted? Here's what I
get when I do this on a mounted partition:
[root@paul ~]# badblocks -nsv /dev/md0
/dev/md0 is mounted; it's not safe to run badblocks!
If you are running RAID, is it safe to run badblocks on the underlying partition?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:03 ` Steve Cousins
` (3 preceding siblings ...)
2006-09-05 17:09 ` Paul Waldo
@ 2006-09-05 17:57 ` Rev. Jeffrey Paul
2006-09-05 18:35 ` Steve Cousins
4 siblings, 1 reply; 27+ messages in thread
From: Rev. Jeffrey Paul @ 2006-09-05 17:57 UTC (permalink / raw)
To: Steve Cousins; +Cc: Gordon Henderson, Paul Waldo, linux-raid
On Tue, Sep 05, 2006 at 11:03:45AM -0400, Steve Cousins wrote:
>
> These are SATA drives and except for the one machine that has a 3Ware
> 8506 card in it I haven't been able to get SMART programs to do anything
> with these drives. How do others deal with this?
>
I use the tw_cli program to check up on my 3ware stuff.
It took me quite a bit of time to figure that one out. I don't
have any automated monitoring set up, but it'd be simple enough to
script. I check on the array every so often and run a verify every few
months to see if it kicks a disk out (it hasn't yet).
0 root@datavibe:~# tw_cli
//datavibe> info
Ctl Model Ports Drives Units NotOpt RRate VRate BBU
------------------------------------------------------------------------
c0 8006-2LP 2 2 1 0 2 - -
//datavibe> info c0
Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
------------------------------------------------------------------------------
u0 RAID-1 OK - - 232.885 ON - -
Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 232.88 GB 488397168 WD-WMAL718611
p1 OK u0 232.88 GB 488397168 WD-WMAL718619
//datavibe>
-j
--
--------------------------------------------------------
Rev. Jeffrey Paul -datavibe- sneak@datavibe.net
aim:x736e65616b pgp:0xD9B3C17D phone:877-748-3467
9440 0C7F C598 01CA 2F17 D098 0A3A 4B8F D9B3 C17D
--------------------------------------------------------
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:41 ` Benjamin Schieder
@ 2006-09-05 18:29 ` Steve Cousins
2006-09-05 20:57 ` Luca Berra
2006-09-06 7:33 ` Mario 'BitKoenig' Holbe
1 sibling, 1 reply; 27+ messages in thread
From: Steve Cousins @ 2006-09-05 18:29 UTC (permalink / raw)
To: Benjamin Schieder; +Cc: linux-raid
Benjamin Schieder wrote:
> On 05.09.2006 11:03:45, Steve Cousins wrote:
>
>>Would people be willing to list their setup? Including such things as
>>mdadm.conf file, crontab -l, plus scripts that they use to check the
>>smart data and the array, mdadm daemon parameters and anything else that
>>is relevant to checking and maintaining an array?
>
>
> Personally, I use this script from cron:
> http://shellscripts.org/project/hdtest
Hi Benjamin,
I am checking this out and I see that you are the writer of this script.
I'm getting errors when it comes to lines 76 and 86-90 about the
arithmetic symbols. This is on a Fedora Core 5 system with bash version
3.1.7(1). I weeded out the smartctl command and tried it manually with
no luck on my SATA /dev/sd? drives.
What do you (or others) recommend for SATA drives?
Thanks,
Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:57 ` Rev. Jeffrey Paul
@ 2006-09-05 18:35 ` Steve Cousins
0 siblings, 0 replies; 27+ messages in thread
From: Steve Cousins @ 2006-09-05 18:35 UTC (permalink / raw)
To: Rev. Jeffrey Paul; +Cc: linux-raid
Rev. Jeffrey Paul wrote:
> On Tue, Sep 05, 2006 at 11:03:45AM -0400, Steve Cousins wrote:
>
>>These are SATA drives and except for the one machine that has a 3Ware
>>8506 card in it I haven't been able to get SMART programs to do anything
>>with these drives. How do others deal with this?
>>
>
>
> I use the tw_cli program to check up on my 3ware stuff.
Hi Jeffrey,
Thanks. I use tw_cli too and I have scripted a check to see if it
degrades but this doesn't help with checking for disk problems before
they happen which SMART should help with. As it happens, smartctl works
with 3Ware SATA drives. It is my other SATA drives that I'm unable to
monitor.
Steve
> It took me quite a bit of time to figure that one out. I don't
> have any automated monitoring set up, but it'd be simple enough to
> script. I check on the array every so often and run a verify every few
> months to see if it kicks a disk out (it hasn't yet).
>
> 0 root@datavibe:~# tw_cli
> //datavibe> info
>
> Ctl Model Ports Drives Units NotOpt RRate VRate BBU
> ------------------------------------------------------------------------
> c0 8006-2LP 2 2 1 0 2 - -
>
> //datavibe> info c0
>
> Unit UnitType Status %Cmpl Stripe Size(GB) Cache AVerify IgnECC
> ------------------------------------------------------------------------------
> u0 RAID-1 OK - - 232.885 ON - -
>
> Port Status Unit Size Blocks Serial
> ---------------------------------------------------------------
> p0 OK u0 232.88 GB 488397168 WD-WMAL718611
> p1 OK u0 232.88 GB 488397168 WD-WMAL718619
> //datavibe>
>
> -j
>
--
______________________________________________________________________
Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu
Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu
Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:09 ` Paul Waldo
@ 2006-09-05 20:17 ` Richard Scobie
0 siblings, 0 replies; 27+ messages in thread
From: Richard Scobie @ 2006-09-05 20:17 UTC (permalink / raw)
To: linux-raid
This is a timely thread for me, as I am about to setup a software RAID
10 (a striped pair of mirrors), on 4 x 500GB SATA.
Anything to watch for by not partitioning the drives at all? Or is it
safer to make one partition, slightly smaller (suggestions of how much
welcome), than the full drive, to allow for possible size discrepencies
with replacemnets.
Also I am wondering as this is RAID0 on top of RAID1, if there are any
special steps that need to be taken when maintaining the array (adding,
removing, rebuilding etc), compared with a "single layer" RAID?
Regards,
Richard
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 18:29 ` Steve Cousins
@ 2006-09-05 20:57 ` Luca Berra
2006-09-06 7:12 ` Benjamin Schieder
[not found] ` <44FDF08D.8000504@maine.edu>
0 siblings, 2 replies; 27+ messages in thread
From: Luca Berra @ 2006-09-05 20:57 UTC (permalink / raw)
To: Steve Cousins
On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:
>
>
>Benjamin Schieder wrote:
>>On 05.09.2006 11:03:45, Steve Cousins wrote:
>>
>>>Would people be willing to list their setup? Including such things as
>>>mdadm.conf file, crontab -l, plus scripts that they use to check the
>>>smart data and the array, mdadm daemon parameters and anything else that
>>>is relevant to checking and maintaining an array?
>>
>>
>>Personally, I use this script from cron:
>>http://shellscripts.org/project/hdtest
nice race :)
>I am checking this out and I see that you are the writer of this script.
>I'm getting errors when it comes to lines 76 and 86-90 about the
>arithmetic symbols. This is on a Fedora Core 5 system with bash version
that is because smartctl output has changed and the grep above returns
no number.
>3.1.7(1). I weeded out the smartctl command and tried it manually with
>no luck on my SATA /dev/sd? drives.
which command?
>What do you (or others) recommend for SATA drives?
smartmontools and a recent kernel just work.
also you can schedule smart tests with smartmontools. so you don't need
to cron scripts.
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:46 ` Paul Waldo
@ 2006-09-06 0:04 ` Gordon Henderson
0 siblings, 0 replies; 27+ messages in thread
From: Gordon Henderson @ 2006-09-06 0:04 UTC (permalink / raw)
To: Paul Waldo; +Cc: linux-raid
On Tue, 5 Sep 2006, Paul Waldo wrote:
> Gordon Henderson wrote:
> > On Tue, 5 Sep 2006, Steve Cousins wrote:
> [snip]
> > and my weekly badblocks script looks like:
> >
> > #!/bin/csh
> >
> > echo "`uname -n`: Badblocks test starting at [`date`]"
> >
> > foreach disk ( a c )
> > foreach partition ( 1 2 3 5 6 )
> > echo -n "hd$disk${partition}: "
> > badblocks -c 128 /dev/hd$disk$partition
> > end
> > echo ""
> > end
> >
> > echo "`uname -n`: Badblocks test ending at [`date`]"
> [snip]
>
> Maybe I'm missing something, but are these partitions mounted? Here's what I
> get when I do this on a mounted partition:
>
> [root@paul ~]# badblocks -nsv /dev/md0
> /dev/md0 is mounted; it's not safe to run badblocks!
Do not use the -n option... (and -s won't be much use in a cron job, nor
-v, probably!) -n will write to the device which might well have issues
with the filesystem cache...
By reading the underlying drives you won't trigger a raid array failure
should you do see a bad sector, which might give you time to go something
about it. There was some emails on this list some time back (year or 2,3?)
about badblocking the md? device - I imagine it might not read every block
of every device unless it was a raid-0 array...
Gordon
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 20:57 ` Luca Berra
@ 2006-09-06 7:12 ` Benjamin Schieder
2006-09-06 18:49 ` Luca Berra
[not found] ` <44FDF08D.8000504@maine.edu>
1 sibling, 1 reply; 27+ messages in thread
From: Benjamin Schieder @ 2006-09-06 7:12 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1441 bytes --]
On 05.09.2006 22:57:27, Luca Berra wrote:
> On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:
> >
> >
> >Benjamin Schieder wrote:
> >>On 05.09.2006 11:03:45, Steve Cousins wrote:
> >>
> >>>Would people be willing to list their setup? Including such things as
> >>>mdadm.conf file, crontab -l, plus scripts that they use to check the
> >>>smart data and the array, mdadm daemon parameters and anything else that
> >>>is relevant to checking and maintaining an array?
> >>
> >>
> >>Personally, I use this script from cron:
> >>http://shellscripts.org/project/hdtest
>
> nice race :)
As in race condition? Where?
> >I am checking this out and I see that you are the writer of this script.
> >I'm getting errors when it comes to lines 76 and 86-90 about the
> >arithmetic symbols. This is on a Fedora Core 5 system with bash version
> that is because smartctl output has changed and the grep above returns
> no number.
I'm running smartmontools 5.33 here. When did the output change? It still
works fine here.
> >What do you (or others) recommend for SATA drives?
>
> smartmontools and a recent kernel just work.
> also you can schedule smart tests with smartmontools. so you don't need
> to cron scripts.
Interesting. I'll look into that.
Greetings,
Benjamin
--
The Nethack IdleRPG! Idle to your favorite Nethack messages!
http://pallas.crash-override.net/nethackidle/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
[not found] ` <44FDF08D.8000504@maine.edu>
@ 2006-09-06 7:16 ` Luca Berra
0 siblings, 0 replies; 27+ messages in thread
From: Luca Berra @ 2006-09-06 7:16 UTC (permalink / raw)
To: Steve Cousins; +Cc: linux-raid
On Tue, Sep 05, 2006 at 05:47:57PM -0400, Steve Cousins wrote:
>
>
>Luca Berra wrote:
>
>>On Tue, Sep 05, 2006 at 02:29:48PM -0400, Steve Cousins wrote:
>>
>>>
>>>
>>>Benjamin Schieder wrote:
>>>
>>>>On 05.09.2006 11:03:45, Steve Cousins wrote:
>>>>
>>>>>Would people be willing to list their setup? Including such things
>>>>>as mdadm.conf file, crontab -l, plus scripts that they use to check
>>>>>the smart data and the array, mdadm daemon parameters and anything
>>>>>else that is relevant to checking and maintaining an array?
>>>>
>>>>
>>>>
>>>>Personally, I use this script from cron:
>>>>http://shellscripts.org/project/hdtest
>>
>>
>>nice race :)
>
>I'm not sure what you mean?
tmp="`mktemp`"
rm -f ${tmp}
touch ${tmp}
the last two lines are unneeded and can be tricked to overwrite
arbitrary filenames
>I tried smartctl -t short -d scsi /dev/sdb where /dev/sdb is a 250GB
>SATA drive.
it is '-d ata'
>What command do you use for SATA drives? The sourceforge page implies
>that -d sata doesn't exist yet. I'm using FC 5 with 2.6.17 kernel and
>smartmontools version 5.33. Do you have a sample configuration script
>that you could show me?
# monitor two sata disks, show temperature in degrees,
# do a long test every sunday and a short every other day
# at 1am on sda and at 2am on sdb, YMMV
/dev/sda -d ata -a -R 194 -s (L/../../7|S/../../[123456])/01
/dev/sdb -d ata -a -R 194 -s (L/../../7|S/../../[123456])/02
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 15:41 ` Benjamin Schieder
2006-09-05 18:29 ` Steve Cousins
@ 2006-09-06 7:33 ` Mario 'BitKoenig' Holbe
1 sibling, 0 replies; 27+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2006-09-06 7:33 UTC (permalink / raw)
To: linux-raid
Benjamin Schieder <blindcoder@scavenger.homeip.net> wrote:
> I have made the experience that long tests slow down the raid to a point
> where the system becomes unusable.
Even though we're quite off-topic here with that since it's more
SMART-related... this is at least unusual.
I'm also running regular SMART selftests (short daily, long weekly) and
usually they don't affect drive's performance very much. However,
whenever I experienced massive slow-downs while selftests are running,
this did always point to disk problems... too much (and strategic
disadvantageous) reallocated sectors (keep an ear on the disks, hear
them seek()ing :)), non-reallocatable sectors or just temperature (for
example, WD drives tend to show a bad performance when getting too hot).
regards
Mario
--
But after a while I learned the trick of speaking fast. You don't have
to think any faster; just use twice as many words to say everything.
-- Paul Graham
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:02 ` Gordon Henderson
2006-09-05 17:46 ` Paul Waldo
@ 2006-09-06 7:41 ` Mario 'BitKoenig' Holbe
2006-09-09 15:56 ` Nix
1 sibling, 1 reply; 27+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2006-09-06 7:41 UTC (permalink / raw)
To: linux-raid
Gordon Henderson <gordon@drogon.net> wrote:
> I don't have any mdadm.conf files ... What am I missing? (I've always been
> under the impression that after needing the /etc/raidtab file with the old
> raidtools, you didn't need a config file as such under mdadm... However,
You don't necessarily need one. However, since Neil considers in-kernel
RAID-autodetection a bad thing and since mdadm typically relies on
mdadm.conf for RAID-assembly and since especially with newer kernels you
probably need to auto-create device nodes (2.6 and udev), it's more
convenient to have one. Though you could live without one even then.
I did also run without one over a long time and I also don't like it to
have one, however, sometimes in the past convenience won :)
regards
Mario
--
There is nothing more deceptive than an obvious fact.
-- Sherlock Holmes by Arthur Conan Doyle
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-06 7:12 ` Benjamin Schieder
@ 2006-09-06 18:49 ` Luca Berra
2006-09-10 7:36 ` Benjamin Schieder
0 siblings, 1 reply; 27+ messages in thread
From: Luca Berra @ 2006-09-06 18:49 UTC (permalink / raw)
To: linux-raid
On Wed, Sep 06, 2006 at 09:12:24AM +0200, Benjamin Schieder wrote:
>> >>Personally, I use this script from cron:
>> >>http://shellscripts.org/project/hdtest
>>
>> nice race :)
>
>As in race condition? Where?
mktemp
rm
touch
why do you do that?
>I'm running smartmontools 5.33 here. When did the output change? It still
>works fine here.
i retested now with 5.36 and it seems the output did _not_ change, i
don't know what i saw this morning.
but then it errors on the line
IFS=" " read type status online < <( smartctl -d ata -a ${disk} | grep
\#\ 1 | sed 's, \+, ,g' | cut -f 2,3,5 )
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-06 7:41 ` Mario 'BitKoenig' Holbe
@ 2006-09-09 15:56 ` Nix
0 siblings, 0 replies; 27+ messages in thread
From: Nix @ 2006-09-09 15:56 UTC (permalink / raw)
To: Mario 'BitKoenig' Holbe; +Cc: linux-raid
On 6 Sep 2006, Mario Holbe spake:
> You don't necessarily need one. However, since Neil considers in-kernel
> RAID-autodetection a bad thing and since mdadm typically relies on
> mdadm.conf for RAID-assembly
You can specify the UUID on the command-line too (although I don't).
The advantage of the config file from my POV is that it lets me activate
*all* my RAID arrays with one command, and the command doesn't change, no
matter how complex the array configuration. (I'll admit that the sheer
number of options to mdadm has always overwhelmed me to some degree,
despite the excellent documentation, so I prefer approaches that keep
a working command-line unchanged, especially for something as critical
as boot-time assembly.)
--
`In typical emacs fashion, it is both absurdly ornate and
still not really what one wanted.' --- jdev
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:39 ` Paul Waldo
@ 2006-09-09 15:58 ` Nix
2006-09-10 5:23 ` dean gaudet
1 sibling, 0 replies; 27+ messages in thread
From: Nix @ 2006-09-09 15:58 UTC (permalink / raw)
To: Paul Waldo; +Cc: linux-raid
On 5 Sep 2006, Paul Waldo uttered the following:
> What about bitmaps? Nobody has mentioned them. It is my
> understanding that you just turn them on with "mdadm /dev/mdX -b
> internal". Any caveats for this?
Notably, how many additional writes does it incur? I have some RAID
arrays using drives which are quiet *until* you access them, and which
then make a bloody racket. The superblock updates are bad enough, but
bitmap updates, well, I don't really like seeing one write turned into
twelve-odd disk hits that much (just a back-of-the-envelope guess for a
three-disk RAID-5 array).
--
`In typical emacs fashion, it is both absurdly ornate and
still not really what one wanted.' --- jdev
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-05 17:39 ` Paul Waldo
2006-09-09 15:58 ` Nix
@ 2006-09-10 5:23 ` dean gaudet
1 sibling, 0 replies; 27+ messages in thread
From: dean gaudet @ 2006-09-10 5:23 UTC (permalink / raw)
To: Paul Waldo; +Cc: linux-raid
On Tue, 5 Sep 2006, Paul Waldo wrote:
> What about bitmaps? Nobody has mentioned them. It is my understanding that
> you just turn them on with "mdadm /dev/mdX -b internal". Any caveats for
> this?
bitmaps have been working great for me on a raid5 and raid1. it makes it
that much more tolerable when i accidentally crash the box and don't have
to wait forever for a resync.
i don't notice the extra write traffic all that much... under heavy
traffic i see about 3 writes/s to the spare disk in the raid5 -- i assume
those are all due to the bitmap in the superblock on the spare.
i've considered using an external bitmap, i forget why i didn't do that
initially. the filesystem on the raid5 already has an external journal on
raid1.
-dean
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Care and feeding of RAID?
2006-09-06 18:49 ` Luca Berra
@ 2006-09-10 7:36 ` Benjamin Schieder
0 siblings, 0 replies; 27+ messages in thread
From: Benjamin Schieder @ 2006-09-10 7:36 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]
On 06.09.2006 20:49:45, Luca Berra wrote:
> On Wed, Sep 06, 2006 at 09:12:24AM +0200, Benjamin Schieder wrote:
> >>>>Personally, I use this script from cron:
> >>>>http://shellscripts.org/project/hdtest
> >>
> >>nice race :)
> >
> >As in race condition? Where?
>
> mktemp
> rm
> touch
> why do you do that?
Probably because of a wrong understanding of mktemp. I fixed this now.
> i retested now with 5.36 and it seems the output did _not_ change, i
> don't know what i saw this morning.
>
> but then it errors on the line
> IFS=" " read type status online < <( smartctl -d ata -a ${disk} | grep
> \#\ 1 | sed 's, \+, ,g' | cut -f 2,3,5 )
I think I know what you mean. I've seen these errors sometimes, too. But I
was too lazy to investigate yet since they only pop up once a month or so.
Greetings,
Benjamin
--
_ _ _ _ _
| \| |___| |_| |_ __ _ __| |__
| .` / -_) _| ' \/ _` / _| / /
|_|\_\___|\__|_||_\__,_\__|_\_\
| | (_)_ _ _ ___ __
| |__| | ' \ || \ \ /
|____|_|_||_\_,_/_\_\
Play Nethack anywhere with an x86 computer:
http://www.crash-override.net/nethacklinux.html
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2006-09-10 7:36 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-05 12:48 Care and feeding of RAID? Paul Waldo
2006-09-05 13:14 ` Benjamin Schieder
2006-09-05 16:56 ` Patrik Jonsson
2006-09-05 17:04 ` Gordon Henderson
2006-09-05 13:19 ` Gordon Henderson
2006-09-05 15:03 ` Steve Cousins
2006-09-05 15:41 ` Benjamin Schieder
2006-09-05 18:29 ` Steve Cousins
2006-09-05 20:57 ` Luca Berra
2006-09-06 7:12 ` Benjamin Schieder
2006-09-06 18:49 ` Luca Berra
2006-09-10 7:36 ` Benjamin Schieder
[not found] ` <44FDF08D.8000504@maine.edu>
2006-09-06 7:16 ` Luca Berra
2006-09-06 7:33 ` Mario 'BitKoenig' Holbe
2006-09-05 16:23 ` Mike Hardy
2006-09-05 17:02 ` Gordon Henderson
2006-09-05 17:46 ` Paul Waldo
2006-09-06 0:04 ` Gordon Henderson
2006-09-06 7:41 ` Mario 'BitKoenig' Holbe
2006-09-09 15:56 ` Nix
2006-09-05 17:09 ` Paul Waldo
2006-09-05 20:17 ` Richard Scobie
2006-09-05 17:57 ` Rev. Jeffrey Paul
2006-09-05 18:35 ` Steve Cousins
2006-09-05 17:39 ` Paul Waldo
2006-09-09 15:58 ` Nix
2006-09-10 5:23 ` dean gaudet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).