linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joachim Otahal <Jou@gmx.net>
To: Keld Simonsen <keld@keldix.com>
Cc: Bill Davidsen <davidsen@tmr.com>, linux-raid@vger.kernel.org
Subject: Re: md devices: Suggestion for in place time and checksum within the RAID
Date: Sun, 14 Mar 2010 15:00:01 +0100	[thread overview]
Message-ID: <4B9CEBE1.7040700@gmx.net> (raw)
In-Reply-To: <20100314130348.GA14141@light.rap.dk>

Keld Simonsen schrieb:
> On Sun, Mar 14, 2010 at 12:58:50PM +0100, Joachim Otahal wrote:
>    
>> Debian schedules a monthly check (first sunday 00:57), IMHO the best
>> possible time and frequency, less is dangerous, more is useless. I added
>> a cronjob to check every 15 minutes for changes from /proc/mdstat and
>> changes from smart info (reallocated sector count and drive internal
>> error list only) and emails me if something changed from the previous check.
>> I use the script because /etc/mdadm/mdadm.conf only takes ONE email
>> address and requires a local MTA installed, I allways uninstall the
>> local MTA if the machine is not going to be a mail server.
>>      
> Interesting! I would like to see your scripts....
>    
sendEmail.pl is from 
http://caspian.dotconf.net/menu/Software/SendEmail/, in his latest 
update he managed to get rid of tls and base64-encoding problems.
Here is the unpolished script, in "it does what it should do" state. The 
HEALTHFILE variable is changed to somewhere in the middle. The locations 
are chosen for: raid info at every boot + upon change, smart info only 
when something changes. It is run every 15 minutes from cron. One of my 
hdd's had a growing reallocated sector count each two weeks, but seems 
to be stabilized now, I can nicely follow that in my inbox.

#!/bin/sh
HEALTHFILE="/tmp/healthcheck.mdstat"
HARDDRIVES="/dev/sda /dev/sdb /dev/sdc /dev/sdd"
SENDEMAILCOMMAND="/usr/local/sbin/sendEmail.pl -f <sender> -t 
<receipient> -cc <receipient> -cc <receipient> -s <smtp-server> -o 
tls=auto -xu <smtp-user> -xp <smtp-password>"
if [ -f ${HEALTHFILE}.1 ] ; then /bin/rm -f ${HEALTHFILE}.1 ; fi
if [ -f ${HEALTHFILE}.0 ] ; then /bin/mv ${HEALTHFILE}.0 ${HEALTHFILE}.1 
; else /usr/bin/touch ${HEALTHFILE}.1 ; fi
/bin/cat /proc/mdstat > ${HEALTHFILE}.0
/usr/bin/diff ${HEALTHFILE}.0 ${HEALTHFILE}.1 > /dev/null
case "$?" in
   0)
     #
   ;;
   1)
     ${SENDEMAILCOMMAND} -u "RAID status" < ${HEALTHFILE}.0
   ;;
esac

HEALTHFILE="/var/log/healthcheck.smartdtl.realloc-sector-count"
if [ -f ${HEALTHFILE}.1 ] ; then /bin/rm -f ${HEALTHFILE}.1 ; fi
if [ -f ${HEALTHFILE}.0 ] ; then /bin/mv ${HEALTHFILE}.0 ${HEALTHFILE}.1 
; else /usr/bin/touch ${HEALTHFILE}.1 ; fi
echo "SMART shot info:"> ${HEALTHFILE}.0
for X in ${HARDDRIVES} ; do
   /bin/echo "${X}">> ${HEALTHFILE}.0
   /usr/local/sbin/smartctl --all ${X} | /bin/grep -i 
Reallocated_Sector_Ct >> ${HEALTHFILE}.0
done
/bin/echo 
"------------------------------------------------------------------------">> 
${HEALTHFILE}.0
/bin/echo "Error Log from drives">> ${HEALTHFILE}.0
for X in ${HARDDRIVES} ; do
   /bin/echo "${X}">> ${HEALTHFILE}.0
   /usr/local/sbin/smartctl --all ${X} | /bin/grep -i -A 999 "SMART 
Error Log" | grep -v "without error" >> ${HEALTHFILE}.0
   /bin/echo 
"------------------------------------------------------------------------">> 
${HEALTHFILE}.0
done
/usr/bin/diff ${HEALTHFILE}.0 ${HEALTHFILE}.1 > /dev/null
case "$?" in
   0)
     #
   ;;
   1)
     ${SENDEMAILCOMMAND} -u "SMART Status, Reallocated Sector Count" < 
${HEALTHFILE}.0
   ;;
esac
>> But why not checking parity during normal read operation? Was that a
>> performance decision?
>>      
> I don't know, but I do think it would hurt performance considerably.
>    
If  http://www.accs.com/p_and_p/RAID/LinuxRAID.html is still current 
info: It will hurt performance due to the "left synchronous default", 
but I expect the real world difference to be small.

>> It is not _that_ bad not doing it during normal
>> operation since the good dists schedule a regular check, but can it be
>> controlled by something like echo "1">
>> /proc/sys/dev/raid/always_read_parity ?
>>      
> Well, I think making an optional check would be fine.
> I dont know if it could be done in a non-performance hurting way, such
> as being deleyed or running at a lower IO priority.
>    
I doubt delaying would help the performance, in asynchronous layouts it 
is the fifth HD doing a read, in synchronous layouts the 
next-chunk-to-read is directly after the parity chunk.

kind regards,

Joachim Otahal


  reply	other threads:[~2010-03-14 14:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-13 23:00 md devices: Suggestion for in place time and checksum within the RAID Joachim Otahal
2010-03-14  0:04 ` Bill Davidsen
2010-03-14  1:25   ` Joachim Otahal
2010-03-14 10:20     ` Keld Simonsen
2010-03-14 11:58       ` Joachim Otahal
2010-03-14 13:03         ` Keld Simonsen
2010-03-14 14:00           ` Joachim Otahal [this message]
2010-03-15 21:28           ` Joachim Otahal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B9CEBE1.7040700@gmx.net \
    --to=jou@gmx.net \
    --cc=davidsen@tmr.com \
    --cc=keld@keldix.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).