From: Joachim Otahal <Jou@gmx.net>
To: Keld Simonsen <keld@keldix.com>
Cc: Bill Davidsen <davidsen@tmr.com>, linux-raid@vger.kernel.org
Subject: Re: md devices: Suggestion for in place time and checksum within the RAID
Date: Sun, 14 Mar 2010 15:00:01 +0100 [thread overview]
Message-ID: <4B9CEBE1.7040700@gmx.net> (raw)
In-Reply-To: <20100314130348.GA14141@light.rap.dk>
Keld Simonsen schrieb:
> On Sun, Mar 14, 2010 at 12:58:50PM +0100, Joachim Otahal wrote:
>
>> Debian schedules a monthly check (first sunday 00:57), IMHO the best
>> possible time and frequency, less is dangerous, more is useless. I added
>> a cronjob to check every 15 minutes for changes from /proc/mdstat and
>> changes from smart info (reallocated sector count and drive internal
>> error list only) and emails me if something changed from the previous check.
>> I use the script because /etc/mdadm/mdadm.conf only takes ONE email
>> address and requires a local MTA installed, I allways uninstall the
>> local MTA if the machine is not going to be a mail server.
>>
> Interesting! I would like to see your scripts....
>
sendEmail.pl is from
http://caspian.dotconf.net/menu/Software/SendEmail/, in his latest
update he managed to get rid of tls and base64-encoding problems.
Here is the unpolished script, in "it does what it should do" state. The
HEALTHFILE variable is changed to somewhere in the middle. The locations
are chosen for: raid info at every boot + upon change, smart info only
when something changes. It is run every 15 minutes from cron. One of my
hdd's had a growing reallocated sector count each two weeks, but seems
to be stabilized now, I can nicely follow that in my inbox.
#!/bin/sh
HEALTHFILE="/tmp/healthcheck.mdstat"
HARDDRIVES="/dev/sda /dev/sdb /dev/sdc /dev/sdd"
SENDEMAILCOMMAND="/usr/local/sbin/sendEmail.pl -f <sender> -t
<receipient> -cc <receipient> -cc <receipient> -s <smtp-server> -o
tls=auto -xu <smtp-user> -xp <smtp-password>"
if [ -f ${HEALTHFILE}.1 ] ; then /bin/rm -f ${HEALTHFILE}.1 ; fi
if [ -f ${HEALTHFILE}.0 ] ; then /bin/mv ${HEALTHFILE}.0 ${HEALTHFILE}.1
; else /usr/bin/touch ${HEALTHFILE}.1 ; fi
/bin/cat /proc/mdstat > ${HEALTHFILE}.0
/usr/bin/diff ${HEALTHFILE}.0 ${HEALTHFILE}.1 > /dev/null
case "$?" in
0)
#
;;
1)
${SENDEMAILCOMMAND} -u "RAID status" < ${HEALTHFILE}.0
;;
esac
HEALTHFILE="/var/log/healthcheck.smartdtl.realloc-sector-count"
if [ -f ${HEALTHFILE}.1 ] ; then /bin/rm -f ${HEALTHFILE}.1 ; fi
if [ -f ${HEALTHFILE}.0 ] ; then /bin/mv ${HEALTHFILE}.0 ${HEALTHFILE}.1
; else /usr/bin/touch ${HEALTHFILE}.1 ; fi
echo "SMART shot info:"> ${HEALTHFILE}.0
for X in ${HARDDRIVES} ; do
/bin/echo "${X}">> ${HEALTHFILE}.0
/usr/local/sbin/smartctl --all ${X} | /bin/grep -i
Reallocated_Sector_Ct >> ${HEALTHFILE}.0
done
/bin/echo
"------------------------------------------------------------------------">>
${HEALTHFILE}.0
/bin/echo "Error Log from drives">> ${HEALTHFILE}.0
for X in ${HARDDRIVES} ; do
/bin/echo "${X}">> ${HEALTHFILE}.0
/usr/local/sbin/smartctl --all ${X} | /bin/grep -i -A 999 "SMART
Error Log" | grep -v "without error" >> ${HEALTHFILE}.0
/bin/echo
"------------------------------------------------------------------------">>
${HEALTHFILE}.0
done
/usr/bin/diff ${HEALTHFILE}.0 ${HEALTHFILE}.1 > /dev/null
case "$?" in
0)
#
;;
1)
${SENDEMAILCOMMAND} -u "SMART Status, Reallocated Sector Count" <
${HEALTHFILE}.0
;;
esac
>> But why not checking parity during normal read operation? Was that a
>> performance decision?
>>
> I don't know, but I do think it would hurt performance considerably.
>
If http://www.accs.com/p_and_p/RAID/LinuxRAID.html is still current
info: It will hurt performance due to the "left synchronous default",
but I expect the real world difference to be small.
>> It is not _that_ bad not doing it during normal
>> operation since the good dists schedule a regular check, but can it be
>> controlled by something like echo "1">
>> /proc/sys/dev/raid/always_read_parity ?
>>
> Well, I think making an optional check would be fine.
> I dont know if it could be done in a non-performance hurting way, such
> as being deleyed or running at a lower IO priority.
>
I doubt delaying would help the performance, in asynchronous layouts it
is the fifth HD doing a read, in synchronous layouts the
next-chunk-to-read is directly after the parity chunk.
kind regards,
Joachim Otahal
next prev parent reply other threads:[~2010-03-14 14:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-13 23:00 md devices: Suggestion for in place time and checksum within the RAID Joachim Otahal
2010-03-14 0:04 ` Bill Davidsen
2010-03-14 1:25 ` Joachim Otahal
2010-03-14 10:20 ` Keld Simonsen
2010-03-14 11:58 ` Joachim Otahal
2010-03-14 13:03 ` Keld Simonsen
2010-03-14 14:00 ` Joachim Otahal [this message]
2010-03-15 21:28 ` Joachim Otahal
-- strict thread matches above, loose matches on Subject: below --
2010-03-13 23:21 Joachim Otahal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B9CEBE1.7040700@gmx.net \
--to=jou@gmx.net \
--cc=davidsen@tmr.com \
--cc=keld@keldix.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.