linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* checking state of RAID (for automated notifications)
@ 2006-09-06  8:44 Tomasz Chmielewski
  2006-09-06 16:50 ` Mike Hardy
  0 siblings, 1 reply; 3+ messages in thread
From: Tomasz Chmielewski @ 2006-09-06  8:44 UTC (permalink / raw)
  To: linux-raid

I would like to have RAID status monitored by nagios.

This sounds like a simple script, but I'm not sure what approach is correct.


Considering, that the "health" status of /proc/mdstat looks like this:

# cat /proc/mdstat
Personalities : [raid1] [raid10]
md2 : active raid10 sda2[4] sdd2[3] sdc2[2] sdb2[1]
       779264640 blocks super 1.0 64K chunks 2 near-copies [4/4] [UUUU]

md1 : active raid1 sdd1[1] sdc1[0]
       1076224 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
       1076224 blocks [2/2] [UU]

unused devices: <none>


What my script should be checking?

Does the number of "U" (8 for this host) letters indicate that RAID is 
healthy?
Or should I count "in_sync" in "cat /sys/block/md*/md/rd*/state"?
Perhaps the two approaches are the same, though.


What's the best way to determine that the RAID is running fine?


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: checking state of RAID (for automated notifications)
  2006-09-06  8:44 checking state of RAID (for automated notifications) Tomasz Chmielewski
@ 2006-09-06 16:50 ` Mike Hardy
  2006-09-07  8:30   ` Tomasz Chmielewski
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Hardy @ 2006-09-06 16:50 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-raid


<berlin> % rpm -qf /usr/lib/nagios/plugins/contrib/check_linux_raid.pl
nagios-plugins-1.4.1-1.2.fc4.rf

It is built in to my nagios plugins package at least, and works great.

-Mike

Tomasz Chmielewski wrote:
> I would like to have RAID status monitored by nagios.
> 
> This sounds like a simple script, but I'm not sure what approach is
> correct.
> 
> 
> Considering, that the "health" status of /proc/mdstat looks like this:
> 
> # cat /proc/mdstat
> Personalities : [raid1] [raid10]
> md2 : active raid10 sda2[4] sdd2[3] sdc2[2] sdb2[1]
>       779264640 blocks super 1.0 64K chunks 2 near-copies [4/4] [UUUU]
> 
> md1 : active raid1 sdd1[1] sdc1[0]
>       1076224 blocks [2/2] [UU]
> 
> md0 : active raid1 sdb1[1] sda1[0]
>       1076224 blocks [2/2] [UU]
> 
> unused devices: <none>
> 
> 
> What my script should be checking?
> 
> Does the number of "U" (8 for this host) letters indicate that RAID is
> healthy?
> Or should I count "in_sync" in "cat /sys/block/md*/md/rd*/state"?
> Perhaps the two approaches are the same, though.
> 
> 
> What's the best way to determine that the RAID is running fine?
> 
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: checking state of RAID (for automated notifications)
  2006-09-06 16:50 ` Mike Hardy
@ 2006-09-07  8:30   ` Tomasz Chmielewski
  0 siblings, 0 replies; 3+ messages in thread
From: Tomasz Chmielewski @ 2006-09-07  8:30 UTC (permalink / raw)
  To: Mike Hardy; +Cc: linux-raid

Mike Hardy wrote:
> <berlin> % rpm -qf /usr/lib/nagios/plugins/contrib/check_linux_raid.pl
> nagios-plugins-1.4.1-1.2.fc4.rf
> 
> It is built in to my nagios plugins package at least, and works great.

All right, I didn't see it.

I was thinking of monitoring remote servers; I wrote something very simple.
It checks how many "U" letters there are in /proc/mdstat, and compares 
it to $DEVICES number we have.

First, run this one on a remote machine, via cron:


#!/bin/bash

# This script prints the status of RAID device on this machine

# how many RAID devices/partitions do we have here?
DEVICES=8


# no need to change anything below...

RUNNING=$(cat /proc/mdstat | tr -cd  "U" | wc -c)

if [ "$DEVICES" == "$RUNNING" ] ; then

echo "RAID status OK" > /tmp/raid-status.txt

else

echo "RAID broken" > /tmp/raid-status.txt

fi


And then poll the results from the nagios server (let's call it 
"check_raid" nagios plugin):

#!/bin/bash

# checks state of software RAID

STATUS=$(ssh -l checkuser -i ~nagios/.ssh/checkuser.rsa $1 "cat 
/tmp/raid-status.txt")

if [ "$STATUS" == "RAID status OK" ] ; then
echo $STATUS
exit 0
else
echo $STATUS
exit 2
fi


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-09-07  8:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-06  8:44 checking state of RAID (for automated notifications) Tomasz Chmielewski
2006-09-06 16:50 ` Mike Hardy
2006-09-07  8:30   ` Tomasz Chmielewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).