* mdadm never notify, grub cause fault
@ 2003-05-16 9:21 Farkas Levente
2003-05-16 17:45 ` Juri Haberland
0 siblings, 1 reply; 9+ messages in thread
From: Farkas Levente @ 2003-05-16 9:21 UTC (permalink / raw)
To: linux-raid; +Cc: Neil Brown
hi,
we've got an raid1 arroy with two 120Gb maxtor hd (hda, hdc) runs rh9.
very ofter hdc faild (although it seems there is no physical error). in
/etc/mdadm.conf:
--------------------------
DEVICE /dev/hd[ac]1
ARRAY /dev/md0 UUID=a64f771d:9934a60a:39c1483d:2f4a9138
MAILADDR root@bnap.hu
--------------------------
we assume if we run:
/sbin/mdadm --monitor --scan --daemonise > /var/run/mdadm
than we'll get a notification in this case. unfortunately we didn't get
any notice! even when I stop this monitor and start it again we still
didn't got any email. do mdadm periodicaly send the notification? or it
send only once and if it fails for some reason we never get notified?
I'd like to get notification about it! even in every minutes. or is
there any other way to check the state in every hour?
another important question why we loose one of out hd? I assume grub
cause it. since yesterday I upgrade the kernel and after that I've to
manualy install grub (root device is on md0). so I run
--------------------------
grub
> root (hd0,0)
> setup (hd0)
> root (hd1,0)
> setup (hd1)
--------------------------
during the next boot:
--------------------------
hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=23072927,
sector=23072864
end_request: I/O error, dev 16:01 (hdc), sector 23072864
raid1: Disk failure on hdc1, disabling device.
Operation continuing on 1 devices
raid1: hdc1: rescheduling block 23072864
md: updating md0 RAID superblock on device
md: hda1 [events: 00000013]<6>(write) hda1's sb offset: 117949120
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
md: (skipping faulty hdc1 )
raid1: hda1: redirecting sector 23072864 to another mirror
--------------------------
currently
--------------------------
cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hda1[0] hdc1[1](F)
117949120 blocks [2/1] [U_]
unused devices: <none>
--------------------------
and
--------------------------
mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.00
Creation Time : Sun May 4 12:12:40 2003
Raid Level : raid1
Array Size : 117949120 (112.49 GiB 120.78 GB)
Device Size : 117949120 (112.49 GiB 120.78 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu May 15 23:57:06 2003
State : dirty, no-errors
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 22 1 1 faulty /dev/hdc1
UUID : a64f771d:9934a60a:39c1483d:2f4a9138
Events : 0.19
--------------------------
what is the prefered reconstruction in this case?:
mdadm /dev/md0 -f /dev/hdc1 -r /dev/hdc1 -a /dev/hdc1
or?
thanks for any help in advance.
--
Levente "Si vis pacem para bellum!"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-16 9:21 mdadm never notify, grub cause fault Farkas Levente
@ 2003-05-16 17:45 ` Juri Haberland
2003-05-16 17:57 ` Paul Clements
0 siblings, 1 reply; 9+ messages in thread
From: Juri Haberland @ 2003-05-16 17:45 UTC (permalink / raw)
To: linux-raid
Farkas Levente <lfarkas@bnap.hu> wrote:
> /etc/mdadm.conf:
> --------------------------
> DEVICE /dev/hd[ac]1
> ARRAY /dev/md0 UUID=a64f771d:9934a60a:39c1483d:2f4a9138
> MAILADDR root@bnap.hu
> --------------------------
> we assume if we run:
> /sbin/mdadm --monitor --scan --daemonise > /var/run/mdadm
> than we'll get a notification in this case. unfortunately we didn't get
> any notice! even when I stop this monitor and start it again we still
> didn't got any email. do mdadm periodicaly send the notification? or it
I also ran into this problem. I found the reason when I started mdadm
without '--daemonize': It tries to use '/usr/lib/sendmail' whereas most
recent distributions have sendmail (or it's replacement) in /usr/sbin.
So just create a link from /usr/sbin/sendmail to /usr/lib/sendmail and it
should work.
> send only once and if it fails for some reason we never get notified?
> I'd like to get notification about it! even in every minutes. or is
> there any other way to check the state in every hour?
Mdadm will sent only one mail per event. If you want to have a periodic
notification I think you will have to write your own script that checks
/proc/mdstat.
Concerning your problem with hdc, it might be a bad cabling. Have you
checked this?
Regards,
Juri
--
Juri Haberland <juri@koschikode.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-16 17:45 ` Juri Haberland
@ 2003-05-16 17:57 ` Paul Clements
2003-05-19 2:47 ` Neil Brown
0 siblings, 1 reply; 9+ messages in thread
From: Paul Clements @ 2003-05-16 17:57 UTC (permalink / raw)
To: Juri Haberland; +Cc: linux-raid
Juri Haberland wrote:
> I also ran into this problem. I found the reason when I started mdadm
> without '--daemonize': It tries to use '/usr/lib/sendmail' whereas most
> recent distributions have sendmail (or it's replacement) in /usr/sbin.
> So just create a link from /usr/sbin/sendmail to /usr/lib/sendmail and it
> should work.
I was wondering myself how mdadm sent e-mail...sounds like there might
be a need for a new "MAILPROG" entry in mdadm.conf...
--
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-16 17:57 ` Paul Clements
@ 2003-05-19 2:47 ` Neil Brown
2003-05-19 6:06 ` Farkas Levente
[not found] ` <3EC8B398.8090702@koschikode.com>
0 siblings, 2 replies; 9+ messages in thread
From: Neil Brown @ 2003-05-19 2:47 UTC (permalink / raw)
To: Paul Clements; +Cc: Juri Haberland, linux-raid
On Friday May 16, Paul.Clements@SteelEye.com wrote:
> Juri Haberland wrote:
>
> > I also ran into this problem. I found the reason when I started mdadm
> > without '--daemonize': It tries to use '/usr/lib/sendmail' whereas most
> > recent distributions have sendmail (or it's replacement) in /usr/sbin.
> > So just create a link from /usr/sbin/sendmail to /usr/lib/sendmail and it
> > should work.
>
> I was wondering myself how mdadm sent e-mail...sounds like there might
> be a need for a new "MAILPROG" entry in mdadm.conf...
>
There is a compile-time option which I have just made more explicit in
the Makefile.
If you add:
-DSendmail=\""/usr/sbin/sendmail -t"\"
to the CFLAGS line in the makefile you will change how mail is sent by
default.
If you want runtime configuration, I would rather just leave the
PROGRAM entry and you and write a script to do whatever you like.
I'm thinking of causing the "NewArray" alert to
- be generate for all arrays at start time
- contain an indication of whether the array is degraded
- cause email to be sent if the array is degraded.
Thus you will get an alert of a failed drive when the mdadm --monitor
is started.
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-19 2:47 ` Neil Brown
@ 2003-05-19 6:06 ` Farkas Levente
2003-05-21 1:38 ` Neil Brown
[not found] ` <3EC8B398.8090702@koschikode.com>
1 sibling, 1 reply; 9+ messages in thread
From: Farkas Levente @ 2003-05-19 6:06 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Neil Brown wrote:
> On Friday May 16, Paul.Clements@SteelEye.com wrote:
>
>>Juri Haberland wrote:
>>
>>
>>>I also ran into this problem. I found the reason when I started mdadm
>>>without '--daemonize': It tries to use '/usr/lib/sendmail' whereas most
>>>recent distributions have sendmail (or it's replacement) in /usr/sbin.
>>>So just create a link from /usr/sbin/sendmail to /usr/lib/sendmail and it
>>>should work.
>>
>>I was wondering myself how mdadm sent e-mail...sounds like there might
>>be a need for a new "MAILPROG" entry in mdadm.conf...
>>
>
>
> There is a compile-time option which I have just made more explicit in
> the Makefile.
> If you add:
> -DSendmail=\""/usr/sbin/sendmail -t"\"
> to the CFLAGS line in the makefile you will change how mail is sent by
> default.
>
> If you want runtime configuration, I would rather just leave the
> PROGRAM entry and you and write a script to do whatever you like.
>
> I'm thinking of causing the "NewArray" alert to
> - be generate for all arrays at start time
> - contain an indication of whether the array is degraded
> - cause email to be sent if the array is degraded.
>
> Thus you will get an alert of a failed drive when the mdadm --monitor
> is started.
the above are generated every every time when --monitor started? that
would be nice!
--
Levente "Si vis pacem para bellum!"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
[not found] ` <3EC8B398.8090702@koschikode.com>
@ 2003-05-19 11:25 ` Neil Brown
0 siblings, 0 replies; 9+ messages in thread
From: Neil Brown @ 2003-05-19 11:25 UTC (permalink / raw)
To: Juri Haberland; +Cc: linux-raid
On Monday May 19, juri@koschikode.com wrote:
> Neil Brown wrote:
>
> > There is a compile-time option which I have just made more explicit in
> > the Makefile.
> > If you add:
> > -DSendmail=\""/usr/sbin/sendmail -t"\"
> > to the CFLAGS line in the makefile you will change how mail is sent by
> > default.
>
> The question is: why does it default to /usr/lib? Yes, I know that this
> is the historical place, but on virtually all Linux distributions
> sendmail is in /usr/sbin and if you're lucky you have a compatibility
> link to /usr/lib. As mdadm is intended to be used on Linux and not on any
> other Unix system I don't see the point in defaulting to /usr/lib.
Must be my grey hairs showing. This 'sbin' thing still feels like a
wierd new invention that is some sort of cross between /etc and
/usr/lib.
When I first used a machine post locally-hacked-Edition-7 Unix, the
gateway to the mail system was "/usr/lib/sendmail", and
"/usr/lib/sendmail" has been there ever since, so I haven't seen a
need to change.
Old habits die hard.
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-19 6:06 ` Farkas Levente
@ 2003-05-21 1:38 ` Neil Brown
2003-05-21 9:05 ` Farkas Levente
0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2003-05-21 1:38 UTC (permalink / raw)
To: Farkas Levente; +Cc: linux-raid
On Monday May 19, lfarkas@bnap.hu wrote:
> >
> > I'm thinking of causing the "NewArray" alert to
> > - be generate for all arrays at start time
> > - contain an indication of whether the array is degraded
> > - cause email to be sent if the array is degraded.
> >
> > Thus you will get an alert of a failed drive when the mdadm --monitor
> > is started.
>
> the above are generated every every time when --monitor started? that
> would be nice!
Yep, and it's done.
There is a patch under
http://www.cse.unsw.edu.au/~neilb/source/mdadm/
(see 'patch' and the 'applied')
that
1/ When mdadm --monitor first notices an array, it will check if it
is degraded and will issue a "DegradedArray" event if it is.
2/ mdadm --monitor has a --oneshot option which causes it to check
once and exit. This can be used in a cron script to generate
DegradedArray events on a regular basis for any degraded arrays.
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-21 1:38 ` Neil Brown
@ 2003-05-21 9:05 ` Farkas Levente
2003-05-22 0:36 ` Neil Brown
0 siblings, 1 reply; 9+ messages in thread
From: Farkas Levente @ 2003-05-21 9:05 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Neil Brown wrote:
> On Monday May 19, lfarkas@bnap.hu wrote:
>
>>>I'm thinking of causing the "NewArray" alert to
>>> - be generate for all arrays at start time
>>> - contain an indication of whether the array is degraded
>>> - cause email to be sent if the array is degraded.
>>>
>>>Thus you will get an alert of a failed drive when the mdadm --monitor
>>>is started.
>>
>>the above are generated every every time when --monitor started? that
>>would be nice!
>
>
> Yep, and it's done.
>
> There is a patch under
>
> http://www.cse.unsw.edu.au/~neilb/source/mdadm/
>
> (see 'patch' and the 'applied')
>
> that
> 1/ When mdadm --monitor first notices an array, it will check if it
> is degraded and will issue a "DegradedArray" event if it is.
> 2/ mdadm --monitor has a --oneshot option which causes it to check
> once and exit. This can be used in a cron script to generate
> DegradedArray events on a regular basis for any degraded arrays.
thanks. what does the applied means? I'm just check both the tgz and
rpms but it seems to me that are older. I always prefere the release
like 1.2.1 what's more if an rpm would be supplied:-))
thnaks.
--
Levente "Si vis pacem para bellum!"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: mdadm never notify, grub cause fault
2003-05-21 9:05 ` Farkas Levente
@ 2003-05-22 0:36 ` Neil Brown
0 siblings, 0 replies; 9+ messages in thread
From: Neil Brown @ 2003-05-22 0:36 UTC (permalink / raw)
To: Farkas Levente; +Cc: linux-raid
On Wednesday May 21, lfarkas@bnap.hu wrote:
>
> thanks. what does the applied means? I'm just check both the tgz and
> rpms but it seems to me that are older. I always prefere the release
> like 1.2.1 what's more if an rpm would be supplied:-))
> thnaks.
>
The "applied" means that I have applied it to my current source tree.
There could also be "removed" if I have temporarily retracted it for
some reason, or "included" if it has been included in a distribution.
The 'patch' directory is simply the repository maintained by my patch
management script (call 'p' and found in
http://www.cse.unsw.edu.au/~neilb/source/wiggle/
). Putting it in my web page is the easiest way to make sure current
work is published with minimal effort.
If you want it in a real release, I'm afraid you will have to wait
until I feel that a new release is appropriate.
NeilBrown
>
> --
> Levente "Si vis pacem para bellum!"
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-05-22 0:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-16 9:21 mdadm never notify, grub cause fault Farkas Levente
2003-05-16 17:45 ` Juri Haberland
2003-05-16 17:57 ` Paul Clements
2003-05-19 2:47 ` Neil Brown
2003-05-19 6:06 ` Farkas Levente
2003-05-21 1:38 ` Neil Brown
2003-05-21 9:05 ` Farkas Levente
2003-05-22 0:36 ` Neil Brown
[not found] ` <3EC8B398.8090702@koschikode.com>
2003-05-19 11:25 ` Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).