linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: MRK <mrk@shiftmail.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm: failed devices become spares!
Date: Wed, 19 May 2010 00:25:16 +0200	[thread overview]
Message-ID: <4BF313CC.9030401@shiftmail.org> (raw)
In-Reply-To: <20100518120637.24d875c9@notabene.brown>

On 05/18/2010 04:06 AM, Neil Brown wrote:
> However if --monitor gets to check the array between the above to events, it
> will first see that the working drive is now faulty, so it reports a failure,
> and then see that the faulty device isn't faulty any more and in fact isn't
> even there.  The "isn't event there" bit doesn't register and it treats it as
> 'SpareActive'.
>
> I should fix that.
>    

However in one case the two events are not detected in the same round:

Apr 12 20:10:02 phobos mdadm[3157]: Fail event detected on md device /dev/md2,
component device /dev/sdf1
Apr 12 20:11:02 phobos mdadm[3157]: SpareActive event detected on md device
/dev/md2, component device /dev/sdf1


1 minute passes between the two entries. I suppose that's the mdadm 
daemon polling time.

In the other case all the entries are at the same time

Apr 13 08:00:02 phobos mdadm[3157]: Fail event detected on md device /dev/md2,
component device /dev/sdd1
Apr 13 08:00:02 phobos mdadm[3157]: SpareActive event detected on md device
/dev/md2, component device /dev/sdd1
Apr 13 08:00:02 phobos last message repeated 7 times
[...many times that messages..]


...plus, in this second case the SpareActive triggers a lot of times 
within that same second (Pierre you cut it short, but are all the "many 
times that messages" all at the exact same time or they span a few seconds?)

It looks to me like some kind of usb failure where the USB connection or 
USB bridge momentarily fails then immediately gets re-detected and 
re-added to the system. But since there are no usb entries in dmesg, 
that would also be an issue of the usb driver. Could the problem also be 
a mixture with some unwise udev triggers of Debian, maybe somehow 
causing the auto-re-add of the drive to the RAID?

Pierre:
- can you post your mdadm.conf?
- USB is not good for RAID imho. Many times in my life I saw problems 
with USB/SATA bridges where the drive would get disconnected on high I/O 
activity and then reconnected after a few seconds. Anyway, readding it 
to the RAID shouldn't have happened. Also in my case there were "usb" 
entries in dmesg.

  reply	other threads:[~2010-05-18 22:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-16 15:40 mdadm: failed devices become spares! Pierre Vignéras
2010-05-16 19:56 ` Leslie Rhorer
2010-05-17 18:10   ` Pierre Vignéras
2010-05-17 21:09     ` Tim Small
2010-05-18  1:30     ` Neil Brown
2010-05-18  2:06       ` Neil Brown
2010-05-18 22:25         ` MRK [this message]
2010-05-19 19:56           ` Simon Matthews
2010-05-21 21:00           ` Pierre Vignéras
2010-05-21 21:27         ` mdadm: failed devices become spares! -> Solved ! Pierre Vignéras
2010-05-18 23:07       ` mdadm: failed devices become spares! Pierre Vignéras
2010-05-19  1:45         ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BF313CC.9030401@shiftmail.org \
    --to=mrk@shiftmail.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).