Re: interesting failure scenario

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Molle Bestefich <molle.bestefich@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Re: interesting failure scenario
Date: Mon, 4 Apr 2005 09:22:55 +0200	[thread overview]
Message-ID: <62b0912f05040400227dab7428@mail.gmail.com> (raw)
In-Reply-To: <62b0912f050404001813448d3d@mail.gmail.com>

Michael Tokarev wrote:
> I just come across an interesting situation, here's the
> scenario.

 [snip] 
 
> Now we have an interesting situation.  Both superblocks in d1
> and d2 are identical, event counts are the same, both are clean.
> Things wich are different:
>    utime - on d1 it is "more recent" (provided we haven't touched
>      the system clock ofcourse)
>    on d1, d2 is marked as faulty
>    on d2, d1 is marked as faulty.
> 
> Neither of the conditions are checked by mdadm.
> 
> So, mdadm just starts a clean RAID1 array composed of two drives
> with different data on them.  And noone noticies this fact (fsck
> which is reading from one disk goes ok), until some time later when
> some app reports data corruption (reading from another disk); you
> go check what's going on, notice there's no data corruption (reading
> from 1st disk), suspects memory and.. it's quite a long list of
> possible bad stuff which can go on here... ;)
> 
> The above scenario is just a theory, but the theory with some quite
> non-null probability.  Instead of hotplugging the disks, one can do
> a reboot having flaky ide/scsi cables or whatnot, so that disks will
> be detected on/off randomly...
> 
> Probably it is a good idea to test utime too, in additional to event
> counters, in mdadm's Assemble.c (as comments says but code disagrees).

Humn, please don't.
 
I rely on MD assembling arrays if their event counters match but the
utimes don't all the time.  Happens quite often that a controller
fails or something like that and you accidentally loose 2 disks in a
raid5.
 
I still want to be able to force the array to be assembled in these cases.
I'm still on 2.4 btw, don't know if there's a better way to do it in
2.6 than manipulating the event counters.
 
(Thinking about it, it would be perfect if the array would instantly
go into read-only mode whenever it is degraded to a non-redundant
state.  That way there's a higher chance of assembling a working array
afterwards?)

next prev parent reply	other threads:[~2005-04-04  7:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-03 21:59 interesting failure scenario Michael Tokarev
     [not found] ` <62b0912f050404001813448d3d@mail.gmail.com>
2005-04-04  7:22   ` Molle Bestefich [this message]
2005-04-04 22:15 ` Luca Berra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62b0912f05040400227dab7428@mail.gmail.com \
    --to=molle.bestefich@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.