From: Molle Bestefich <molle.bestefich@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Re: interesting failure scenario
Date: Mon, 4 Apr 2005 09:22:55 +0200 [thread overview]
Message-ID: <62b0912f05040400227dab7428@mail.gmail.com> (raw)
In-Reply-To: <62b0912f050404001813448d3d@mail.gmail.com>
Michael Tokarev wrote:
> I just come across an interesting situation, here's the
> scenario.
[snip]
> Now we have an interesting situation. Both superblocks in d1
> and d2 are identical, event counts are the same, both are clean.
> Things wich are different:
> utime - on d1 it is "more recent" (provided we haven't touched
> the system clock ofcourse)
> on d1, d2 is marked as faulty
> on d2, d1 is marked as faulty.
>
> Neither of the conditions are checked by mdadm.
>
> So, mdadm just starts a clean RAID1 array composed of two drives
> with different data on them. And noone noticies this fact (fsck
> which is reading from one disk goes ok), until some time later when
> some app reports data corruption (reading from another disk); you
> go check what's going on, notice there's no data corruption (reading
> from 1st disk), suspects memory and.. it's quite a long list of
> possible bad stuff which can go on here... ;)
>
> The above scenario is just a theory, but the theory with some quite
> non-null probability. Instead of hotplugging the disks, one can do
> a reboot having flaky ide/scsi cables or whatnot, so that disks will
> be detected on/off randomly...
>
> Probably it is a good idea to test utime too, in additional to event
> counters, in mdadm's Assemble.c (as comments says but code disagrees).
Humn, please don't.
I rely on MD assembling arrays if their event counters match but the
utimes don't all the time. Happens quite often that a controller
fails or something like that and you accidentally loose 2 disks in a
raid5.
I still want to be able to force the array to be assembled in these cases.
I'm still on 2.4 btw, don't know if there's a better way to do it in
2.6 than manipulating the event counters.
(Thinking about it, it would be perfect if the array would instantly
go into read-only mode whenever it is degraded to a non-redundant
state. That way there's a higher chance of assembling a working array
afterwards?)
next prev parent reply other threads:[~2005-04-04 7:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-03 21:59 interesting failure scenario Michael Tokarev
[not found] ` <62b0912f050404001813448d3d@mail.gmail.com>
2005-04-04 7:22 ` Molle Bestefich [this message]
2005-04-04 22:15 ` Luca Berra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62b0912f05040400227dab7428@mail.gmail.com \
--to=molle.bestefich@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).