From: NeilBrown <neilb@suse.de>
To: Martin Wilck <mwilck@arcor.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: RFC: handling of missing disks in DDF
Date: Thu, 8 Aug 2013 09:04:23 +1000 [thread overview]
Message-ID: <20130808090423.4b806828@notabene.brown> (raw)
In-Reply-To: <5202B633.3080807@arcor.de>
[-- Attachment #1: Type: text/plain, Size: 2289 bytes --]
On Wed, 07 Aug 2013 23:03:47 +0200 Martin Wilck <mwilck@arcor.de> wrote:
> Hi Neil, everyone,
>
> I'd like to get comments on the following concept for handling missing
> disks. Currently we handle "Missing" and "Failed" almost equally, that's
> not optimal.
>
> 1. Try to detect missing disks and mark them so in the meta data.
> ddf_open_new() could do this. At the latest, it must be done before
> writing the meta data (write_super or sync_metadata).
>
> 2. Detection of a missing disk should not force an immediate meta data
> write, because the disk may be added soon. It's sufficient to mark a
> disk as missing when we write the meta data for other reasons.
>
> 3. Try to be smart when loading headers. A disk that was missing in a
> previous run must have a lower seq number and time stamp in the meta
> data than the other disks, and must be marked "Missing" there (but not
> "Failed"). The meta data on the preciously missing disk should mark it
> as "active/Online". In this case, use the newer meta data, and try to
> re-integrate the previously missing disk (i.e. treat it as preferred
> global spare).
>
> 4. It's possible that an array wasn't written to while a disk had been
> missing. In that case the disk could be re-added to the array without
> recovery. The question is if we can detect this situation. I thought
> first the "Not Consistent" bit might be usable for that, but I doubt it
> now. The spec isn't clear about whether "consistent" state means
> consistency over all configured disks or only over present ones.
>
Hi Martin.
Thanks for looking into this.
Can you say why exactly that treating 'Missing' like 'Failed' is not
optimal? I find them to be very similar concepts.
Your idea for marking a missing device that reappears as a preferred spare is
probably a good one - is that incompatible with treating it as failed while
it isn't present?
As soon as we mark an array as active (aka Not Consistent) we need to mark
any missing devices as 'failed' in some way to ensure that the data is never
seen as valid in the array. Before that we can certainly be lazy about
translating 'missing' to 'failed'...
Maybe if you could give some more detail on the problem scenario??
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-08-07 23:04 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-07 21:03 RFC: handling of missing disks in DDF Martin Wilck
2013-08-07 23:04 ` NeilBrown [this message]
2013-08-08 8:47 ` Martin Wilck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130808090423.4b806828@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=mwilck@arcor.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).