From: Doug Ledford <dledford@redhat.com>
To: Phillip Susi <psusi@cfl.rr.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: safe segmenting of conflicting changes
Date: Mon, 26 Apr 2010 14:05:43 -0400 [thread overview]
Message-ID: <4BD5D5F7.208@redhat.com> (raw)
In-Reply-To: <4BD5D20B.9020900@cfl.rr.com>
[-- Attachment #1: Type: text/plain, Size: 2656 bytes --]
On 04/26/2010 01:48 PM, Phillip Susi wrote:
> On 4/26/2010 12:59 PM, Doug Ledford wrote:
>>> This goes ahead and adds the disk back to the array, despite the fact
>>> that it has been explicitly removed.
>>
>> Of course it does. You've just explicitly readded it, which is no
>> different than your explicit removal. Mdadm honors both.
>
> No, --incremental is automatically invoked by udev to scan disks as they
> are detected and try to assemble them. It isn't an explicit --add
> operation.
So, the point of raid is to be as reliable as possible, if the disk that
was once gone is now back, we want to use it if possible.
>>> Whether or not the superblock on sdb is updated when it is removed,
>>> --incremental should NOT use it as long as mdadm -D /dev/md0 says that
>>> disk is removed, at least not use it in /dev/md0.
>>
>> Why not? It's not like it uses it without correcting the missing bits
>> first. My guess is that you've either A) got a write intent bitmap or
>
> Actually under the right circumstances it DOES use the second disk's
> incorrect data without correcting it first,
A problem for which I suggested a specific fix in another email.
> and if it does overwrite it,
> that causes data loss so should not be done without an explicit --add
> --force. The fact that this happens is the entire reason for this thread.
The problem is the cause of this thread, and it's a bug that should be
fixed, it should not cause us to require things to have an explicit
--add --force to use a previously failed drive. This is a case of
reacting to a bug by disabling a useful aspect of the stack instead of
simply fixing the bug IMO. When the raid stack thinks that things are
out of sync, it doesn't automatically do anything bad. The real bug
here is that there is a way to get things out of sync behind the raid
stack's back.
> Whether or not it can be added safely, the disk has been explicitly
> removed so automatically adding it back is not acceptable.
The md raid stack makes no distinction between explicit removal or a
device that disappeared because of a glitch in a USB cable or some such.
In both cases the drive is failed and removed. So the fact that you
draw a distinction is irrelevant until such time as the raid superblocks
are changed to be able to encode the cause of a device being removed in
addition to the fact that it simply was removed.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next prev parent reply other threads:[~2010-04-26 18:05 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-23 13:42 safe segmenting of conflicting changes (was: Two degraded mirror segments recombined out of sync for massive data loss) Christian Gatzemeier
2010-04-23 15:08 ` Phillip Susi
2010-04-23 18:18 ` Phillip Susi
2010-04-26 16:59 ` safe segmenting of conflicting changes Doug Ledford
2010-04-26 17:48 ` Phillip Susi
2010-04-26 18:05 ` Doug Ledford [this message]
2010-04-26 18:43 ` Phillip Susi
2010-04-26 19:07 ` Doug Ledford
2010-04-26 19:38 ` Phillip Susi
2010-04-26 23:33 ` Doug Ledford
2010-04-27 16:20 ` Phillip Susi
2010-04-27 17:27 ` Doug Ledford
2010-04-27 18:04 ` Phillip Susi
2010-04-27 19:29 ` Doug Ledford
2010-04-28 13:22 ` Phillip Susi
2010-04-23 21:04 ` safe segmenting of conflicting changes, and hot-plugging between alternative versions Christian Gatzemeier
2010-04-24 8:10 ` Christian Gatzemeier
2010-04-26 17:11 ` Doug Ledford
2010-04-26 21:10 ` Christian Gatzemeier
2010-05-05 11:28 ` detecting segmentation / conflicting changes Christian Gatzemeier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BD5D5F7.208@redhat.com \
--to=dledford@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=psusi@cfl.rr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).