From: Alexander Lyakas <alex.bolshoy@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Split-Brain Protection for MD arrays
Date: Thu, 15 Dec 2011 16:29:12 +0200 [thread overview]
Message-ID: <CAGRgLy7oCm87HRTK_FS6g80g5MQzSvto10T0PH0vE022XMsn-w@mail.gmail.com> (raw)
In-Reply-To: <20111215140252.2f9bb986@notabene.brown>
Neil,
thanks for the review, and for detailed answers to my questions.
> When we mark a device 'failed' it should stay marked as 'failed'. When the
> array is optimal again it is safe to convert all 'failed' slots to
> 'spare/missing' but not before.
I did not understand all that reasoning. When you say "slot", you mean
index in the dev_roles[] array, correct? If yes, I don't see what
importance the index has, compared to the value of the entry itself
(which is "role" in your terminology).
Currently, 0xFFFE means both "failed" and "missing", and that makes
perfect sense to me. Basically this means that this entry of
dev_roles[] is unused. When a device fails, it is kicked out of the
array, so its entry in dev_roles[] becomes available.
(You once mentioned that for older arrays, their dev_roles[] index was
also their role, perhaps you are concerned about those too).
In any case, I will be watching for changes in this area, if you
decide to make them (although I think this might break backwards
compatibility, unless a new version of superblock will be used).
> If you have a working array and you initiate a write of a data block and the
> parity block, and if one of those writes fails, then you no longer have a
> working array. Some data blocks in that stripe cannot be recovered.
> So we need to make sure that admin knows the array is dead and doesn't just
> re-assemble and think everything is OK.
I see your point. I don't know what's better: to know the "last known
good" configuration, or to know that the array has failed. I guess, I
am just used to the former.
> I think to resolve this issue we need 2 thing.
>
> 1/ when assembling an array if any device thinks that the 'chosen' device has
> failed, then don't trust that devices.
I think that if any device thinks that "chosen" has failed, then
either it has a more recent superblock, and then this device should be
"chosen" and not the other. Or, the "chosen" device's superblock is
the one that counts, then it doesn't matter what current device
thinks, because array will be assembled according to the "chosen"
superblock.
> 2/ Don't erase 'failed' status from dev_roles[] until the array is
> optimal.
Neil, I think both these points don't resolve the following simple
scenario: RAID1 with drive A and B. Drive A fails, array continues to
operate on drive B. After reboot, only drive A is accessible. If we go
ahead with assemble, we will see stale data. If after reboot, we,
however, see only drive A, then (since B is "faulty" in A's
superblock), we can go ahead and assemble. The change I suggested will
abort in the first case, but will assemble in the second case.
But obviously, you know better what MD users expect and want.
Thanks again for taking time and reviewing the proposal! And yes, next
time, I will put everything in the email.
Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-12-15 14:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-12 18:51 Split-Brain Protection for MD arrays Alexander Lyakas
2011-12-12 20:18 ` Vincent Pelletier
2011-12-13 9:50 ` Alexander Lyakas
2011-12-15 3:02 ` NeilBrown
2011-12-15 14:29 ` Alexander Lyakas [this message]
2011-12-15 19:40 ` NeilBrown
2011-12-16 13:46 ` Roberto Spadim
2011-12-16 14:30 ` Alexander Lyakas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGRgLy7oCm87HRTK_FS6g80g5MQzSvto10T0PH0vE022XMsn-w@mail.gmail.com \
--to=alex.bolshoy@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).