From: Vincent Pelletier <plr.vincent@gmail.com>
To: Alexander Lyakas <alex.bolshoy@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Split-Brain Protection for MD arrays
Date: Mon, 12 Dec 2011 21:18:28 +0100 [thread overview]
Message-ID: <201112122118.28351.plr.vincent@gmail.com> (raw)
In-Reply-To: <CAGRgLy6=-naSGJw_tgiD5=ab7gWxyeQ2ysu-yCKa064Jih+cfA@mail.gmail.com>
Le lundi 12 décembre 2011 19:51:23, vous avez écrit :
> split-brain
I'm participating on the NEO[1] project (object database server with
redundancy - that last bit is the one relevant to this discussion), which
faces the same kind of problem (storage nodes dying when cluster is functional
or not, dead nodes comming back to life later, etc). So we had to design some
counter measures to handle split-brain.
I'm happy to recognise some equivalent of the decisions we took on NEO, and
I'll be following this thread with attention (we didn't try to get a lot of
reviewing on our design so far).
I would suggest one thing:
Use a fixed increment for "metadata version" number. Time representation is
not reliable IMHO, especially at times when you need to setup an array:
faulty BIOS battery, old RTC drifting either way, no NTP to correct this
(either none available or no client to access one).
If timestamp is affected by timezone (and especially DST) makes matters
worse.
Admitedly, fixed increment exposes user to problems if he decides to
independently run two halves of a split brain, start making their data
diverge, reach a point (controlable) where version number is at some
convenient value and then let the array assemble itself and burst in fire.
Though, user has to jump through hoops to reach this. Timestamp-based
requires non-monotonous RTC.
Side note: if anyone knows a time source available to userland which is not
affected by date/ntpd/ntpdate nor timezones nor DST (but can drift when
computer is powered down - but if possible not when suspended), please tell
me.
[1] http://pypi.python.org/pypi/neoppod
Regards,
--
Vincent Pelletier
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-12-12 20:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-12 18:51 Split-Brain Protection for MD arrays Alexander Lyakas
2011-12-12 20:18 ` Vincent Pelletier [this message]
2011-12-13 9:50 ` Alexander Lyakas
2011-12-15 3:02 ` NeilBrown
2011-12-15 14:29 ` Alexander Lyakas
2011-12-15 19:40 ` NeilBrown
2011-12-16 13:46 ` Roberto Spadim
2011-12-16 14:30 ` Alexander Lyakas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201112122118.28351.plr.vincent@gmail.com \
--to=plr.vincent@gmail.com \
--cc=alex.bolshoy@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).