From: Lars Marowsky-Bree <lmb@suse.de>
To: Philipp Reisner <philipp.reisner@linbit.com>, drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem
Date: Wed, 25 Aug 2004 12:28:52 +0200 [thread overview]
Message-ID: <20040825102852.GS3125@marowsky-bree.de> (raw)
In-Reply-To: <200408251142.18807.philipp.reisner@linbit.com>
On 2004-08-25T11:42:18,
Philipp Reisner <philipp.reisner@linbit.com> said:
> So, the current policy is:
> * The primary node refuses to connect to a peer with higher generation
> counts. This keeps the data intact. This is very related to the other
> after-split-brain-policy I want to make expclicit.
Makes sense.
> * Remeber the options so far: (for primary-after-split-brain)
>
> - The node that was primary before split brain (current behaviour)
> - The node that became primary during split brain
> - The node that modified more of it's data during the split-brain
> situation [ Do not think about implementation yet, just about
> the policy ]
> - None, wait for operator's decission. [suggested by LMB]
> - Node that is currently primary [see example above by LGE]
Minor clarification: I think the question is not about "Who becomes
primary", as the Sync* is decoupled from that status, but which side
drbd deems to have the good data and thus the SyncMaster.
Looking at it from this angle, we have two dimensions:
- Node state after the split-brain heals. Each side can either be
primary or secondary.
- The data state on each side.
Now, obviously, if the node state of both sides is "primary", drbd can't
automatically do something, but _must_ wait for admin intervention. It
can't resolve this internally, because it would destroy the layers
above. -> _MUST_ wait for operator intervention.
(Embedded environments with a dumb cluster manager... Hmm... Ok, maybe
crashing one side (which inherently stops the higher layers and triggers
recovery) and thus reducing the problem to one of the somewhat simpler
ones below might work...)
If only one side is primary, and the algorithms determine that this one
has the good data, and the other side has not touched the data in
between, this is also a simple case.
If both sides are secondary, but only one side has modified the data
since or been primary, again it's simple.
If one side is primary, but the other side has been primary in between
(but not at the time of the connect), drbd can either wait for a
higher-level intervention, or sync the now-secondary. Only two options,
nothing else makes sense. (Changing the data underneath the primary
strikes me as an exceptionally bad idea.)
If both sides are secondary, but both sides have modified the data
since, then we have several choices like picking the most recent
(timestamp?), most data modified, throwing a coin or again waiting for
admin intervention.
(Personally, I'd say operator intervention, after very careful
consideration of the problem, is in fact the only choice; this scenario
is only reached by a combination of several _severe_ faults.)
A special case obviously exists if one secondary side has inconsistent
data and the other has a consistent snapshot, which case it is a
somewhat safer assumption to sync automatically from the consistent to
the inconsistent side. This should be the default, but may be
configurable...
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering \\\ ///
SUSE Labs, Research and Development \honk/
SUSE LINUX AG - A Novell company \\//
next prev parent reply other threads:[~2004-08-25 10:28 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20040819110202.GO9601@marowsky-bree.de>
[not found] ` <20040819113205.GP9601@marowsky-bree.de>
[not found] ` <R+ahoCHARbsLOMKIahWH0/Q=lge@web.de>
2004-08-20 12:52 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Philipp Reisner
2004-08-20 13:32 ` Lars Ellenberg
2004-08-23 14:28 ` [Drbd-dev] gen_counts and primary --human Lars Ellenberg
2004-08-23 21:57 ` Lars Marowsky-Bree
2004-08-25 9:42 ` Philipp Reisner
2004-08-23 21:56 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Lars Marowsky-Bree
2004-08-25 9:42 ` Philipp Reisner
2004-08-25 10:28 ` Lars Marowsky-Bree [this message]
2004-08-25 11:30 ` Philipp Reisner
2004-08-25 13:38 ` Lars Ellenberg
2004-09-04 9:48 ` [Drbd-dev] Another drbd race Lars Marowsky-Bree
2004-09-04 10:00 ` Lars Ellenberg
2004-09-04 10:18 ` Lars Marowsky-Bree
2004-09-04 10:43 ` Lars Ellenberg
2004-09-04 10:51 ` Lars Marowsky-Bree
2004-09-07 9:39 ` Philipp Reisner
2004-09-07 10:13 ` Lars Ellenberg
2004-09-07 11:32 ` Philipp Reisner
2004-09-07 12:05 ` Lars Ellenberg
2004-09-07 12:12 ` Lars Marowsky-Bree
2004-09-07 12:06 ` Lars Marowsky-Bree
2004-09-07 12:19 ` Philipp Reisner
2004-09-07 12:28 ` Lars Marowsky-Bree
2004-09-07 12:47 ` Philipp Reisner
2004-09-08 11:20 ` Lars Marowsky-Bree
2004-09-08 11:31 ` Lars Ellenberg
2004-09-08 15:11 ` Lars Marowsky-Bree
2004-09-08 15:22 ` Lars Ellenberg
2004-09-08 11:33 ` Philipp Reisner
2004-09-07 15:55 ` Lars Ellenberg
2004-08-20 14:10 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Helmut Wollmersdorfer
2004-08-23 22:01 ` Lars Marowsky-Bree
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040825102852.GS3125@marowsky-bree.de \
--to=lmb@suse.de \
--cc=drbd-dev@lists.linbit.com \
--cc=philipp.reisner@linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.