From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] Another drbd race
Date: Tue, 7 Sep 2004 14:05:02 +0200 [thread overview]
Message-ID: <20040907120502.GA12518@nudl> (raw)
In-Reply-To: <200409071332.02477.philipp.reisner@linbit.com>
On Tue, Sep 07, 2004 at 01:32:02PM +0200, Philipp Reisner wrote:
> > I would like to introduce an additional Node state for the o_state:
> > Dead. it is never "recognized" internally, but can be set by the
> > operator or cluster manager. basically, if we go to WhatEver/Unknown,
> > we don't accept anything (since we don't want to risk split brain).
> > some higher authority can and needs to resolve this, telling us the peer
> > is dead (after a successfull stonith, when we are Secondary and shall be
> > promoted).
> >
> >
> > now we have this:
> > P/S --- S/P
> > P/? -:- S/?
> >
> > A)
> > if this is in fact (from the pov of heartbeat)
> > P/? -.. XXX
> > we stonith it (just to be sure) and tell it "peer dead"
> > P/D -..
> > (and there it resumes).
> >
> > B)
> > if this is in fact (from the pov of heartbeat)
> > P/? XXX S/?
> > - we do nothing
> > (blocks until network is fixed again)
> > - we tell S that it is outdated,
> > then tell P to resume
> > - or we make it (by STONITH) into either A or C
> >
> > C)
> > if this is in fact (from the pov of heartbeat)
> > XXX ..- S/?
> > we stonith it (just to be sure) and tell it "peer dead"
> > XXX ..- S/D
> > (and there it accepts to be promoted again).
> >
> >
> > similar after bootup:
> > we refuse to be promoted to Primary from Secondary/Unknown,
> > unless we got an explicit "peer dead" confirmation by someone.
> >
> > does that make any sense?
> >
>
> I like it a lot!
>
> Thus we will not call it "drbdadm resume-io r0" but
> "drbdadm peer-dead r0"
>
> I think the assertion that the peer is dead
> (short "peer-dead") is a lot easier to understand than
> a "resume-io" command.
>
>
> Also the question at the startup-user-dialog:
>
> Is the peer dead ?
>
> Is easier to get right....
maybe we still need to have this a two-stage process:
after reboot, and we remain in Secondary/Unknown,
we need to be told "peer dead", but we also need to get the confirmation
"up-to-date" (just to cover our ass).
when it was just a connection loss, we *are* up-to-date, and just need the
confirmation "peer dead"; or we get the confirmation "link dead, peer
alive", which basically is "you are outdated!".
just so we cannot be blamed for "automatically losing transactions",
even in a multiple failure scenario.
lge
next prev parent reply other threads:[~2004-09-07 12:05 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20040819110202.GO9601@marowsky-bree.de>
[not found] ` <20040819113205.GP9601@marowsky-bree.de>
[not found] ` <R+ahoCHARbsLOMKIahWH0/Q=lge@web.de>
2004-08-20 12:52 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Philipp Reisner
2004-08-20 13:32 ` Lars Ellenberg
2004-08-23 14:28 ` [Drbd-dev] gen_counts and primary --human Lars Ellenberg
2004-08-23 21:57 ` Lars Marowsky-Bree
2004-08-25 9:42 ` Philipp Reisner
2004-08-23 21:56 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Lars Marowsky-Bree
2004-08-25 9:42 ` Philipp Reisner
2004-08-25 10:28 ` Lars Marowsky-Bree
2004-08-25 11:30 ` Philipp Reisner
2004-08-25 13:38 ` Lars Ellenberg
2004-09-04 9:48 ` [Drbd-dev] Another drbd race Lars Marowsky-Bree
2004-09-04 10:00 ` Lars Ellenberg
2004-09-04 10:18 ` Lars Marowsky-Bree
2004-09-04 10:43 ` Lars Ellenberg
2004-09-04 10:51 ` Lars Marowsky-Bree
2004-09-07 9:39 ` Philipp Reisner
2004-09-07 10:13 ` Lars Ellenberg
2004-09-07 11:32 ` Philipp Reisner
2004-09-07 12:05 ` Lars Ellenberg [this message]
2004-09-07 12:12 ` Lars Marowsky-Bree
2004-09-07 12:06 ` Lars Marowsky-Bree
2004-09-07 12:19 ` Philipp Reisner
2004-09-07 12:28 ` Lars Marowsky-Bree
2004-09-07 12:47 ` Philipp Reisner
2004-09-08 11:20 ` Lars Marowsky-Bree
2004-09-08 11:31 ` Lars Ellenberg
2004-09-08 15:11 ` Lars Marowsky-Bree
2004-09-08 15:22 ` Lars Ellenberg
2004-09-08 11:33 ` Philipp Reisner
2004-09-07 15:55 ` Lars Ellenberg
2004-08-20 14:10 ` [Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem Helmut Wollmersdorfer
2004-08-23 22:01 ` Lars Marowsky-Bree
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040907120502.GA12518@nudl \
--to=lars.ellenberg@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox