From: David Teigland <teigland@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [DLM PATCH] dlm_controld: handle the case of network transient disconnection
Date: Thu, 12 May 2016 11:51:14 -0500 [thread overview]
Message-ID: <20160512165114.GB13651@redhat.com> (raw)
In-Reply-To: <1463044568-19583-1-git-send-email-zren@suse.com>
On Thu, May 12, 2016 at 05:16:08PM +0800, Eric Ren wrote:
> DLM would be stuck in "need fencing" state, although cluster can
> regain quorum very quickly after a network transient disconnection.
>
> It's possible that this process happens within one monoclock. It
> means "cluster_quorate_monotime" can eqaul "node->daemon_rem_time".
> We now skip this chance of telling corosync to kill cluster for
> stateful merge. As a result, any fencing cannot proceed further.
Hi Eric, thanks for looking at this, it's a notoriously difficult
situation to sort out. I'm not sure we have the same understanding of how
the behavior will change with your patch, so let's look at an example, and
please let me know if you think these examples don't match what you see
(it's been quite a while since I actually tested this).
T = time in seconds, A,B,C = cluster nodes.
At T=1 A,B,C become members and have quorum.
At T=10 a partition creates A,B | C.
At T=11 it merges and creates A,B,C.
At T=12, A,B will have:
cluster_quorate=1
cluster_quorate_monotime=1
C->daemon_rem_time=10
At T=12, C will have:
cluster_quorate=1
cluster_quorate_monotime=11
A->daemon_rem_time=10
B->daemon_rem_time=10
Result:
A,B will kick C from the cluster because
cluster_quorate_monotime (1) < C->daemon_rem_time (10),
which is what we want.
C will not kick A,B from the cluster because
cluster_quorate_monotime (11) > A->daemon_rem_time (10),
which is what we want.
It's the simpler case, but does that sound right so far?
...
If the partition and merge occur within the same second, then:
At T=1 A,B,C become members and get quorum.
At T=10 a partition creates A,B | C.
At T=10 it merges and creates A,B,C.
At T=12, A,B will have:
cluster_quorate=1
cluster_quorate_monotime=1
C->daemon_rem_time=10
At T=12, C will have:
cluster_quorate=1
cluster_quorate_monotime=10
A->daemon_rem_time=10
B->daemon_rem_time=10
Result:
A,B will kick C from the cluster because
cluster_quorate_monotime (1) < C->daemon_rem_time (10),
which is what we want.
C will not kick A,B from the cluster because
cluster_quorate_monotime (10) = A->daemon_rem_time (10),
which is what we want.
If that's correct, there doesn't seem to be problem so far.
If we apply your patch, won't it allow C to kick A,B from the
cluster since cluster_quorate_monotime = A->daemon_rem_time?
...
If you're looking at a cluster with an equal partition, e.g. A,B | C,D,
then it becomes messy because cluster_quorate_monotime = daemon_rem_time
everywhere after the merge. In this case, no nodes will kick others from
the cluster, but with your patch, each side will kick the other side from
the cluster. Neither option is good. In the past we decided to let the
cluster sit in this state so an admin could choose which nodes to remove.
Do you prefer the alternative of kicking nodes in this case (with somewhat
unpredictable results)? If so, we could make that an optional setting,
but we'd want to keep the existing behavior for non-even partitions in the
example above.
> diff --git a/dlm_controld/daemon_cpg.c b/dlm_controld/daemon_cpg.c
> index 356e80d..cd8a4e2 100644
> --- a/dlm_controld/daemon_cpg.c
> +++ b/dlm_controld/daemon_cpg.c
> @@ -1695,7 +1695,7 @@ static void receive_protocol(struct dlm_header *hd, int len)
> node->stateful_merge = 1;
>
> if (cluster_quorate && node->daemon_rem_time &&
> - cluster_quorate_monotime < node->daemon_rem_time) {
> + cluster_quorate_monotime <= node->daemon_rem_time) {
> if (!node->killed) {
> if (cluster_two_node) {
> /*
> --
> 2.6.6
next prev parent reply other threads:[~2016-05-12 16:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-12 9:16 [Cluster-devel] [DLM PATCH] dlm_controld: handle the case of network transient disconnection Eric Ren
2016-05-12 16:51 ` David Teigland [this message]
2016-05-13 5:45 ` Eric Ren
2016-05-13 15:49 ` David Teigland
2016-05-16 7:44 ` Eric Ren
2016-05-16 17:02 ` David Teigland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160512165114.GB13651@redhat.com \
--to=teigland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).