From: Eric Ren <zren@suse.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful
Date: Fri, 20 May 2016 17:03:00 +0800 [thread overview]
Message-ID: <573ED2C4.8010805@suse.com> (raw)
In-Reply-To: <20160518185029.GA5193@redhat.com>
Hi David,
On 05/19/2016 02:50 AM, David Teigland wrote:
> On Wed, May 18, 2016 at 02:53:00PM +0800, Eric Ren wrote:
>> Q1: what's stateful merged node?
>
>> Q2: what if we add the stateful merged nodes to dlm_controld daemon
>> cpg instead of fencing them?
>
> The details here are fundamental to the way dlm works because the dlm
> depends on the properties of Virtual Synchrony. Partitions obviously
> violate VS. ("Extended" forms of virtual synchrony deal with partitions,
> but they are not very practical. Unfortunately, corosync implements one
> of these extended forms of VS, which means any application that requires
> strict VS has to implement an equivalent of this "stateful merging"
> detection that's in the dlm.)
>
> With VS, message/membership events change the state being kept consistent
> among nodes. When a partition occurs, nodes have divergent events and
> inconsistent state. The partition is simple to understand, because
> partitioned nodes are indistinguishable from failed nodes and are treated
> as such. But, if partitioned nodes merge, the inconsistent state has to
> be made consistent. This must be done in the same way a new node is added
> to an existing node, which means doing "state transfer" from the existing
> node to the new node to make the state consistent between them.
>
> If the "new" node previously had state because of partition/merge, it must
> drop that old state and replace it with the state being transferred to it.
> After this, they will be consistent and can continue. With a simple
> process, you might just kill it, restart it and add the transferred state.
> But the dlm isn't a process that can simply be restarted, the state is
> spread through applications using it, and through the kernel. The only
> mechanism for resetting the dlm state is resetting the kernel, which is
> resetting/rebooting the machine.
>
>> if so, CPG $uuid now, e.g. from the perspective of A, may has only one
>> memeber - A itself, it can perform lockspace now because cluster is
>> quorate now (and if we skip fencing); B and C do likewise; then for each
>> node, it looks like every node own this volume; so corruption may happen?
>
> When the nodes are partitioned, the situation is fairly straight forward
> -- each node thinks the others are failed, and normal operation is blocked
> until recovery happens for the failed nodes.
>
> The harder problem is what to do when they merge. The dlm effectively
> ignores the invalid addition of the merged nodes and calls it a "stateful
> merge". The merged nodes continue to be considered failed (from the
> partition) and require a full restart before being added.
>
Thanks a lot for elaborating this valuable knowledge to me! I've also
shard with pacemaker guys. They'll make corresponding changes on pcmk
side once the patch of dlm_controld is merged. I've sent the patch to
you. Please take a look at;-)
With best regards,
Eric
prev parent reply other threads:[~2016-05-20 9:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-17 12:10 [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-17 12:10 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful merging Eric Ren
2016-05-18 6:53 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-18 18:50 ` David Teigland
2016-05-20 9:03 ` Eric Ren [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=573ED2C4.8010805@suse.com \
--to=zren@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).