From: Eric Ren <zren@suse.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful
Date: Fri, 20 May 2016 17:03:00 +0800 [thread overview]
Message-ID: <573ED2C4.8010805@suse.com> (raw)
In-Reply-To: <20160518185029.GA5193@redhat.com>
Hi David,
On 05/19/2016 02:50 AM, David Teigland wrote:
> On Wed, May 18, 2016 at 02:53:00PM +0800, Eric Ren wrote:
>> Q1: what's stateful merged node?
>
>> Q2: what if we add the stateful merged nodes to dlm_controld daemon
>> cpg instead of fencing them?
>
> The details here are fundamental to the way dlm works because the dlm
> depends on the properties of Virtual Synchrony. Partitions obviously
> violate VS. ("Extended" forms of virtual synchrony deal with partitions,
> but they are not very practical. Unfortunately, corosync implements one
> of these extended forms of VS, which means any application that requires
> strict VS has to implement an equivalent of this "stateful merging"
> detection that's in the dlm.)
>
> With VS, message/membership events change the state being kept consistent
> among nodes. When a partition occurs, nodes have divergent events and
> inconsistent state. The partition is simple to understand, because
> partitioned nodes are indistinguishable from failed nodes and are treated
> as such. But, if partitioned nodes merge, the inconsistent state has to
> be made consistent. This must be done in the same way a new node is added
> to an existing node, which means doing "state transfer" from the existing
> node to the new node to make the state consistent between them.
>
> If the "new" node previously had state because of partition/merge, it must
> drop that old state and replace it with the state being transferred to it.
> After this, they will be consistent and can continue. With a simple
> process, you might just kill it, restart it and add the transferred state.
> But the dlm isn't a process that can simply be restarted, the state is
> spread through applications using it, and through the kernel. The only
> mechanism for resetting the dlm state is resetting the kernel, which is
> resetting/rebooting the machine.
>
>> if so, CPG $uuid now, e.g. from the perspective of A, may has only one
>> memeber - A itself, it can perform lockspace now because cluster is
>> quorate now (and if we skip fencing); B and C do likewise; then for each
>> node, it looks like every node own this volume; so corruption may happen?
>
> When the nodes are partitioned, the situation is fairly straight forward
> -- each node thinks the others are failed, and normal operation is blocked
> until recovery happens for the failed nodes.
>
> The harder problem is what to do when they merge. The dlm effectively
> ignores the invalid addition of the merged nodes and calls it a "stateful
> merge". The merged nodes continue to be considered failed (from the
> partition) and require a full restart before being added.
>
Thanks a lot for elaborating this valuable knowledge to me! I've also
shard with pacemaker guys. They'll make corresponding changes on pcmk
side once the patch of dlm_controld is merged. I've sent the patch to
you. Please take a look at;-)
With best regards,
Eric
prev parent reply other threads:[~2016-05-20 9:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-17 12:10 [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-17 12:10 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful merging Eric Ren
2016-05-18 6:53 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-18 18:50 ` David Teigland
2016-05-20 9:03 ` Eric Ren [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=573ED2C4.8010805@suse.com \
--to=zren@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.