[Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: Eric Ren <zren@suse.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful
Date: Fri, 20 May 2016 17:03:00 +0800	[thread overview]
Message-ID: <573ED2C4.8010805@suse.com> (raw)
In-Reply-To: <20160518185029.GA5193@redhat.com>

Hi David,

On 05/19/2016 02:50 AM, David Teigland wrote:
> On Wed, May 18, 2016 at 02:53:00PM +0800, Eric Ren wrote:
>> Q1: what's stateful merged node?
>
>> Q2: what if we add the stateful merged nodes to dlm_controld daemon
>> cpg instead of fencing them?
>
> The details here are fundamental to the way dlm works because the dlm
> depends on the properties of Virtual Synchrony.  Partitions obviously
> violate VS.  ("Extended" forms of virtual synchrony deal with partitions,
> but they are not very practical.  Unfortunately, corosync implements one
> of these extended forms of VS, which means any application that requires
> strict VS has to implement an equivalent of this "stateful merging"
> detection that's in the dlm.)
>
> With VS, message/membership events change the state being kept consistent
> among nodes.  When a partition occurs, nodes have divergent events and
> inconsistent state.  The partition is simple to understand, because
> partitioned nodes are indistinguishable from failed nodes and are treated
> as such.  But, if partitioned nodes merge, the inconsistent state has to
> be made consistent.  This must be done in the same way a new node is added
> to an existing node, which means doing "state transfer" from the existing
> node to the new node to make the state consistent between them.
>
> If the "new" node previously had state because of partition/merge, it must
> drop that old state and replace it with the state being transferred to it.
> After this, they will be consistent and can continue.  With a simple
> process, you might just kill it, restart it and add the transferred state.
> But the dlm isn't a process that can simply be restarted, the state is
> spread through applications using it, and through the kernel.  The only
> mechanism for resetting the dlm state is resetting the kernel, which is
> resetting/rebooting the machine.
>
>> if so, CPG $uuid now, e.g. from the perspective of A, may has only one
>> memeber - A itself, it can perform lockspace now because cluster is
>> quorate now (and if we skip fencing); B and C do likewise; then for each
>> node, it looks like every node own this volume; so corruption may happen?
>
> When the nodes are partitioned, the situation is fairly straight forward
> -- each node thinks the others are failed, and normal operation is blocked
> until recovery happens for the failed nodes.
>
> The harder problem is what to do when they merge.  The dlm effectively
> ignores the invalid addition of the merged nodes and calls it a "stateful
> merge".  The merged nodes continue to be considered failed (from the
> partition) and require a full restart before being added.
>

Thanks a lot for elaborating this valuable knowledge to me! I've also 
shard with pacemaker guys. They'll make corresponding changes on pcmk 
side once the patch of dlm_controld is merged. I've sent the patch to 
you. Please take a look at;-)

With best regards,
Eric

     prev parent reply	other threads:[~2016-05-20  9:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-17 12:10 [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-17 12:10 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful merging Eric Ren
2016-05-18  6:53 ` [Cluster-devel] [DLM PATCH] dlm_controld: outputs explicit info about stateful Eric Ren
2016-05-18 18:50   ` David Teigland
2016-05-20  9:03     ` Eric Ren [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=573ED2C4.8010805@suse.com \
    --to=zren@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).