cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick
@ 2016-05-16  8:07 Eric Ren
  2016-05-16 17:12 ` David Teigland
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Ren @ 2016-05-16  8:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

When there are 3 or more partitions that merge, none may see enough
clean nodes. Therefore, DLM would be stuck there forever unitl administrator
manually reset/restart enough nodes to produce sufficient clean nodes.
However, sometimes people hope that DLM can automatically recover from "useless"
state by forcing kick statefull merged nodes.

The option of "enable_force_kick" defaults to "0"(disabled), which
remains the old way. Note that, enable this option at your own risk
because it's hard to predict which node (or none) will survive when both
sides of the merged partitions are kicking the other out of the cluster
at the same time.

Signed-off-by: Eric Ren <zren@suse.com>
---
 dlm_controld/daemon_cpg.c   | 6 +++++-
 dlm_controld/dlm.conf.5     | 2 ++
 dlm_controld/dlm_controld.8 | 5 +++++
 dlm_controld/dlm_daemon.h   | 1 +
 dlm_controld/main.c         | 6 ++++++
 5 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/dlm_controld/daemon_cpg.c b/dlm_controld/daemon_cpg.c
index 356e80d..a09971d 100644
--- a/dlm_controld/daemon_cpg.c
+++ b/dlm_controld/daemon_cpg.c
@@ -845,7 +845,11 @@ static void daemon_fence_work(void)
 		log_retry(retry_fencing, "fence work wait to clear merge %d clean %d part %d gone %d",
 			  merge_count, clean_count, part_count, gone_count);
 
-		if ((clean_count >= merge_count) && !part_count && (low == our_nodeid))
+		if(opt(enable_force_kick_ind))
+			log_retry(retry_fencing, "fence work force to kick stateful merged members");
+
+		if ((clean_count >= merge_count || opt(enable_force_kick_ind))
+		    && !part_count && (low == our_nodeid))
 			kick_stateful_merge_members();
 
 		retry = 1;
diff --git a/dlm_controld/dlm.conf.5 b/dlm_controld/dlm.conf.5
index 007e4de..4dc1ba4 100644
--- a/dlm_controld/dlm.conf.5
+++ b/dlm_controld/dlm.conf.5
@@ -68,6 +68,8 @@ enable_quorum_fencing
 .br
 enable_quorum_lockspace
 .br
+enable_force_kick
+.br
 
 .SH Fencing
 
diff --git a/dlm_controld/dlm_controld.8 b/dlm_controld/dlm_controld.8
index c9011fd..c424f41 100644
--- a/dlm_controld/dlm_controld.8
+++ b/dlm_controld/dlm_controld.8
@@ -87,6 +87,11 @@ For default settings, see dlm_controld -h.
 0|1
         enable/disable quorum requirement for lockspace operations
 
+.B --enable_force_kick | -k
+0|1
+        enable/disable forcing kick when cluster is stuck waiting
+        for administrator to manually produce enough clean nodes
+
 .B --fence_all
 .I str
         fence all nodes with this agent
diff --git a/dlm_controld/dlm_daemon.h b/dlm_controld/dlm_daemon.h
index 62508ea..bdaf6bc 100644
--- a/dlm_controld/dlm_daemon.h
+++ b/dlm_controld/dlm_daemon.h
@@ -108,6 +108,7 @@ enum {
         enable_startup_fencing_ind,
         enable_quorum_fencing_ind,
         enable_quorum_lockspace_ind,
+	enable_force_kick_ind,
         help_ind,
         version_ind,
         dlm_options_max,
diff --git a/dlm_controld/main.c b/dlm_controld/main.c
index 4f1399f..354db44 100644
--- a/dlm_controld/main.c
+++ b/dlm_controld/main.c
@@ -1355,6 +1355,12 @@ static void set_opt_defaults(void)
 			1, NULL,
 			"enable/disable quorum requirement for lockspace operations");
 
+	set_opt_default(enable_force_kick_ind,
+			"enable_force_kick", 'k', req_arg_bool,
+			0, NULL,
+			"enable/disable forcing kick when cluster is stuck waiting "
+			"for administrator to manually produce enough clean nodes");
+
 	set_opt_default(help_ind,
 			"help", 'h', no_arg,
 			-1, NULL,
-- 
2.6.6



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick
  2016-05-16  8:07 [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick Eric Ren
@ 2016-05-16 17:12 ` David Teigland
  2016-05-17 12:16   ` Eric Ren
  0 siblings, 1 reply; 3+ messages in thread
From: David Teigland @ 2016-05-16 17:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, May 16, 2016 at 04:07:18PM +0800, Eric Ren wrote:
> When there are 3 or more partitions that merge, none may see enough
> clean nodes. Therefore, DLM would be stuck there forever unitl administrator
> manually reset/restart enough nodes to produce sufficient clean nodes.
> However, sometimes people hope that DLM can automatically recover from "useless"
> state by forcing kick statefull merged nodes.
> 
> The option of "enable_force_kick" defaults to "0"(disabled), which
> remains the old way. Note that, enable this option at your own risk
> because it's hard to predict which node (or none) will survive when both
> sides of the merged partitions are kicking the other out of the cluster
> at the same time.

This looks good.  Would you still use this patch if we add the new
dlm_tool output from the other email?
Dave



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick
  2016-05-16 17:12 ` David Teigland
@ 2016-05-17 12:16   ` Eric Ren
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Ren @ 2016-05-17 12:16 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello David:

On 05/17/2016 01:12 AM, David Teigland wrote:
> This looks good.  Would you still use this patch if we add the new
> dlm_tool output from the other email?

Please hold back this for now;-) I prefer to drop this method if the 
latter one works better. And I'm trying to working this out with 
pacemaker guys;-)

Thanks a lot,
Eric



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-05-17 12:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-16  8:07 [Cluster-devel] [DLM PATCH] dlm_controld: add option of enable_force_kick Eric Ren
2016-05-16 17:12 ` David Teigland
2016-05-17 12:16   ` Eric Ren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).