[Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes
@ 2021-09-15 20:39 Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: add notes for recovery and membership handling Alexander Aring
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Alexander Aring @ 2021-09-15 20:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

this patch series changes the recovery behaviour to call
dlm_lsop_recover_prep() callback only once between dlm_ls_start() and
dlm_ls_stop(). Currently if recovery gets interrupted by another
dlm_ls_stop() it could be we calling this callback multiple times and
the dlm_lsop_recover_done() was not called yet. Users might depend on the
behaviour that the dlm_lsop_recover_prep() is only called once followed
by a final dlm_lsop_recover_done() call.

Another change is that dlm_new_lockspace() will wait until the recovery is
done. The current behaviour is that the dlm_new_lockspace() will wait until
lockspace member configuration seems to be valid. After the members
configuration the recovery can still get interrupted by another
dlm_ls_stop() call and dlm_new_lockspace() moves on at that point when
recovery is still not done yet. The most kernel users have already a wait
for the dlm_lsop_recover_done() after calling dlm_new_lockspace() which
is not necessary now. However the old behaviour should still work.

I tested this patch series with an experimental python bindings to libdlm.
I will send soon patches which will add those bindings and test to the
dlm user space software repository.

- Alex

Alexander Aring (3):
  fs: dlm: add notes for recovery and membership handling
  fs: dlm: call dlm_lsop_recover_prep once
  fs: dlm: let new_lockspace() wait until recovery

 fs/dlm/dlm_internal.h |  4 ++--
 fs/dlm/lockspace.c    |  9 +++++----
 fs/dlm/member.c       | 30 +++++++++++++++---------------
 fs/dlm/recoverd.c     | 17 +++++++++++++++++
 4 files changed, 39 insertions(+), 21 deletions(-)

-- 
2.27.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: add notes for recovery and membership handling
  2021-09-15 20:39 [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes Alexander Aring
@ 2021-09-15 20:39 ` Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 2/3] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
  2 siblings, 0 replies; 4+ messages in thread
From: Alexander Aring @ 2021-09-15 20:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds some comment sections to make aware that the ls_recover()
function should never fail before membership handling. Membership
handling means to add/remove nodes from the lockspace ls_nodes
attribute in dlm_recover_members().

This is because there are functionality like dlm_midcomms_add_member(),
dlm_midcomms_remove_member() or dlm_lsop_recover_slot() which should
always get aware of any join or leave of lockspace members. If we add a
e.g. dlm_locking_stopped() before dlm_recover_members() to check if the
recovery was interrupted and abort it we might skip to call
dlm_midcomms_add_member(), dlm_midcomms_remove_member() or
dlm_lsop_recover_slot().

A reason because the recovery is interrupted could be that the cluster
manager notified about a new configuration .e.g. members joined or
leaved. It is fine to interrupt or fail the recovery handling after
the mentioned handling of dlm_recover_members() but never before.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/member.c   | 6 +++++-
 fs/dlm/recoverd.c | 4 ++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 731d489aa323..5f5b07bdbcc3 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -540,7 +540,11 @@ int dlm_recover_members(struct dlm_ls *ls, struct dlm_recover *rv, int *neg_out)
 	int i, error, neg = 0, low = -1;

 	/* previously removed members that we've not finished removing need to
-	   count as a negative change so the "neg" recovery steps will happen */
+	 * count as a negative change so the "neg" recovery steps will happen
+	 *
+	 * This functionality must report all member changes to lsops or
+	 * midcomms layer and must never return before.
+	 */

 	list_for_each_entry(memb, &ls->ls_nodes_gone, list) {
 		log_rinfo(ls, "prev removed member %d", memb->nodeid);
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 97d052cea5a9..208b69f46baf 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -70,6 +70,10 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)

 	/*
 	 * Add or remove nodes from the lockspace's ls_nodes list.
+	 *
+	 * Due the fact we must report all membership changes to lsops or
+	 * midcomms layer it is not permitted to abort ls_recover() until
+	 * this is done.
 	 */

 	error = dlm_recover_members(ls, rv, &neg);
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH dlm/next 2/3] fs: dlm: call dlm_lsop_recover_prep once
  2021-09-15 20:39 [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: add notes for recovery and membership handling Alexander Aring
@ 2021-09-15 20:39 ` Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
  2 siblings, 0 replies; 4+ messages in thread
From: Alexander Aring @ 2021-09-15 20:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes the behaviour of "dlm_lsop_recover_prep" callback. It
will be called once when locking was stopped if it was previously
running not if dlm_ls_stop() is called multiple times when locking was
already stopped. Now it will be called only if a dlm_ls_start() was before.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/member.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 5f5b07bdbcc3..446e1635229d 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -685,7 +685,16 @@ int dlm_ls_stop(struct dlm_ls *ls)
 	if (!ls->ls_recover_begin)
 		ls->ls_recover_begin = jiffies;
 
-	dlm_lsop_recover_prep(ls);
+	/* call recover_prep ops only once and not multiple times
+	 * for each possible dlm_ls_stop() when recovery is already
+	 * stopped.
+	 *
+	 * If we successful was able to clear LSFL_RUNNING bit and
+	 * it was set we know it is the first dlm_ls_stop() call.
+	 */
+	if (new)
+		dlm_lsop_recover_prep(ls);
+
 	return 0;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: let new_lockspace() wait until recovery
  2021-09-15 20:39 [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: add notes for recovery and membership handling Alexander Aring
  2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 2/3] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
@ 2021-09-15 20:39 ` Alexander Aring
  2 siblings, 0 replies; 4+ messages in thread
From: Alexander Aring @ 2021-09-15 20:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes the behaviour in dlm_new_lockspace() function to wait
until a recovery was successful or failed. Before a possible waiter in
ls_members_done was waiting until dlm_recover_members() was done
either if it was successful (inclusive interrupted) or failed. The result
was returned to the waiter of dlm_new_lockspace(), if success the caller
was able to use the lockspace at this point.

This behaviour is now changed to wait of a complete run of recovery
functionality which is done by ls_recover(). The result can be either
successful or failed and delivered back to a possible waiter of
ls_recovery_done. A possible waiter is then able to use the lockspace
or run error handling if failed. If recovery gets interrupted
e.g. checked at several places if dlm_locking_stopped() is true, a
possible waiter of ls_recovery_done is still waiting until ls_recover()
is successful or fails.

A reason why the recovery task gets interrupted is that an another
dlm_ls_stop() was called while ls_recover() runs. The call of an another
dlm_ls_stop() means that the recovery task will call ls_recover() again
with a possible new configuration delivered by the cluster manager.

Most dlm kernel users e.g. gfs2 or cluster-md have their own wait
handling to wait for recovery done after calling dlm_new_lockspace().
This becomes unnecessary now but still works. Users can update their code
because dlm takes care about it now.

An example to simple interrupt recovery can be done by calling
dlm_new_lockspace() and dlm_release_lockspace() in a loop on several
cluster nodes. This has the effect that the cluster manager will
interrupt the recovery with new membership information over and over
again.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/dlm_internal.h |  4 ++--
 fs/dlm/lockspace.c    |  9 +++++----
 fs/dlm/member.c       | 13 -------------
 fs/dlm/recoverd.c     | 13 +++++++++++++
 4 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index 5f57538b5d45..de6c9fb5dd30 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -610,8 +610,8 @@ struct dlm_ls {
 
 	wait_queue_head_t	ls_uevent_wait;	/* user part of join/leave */
 	int			ls_uevent_result;
-	struct completion	ls_members_done;
-	int			ls_members_result;
+	struct completion	ls_recovery_done;
+	int			ls_recovery_result;
 
 	struct miscdevice       ls_device;
 
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 10eddfa6c3d7..0feffdeeb329 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -547,8 +547,8 @@ static int new_lockspace(const char *name, const char *cluster,
 
 	init_waitqueue_head(&ls->ls_uevent_wait);
 	ls->ls_uevent_result = 0;
-	init_completion(&ls->ls_members_done);
-	ls->ls_members_result = -1;
+	init_completion(&ls->ls_recovery_done);
+	ls->ls_recovery_result = -1;
 
 	mutex_init(&ls->ls_cb_mutex);
 	INIT_LIST_HEAD(&ls->ls_cb_delay);
@@ -642,8 +642,9 @@ static int new_lockspace(const char *name, const char *cluster,
 	if (error)
 		goto out_recoverd;
 
-	wait_for_completion(&ls->ls_members_done);
-	error = ls->ls_members_result;
+	/* wait until recovery is successful or failed */
+	wait_for_completion(&ls->ls_recovery_done);
+	error = ls->ls_recovery_result;
 	if (error)
 		goto out_members;
 
diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 446e1635229d..3122c5a718c4 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -593,19 +593,6 @@ int dlm_recover_members(struct dlm_ls *ls, struct dlm_recover *rv, int *neg_out)
 	*neg_out = neg;
 
 	error = ping_members(ls);
-	/* error -EINTR means that a new recovery action is triggered.
-	 * We ignore this recovery action and let run the new one which might
-	 * have new member configuration.
-	 */
-	if (error == -EINTR)
-		error = 0;
-
-	/* new_lockspace() may be waiting to know if the config
-	 * is good or bad
-	 */
-	ls->ls_members_result = error;
-	complete(&ls->ls_members_done);
-
 	log_rinfo(ls, "dlm_recover_members %d nodes", ls->ls_num_nodes);
 	return error;
 }
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 208b69f46baf..eaf310fdcb7d 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -244,6 +244,9 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 		  jiffies_to_msecs(jiffies - start));
 	mutex_unlock(&ls->ls_recoverd_active);
 
+	ls->ls_recovery_result = 0;
+	complete(&ls->ls_recovery_done);
+
 	dlm_lsop_recover_done(ls);
 	return 0;
 
@@ -252,6 +255,16 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 	log_rinfo(ls, "dlm_recover %llu error %d",
 		  (unsigned long long)rv->seq, error);
 	mutex_unlock(&ls->ls_recoverd_active);
+
+	/* let new_lockspace() get aware of critical error if recovery
+	 * was interrupted -EINTR we wait for the next ls_recover()
+	 * iteration until it succeeds.
+	 */
+	if (error != -EINTR) {
+		ls->ls_recovery_result = error;
+		complete(&ls->ls_recovery_done);
+	}
+
 	return error;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-15 20:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-15 20:39 [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: recovery ops and wait changes Alexander Aring
2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: add notes for recovery and membership handling Alexander Aring
2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 2/3] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
2021-09-15 20:39 ` [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: let new_lockspace() wait until recovery Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).