[Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling
@ 2022-06-10 17:06 Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling Alexander Aring
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

I have these patches laying around a long time... and it's maybe time to
bring them up. It does the three changes in dlm recovery handling:

1.

The dlm_lsop_recover_prep() callback should be called once after the
lockspace is stopped and not if it's already stopped when the recovery
is running. 

It will change possible:

dlm_lsop_recover_prep()
...
dlm_lsop_recover_prep()
dlm_lsop_recover_done()

to only have one possible prep call:

dlm_lsop_recover_prep()
dlm_lsop_recover_done()

2.

If a new_lockspace() is created we wait until a point when members are
successful pinged, then new_lockspace() returns to the caller. However
the recovery might be still running. Mostly all users of dlm will
workaround this with a dlm_lsop_recover_done() call wait to know the dlm
lockspace can be used now. This should be backwards compatible with the
existing dlm users, however they can drop their handling if they want.

3.

There exists two ways how recovery can be triggered. Either somebody called
new_lockspace(), that means a waiter waits until recovery is done. Or it
is a complete async process e.g. nodes joining/leaving the lockspace.
There is no caller in the async case which waits for dlm recovery is done,
therefore there exists no error handling which reacts on possible recovery
errors. This patch series will introduce a "best effort" approach to simple
retry/schedule() the recovery on error and hope the error gets resolved.
If this is not the case in 5 retries panic() will fence the node.

- Alex

Alexander Aring (7):
  fs: dlm: add notes for recovery and membership handling
  fs: dlm: call dlm_lsop_recover_prep once
  fs: dlm: let new_lockspace() wait until recovery
  fs: dlm: handle recovery result outside of ls_recover
  fs: dlm: handle recovery -EAGAIN case as retry
  fs: dlm: change -EINVAL recovery error to -EAGAIN
  fs: dlm: add WARN_ON for non waiter case

 fs/dlm/dlm_internal.h |  4 +--
 fs/dlm/lock.c         |  5 +++-
 fs/dlm/lockspace.c    |  9 ++++---
 fs/dlm/member.c       | 30 +++++++++++-----------
 fs/dlm/recoverd.c     | 60 ++++++++++++++++++++++++++++++++++++++++---
 5 files changed, 82 insertions(+), 26 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 2/7] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds some comment sections to make aware that the ls_recover()
function should never fail before membership handling. Membership
handling means to add/remove nodes from the lockspace ls_nodes
attribute in dlm_recover_members().

This is because there are functionality like dlm_midcomms_add_member(),
dlm_midcomms_remove_member() or dlm_lsop_recover_slot() which should
always get aware of any join or leave of lockspace members. If we add a
e.g. dlm_locking_stopped() before dlm_recover_members() to check if the
recovery was interrupted and abort it we might skip to call
dlm_midcomms_add_member(), dlm_midcomms_remove_member() or
dlm_lsop_recover_slot().

A reason because the recovery is interrupted could be that the cluster
manager notified about a new configuration .e.g. members joined or
leaved. It is fine to interrupt or fail the recovery handling after
the mentioned handling of dlm_recover_members() but never before.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/member.c   | 6 +++++-
 fs/dlm/recoverd.c | 4 ++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 98084e0cfccf..7e5f5aefefb5 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -534,7 +534,11 @@ int dlm_recover_members(struct dlm_ls *ls, struct dlm_recover *rv, int *neg_out)
 	int i, error, neg = 0, low = -1;

 	/* previously removed members that we've not finished removing need to
-	   count as a negative change so the "neg" recovery steps will happen */
+	 * count as a negative change so the "neg" recovery steps will happen
+	 *
+	 * This functionality must report all member changes to lsops or
+	 * midcomms layer and must never return before.
+	 */

 	list_for_each_entry(memb, &ls->ls_nodes_gone, list) {
 		log_rinfo(ls, "prev removed member %d", memb->nodeid);
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index a55dfce705dd..b5b519cde20b 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -70,6 +70,10 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)

 	/*
 	 * Add or remove nodes from the lockspace's ls_nodes list.
+	 *
+	 * Due the fact we must report all membership changes to lsops or
+	 * midcomms layer it is not permitted to abort ls_recover() until
+	 * this is done.
 	 */

 	error = dlm_recover_members(ls, rv, &neg);
-- 
2.31.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 2/7] fs: dlm: call dlm_lsop_recover_prep once
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 3/7] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes the behaviour of "dlm_lsop_recover_prep" callback. It
will be called once when locking was stopped if it was previously
running not if dlm_ls_stop() is called multiple times when locking was
already stopped. Now it will be called only if a dlm_ls_start() was before.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/member.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 7e5f5aefefb5..67b056634f03 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -679,7 +679,16 @@ int dlm_ls_stop(struct dlm_ls *ls)
 	if (!ls->ls_recover_begin)
 		ls->ls_recover_begin = jiffies;
 
-	dlm_lsop_recover_prep(ls);
+	/* call recover_prep ops only once and not multiple times
+	 * for each possible dlm_ls_stop() when recovery is already
+	 * stopped.
+	 *
+	 * If we successful was able to clear LSFL_RUNNING bit and
+	 * it was set we know it is the first dlm_ls_stop() call.
+	 */
+	if (new)
+		dlm_lsop_recover_prep(ls);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 3/7] fs: dlm: let new_lockspace() wait until recovery
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 2/7] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 4/7] fs: dlm: handle recovery result outside of ls_recover Alexander Aring
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes the behaviour in dlm_new_lockspace() function to wait
until a recovery was successful or failed. Before a possible waiter in
ls_members_done was waiting until dlm_recover_members() was done
either if it was successful (inclusive interrupted) or failed. The result
was returned to the waiter of dlm_new_lockspace(), if success the caller
was able to use the lockspace at this point.

This behaviour is now changed to wait of a complete run of recovery
functionality which is done by ls_recover(). The result can be either
successful or failed and delivered back to a possible waiter of
ls_recovery_done. A possible waiter is then able to use the lockspace
or run error handling if failed. If recovery gets interrupted
e.g. checked at several places if dlm_locking_stopped() is true, a
possible waiter of ls_recovery_done is still waiting until ls_recover()
is successful or fails.

A reason why the recovery task gets interrupted is that an another
dlm_ls_stop() was called while ls_recover() runs. The call of an another
dlm_ls_stop() means that the recovery task will call ls_recover() again
with a possible new configuration delivered by the cluster manager.

Most dlm kernel users e.g. gfs2 or cluster-md have their own wait
handling to wait for recovery done after calling dlm_new_lockspace().
This becomes unnecessary now but still works. Users can update their code
because dlm takes care about it now.

An example to simple interrupt recovery can be done by calling
dlm_new_lockspace() and dlm_release_lockspace() in a loop on several
cluster nodes. This has the effect that the cluster manager will
interrupt the recovery with new membership information over and over
again.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/dlm_internal.h |  4 ++--
 fs/dlm/lockspace.c    |  9 +++++----
 fs/dlm/member.c       | 13 -------------
 fs/dlm/recoverd.c     | 13 +++++++++++++
 4 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index 776c3ed519f0..c03388a3875c 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -606,8 +606,8 @@ struct dlm_ls {
 
 	wait_queue_head_t	ls_uevent_wait;	/* user part of join/leave */
 	int			ls_uevent_result;
-	struct completion	ls_members_done;
-	int			ls_members_result;
+	struct completion	ls_recovery_done;
+	int			ls_recovery_result;
 
 	struct miscdevice       ls_device;
 
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 19ed41a5da93..0c3613d09c5e 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -548,8 +548,8 @@ static int new_lockspace(const char *name, const char *cluster,
 
 	init_waitqueue_head(&ls->ls_uevent_wait);
 	ls->ls_uevent_result = 0;
-	init_completion(&ls->ls_members_done);
-	ls->ls_members_result = -1;
+	init_completion(&ls->ls_recovery_done);
+	ls->ls_recovery_result = -1;
 
 	mutex_init(&ls->ls_cb_mutex);
 	INIT_LIST_HEAD(&ls->ls_cb_delay);
@@ -645,8 +645,9 @@ static int new_lockspace(const char *name, const char *cluster,
 	if (error)
 		goto out_recoverd;
 
-	wait_for_completion(&ls->ls_members_done);
-	error = ls->ls_members_result;
+	/* wait until recovery is successful or failed */
+	wait_for_completion(&ls->ls_recovery_done);
+	error = ls->ls_recovery_result;
 	if (error)
 		goto out_members;
 
diff --git a/fs/dlm/member.c b/fs/dlm/member.c
index 67b056634f03..2af2ccfe43a9 100644
--- a/fs/dlm/member.c
+++ b/fs/dlm/member.c
@@ -587,19 +587,6 @@ int dlm_recover_members(struct dlm_ls *ls, struct dlm_recover *rv, int *neg_out)
 	*neg_out = neg;
 
 	error = ping_members(ls);
-	/* error -EINTR means that a new recovery action is triggered.
-	 * We ignore this recovery action and let run the new one which might
-	 * have new member configuration.
-	 */
-	if (error == -EINTR)
-		error = 0;
-
-	/* new_lockspace() may be waiting to know if the config
-	 * is good or bad
-	 */
-	ls->ls_members_result = error;
-	complete(&ls->ls_members_done);
-
 	log_rinfo(ls, "dlm_recover_members %d nodes", ls->ls_num_nodes);
 	return error;
 }
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index b5b519cde20b..98c17f74927f 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -243,6 +243,9 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 		  jiffies_to_msecs(jiffies - start));
 	mutex_unlock(&ls->ls_recoverd_active);
 
+	ls->ls_recovery_result = 0;
+	complete(&ls->ls_recovery_done);
+
 	dlm_lsop_recover_done(ls);
 	return 0;
 
@@ -251,6 +254,16 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 	log_rinfo(ls, "dlm_recover %llu error %d",
 		  (unsigned long long)rv->seq, error);
 	mutex_unlock(&ls->ls_recoverd_active);
+
+	/* let new_lockspace() get aware of critical error if recovery
+	 * was interrupted -EINTR we wait for the next ls_recover()
+	 * iteration until it succeeds.
+	 */
+	if (error != -EINTR) {
+		ls->ls_recovery_result = error;
+		complete(&ls->ls_recovery_done);
+	}
+
 	return error;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 4/7] fs: dlm: handle recovery result outside of ls_recover
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
                   ` (2 preceding siblings ...)
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 3/7] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 5/7] fs: dlm: handle recovery -EAGAIN case as retry Alexander Aring
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch cleanups the recovery result handling by moving it outside of
ls_recover() to the caller function do_ls_recovery(). This way it's
clear how we react to recovery if it's successful or delivers different
error codes.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/recoverd.c | 42 ++++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 98c17f74927f..90e8b7f440da 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -243,27 +243,12 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 		  jiffies_to_msecs(jiffies - start));
 	mutex_unlock(&ls->ls_recoverd_active);
 
-	ls->ls_recovery_result = 0;
-	complete(&ls->ls_recovery_done);
-
-	dlm_lsop_recover_done(ls);
 	return 0;
 
  fail:
 	dlm_release_root_list(ls);
-	log_rinfo(ls, "dlm_recover %llu error %d",
-		  (unsigned long long)rv->seq, error);
 	mutex_unlock(&ls->ls_recoverd_active);
 
-	/* let new_lockspace() get aware of critical error if recovery
-	 * was interrupted -EINTR we wait for the next ls_recover()
-	 * iteration until it succeeds.
-	 */
-	if (error != -EINTR) {
-		ls->ls_recovery_result = error;
-		complete(&ls->ls_recovery_done);
-	}
-
 	return error;
 }
 
@@ -274,6 +259,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 static void do_ls_recovery(struct dlm_ls *ls)
 {
 	struct dlm_recover *rv = NULL;
+	int error;
 
 	spin_lock(&ls->ls_recover_lock);
 	rv = ls->ls_recover_args;
@@ -283,7 +269,31 @@ static void do_ls_recovery(struct dlm_ls *ls)
 	spin_unlock(&ls->ls_recover_lock);
 
 	if (rv) {
-		ls_recover(ls, rv);
+		error = ls_recover(ls, rv);
+		switch (error) {
+		case 0:
+			ls->ls_recovery_result = 0;
+			complete(&ls->ls_recovery_done);
+
+			dlm_lsop_recover_done(ls);
+			break;
+		case -EINTR:
+			/* if recovery was interrupted -EINTR we wait for the next
+			 * ls_recover() iteration until it hopefully succeeds.
+			 */
+			log_rinfo(ls, "%s %llu interrupted and should be queued to run again",
+				  __func__, (unsigned long long)rv->seq);
+			break;
+		default:
+			log_rinfo(ls, "%s %llu error %d", __func__,
+				  (unsigned long long)rv->seq, error);
+
+			/* let new_lockspace() get aware of critical error */
+			ls->ls_recovery_result = error;
+			complete(&ls->ls_recovery_done);
+			break;
+		}
+
 		kfree(rv->nodes);
 		kfree(rv);
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 5/7] fs: dlm: handle recovery -EAGAIN case as retry
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
                   ` (3 preceding siblings ...)
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 4/7] fs: dlm: handle recovery result outside of ls_recover Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case Alexander Aring
  6 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds additional functionality if recovery returns -EAGAIN
error code to not deliver this failure to the potential caller of
dlm_new_lockspace(). If -EAGAIN is returned we try to run recovery again
and hope with a additional schedule() it doesn't return -EAGAIN anymore.
If a maximum amount is hit, we fence ourself by running panic().

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/recoverd.c | 70 +++++++++++++++++++++++++++++++----------------
 1 file changed, 47 insertions(+), 23 deletions(-)

diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 90e8b7f440da..eeb221c175a2 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -20,6 +20,7 @@
 #include "requestqueue.h"
 #include "recoverd.h"
 
+#define DLM_RECOVERY_MAX_RETRIES 5
 
 /* If the start for which we're re-enabling locking (seq) has been superseded
    by a newer stop (ls_recover_seq), we need to leave locking disabled.
@@ -259,7 +260,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 static void do_ls_recovery(struct dlm_ls *ls)
 {
 	struct dlm_recover *rv = NULL;
-	int error;
+	int error, count = 0;
 
 	spin_lock(&ls->ls_recover_lock);
 	rv = ls->ls_recover_args;
@@ -269,30 +270,53 @@ static void do_ls_recovery(struct dlm_ls *ls)
 	spin_unlock(&ls->ls_recover_lock);
 
 	if (rv) {
-		error = ls_recover(ls, rv);
-		switch (error) {
-		case 0:
-			ls->ls_recovery_result = 0;
-			complete(&ls->ls_recovery_done);
-
-			dlm_lsop_recover_done(ls);
-			break;
-		case -EINTR:
-			/* if recovery was interrupted -EINTR we wait for the next
-			 * ls_recover() iteration until it hopefully succeeds.
+		do {
+			/* we try DLM_MAX_RECOVERY_RETRIES times again to run
+			 * recovery, if any -EAGAIN is not resolved this
+			 * time we will let DLM_ASSERT() fence ourself.
 			 */
-			log_rinfo(ls, "%s %llu interrupted and should be queued to run again",
-				  __func__, (unsigned long long)rv->seq);
-			break;
-		default:
-			log_rinfo(ls, "%s %llu error %d", __func__,
-				  (unsigned long long)rv->seq, error);
+			DLM_ASSERT(count < DLM_RECOVERY_MAX_RETRIES,
+				   pr_err("%s %llu too many recovery retries %d\n",
+					  __func__, (unsigned long long)rv->seq););
 
-			/* let new_lockspace() get aware of critical error */
-			ls->ls_recovery_result = error;
-			complete(&ls->ls_recovery_done);
-			break;
-		}
+			error = ls_recover(ls, rv);
+			switch (error) {
+			case 0:
+				ls->ls_recovery_result = 0;
+				complete(&ls->ls_recovery_done);
+
+				dlm_lsop_recover_done(ls);
+				break;
+			case -EINTR:
+				/* if recovery was interrupted -EINTR we wait for the next
+				 * ls_recover() iteration until it hopefully succeeds.
+				 */
+				log_rinfo(ls,
+					  "%s %llu interrupted and should be queued to run again",
+					  __func__, (unsigned long long)rv->seq);
+				break;
+			case -EAGAIN:
+				/* either API is returning -EAGAIN or some critical errors
+				 * returning -EAGAIN which let the recovery run again. There
+				 * is a schedule() between it in the hope that the error resolves
+				 * itself. If not the above DLM_ASSERT() will hit.
+				 */
+				log_rinfo(ls, "%s %llu recovery wants to run again",
+					  __func__, (unsigned long long)rv->seq);
+				schedule();
+				break;
+			default:
+				log_rinfo(ls, "%s %llu error %d", __func__,
+					  (unsigned long long)rv->seq, error);
+
+				/* let new_lockspace() get aware of critical error */
+				ls->ls_recovery_result = error;
+				complete(&ls->ls_recovery_done);
+				break;
+			}
+
+			count++;
+		} while (error == -EAGAIN);
 
 		kfree(rv->nodes);
 		kfree(rv);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
                   ` (4 preceding siblings ...)
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 5/7] fs: dlm: handle recovery -EAGAIN case as retry Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-14 14:54   ` Alexander Aring
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case Alexander Aring
  6 siblings, 1 reply; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch changes a -EINVAL error for dlm_master_lookup() to -EAGAIN.
It is a critical error which should not happened, if it happens there
exists an issue. However we still track those issues inside the lock but
if they happen we try to run recovery again if those issues will get
resolved. If not recovery has a logic to fail this node after several
retries.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lock.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 226822f49d30..ad32a883c1fd 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -1018,7 +1018,10 @@ int dlm_master_lookup(struct dlm_ls *ls, int from_nodeid, char *name, int len,
 			  from_nodeid, dir_nodeid, our_nodeid, hash,
 			  ls->ls_num_nodes);
 		*r_nodeid = -1;
-		return -EINVAL;
+		/* this case should never occur, we try again
+		 * to hope it got resolved
+		 */
+		return -EAGAIN;
 	}
 
  retry:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case
  2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
                   ` (5 preceding siblings ...)
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
@ 2022-06-10 17:06 ` Alexander Aring
  2022-06-14 17:59   ` Alexander Aring
  6 siblings, 1 reply; 10+ messages in thread
From: Alexander Aring @ 2022-06-10 17:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds a WARN_ON if recovery hits a critical error but no
caller was waiting in dlm_new_lockspace(), this can occur e.g. if a
node got fences. The WARN_ON signals us to investigate into this case
that it should not occur.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/recoverd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index eeb221c175a2..240267568aab 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -311,6 +311,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
 
 				/* let new_lockspace() get aware of critical error */
 				ls->ls_recovery_result = error;
+				WARN_ON(completion_done(&ls->ls_recovery_done));
 				complete(&ls->ls_recovery_done);
 				break;
 			}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
@ 2022-06-14 14:54   ` Alexander Aring
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-14 14:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> This patch changes a -EINVAL error for dlm_master_lookup() to -EAGAIN.
> It is a critical error which should not happened, if it happens there
> exists an issue. However we still track those issues inside the lock but
> if they happen we try to run recovery again if those issues will get
> resolved. If not recovery has a logic to fail this node after several
> retries.
>
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/lock.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> index 226822f49d30..ad32a883c1fd 100644
> --- a/fs/dlm/lock.c
> +++ b/fs/dlm/lock.c
> @@ -1018,7 +1018,10 @@ int dlm_master_lookup(struct dlm_ls *ls, int from_nodeid, char *name, int len,
>                           from_nodeid, dir_nodeid, our_nodeid, hash,
>                           ls->ls_num_nodes);
>                 *r_nodeid = -1;
> -               return -EINVAL;
> +               /* this case should never occur, we try again
> +                * to hope it got resolved
> +                */
> +               return -EAGAIN;

I moved this -EAGAIN return if  dlm_master_lookup() in
dlm_recover_directory() returns -EINVAL as this function is also used
in non-recovery handling whereas dlm_recover_directory() is used in
recovery handling only. There was once an issue that
dlm_recover_directory() returned -EINVAL in recovery handling and this
patch should somehow try to resolve the issue by assuming it is a
temporal issue when exchanging messages or scheduling some other
tasks... unfortunately there was no more information how this issue
was triggered.

- Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case
  2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case Alexander Aring
@ 2022-06-14 17:59   ` Alexander Aring
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2022-06-14 17:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> This patch adds a WARN_ON if recovery hits a critical error but no
> caller was waiting in dlm_new_lockspace(), this can occur e.g. if a
> node got fences. The WARN_ON signals us to investigate into this case
> that it should not occur.
>
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/recoverd.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index eeb221c175a2..240267568aab 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -311,6 +311,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
>
>                                 /* let new_lockspace() get aware of critical error */
>                                 ls->ls_recovery_result = error;
> +                               WARN_ON(completion_done(&ls->ls_recovery_done));

I will drop this patch, I think it can race because
dlm_new_lockspace() triggers recovery and then waits... race is
unlikely but I think the better approach is here to look at debug
messages to see why recovery fails then. Debug messages may need to be
improved depending on the case and I will just send patches if there
is any information missing.

- Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-14 17:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-10 17:06 [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 2/7] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 3/7] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 4/7] fs: dlm: handle recovery result outside of ls_recover Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 5/7] fs: dlm: handle recovery -EAGAIN case as retry Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
2022-06-14 14:54   ` Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case Alexander Aring
2022-06-14 17:59   ` Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).