cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH RESEND v5.18-rc1 18/19] fs: dlm: cleanup lock handling in dlm_master_lookup
Date: Mon,  4 Apr 2022 16:06:45 -0400	[thread overview]
Message-ID: <20220404200646.3170301-19-aahringo@redhat.com> (raw)
In-Reply-To: <20220404200646.3170301-1-aahringo@redhat.com>

This patch will remove the following warning by sparse:

fs/dlm/lock.c:1049:9: warning: context imbalance in 'dlm_master_lookup' - different lock contexts for basic block

I tried to find any issues with the current handling and I did not find
any. However it is hard to follow the lock handling in this area of
dlm_master_lookup() and I suppose that sparse cannot realize that there
are no issues. The variable "toss_list" makes it really hard to follow
the lock handling because if it's set the rsb lock/refcount isn't held
but the ls->ls_rsbtbl[b].lock is held and this is one reason why the rsb
lock/refcount does not need to be held. If it's not set the
ls->ls_rsbtbl[b].lock is not held but the rsb lock/refcount is held. The
indicator of toss_list will be used to store the actual lock state.
Another possibility is that a retry can happen and then it's hard to
follow the specific code part. I did not find any issues but sparse
cannot realize that there are no issues.

To make it more easier to understand for developers and sparse as well,
we remove the toss_list variable which indicates a specific lock state
and move handling in between of this lock state in a separate function.
This function can be called now in case when the initial lock states are
taken which was previously signalled if toss_list was set or not. The
advantage here is that we can release all locks/refcounts in mostly the
same code block as it was taken.

Afterwards sparse had no issues to figure out that there are no problems
with the current lock behaviour.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lock.c | 189 +++++++++++++++++++++++++++-----------------------
 1 file changed, 102 insertions(+), 87 deletions(-)

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 25468b5e65ad..29e80039e7ca 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -880,6 +880,88 @@ static int validate_master_nodeid(struct dlm_ls *ls, struct dlm_rsb *r,
 	}
 }
 
+static void __dlm_master_lookup(struct dlm_ls *ls, struct dlm_rsb *r, int our_nodeid,
+				int from_nodeid, bool toss_list, unsigned int flags,
+				int *r_nodeid, int *result)
+{
+	int fix_master = (flags & DLM_LU_RECOVER_MASTER);
+	int from_master = (flags & DLM_LU_RECOVER_DIR);
+
+	if (r->res_dir_nodeid != our_nodeid) {
+		/* should not happen, but may as well fix it and carry on */
+		log_error(ls, "%s res_dir %d our %d %s", __func__,
+			  r->res_dir_nodeid, our_nodeid, r->res_name);
+		r->res_dir_nodeid = our_nodeid;
+	}
+
+	if (fix_master && dlm_is_removed(ls, r->res_master_nodeid)) {
+		/* Recovery uses this function to set a new master when
+		 * the previous master failed.  Setting NEW_MASTER will
+		 * force dlm_recover_masters to call recover_master on this
+		 * rsb even though the res_nodeid is no longer removed.
+		 */
+
+		r->res_master_nodeid = from_nodeid;
+		r->res_nodeid = from_nodeid;
+		rsb_set_flag(r, RSB_NEW_MASTER);
+
+		if (toss_list) {
+			/* I don't think we should ever find it on toss list. */
+			log_error(ls, "%s fix_master on toss", __func__);
+			dlm_dump_rsb(r);
+		}
+	}
+
+	if (from_master && (r->res_master_nodeid != from_nodeid)) {
+		/* this will happen if from_nodeid became master during
+		 * a previous recovery cycle, and we aborted the previous
+		 * cycle before recovering this master value
+		 */
+
+		log_limit(ls, "%s from_master %d master_nodeid %d res_nodeid %d first %x %s",
+			  __func__, from_nodeid, r->res_master_nodeid,
+			  r->res_nodeid, r->res_first_lkid, r->res_name);
+
+		if (r->res_master_nodeid == our_nodeid) {
+			log_error(ls, "from_master %d our_master", from_nodeid);
+			dlm_dump_rsb(r);
+			goto ret_assign;
+		}
+
+		r->res_master_nodeid = from_nodeid;
+		r->res_nodeid = from_nodeid;
+		rsb_set_flag(r, RSB_NEW_MASTER);
+	}
+
+	if (!r->res_master_nodeid) {
+		/* this will happen if recovery happens while we're looking
+		 * up the master for this rsb
+		 */
+
+		log_debug(ls, "%s master 0 to %d first %x %s", __func__,
+			  from_nodeid, r->res_first_lkid, r->res_name);
+		r->res_master_nodeid = from_nodeid;
+		r->res_nodeid = from_nodeid;
+	}
+
+	if (!from_master && !fix_master &&
+	    (r->res_master_nodeid == from_nodeid)) {
+		/* this can happen when the master sends remove, the dir node
+		 * finds the rsb on the keep list and ignores the remove,
+		 * and the former master sends a lookup
+		 */
+
+		log_limit(ls, "%s from master %d flags %x first %x %s",
+			  __func__, from_nodeid, flags, r->res_first_lkid,
+			  r->res_name);
+	}
+
+ ret_assign:
+	*r_nodeid = r->res_master_nodeid;
+	if (result)
+		*result = DLM_LU_MATCH;
+}
+
 /*
  * We're the dir node for this res and another node wants to know the
  * master nodeid.  During normal operation (non recovery) this is only
@@ -914,10 +996,8 @@ int dlm_master_lookup(struct dlm_ls *ls, int from_nodeid, char *name, int len,
 {
 	struct dlm_rsb *r = NULL;
 	uint32_t hash, b;
-	int from_master = (flags & DLM_LU_RECOVER_DIR);
-	int fix_master = (flags & DLM_LU_RECOVER_MASTER);
 	int our_nodeid = dlm_our_nodeid();
-	int dir_nodeid, error, toss_list = 0;
+	int dir_nodeid, error;
 
 	if (len > DLM_RESNAME_MAXLEN)
 		return -EINVAL;
@@ -949,103 +1029,38 @@ int dlm_master_lookup(struct dlm_ls *ls, int from_nodeid, char *name, int len,
 	error = dlm_search_rsb_tree(&ls->ls_rsbtbl[b].keep, name, len, &r);
 	if (!error) {
 		/* because the rsb is active, we need to lock_rsb before
-		   checking/changing re_master_nodeid */
+		 * checking/changing re_master_nodeid
+		 */
 
 		hold_rsb(r);
 		spin_unlock(&ls->ls_rsbtbl[b].lock);
 		lock_rsb(r);
-	} else {
-		error = dlm_search_rsb_tree(&ls->ls_rsbtbl[b].toss, name, len, &r);
-		if (error)
-			goto not_found;
-
-		/* because the rsb is inactive (on toss list), it's not refcounted
-		 * and lock_rsb is not used, but is protected by the rsbtbl lock
-		 */
-
-		toss_list = 1;
-	}
-
-	if (r->res_dir_nodeid != our_nodeid) {
-		/* should not happen, but may as well fix it and carry on */
-		log_error(ls, "dlm_master_lookup res_dir %d our %d %s",
-			  r->res_dir_nodeid, our_nodeid, r->res_name);
-		r->res_dir_nodeid = our_nodeid;
-	}
-
-	if (fix_master && dlm_is_removed(ls, r->res_master_nodeid)) {
-		/* Recovery uses this function to set a new master when
-		   the previous master failed.  Setting NEW_MASTER will
-		   force dlm_recover_masters to call recover_master on this
-		   rsb even though the res_nodeid is no longer removed. */
-
-		r->res_master_nodeid = from_nodeid;
-		r->res_nodeid = from_nodeid;
-		rsb_set_flag(r, RSB_NEW_MASTER);
-
-		if (toss_list) {
-			/* I don't think we should ever find it on toss list. */
-			log_error(ls, "dlm_master_lookup fix_master on toss");
-			dlm_dump_rsb(r);
-		}
-	}
 
-	if (from_master && (r->res_master_nodeid != from_nodeid)) {
-		/* this will happen if from_nodeid became master during
-		   a previous recovery cycle, and we aborted the previous
-		   cycle before recovering this master value */
+		__dlm_master_lookup(ls, r, our_nodeid, from_nodeid, false,
+				    flags, r_nodeid, result);
 
-		log_limit(ls, "dlm_master_lookup from_master %d "
-			  "master_nodeid %d res_nodeid %d first %x %s",
-			  from_nodeid, r->res_master_nodeid, r->res_nodeid,
-			  r->res_first_lkid, r->res_name);
-
-		if (r->res_master_nodeid == our_nodeid) {
-			log_error(ls, "from_master %d our_master", from_nodeid);
-			dlm_dump_rsb(r);
-			goto out_found;
-		}
+		/* the rsb was active */
+		unlock_rsb(r);
+		put_rsb(r);
 
-		r->res_master_nodeid = from_nodeid;
-		r->res_nodeid = from_nodeid;
-		rsb_set_flag(r, RSB_NEW_MASTER);
+		return 0;
 	}
 
-	if (!r->res_master_nodeid) {
-		/* this will happen if recovery happens while we're looking
-		   up the master for this rsb */
-
-		log_debug(ls, "dlm_master_lookup master 0 to %d first %x %s",
-			  from_nodeid, r->res_first_lkid, r->res_name);
-		r->res_master_nodeid = from_nodeid;
-		r->res_nodeid = from_nodeid;
-	}
+	error = dlm_search_rsb_tree(&ls->ls_rsbtbl[b].toss, name, len, &r);
+	if (error)
+		goto not_found;
 
-	if (!from_master && !fix_master &&
-	    (r->res_master_nodeid == from_nodeid)) {
-		/* this can happen when the master sends remove, the dir node
-		   finds the rsb on the keep list and ignores the remove,
-		   and the former master sends a lookup */
+	/* because the rsb is inactive (on toss list), it's not refcounted
+	 * and lock_rsb is not used, but is protected by the rsbtbl lock
+	 */
 
-		log_limit(ls, "dlm_master_lookup from master %d flags %x "
-			  "first %x %s", from_nodeid, flags,
-			  r->res_first_lkid, r->res_name);
-	}
+	__dlm_master_lookup(ls, r, our_nodeid, from_nodeid, true, flags,
+			    r_nodeid, result);
 
- out_found:
-	*r_nodeid = r->res_master_nodeid;
-	if (result)
-		*result = DLM_LU_MATCH;
+	r->res_toss_time = jiffies;
+	/* the rsb was inactive (on toss list) */
+	spin_unlock(&ls->ls_rsbtbl[b].lock);
 
-	if (toss_list) {
-		r->res_toss_time = jiffies;
-		/* the rsb was inactive (on toss list) */
-		spin_unlock(&ls->ls_rsbtbl[b].lock);
-	} else {
-		/* the rsb was active */
-		unlock_rsb(r);
-		put_rsb(r);
-	}
 	return 0;
 
  not_found:
-- 
2.31.1


  parent reply	other threads:[~2022-04-04 20:06 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-04 20:06 [Cluster-devel] [PATCH RESEND v5.18-rc1 00/19] fs: dlm: resend patches for v5.18-rc1 Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 01/19] fs: dlm: uninitialized variable on error in dlm_listen_for_all() Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 02/19] fs: dlm: fix missing check in validate_lock_args Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 03/19] fs: dlm: fix plock invalid read Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 04/19] fs: dlm: replace sanity checks with WARN_ON Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 05/19] fs: dlm: cleanup plock_op vs plock_xop Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 06/19] fs: dlm: rearrange async condition return Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 07/19] fs: dlm: improve plock logging if interrupted Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 08/19] fs: dlm: remove unnecessary INIT_LIST_HEAD() Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 09/19] fs: dlm: move global to static inits Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 10/19] fs: dlm: add __CHECKER__ for false positives Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 11/19] fs: dlm: use __le types for options header Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 12/19] fs: dlm: use __le types for dlm header Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 13/19] fs: dlm: use __le types for rcom messages Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 14/19] fs: dlm: use __le types for dlm messages Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 15/19] fs: dlm: move conversion to compile time Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 16/19] fs: dlm: remove __user conversion warnings Alexander Aring
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 17/19] fs: dlm: remove found label in dlm_master_lookup Alexander Aring
2022-04-04 20:06 ` Alexander Aring [this message]
2022-04-04 20:06 ` [Cluster-devel] [PATCH RESEND v5.18-rc1 19/19] fs: dlm: check required context while close Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220404200646.3170301-19-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).