From: Alexander Aring <aahringo@redhat.com>
To: teigland@redhat.com
Cc: gfs2@lists.linux.dev, aahringo@redhat.com
Subject: [PATCH v6.13-rc1 2/5] dlm: fix missing rsb put on scan timer
Date: Mon, 2 Dec 2024 10:26:38 -0500 [thread overview]
Message-ID: <20241202152641.3395369-2-aahringo@redhat.com> (raw)
In-Reply-To: <20241202152641.3395369-1-aahringo@redhat.com>
I figured out we don't cleanup all rsb on toss list right now if all
locks are unlocked in the cluster for a resource. After investigation
I figured out the condition to add_scan() is wrong and I added a comment
for the second important condition for all possible permuations of the
boolean expression.
The current expression:
"(r->res_master_nodeid != our_nodeid &&
dlm_dir_nodeid(r) != our_nodeid))"
it only covers the case when we are not the master and not the dir node
which is the second case of the added comment in 2. Case 1 and 3 are
forgotten.
Remove also the del_scan() as it always should remove the rsb out of the
scan timer when it's on, if it's not on del_scan() will do nothing.
For find_rsb_nodir() the del_scan() need to be moved before clearing the
RSB_INACTIVE flag as del_scan() is has a WARN_ON() on it.
Fixes: c217adfc8caa ("dlm: fix add_scan and del_scan usage")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/lock.c | 52 +++++++++++++++++++++++++++++++++++++--------------
1 file changed, 38 insertions(+), 14 deletions(-)
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index fc1d710166e9..a8885cf82913 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -824,8 +824,11 @@ static int find_rsb_dir(struct dlm_ls *ls, const void *name, int len,
r->res_first_lkid = 0;
}
- /* A dir record will not be on the scan list. */
- if (r->res_dir_nodeid != our_nodeid)
+ /* A dir record will not be on the scan list.
+ * Except if we are the master of the rsb.
+ */
+ if (r->res_master_nodeid == our_nodeid ||
+ r->res_dir_nodeid != our_nodeid)
del_scan(ls, r);
list_move(&r->res_slow_list, &ls->ls_slow_active);
rsb_clear_flag(r, RSB_INACTIVE);
@@ -989,10 +992,10 @@ static int find_rsb_nodir(struct dlm_ls *ls, const void *name, int len,
r->res_nodeid = 0;
}
+ del_scan(ls, r);
list_move(&r->res_slow_list, &ls->ls_slow_active);
rsb_clear_flag(r, RSB_INACTIVE);
kref_init(&r->res_ref);
- del_scan(ls, r);
write_unlock_bh(&ls->ls_rsbtbl_lock);
goto out;
@@ -1337,9 +1340,13 @@ static int _dlm_master_lookup(struct dlm_ls *ls, int from_nodeid, const char *na
__dlm_master_lookup(ls, r, our_nodeid, from_nodeid, true, flags,
r_nodeid, result);
- /* A dir record rsb should never be on scan list. */
- /* Try to fix this with del_scan? */
- WARN_ON(!list_empty(&r->res_scan_list));
+ /* A dir record rsb should never be on scan list.
+ * Except when we are the dir and master node.
+ * This function should only be called by the dir
+ * node.
+ */
+ WARN_ON(!list_empty(&r->res_scan_list) &&
+ r->res_master_nodeid != our_nodeid);
write_unlock_bh(&ls->ls_rsbtbl_lock);
@@ -1431,15 +1438,32 @@ static void deactivate_rsb(struct kref *kref)
/*
* When the rsb becomes unused:
- * - If it's not a dir record for a remote master rsb,
- * then it is put on the scan list to be freed.
- * - If it's a dir record for a remote master rsb,
- * then it is kept in the inactive state until
- * receive_remove() from the master node.
+ * - It's on scan list when there is no directory
+ * node functionality
+ * or
+ * - First case is if we are the rsb master we need to
+ * call send_remove() to the dir node and delete
+ * ourself.
+ *
+ * The second case is if we are not the dir node we
+ * need to delete ourself as well but don't call
+ * send_remove() as we are not the master.
+ *
+ * The thrid case when the node is master and
+ * dir node at the same time. Then the rsb need to be
+ * added to the scan timer as well because the scan
+ * acts like a local send_remove() that is done
+ * immediately.
+ *
+ * The fourth case is the one we need to avoid here,
+ * when we are the directory node and not the master.
+ * This is the inverse of the condition below.
+ * Then we never should be on the scan timer and
+ * waiting of receive_remove() of the first case.
*/
- if (!dlm_no_directory(ls) &&
- (r->res_master_nodeid != our_nodeid) &&
- (dlm_dir_nodeid(r) != our_nodeid))
+ if (dlm_no_directory(ls) ||
+ (r->res_master_nodeid == our_nodeid ||
+ dlm_dir_nodeid(r) != our_nodeid))
add_scan(ls, r);
if (r->res_lvbptr) {
--
2.43.0
next prev parent reply other threads:[~2024-12-02 15:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-02 15:26 [PATCH v6.13-rc1 1/5] dlm: fix srcu_read_lock() return type to int Alexander Aring
2024-12-02 15:26 ` Alexander Aring [this message]
2024-12-02 15:26 ` [PATCH v6.13-rc1 3/5] dlm: add more sanity checks related scan timer Alexander Aring
2024-12-02 15:26 ` [PATCH v6.13-rc1 4/5] dlm: make sense out of force values Alexander Aring
2024-12-02 15:26 ` [PATCH v6.13-rc1 5/5] dlm: return -ENOENT if no comm was found Alexander Aring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241202152641.3395369-2-aahringo@redhat.com \
--to=aahringo@redhat.com \
--cc=gfs2@lists.linux.dev \
--cc=teigland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox