cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH dlm/next 3/8] fs: dlm: wait until all midcomms nodes detects version
Date: Thu, 12 Jan 2023 17:18:44 -0500	[thread overview]
Message-ID: <20230112221849.1883104-4-aahringo@redhat.com> (raw)
In-Reply-To: <20230112221849.1883104-1-aahringo@redhat.com>

The current dlm version detection is very complex due backwards
compatiblilty with earlier dlm protocol versions. It took some time to
detect if a peer node has a specific DLM version, if it's not known we
just cut the socket connection. Now there could be cases where the node
didn't detected the version field yet but the peer node detected it and
we are trying to shutdown the dlm connection with a specific mechanism
to avoid synchronization issue when pending cluster lockspace
memberships are still on wire.

To make it more robust we introcude a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.

It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktroture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lockspace.c |  3 +++
 fs/dlm/midcomms.c  | 23 +++++++++++++++++++++++
 fs/dlm/midcomms.h  |  1 +
 3 files changed, 27 insertions(+)

diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 99bc96f90779..d9dc0b734002 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -820,6 +820,9 @@ static int release_lockspace(struct dlm_ls *ls, int force)
 		return rv;
 	}
 
+	if (ls_count == 1)
+		dlm_midcomms_version_wait();
+
 	dlm_device_deregister(ls);
 
 	if (force < 3 && dlm_user_daemon_available())
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index dbc998b2748b..cf91a5a11b4f 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -657,6 +657,7 @@ static int dlm_midcomms_version_check_3_2(struct midcomms_node *node)
 	switch (node->version) {
 	case DLM_VERSION_NOT_SET:
 		node->version = DLM_VERSION_3_2;
+		wake_up(&node->shutdown_wait);
 		log_print("version 0x%08x for node %d detected", DLM_VERSION_3_2,
 			  node->nodeid);
 		break;
@@ -826,6 +827,7 @@ static int dlm_midcomms_version_check_3_1(struct midcomms_node *node)
 	switch (node->version) {
 	case DLM_VERSION_NOT_SET:
 		node->version = DLM_VERSION_3_1;
+		wake_up(&node->shutdown_wait);
 		log_print("version 0x%08x for node %d detected", DLM_VERSION_3_1,
 			  node->nodeid);
 		break;
@@ -1386,6 +1388,27 @@ static void midcomms_node_release(struct rcu_head *rcu)
 	kfree(node);
 }
 
+void dlm_midcomms_version_wait(void)
+{
+	struct midcomms_node *node;
+	int i, idx, ret;
+
+	idx = srcu_read_lock(&nodes_srcu);
+	for (i = 0; i < CONN_HASH_SIZE; i++) {
+		hlist_for_each_entry_rcu(node, &node_hash[i], hlist) {
+			ret = wait_event_timeout(node->shutdown_wait,
+						 node->version != DLM_VERSION_NOT_SET ||
+						 node->state == DLM_CLOSED ||
+						 test_bit(DLM_NODE_FLAG_CLOSE, &node->flags),
+						 DLM_SHUTDOWN_TIMEOUT);
+			if (!ret || test_bit(DLM_NODE_FLAG_CLOSE, &node->flags))
+				pr_debug("version wait timed out for node %d with state %s\n",
+					 node->nodeid, dlm_state_str(node->state));
+		}
+	}
+	srcu_read_unlock(&nodes_srcu, idx);
+}
+
 static void midcomms_shutdown(struct midcomms_node *node)
 {
 	int ret;
diff --git a/fs/dlm/midcomms.h b/fs/dlm/midcomms.h
index bea1cee4279c..9f8c9605013d 100644
--- a/fs/dlm/midcomms.h
+++ b/fs/dlm/midcomms.h
@@ -20,6 +20,7 @@ struct dlm_mhandle *dlm_midcomms_get_mhandle(int nodeid, int len,
 					     gfp_t allocation, char **ppc);
 void dlm_midcomms_commit_mhandle(struct dlm_mhandle *mh, const void *name,
 				 int namelen);
+void dlm_midcomms_version_wait(void);
 int dlm_midcomms_close(int nodeid);
 int dlm_midcomms_start(void);
 void dlm_midcomms_stop(void);
-- 
2.31.1


  parent reply	other threads:[~2023-01-12 22:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 2/8] fs: dlm: change to ignore unexpected non dlm opts msgs Alexander Aring
2023-01-12 22:18 ` Alexander Aring [this message]
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 4/8] fs: dlm: make dlm sequence id handling more robust Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 5/8] fs: dlm: reduce the timeout time to 5 secs Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 6/8] fs: dlm: remove newline in log_print Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 7/8] fs: dlm: move state change into else branch Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 8/8] fs: dlm: remove unnecessary waker_up() calls Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230112221849.1883104-4-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).