* [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase
@ 2023-01-12 22:18 Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling Alexander Aring
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
I previously send some patches which should queued up for stable. This
patches tries to make the behaviour better for the midcomms layer in a
very specific corner case of invoking stop/start of midcomms/lowcomms
in a looping behaviour.
I am using dlm_locktorture module which isn't upstream but can be found
at [0]. Have it as a module, I am doing a:
while true;do modprobe dlm_locktorture cluster=gohan;rmmod dlm_locktorture;done
on all cluster nodes will invoke the testcase which tests a specific
corner case to call midcomms/lowcomms start and stop again on all
cluster nodes. Between we might have some locking requests happening.
Those patches makes this specific corner case better to handle, it's
still not perfect and in futurue we should look again into this specific
behaviour of adding/removing lockspaces in a looping behaviour. However
it is a very corner testcase which probably does not occur a lot in
practice. We might also need to try to simplify the whole process and
have a better behaviour for dlm version detection as it's currently is.
- Alex
[0] https://gitlab.com/netcoder/linux-public/-/commit/cb67a3c99f0eb1e5860a9e27015e6dbd00bbcc34
Alexander Aring (8):
fs: dlm: bring back previously shutdown handling
fs: dlm: change to ignore unexpected non dlm opts msgs
fs: dlm: wait until all midcomms nodes detects version
fs: dlm: make dlm sequence id handling more robust
fs: dlm: reduce the timeout time to 5 secs
fs: dlm: remove newline in log_print
fs: dlm: move state change into else branch
fs: dlm: remove unnecessary waker_up() calls
fs/dlm/lockspace.c | 5 ++-
fs/dlm/lowcomms.c | 77 +++++++++++++++++++++++++++++++++-------------
fs/dlm/midcomms.c | 74 ++++++++++++++++++++++++--------------------
fs/dlm/midcomms.h | 1 +
4 files changed, 102 insertions(+), 55 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 2/8] fs: dlm: change to ignore unexpected non dlm opts msgs Alexander Aring
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch mostly reverts commit 4f567acb0b86 ("fs: dlm: remove socket
shutdown handling"). There can be situations where the dlm midcomms nodes
hash and lowcomms connection hash are not equal, but we need to guarantee
that the lowcomms are all closed on a last release of a dlm lockspace,
means a shutdown is invoked. This patch will guarantee that we always
close all sockets managed by the lowcomms connection hash and calls
shutdown for the last sending dlm message to be sure we don't cut the
socket and the peer potential gets a connection reset.
In future we should try to merge the midcomms/lowcomms hashes into one
hash and don't handle both in separate hashes.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/lowcomms.c | 77 ++++++++++++++++++++++++++++++++++-------------
fs/dlm/midcomms.c | 20 +++++-------
2 files changed, 63 insertions(+), 34 deletions(-)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 4450721ec83c..61cd6c2628fa 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -61,6 +61,7 @@
#include "memory.h"
#include "config.h"
+#define DLM_SHUTDOWN_WAIT_TIMEOUT msecs_to_jiffies(5000)
#define NEEDED_RMEM (4*1024*1024)
struct connection {
@@ -99,6 +100,7 @@ struct connection {
struct connection *othercon;
struct work_struct rwork; /* receive worker */
struct work_struct swork; /* send worker */
+ wait_queue_head_t shutdown_wait;
unsigned char rx_leftover_buf[DLM_MAX_SOCKET_BUFSIZE];
int rx_leftover;
int mark;
@@ -282,6 +284,7 @@ static void dlm_con_init(struct connection *con, int nodeid)
INIT_WORK(&con->swork, process_send_sockets);
INIT_WORK(&con->rwork, process_recv_sockets);
spin_lock_init(&con->addrs_lock);
+ init_waitqueue_head(&con->shutdown_wait);
}
/*
@@ -790,6 +793,43 @@ static void close_connection(struct connection *con, bool and_other)
up_write(&con->sock_lock);
}
+static void shutdown_connection(struct connection *con, bool and_other)
+{
+ int ret;
+
+ if (con->othercon && and_other)
+ shutdown_connection(con->othercon, false);
+
+ flush_workqueue(io_workqueue);
+ down_read(&con->sock_lock);
+ /* nothing to shutdown */
+ if (!con->sock) {
+ up_read(&con->sock_lock);
+ return;
+ }
+
+ ret = kernel_sock_shutdown(con->sock, SHUT_WR);
+ up_read(&con->sock_lock);
+ if (ret) {
+ log_print("Connection %p failed to shutdown: %d will force close",
+ con, ret);
+ goto force_close;
+ } else {
+ ret = wait_event_timeout(con->shutdown_wait, !con->sock,
+ DLM_SHUTDOWN_WAIT_TIMEOUT);
+ if (ret == 0) {
+ log_print("Connection %p shutdown timed out, will force close",
+ con);
+ goto force_close;
+ }
+ }
+
+ return;
+
+force_close:
+ close_connection(con, false);
+}
+
static struct processqueue_entry *new_processqueue_entry(int nodeid,
int buflen)
{
@@ -1488,6 +1528,7 @@ static void process_recv_sockets(struct work_struct *work)
break;
case DLM_IO_EOF:
close_connection(con, false);
+ wake_up(&con->shutdown_wait);
/* CF_RECV_PENDING cleared */
break;
case DLM_IO_RESCHED:
@@ -1695,6 +1736,9 @@ static int work_start(void)
void dlm_lowcomms_shutdown(void)
{
+ struct connection *con;
+ int i, idx;
+
/* stop lowcomms_listen_data_ready calls */
lock_sock(listen_con.sock->sk);
listen_con.sock->sk->sk_data_ready = listen_sock.sk_data_ready;
@@ -1703,29 +1747,20 @@ void dlm_lowcomms_shutdown(void)
cancel_work_sync(&listen_con.rwork);
dlm_close_sock(&listen_con.sock);
- flush_workqueue(process_workqueue);
-}
-
-void dlm_lowcomms_shutdown_node(int nodeid, bool force)
-{
- struct connection *con;
- int idx;
-
idx = srcu_read_lock(&connections_srcu);
- con = nodeid2con(nodeid, 0);
- if (WARN_ON_ONCE(!con)) {
- srcu_read_unlock(&connections_srcu, idx);
- return;
- }
+ for (i = 0; i < CONN_HASH_SIZE; i++) {
+ hlist_for_each_entry_rcu(con, &connection_hash[i], list) {
+ shutdown_connection(con, true);
+ stop_connection_io(con);
+ flush_workqueue(process_workqueue);
+ close_connection(con, true);
- flush_work(&con->swork);
- stop_connection_io(con);
- WARN_ON_ONCE(!force && !list_empty(&con->writequeue));
- close_connection(con, true);
- clean_one_writequeue(con);
- if (con->othercon)
- clean_one_writequeue(con->othercon);
- allow_connection_io(con);
+ clean_one_writequeue(con);
+ if (con->othercon)
+ clean_one_writequeue(con->othercon);
+ allow_connection_io(con);
+ }
+ }
srcu_read_unlock(&connections_srcu, idx);
}
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index ecfb3beb0bb8..ecd81018d1cf 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -1418,8 +1418,7 @@ static void midcomms_shutdown(struct midcomms_node *node)
break;
case DLM_CLOSED:
/* we have what we want */
- spin_unlock(&node->state_lock);
- return;
+ break;
default:
/* busy to enter DLM_FIN_WAIT1, wait until passive
* done in shutdown_wait to enter DLM_CLOSED.
@@ -1436,17 +1435,12 @@ static void midcomms_shutdown(struct midcomms_node *node)
node->state == DLM_CLOSED ||
test_bit(DLM_NODE_FLAG_CLOSE, &node->flags),
DLM_SHUTDOWN_TIMEOUT);
- if (!ret || test_bit(DLM_NODE_FLAG_CLOSE, &node->flags)) {
+ if (!ret || test_bit(DLM_NODE_FLAG_CLOSE, &node->flags))
pr_debug("active shutdown timed out for node %d with state %s\n",
node->nodeid, dlm_state_str(node->state));
- midcomms_node_reset(node);
- dlm_lowcomms_shutdown_node(node->nodeid, true);
- return;
- }
-
- pr_debug("active shutdown done for node %d with state %s\n",
- node->nodeid, dlm_state_str(node->state));
- dlm_lowcomms_shutdown_node(node->nodeid, false);
+ else
+ pr_debug("active shutdown done for node %d with state %s\n",
+ node->nodeid, dlm_state_str(node->state));
}
void dlm_midcomms_shutdown(void)
@@ -1454,8 +1448,6 @@ void dlm_midcomms_shutdown(void)
struct midcomms_node *node;
int i, idx;
- dlm_lowcomms_shutdown();
-
mutex_lock(&close_lock);
idx = srcu_read_lock(&nodes_srcu);
for (i = 0; i < CONN_HASH_SIZE; i++) {
@@ -1473,6 +1465,8 @@ void dlm_midcomms_shutdown(void)
}
srcu_read_unlock(&nodes_srcu, idx);
mutex_unlock(&close_lock);
+
+ dlm_lowcomms_shutdown();
}
int dlm_midcomms_close(int nodeid)
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 2/8] fs: dlm: change to ignore unexpected non dlm opts msgs
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 3/8] fs: dlm: wait until all midcomms nodes detects version Alexander Aring
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch will change to ignore unexpected for RCOM_NAMES/RCOM_STATUS
messages. Those message are there to provide a version detection because
those are messages which are exchanged as first to detect the dlm
version. Due some DLM behaviour and being still backwards compatible
those messages are not part of the new reliable DLM OPTS encapsulation
header. On the other hand those message have their own retransmit
handling with a sequence number match, means if the sequence number
doesn't match DLM RCOM application layer will filter it out. When we get
a unexpected non dlm opts msg we should allow them and DLM RCOM will
filter it out if the sequence number does not match.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/midcomms.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index ecd81018d1cf..dbc998b2748b 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -606,16 +606,8 @@ dlm_midcomms_recv_node_lookup(int nodeid, const union dlm_packet *p,
case DLM_ESTABLISHED:
break;
default:
- /* some invalid state passive shutdown
- * was failed, we try to reset and
- * hope it will go on.
- */
- log_print("reset node %d because shutdown stuck",
- node->nodeid);
-
- midcomms_node_reset(node);
- node->state = DLM_ESTABLISHED;
- break;
+ spin_unlock(&node->state_lock);
+ return NULL;
}
spin_unlock(&node->state_lock);
}
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 3/8] fs: dlm: wait until all midcomms nodes detects version
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 2/8] fs: dlm: change to ignore unexpected non dlm opts msgs Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 4/8] fs: dlm: make dlm sequence id handling more robust Alexander Aring
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
The current dlm version detection is very complex due backwards
compatiblilty with earlier dlm protocol versions. It took some time to
detect if a peer node has a specific DLM version, if it's not known we
just cut the socket connection. Now there could be cases where the node
didn't detected the version field yet but the peer node detected it and
we are trying to shutdown the dlm connection with a specific mechanism
to avoid synchronization issue when pending cluster lockspace
memberships are still on wire.
To make it more robust we introcude a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.
It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktroture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/lockspace.c | 3 +++
fs/dlm/midcomms.c | 23 +++++++++++++++++++++++
fs/dlm/midcomms.h | 1 +
3 files changed, 27 insertions(+)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 99bc96f90779..d9dc0b734002 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -820,6 +820,9 @@ static int release_lockspace(struct dlm_ls *ls, int force)
return rv;
}
+ if (ls_count == 1)
+ dlm_midcomms_version_wait();
+
dlm_device_deregister(ls);
if (force < 3 && dlm_user_daemon_available())
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index dbc998b2748b..cf91a5a11b4f 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -657,6 +657,7 @@ static int dlm_midcomms_version_check_3_2(struct midcomms_node *node)
switch (node->version) {
case DLM_VERSION_NOT_SET:
node->version = DLM_VERSION_3_2;
+ wake_up(&node->shutdown_wait);
log_print("version 0x%08x for node %d detected", DLM_VERSION_3_2,
node->nodeid);
break;
@@ -826,6 +827,7 @@ static int dlm_midcomms_version_check_3_1(struct midcomms_node *node)
switch (node->version) {
case DLM_VERSION_NOT_SET:
node->version = DLM_VERSION_3_1;
+ wake_up(&node->shutdown_wait);
log_print("version 0x%08x for node %d detected", DLM_VERSION_3_1,
node->nodeid);
break;
@@ -1386,6 +1388,27 @@ static void midcomms_node_release(struct rcu_head *rcu)
kfree(node);
}
+void dlm_midcomms_version_wait(void)
+{
+ struct midcomms_node *node;
+ int i, idx, ret;
+
+ idx = srcu_read_lock(&nodes_srcu);
+ for (i = 0; i < CONN_HASH_SIZE; i++) {
+ hlist_for_each_entry_rcu(node, &node_hash[i], hlist) {
+ ret = wait_event_timeout(node->shutdown_wait,
+ node->version != DLM_VERSION_NOT_SET ||
+ node->state == DLM_CLOSED ||
+ test_bit(DLM_NODE_FLAG_CLOSE, &node->flags),
+ DLM_SHUTDOWN_TIMEOUT);
+ if (!ret || test_bit(DLM_NODE_FLAG_CLOSE, &node->flags))
+ pr_debug("version wait timed out for node %d with state %s\n",
+ node->nodeid, dlm_state_str(node->state));
+ }
+ }
+ srcu_read_unlock(&nodes_srcu, idx);
+}
+
static void midcomms_shutdown(struct midcomms_node *node)
{
int ret;
diff --git a/fs/dlm/midcomms.h b/fs/dlm/midcomms.h
index bea1cee4279c..9f8c9605013d 100644
--- a/fs/dlm/midcomms.h
+++ b/fs/dlm/midcomms.h
@@ -20,6 +20,7 @@ struct dlm_mhandle *dlm_midcomms_get_mhandle(int nodeid, int len,
gfp_t allocation, char **ppc);
void dlm_midcomms_commit_mhandle(struct dlm_mhandle *mh, const void *name,
int namelen);
+void dlm_midcomms_version_wait(void);
int dlm_midcomms_close(int nodeid);
int dlm_midcomms_start(void);
void dlm_midcomms_stop(void);
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 4/8] fs: dlm: make dlm sequence id handling more robust
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
` (2 preceding siblings ...)
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 3/8] fs: dlm: wait until all midcomms nodes detects version Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 5/8] fs: dlm: reduce the timeout time to 5 secs Alexander Aring
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
When doing a lot of testing with joining lockspace and leaving a
lockspace again. I realized the sequence number which is put into some
dlm message starts with zero when just a lockspace created. This
sequence number will be checked for matching message replies, to avoid
issues and easily see sequence mismatches for different lockspace
creations we change to start the sequence number randomly.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/lockspace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index d9dc0b734002..9f344d76afa3 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -572,7 +572,7 @@ static int new_lockspace(const char *name, const char *cluster,
spin_lock_init(&ls->ls_rcom_spin);
get_random_bytes(&ls->ls_rcom_seq, sizeof(uint64_t));
ls->ls_recover_status = 0;
- ls->ls_recover_seq = 0;
+ ls->ls_recover_seq = get_random_u64();
ls->ls_recover_args = NULL;
init_rwsem(&ls->ls_in_recovery);
init_rwsem(&ls->ls_recv_active);
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 5/8] fs: dlm: reduce the timeout time to 5 secs
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
` (3 preceding siblings ...)
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 4/8] fs: dlm: make dlm sequence id handling more robust Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 6/8] fs: dlm: remove newline in log_print Alexander Aring
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
When a shutdown stucks for any reason we switch to a 5 secs timeout
instead of getting stucked for a long time and even HUNG_TASK reports
about stucking into the wait_event_timeout() handling. In 5 secs it
should already be done in most of all cases, if not we timeout and try
to force a shutdown.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/midcomms.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index cf91a5a11b4f..3932c7cf0e7a 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -147,7 +147,7 @@
/* init value for sequence numbers for testing purpose only e.g. overflows */
#define DLM_SEQ_INIT 0
/* 3 minutes wait to sync ending of dlm */
-#define DLM_SHUTDOWN_TIMEOUT msecs_to_jiffies(3 * 60 * 1000)
+#define DLM_SHUTDOWN_TIMEOUT msecs_to_jiffies(5000)
#define DLM_VERSION_NOT_SET 0
struct midcomms_node {
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 6/8] fs: dlm: remove newline in log_print
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
` (4 preceding siblings ...)
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 5/8] fs: dlm: reduce the timeout time to 5 secs Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 7/8] fs: dlm: move state change into else branch Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 8/8] fs: dlm: remove unnecessary waker_up() calls Alexander Aring
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
There is an API difference between log_print() and other printk()s to
put a newline or not. This one was introduced by mistake because
log_print() adds a newline.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/midcomms.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index 3932c7cf0e7a..b5720042ef9d 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -467,7 +467,7 @@ static void dlm_pas_fin_ack_rcv(struct midcomms_node *node)
break;
default:
spin_unlock(&node->state_lock);
- log_print("%s: unexpected state: %d\n",
+ log_print("%s: unexpected state: %d",
__func__, node->state);
WARN_ON_ONCE(1);
return;
@@ -540,7 +540,7 @@ static void dlm_midcomms_receive_buffer(union dlm_packet *p,
break;
default:
spin_unlock(&node->state_lock);
- log_print("%s: unexpected state: %d\n",
+ log_print("%s: unexpected state: %d",
__func__, node->state);
WARN_ON_ONCE(1);
return;
@@ -1269,7 +1269,7 @@ static void dlm_act_fin_ack_rcv(struct midcomms_node *node)
break;
default:
spin_unlock(&node->state_lock);
- log_print("%s: unexpected state: %d\n",
+ log_print("%s: unexpected state: %d",
__func__, node->state);
WARN_ON_ONCE(1);
return;
@@ -1369,7 +1369,7 @@ void dlm_midcomms_remove_member(int nodeid)
/* already gone, do nothing */
break;
default:
- log_print("%s: unexpected state: %d\n",
+ log_print("%s: unexpected state: %d",
__func__, node->state);
break;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 7/8] fs: dlm: move state change into else branch
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
` (5 preceding siblings ...)
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 6/8] fs: dlm: remove newline in log_print Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 8/8] fs: dlm: remove unnecessary waker_up() calls Alexander Aring
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
Currently we can switch at first into DLM_CLOSE_WAIT state and then do
another state change if a condition is true. Instead of doing two state
changes we handle the other state change inside an else branch of this
condition.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/midcomms.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index b5720042ef9d..2a9177f3d468 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -506,9 +506,6 @@ static void dlm_midcomms_receive_buffer(union dlm_packet *p,
case DLM_ESTABLISHED:
dlm_send_ack(node->nodeid, node->seq_next);
- node->state = DLM_CLOSE_WAIT;
- pr_debug("switch node %d to state %s\n",
- node->nodeid, dlm_state_str(node->state));
/* passive shutdown DLM_LAST_ACK case 1
* additional we check if the node is used by
* cluster manager events at all.
@@ -519,6 +516,10 @@ static void dlm_midcomms_receive_buffer(union dlm_packet *p,
node->nodeid, dlm_state_str(node->state));
set_bit(DLM_NODE_FLAG_STOP_RX, &node->flags);
dlm_send_fin(node, dlm_pas_fin_ack_rcv);
+ } else {
+ node->state = DLM_CLOSE_WAIT;
+ pr_debug("switch node %d to state %s\n",
+ node->nodeid, dlm_state_str(node->state));
}
break;
case DLM_FIN_WAIT1:
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Cluster-devel] [PATCH dlm/next 8/8] fs: dlm: remove unnecessary waker_up() calls
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
` (6 preceding siblings ...)
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 7/8] fs: dlm: move state change into else branch Alexander Aring
@ 2023-01-12 22:18 ` Alexander Aring
7 siblings, 0 replies; 9+ messages in thread
From: Alexander Aring @ 2023-01-12 22:18 UTC (permalink / raw)
To: cluster-devel.redhat.com
The wake_up() is already handled inside of midcomms_node_reset() when
switching the state to CLOSED state. So there is not need to call it
after midcomms_node_reset() again.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
fs/dlm/midcomms.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index 2a9177f3d468..42ab77f0b28b 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -534,7 +534,6 @@ static void dlm_midcomms_receive_buffer(union dlm_packet *p,
midcomms_node_reset(node);
pr_debug("switch node %d to state %s\n",
node->nodeid, dlm_state_str(node->state));
- wake_up(&node->shutdown_wait);
break;
case DLM_LAST_ACK:
/* probably remove_member caught it, do nothing */
@@ -1262,7 +1261,6 @@ static void dlm_act_fin_ack_rcv(struct midcomms_node *node)
midcomms_node_reset(node);
pr_debug("switch node %d to state %s\n",
node->nodeid, dlm_state_str(node->state));
- wake_up(&node->shutdown_wait);
break;
case DLM_CLOSED:
/* not valid but somehow we got what we want */
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-01-12 22:18 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-12 22:18 [Cluster-devel] [PATCH dlm/next 0/8] fs: dlm: better handling for new stop/start testcase Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 1/8] fs: dlm: bring back previously shutdown handling Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 2/8] fs: dlm: change to ignore unexpected non dlm opts msgs Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 3/8] fs: dlm: wait until all midcomms nodes detects version Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 4/8] fs: dlm: make dlm sequence id handling more robust Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 5/8] fs: dlm: reduce the timeout time to 5 secs Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 6/8] fs: dlm: remove newline in log_print Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 7/8] fs: dlm: move state change into else branch Alexander Aring
2023-01-12 22:18 ` [Cluster-devel] [PATCH dlm/next 8/8] fs: dlm: remove unnecessary waker_up() calls Alexander Aring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).