cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [RFC dlm/next 13/15] fs: dlm: move writequeue init to sendcon only
Date: Wed, 23 Jun 2021 11:14:52 -0400	[thread overview]
Message-ID: <20210623151454.176649-14-aahringo@redhat.com> (raw)
In-Reply-To: <20210623151454.176649-1-aahringo@redhat.com>

This patch inits only sendconf functionality to the sendcon and not
othercon. If we have make a mistake by accident we can see it when the
kernel crashes because the othercon was queued for transmitting which
should never be the case. Also add a comment about the othercon handling
and why it's there and how we could possible remove it with breaking
backwards compatibility.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 61 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index ddf3c0c98386..e858453b4eb7 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -40,6 +40,29 @@
  * cluster-wide mechanism as it must be the same on all nodes of the cluster
  * for the DLM to function.
  *
+ * TODO:
+ *
+ * Special note about the "othercon" field in connection structure. This is
+ * only set if we hit a race between two peers do connect() and accept() at
+ * the same time. If we don't accept the connection and close it the other
+ * peer will disconnect on it's non "othercon" connection which is known as
+ * "sendcon". It's named "sendcon" because it is a must that we only do
+ * sending over this connection. If we hit the race then the "othercon" is
+ * be used for receiving only. Calling a "ss -t" shows two connections in
+ * this case.
+ *
+ * Overall it makes this code a lot of confusion and the code tries to use
+ * only the "sendcon" as resource e.g. mutexes. As the race is only sometimes
+ * there moving "othercon" in it's own struct e.g. "struct connection_other"
+ * makes it difficult to deal with it when we don't hit the race and the
+ * receiving is done by "sendcon".
+ *
+ * There exists an idea by Steve Whitehouse to get rid of this race by
+ * introducing a priotize accept() rule e.g. $OWN_NODEID < $PEER_NODEID.
+ * If the condition is true we accept the node otherwise we trigger a reconnect
+ * to this peer (because the peer wants connect again). However this is not
+ * backwards compatible and will break the connection handling with the
+ * "othercon" handling.
  */
 
 #include <asm/ioctls.h>
@@ -283,16 +306,15 @@ static struct connection *__find_con(int nodeid, int r)
 
 static bool tcp_eof_condition(struct connection *con)
 {
+	if (test_bit(CF_IS_OTHERCON, &con->flags))
+		return false;
+
 	return atomic_read(&con->writequeue_cnt);
 }
 
 static void dlm_con_init(struct connection *con, int nodeid)
 {
 	con->nodeid = nodeid;
-	INIT_LIST_HEAD(&con->writequeue);
-	spin_lock_init(&con->writequeue_lock);
-	atomic_set(&con->writequeue_cnt, 0);
-	INIT_DELAYED_WORK(&con->swork, process_send_sockets);
 	INIT_WORK(&con->rwork, process_recv_sockets);
 	INIT_WORK(&con->cwork, process_close_sockets);
 	init_waitqueue_head(&con->shutdown_wait);
@@ -320,7 +342,12 @@ static struct connection *nodeid2con(int nodeid, gfp_t alloc)
 
 	mutex_init(&con->rwork_lock);
 	mutex_init(&con->swork_lock);
+
 	mutex_init(&con->wq_alloc);
+	INIT_LIST_HEAD(&con->writequeue);
+	spin_lock_init(&con->writequeue_lock);
+	atomic_set(&con->writequeue_cnt, 0);
+	INIT_DELAYED_WORK(&con->swork, process_send_sockets);
 
 	mutex_init(&con->process_lock);
 	INIT_LIST_HEAD(&con->processqueue);
@@ -813,20 +840,23 @@ static void close_connection(struct connection *con, bool and_other)
 	 * our policy is to start on a clean state when disconnects, we don't
 	 * know what's send/received on transport layer in this case.
 	 */
-	spin_lock(&con->writequeue_lock);
-	if (!list_empty(&con->writequeue)) {
-		e = list_first_entry(&con->writequeue, struct writequeue_entry,
-				     list);
-		if (e->dirty)
-			free_entry(e);
+	if (!test_bit(CF_IS_OTHERCON, &con->flags)) {
+		spin_lock(&con->writequeue_lock);
+		if (!list_empty(&con->writequeue)) {
+			e = list_first_entry(&con->writequeue, struct writequeue_entry,
+					     list);
+			if (e->dirty)
+				free_entry(e);
+		}
+		spin_unlock(&con->writequeue_lock);
+
+		con->retries = 0;
+		clear_bit(CF_APP_LIMITED, &con->flags);
+		clear_bit(CF_EOF, &con->flags);
 	}
-	spin_unlock(&con->writequeue_lock);
 
 	con->rx_leftover = 0;
-	con->retries = 0;
-	clear_bit(CF_APP_LIMITED, &con->flags);
 	clear_bit(CF_CONNECTED, &con->flags);
-	clear_bit(CF_EOF, &con->flags);
 
 	/* handling for tcp shutdown */
 	clear_bit(CF_SHUTDOWN, &con->flags);
@@ -1544,8 +1574,6 @@ int dlm_lowcomms_close(int nodeid)
 		set_bit(CF_CLOSE, &con->flags);
 		cancel_io_work(con, true);
 		clean_one_writequeue(con);
-		if (con->othercon)
-			clean_one_writequeue(con->othercon);
 	}
 	srcu_read_unlock(&connections_srcu, idx);
 
@@ -1781,7 +1809,6 @@ static void free_conn(struct connection *con)
 	hlist_del_rcu(&con->list);
 	spin_unlock(&connections_lock);
 	if (con->othercon) {
-		clean_one_writequeue(con->othercon);
 		call_srcu(&connections_srcu, &con->othercon->rcu,
 			  connection_release);
 	}
-- 
2.26.3



  parent reply	other threads:[~2021-06-23 15:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-23 15:14 [Cluster-devel] [RFC dlm/next 00/15] fs: dlm: performance Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 01/15] fs: dlm: clear CF_APP_LIMITED on close Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 02/15] fs: dlm: introduce con_next_wq helper Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 03/15] fs: dlm: move to static proto ops Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 04/15] fs: dlm: introduce generic listen Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 05/15] fs: dlm: auto load sctp module Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 06/15] fs: dlm: generic connect func Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 07/15] fs: dlm: fix multiple empty writequeue alloc Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 08/15] fs: dlm: move receive loop into receive handler Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 09/15] fs: dlm: introduce io_workqueue Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 10/15] fs: dlm: introduce reconnect work Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 11/15] fs: dlm: introduce process workqueue Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 12/15] fs: dlm: remove send starve Alexander Aring
2021-06-23 15:14 ` Alexander Aring [this message]
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 14/15] fs: dlm: flush listen con Alexander Aring
2021-06-23 15:14 ` [Cluster-devel] [RFC dlm/next 15/15] fs: dlm: move srcu into loop call Alexander Aring
2021-06-23 21:31 ` [Cluster-devel] [RFC dlm/next 00/15] fs: dlm: performance Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210623151454.176649-14-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).