cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 2/6] dlm: fix race while closing connections
Date: Tue, 11 Aug 2015 19:22:21 -0300	[thread overview]
Message-ID: <743f99754d27595fa5322d72d524c507314676e2.1439330654.git.marcelo.leitner@gmail.com> (raw)
In-Reply-To: <cover.1439330654.git.marcelo.leitner@gmail.com>

When a connection have issues DLM may need to close it.  Therefore we
should also cancel pending workqueues for such connection at that time,
and not just when dlm is not willing to use this connection anymore.

Also, if we don't clear CF_CONNECT_PENDING flag, the error handling
routines won't be able to re-connect as lowcomms_connect_sock() will
check for it.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 fs/dlm/lowcomms.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index bc04f5e3af7ac5fe107a7a26555777364de8bc15..749deb3b69b2932fb18e7ae70c06e4ced15bd9b6 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -514,17 +514,24 @@ static void make_sockaddr(struct sockaddr_storage *saddr, uint16_t port,
 }
 
 /* Close a remote connection and tidy up */
-static void close_connection(struct connection *con, bool and_other)
+static void close_connection(struct connection *con, bool and_other,
+			     bool tx, bool rx)
 {
-	mutex_lock(&con->sock_mutex);
+	clear_bit(CF_CONNECT_PENDING, &con->flags);
+	clear_bit(CF_WRITE_PENDING, &con->flags);
+	if (tx && cancel_work_sync(&con->swork))
+		log_print("canceled swork for node %d", con->nodeid);
+	if (rx && cancel_work_sync(&con->rwork))
+		log_print("canceled rwork for node %d", con->nodeid);
 
+	mutex_lock(&con->sock_mutex);
 	if (con->sock) {
 		sock_release(con->sock);
 		con->sock = NULL;
 	}
 	if (con->othercon && and_other) {
 		/* Will only re-enter once. */
-		close_connection(con->othercon, false);
+		close_connection(con->othercon, false, true, true);
 	}
 	if (con->rx_page) {
 		__free_page(con->rx_page);
@@ -902,7 +909,7 @@ out_resched:
 out_close:
 	mutex_unlock(&con->sock_mutex);
 	if (ret != -EAGAIN) {
-		close_connection(con, false);
+		close_connection(con, false, true, false);
 		/* Reconnect when there is something to send */
 	}
 	/* Don't return success if we really got EOF */
@@ -1622,7 +1629,7 @@ out:
 
 send_error:
 	mutex_unlock(&con->sock_mutex);
-	close_connection(con, false);
+	close_connection(con, false, false, true);
 	lowcomms_connect_sock(con);
 	return;
 
@@ -1654,15 +1661,9 @@ int dlm_lowcomms_close(int nodeid)
 	log_print("closing connection to node %d", nodeid);
 	con = nodeid2con(nodeid, 0);
 	if (con) {
-		clear_bit(CF_CONNECT_PENDING, &con->flags);
-		clear_bit(CF_WRITE_PENDING, &con->flags);
 		set_bit(CF_CLOSE, &con->flags);
-		if (cancel_work_sync(&con->swork))
-			log_print("canceled swork for node %d", nodeid);
-		if (cancel_work_sync(&con->rwork))
-			log_print("canceled rwork for node %d", nodeid);
+		close_connection(con, true, true, true);
 		clean_one_writequeue(con);
-		close_connection(con, true);
 	}
 
 	spin_lock(&dlm_node_addrs_spin);
@@ -1745,7 +1746,7 @@ static void stop_conn(struct connection *con)
 
 static void free_conn(struct connection *con)
 {
-	close_connection(con, true);
+	close_connection(con, true, true, true);
 	if (con->othercon)
 		kmem_cache_free(con_cache, con->othercon);
 	hlist_del(&con->list);
@@ -1816,7 +1817,7 @@ fail_unlisten:
 	dlm_allow_conn = 0;
 	con = nodeid2con(0,0);
 	if (con) {
-		close_connection(con, false);
+		close_connection(con, false, true, true);
 		kmem_cache_free(con_cache, con);
 	}
 fail_destroy:
-- 
2.4.3



  parent reply	other threads:[~2015-08-11 22:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1439330654.git.marcelo.leitner@gmail.com>
2015-08-11 22:22 ` [Cluster-devel] [PATCH 1/6] dlm: fix connection stealing if using SCTP Marcelo Ricardo Leitner
2015-08-11 22:22 ` Marcelo Ricardo Leitner [this message]
2015-08-11 22:22 ` [Cluster-devel] [PATCH 3/6] dlm: fix not reconnecting on connecting error handling Marcelo Ricardo Leitner
2015-08-11 22:22 ` [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API Marcelo Ricardo Leitner
2015-08-12 10:23   ` David Laight
2015-08-12 13:16     ` Marcelo Ricardo Leitner
2015-08-12 15:33       ` David Laight
2015-08-12 16:42         ` Marcelo Ricardo Leitner
2015-08-13  9:37           ` Steven Whitehouse
2015-08-13 12:13             ` Marcelo Ricardo Leitner
2015-08-11 22:22 ` [Cluster-devel] [PATCH 5/6] dlm: replace BUG_ON with a less severe handling Marcelo Ricardo Leitner
2015-08-11 22:22 ` [Cluster-devel] [PATCH 6/6] dlm: fix reconnecting but not sending data Marcelo Ricardo Leitner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=743f99754d27595fa5322d72d524c507314676e2.1439330654.git.marcelo.leitner@gmail.com \
    --to=marcelo.leitner@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).