cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCHv3 dlm/next 18/20] fs: dlm: don't allow half transmitted messages
Date: Mon,  4 Jan 2021 16:00:22 -0500	[thread overview]
Message-ID: <20210104210024.233765-19-aahringo@redhat.com> (raw)
In-Reply-To: <20210104210024.233765-1-aahringo@redhat.com>

This patch will clean a dirty page buffer if a reconnect occurs. If a page
buffer was half transmitted we cannot start inside the middle of a dlm
message if a node connects again. I observed invalid length receptions
errors and was guessing that this behaviour occurs, after this patch I
never saw an invalid message length again. This patch might drops more
messages for dlm version 3.1 but 3.1 can't deal with half messages as
well, for 3.2 it might trigger more re-transmissions but will not leave dlm
in a broken state.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 84 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 54 insertions(+), 30 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 0e4cbabc680f..44a34becb780 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -112,6 +112,7 @@ struct writequeue_entry {
 	int len;
 	int end;
 	int users;
+	bool dirty;
 	struct connection *con;
 	struct list_head msgs;
 	struct kref ref;
@@ -647,6 +648,36 @@ static void make_sockaddr(struct sockaddr_storage *saddr, uint16_t port,
 	memset((char *)saddr + *addr_len, 0, sizeof(struct sockaddr_storage) - *addr_len);
 }
 
+static void dlm_page_release(struct kref *kref)
+{
+	struct writequeue_entry *e = container_of(kref, struct writequeue_entry,
+						  ref);
+
+	__free_page(e->page);
+	kfree(e);
+}
+
+static void dlm_msg_release(struct kref *kref)
+{
+	struct dlm_msg *msg = container_of(kref, struct dlm_msg, ref);
+
+	kref_put(&msg->entry->ref, dlm_page_release);
+	kfree(msg);
+}
+
+static void free_entry(struct writequeue_entry *e)
+{
+	struct dlm_msg *msg, *tmp;
+
+	list_for_each_entry_safe(msg, tmp, &e->msgs, list) {
+		list_del(&msg->list);
+		kref_put(&msg->ref, dlm_msg_release);
+	}
+
+	list_del(&e->list);
+	kref_put(&e->ref, dlm_page_release);
+}
+
 static void dlm_close_sock(struct socket **sock)
 {
 	if (*sock) {
@@ -661,6 +692,7 @@ static void close_connection(struct connection *con, bool and_other,
 			     bool tx, bool rx)
 {
 	bool closing = test_and_set_bit(CF_CLOSING, &con->flags);
+	struct writequeue_entry *e;
 
 	if (tx && !closing && cancel_work_sync(&con->swork)) {
 		log_print("canceled swork for node %d", con->nodeid);
@@ -679,6 +711,26 @@ static void close_connection(struct connection *con, bool and_other,
 		close_connection(con->othercon, false, true, true);
 	}
 
+	/* if we send a writequeue entry only a half way, we drop the
+	 * whole entry because reconnection and that we not start of the
+	 * middle of a msg which will confuse the other end.
+	 *
+	 * we can always drop messages because retransmits, but what we
+	 * cannot allow is to transmit half messages which may be processed
+	 * at the other side.
+	 *
+	 * our policy is to start on a clean state when disconnects, we don't
+	 * know what's send/received on transport layer in this case.
+	 */
+	spin_lock_bh(&con->writequeue_lock);
+	if (!list_empty(&con->writequeue)) {
+		e = list_first_entry(&con->writequeue, struct writequeue_entry,
+				     list);
+		if (e->dirty)
+			free_entry(e);
+	}
+	spin_unlock_bh(&con->writequeue_lock);
+
 	con->rx_leftover = 0;
 	con->retries = 0;
 	clear_bit(CF_CONNECTED, &con->flags);
@@ -958,36 +1010,6 @@ static int accept_from_sock(struct listen_connection *con)
 	return result;
 }
 
-static void dlm_page_release(struct kref *kref)
-{
-	struct writequeue_entry *e = container_of(kref, struct writequeue_entry,
-						  ref);
-
-	__free_page(e->page);
-	kfree(e);
-}
-
-static void dlm_msg_release(struct kref *kref)
-{
-	struct dlm_msg *msg = container_of(kref, struct dlm_msg, ref);
-
-	kref_put(&msg->entry->ref, dlm_page_release);
-	kfree(msg);
-}
-
-static void free_entry(struct writequeue_entry *e)
-{
-	struct dlm_msg *msg, *tmp;
-
-	list_for_each_entry_safe(msg, tmp, &e->msgs, list) {
-		list_del(&msg->list);
-		kref_put(&msg->ref, dlm_msg_release);
-	}
-
-	list_del(&e->list);
-	kref_put(&e->ref, dlm_page_release);
-}
-
 /*
  * writequeue_entry_complete - try to delete and free write queue entry
  * @e: write queue entry to try to delete
@@ -999,6 +1021,8 @@ static void writequeue_entry_complete(struct writequeue_entry *e, int completed)
 {
 	e->offset += completed;
 	e->len -= completed;
+	/* signal that page was half way transmitted */
+	e->dirty = true;
 
 	if (e->len == 0 && e->users == 0)
 		free_entry(e);
-- 
2.26.2



  parent reply	other threads:[~2021-01-04 21:00 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 21:00 [Cluster-devel] [PATCHv3 dlm/next 00/20] fs: dlm: introduce dlm re-transmission layer Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 01/20] fs: dlm: set connected bit after accept Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 02/20] fs: dlm: set subclass for othercon sock_mutex Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 03/20] fs: dlm: add errno handling to check callback Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 04/20] fs: dlm: add check if dlm is currently running Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 05/20] fs: dlm: change allocation limits Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 06/20] fs: dlm: public header in out utility Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 07/20] fs: dlm: use GFP_ZERO for page buffer Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 08/20] fs: dlm: simplify writequeue handling Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 09/20] fs: dlm: add more midcomms hooks Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 10/20] fs: dlm: make buffer handling per msg Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 11/20] fs: dlm: make new buffer handling softirq ready Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 12/20] fs: dlm: add functionality to re-transmit a message Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 13/20] fs: dlm: move out some hash functionality Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 14/20] fs: dlm: remove unaligned memory access handling Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 15/20] fs: dlm: add union in dlm header for lockspace id Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 16/20] fs: dlm: add per node receive flush Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 17/20] fs: dlm: add reliable connection if reconnect Alexander Aring
2021-01-04 21:00 ` Alexander Aring [this message]
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 19/20] fs: dlm: remove obsolete code and comment Alexander Aring
2021-01-04 21:00 ` [Cluster-devel] [PATCHv3 dlm/next 20/20] fs: dlm: check for invalid namelen Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210104210024.233765-19-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).