From: Junxiao Bi <junxiao.bi@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH 2/3] ocfs2: o2net: set tcp user timeout to max value
Date: Thu, 15 May 2014 12:26:22 +0800 [thread overview]
Message-ID: <1400127983-9774-3-git-send-email-junxiao.bi@oracle.com> (raw)
In-Reply-To: <1400127983-9774-1-git-send-email-junxiao.bi@oracle.com>
When tcp retransmit timeout(15mins), the connection will be closed.
Pending messages may be lost during this time. So we set tcp user
timeout to override the retransmit timeout to the max value.
This is OK for ocfs2 since we have disk heartbeat, if peer crash,
the disk heartbeat will timeout and it will be evicted, if disk
heartbeat not timeout and connection idle for a long time, then
this means the cluster enters split-brain state, since fence can't
happen, we'd better keep the connection and wait network recover.
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
fs/ocfs2/cluster/tcp.c | 20 ++++++++++++++++++++
fs/ocfs2/cluster/tcp.h | 1 +
2 files changed, 21 insertions(+)
diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c
index 76ef3d8..eae58d8 100644
--- a/fs/ocfs2/cluster/tcp.c
+++ b/fs/ocfs2/cluster/tcp.c
@@ -1480,6 +1480,14 @@ static int o2net_set_nodelay(struct socket *sock)
return ret;
}
+static int o2net_set_usertimeout(struct socket *sock)
+{
+ int user_timeout = O2NET_TCP_USER_TIMEOUT;
+
+ return kernel_setsockopt(sock, SOL_TCP, TCP_USER_TIMEOUT,
+ (char *)&user_timeout, sizeof(user_timeout));
+}
+
static void o2net_initialize_handshake(void)
{
o2net_hand->o2hb_heartbeat_timeout_ms = cpu_to_be32(
@@ -1663,6 +1671,12 @@ static void o2net_start_connect(struct work_struct *work)
goto out;
}
+ ret = o2net_set_usertimeout(sock);
+ if (ret) {
+ mlog(ML_ERROR, "set TCP_USER_TIMEOUT failed with %d\n", ret);
+ goto out;
+ }
+
o2net_register_callbacks(sc->sc_sock->sk, sc);
spin_lock(&nn->nn_lock);
@@ -1842,6 +1856,12 @@ static int o2net_accept_one(struct socket *sock)
goto out;
}
+ ret = o2net_set_usertimeout(new_sock);
+ if (ret) {
+ mlog(ML_ERROR, "set TCP_USER_TIMEOUT failed with %d\n", ret);
+ goto out;
+ }
+
slen = sizeof(sin);
ret = new_sock->ops->getname(new_sock, (struct sockaddr *) &sin,
&slen, 1);
diff --git a/fs/ocfs2/cluster/tcp.h b/fs/ocfs2/cluster/tcp.h
index 5bada2a..c571e84 100644
--- a/fs/ocfs2/cluster/tcp.h
+++ b/fs/ocfs2/cluster/tcp.h
@@ -63,6 +63,7 @@ typedef void (o2net_post_msg_handler_func)(int status, void *data,
#define O2NET_KEEPALIVE_DELAY_MS_DEFAULT 2000
#define O2NET_IDLE_TIMEOUT_MS_DEFAULT 30000
+#define O2NET_TCP_USER_TIMEOUT 0x7fffffff
/* TODO: figure this out.... */
static inline int o2net_link_down(int err, struct socket *sock)
--
1.7.9.5
next prev parent reply other threads:[~2014-05-15 4:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-15 4:26 [Ocfs2-devel] [PATCH 0/3] ocfs2: o2net: fix packets lost issue when reconnect Junxiao Bi
2014-05-15 4:26 ` [Ocfs2-devel] [PATCH 1/3] ocfs2: o2net: don't shutdown connection when idle timeout Junxiao Bi
2014-05-15 4:26 ` Junxiao Bi [this message]
2014-05-15 4:26 ` [Ocfs2-devel] [PATCH 3/3] ocfs2: quorum: add a log for node not fenced Junxiao Bi
2014-05-15 8:27 ` [Ocfs2-devel] [PATCH 0/3] ocfs2: o2net: fix packets lost issue when reconnect Joseph Qi
2014-05-16 2:19 ` Junxiao Bi
2014-05-16 8:05 ` Joseph Qi
2014-05-16 8:32 ` Junxiao Bi
2014-05-16 9:01 ` Joseph Qi
2014-05-19 1:36 ` Junxiao Bi
2014-06-06 2:18 ` Junxiao Bi
2014-06-12 21:03 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2014-06-13 1:48 [Ocfs2-devel] " Junxiao Bi
2014-06-13 1:48 ` [Ocfs2-devel] [PATCH 2/3] ocfs2: o2net: set tcp user timeout to max value Junxiao Bi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1400127983-9774-3-git-send-email-junxiao.bi@oracle.com \
--to=junxiao.bi@oracle.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).