From: Pavel Emelyanov <xemul@parallels.com>
To: Linux Netdev List <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: [PATCH 6/6] tcp: Repair connection-time negotiated parameters
Date: Thu, 19 Apr 2012 17:41:57 +0400 [thread overview]
Message-ID: <4F901625.1090008@parallels.com> (raw)
In-Reply-To: <4F901572.4040009@parallels.com>
There are options, which are set up on a socket while performing
TCP handshake. Need to resurrect them on a socket while repairing.
A new sockoption accepts a buffer and parses it. The buffer should
be CODE:VALUE sequence of bytes, where CODE is standard option
code and VALUE is the respective value.
Only 4 options should be handled on repaired socket.
To read 3 out of 4 of these options the TCP_INFO sockoption can be
used. An ability to get the last one (the mss_clamp) was added by
the previous patch.
Now the restore. Three of these options -- timestamp_ok, mss_clamp
and snd_wscale -- are just restored on a coket.
The sack_ok flags has 2 issues. First, whether or not to do sacks
at all. This flag is just read and set back. No other sack info is
saved or restored, since according to the standart and the code
dropping all sack-ed segments is OK, the sender will resubmit them
again, so after the repair we will probably experience a pause in
connection. Next, the fack bit. It's just set back on a socket if
the respective sysctl is set. No collected stats about packets flow
is preserved. As far as I see (plz, correct me if I'm wrong) the
fack-based congestion algorithm survives dropping all of the stats
and repairs itself eventually, probably losing the performance for
that period.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
include/linux/tcp.h | 1 +
net/ipv4/tcp.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+), 0 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 4e90e6a..9865936 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -109,6 +109,7 @@ enum {
#define TCP_REPAIR 19 /* TCP sock is under repair right now */
#define TCP_REPAIR_QUEUE 20
#define TCP_QUEUE_SEQ 21
+#define TCP_REPAIR_OPTIONS 22
enum {
TCP_NO_QUEUE,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b4e690d..3ce3bd0 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2218,6 +2218,68 @@ static inline int tcp_can_repair_sock(struct sock *sk)
((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_ESTABLISHED));
}
+static int tcp_repair_options_est(struct tcp_sock *tp, char __user *optbuf, unsigned int len)
+{
+ /*
+ * Options are stored in CODE:VALUE form where CODE is 8bit and VALUE
+ * fits the respective TCPOLEN_ size
+ */
+
+ while (len > 0) {
+ u8 opcode;
+
+ if (get_user(opcode, optbuf))
+ return -EFAULT;
+
+ optbuf++;
+ len--;
+
+ switch (opcode) {
+ case TCPOPT_MSS: {
+ u16 in_mss;
+
+ if (len < sizeof(in_mss))
+ return -ENODATA;
+ if (get_user(in_mss, optbuf))
+ return -EFAULT;
+
+ tp->rx_opt.mss_clamp = in_mss;
+
+ optbuf += sizeof(in_mss);
+ len -= sizeof(in_mss);
+ break;
+ }
+ case TCPOPT_WINDOW: {
+ u8 wscale;
+
+ if (len < sizeof(wscale))
+ return -ENODATA;
+ if (get_user(wscale, optbuf))
+ return -EFAULT;
+
+ if (wscale > 14)
+ return -EFBIG;
+
+ tp->rx_opt.snd_wscale = wscale;
+
+ optbuf += sizeof(wscale);
+ len -= sizeof(wscale);
+ break;
+ }
+ case TCPOPT_SACK_PERM:
+ tp->rx_opt.sack_ok |= TCP_SACK_SEEN;
+ if (sysctl_tcp_fack)
+ tcp_enable_fack(tp);
+ break;
+ case TCPOPT_TIMESTAMP:
+ tp->rx_opt.tstamp_ok = 1;
+ break;
+ }
+ }
+
+ return 0;
+}
+
/*
* Socket option code for TCP.
*/
@@ -2426,6 +2488,15 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
err = -EINVAL;
break;
+ case TCP_REPAIR_OPTIONS:
+ if (!tp->repair)
+ err = -EINVAL;
+ else if (sk->sk_state == TCP_ESTABLISHED)
+ err = tcp_repair_options_est(tp, optval, optlen);
+ else
+ err = -EPERM;
+ break;
+
case TCP_CORK:
/* When set indicates to always queue non-full frames.
* Later the user clears this option and we transmit
--
1.5.5.6
next prev parent reply other threads:[~2012-04-19 13:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-19 13:38 [PATCH net-next 0/6] TCP connection repair (v4) Pavel Emelyanov
2012-04-19 13:39 ` [PATCH 1/6] sock: Introduce named constants for sk_reuse Pavel Emelyanov
2012-04-19 13:40 ` [PATCH 2/6] tcp: Move code around Pavel Emelyanov
2012-04-19 13:40 ` [PATCH 3/6] tcp: Initial repair mode Pavel Emelyanov
2012-04-19 13:41 ` [PATCH 4/6] tcp: Repair socket queues Pavel Emelyanov
2012-05-02 11:11 ` Eric Dumazet
2012-05-03 8:59 ` Pavel Emelyanov
2012-05-03 9:08 ` Eric Dumazet
2012-05-03 9:15 ` Pavel Emelyanov
2012-05-03 9:31 ` David Miller
2012-04-19 13:41 ` [PATCH 5/6] tcp: Report mss_clamp with TCP_MAXSEG option in repair mode Pavel Emelyanov
2012-04-19 13:41 ` Pavel Emelyanov [this message]
2012-04-21 19:53 ` [PATCH net-next 0/6] TCP connection repair (v4) David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F901625.1090008@parallels.com \
--to=xemul@parallels.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.