From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Pavel Emelyanov <xemul@parallels.com>,
"David S. Miller" <davem@davemloft.net>,
Andrey Vagin <avagin@openvz.org>
Subject: [PATCH 3.14 06/37] tcp: dont use timestamp from repaired skb-s to calculate RTT (v2)
Date: Mon, 13 Oct 2014 04:24:03 +0200 [thread overview]
Message-ID: <20141013022400.564389067@linuxfoundation.org> (raw)
In-Reply-To: <20141013022400.286360067@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Vagin <avagin@openvz.org>
[ Upstream commit 9d186cac7ffb1831e9f34cb4a3a8b22abb9dd9d4 ]
We don't know right timestamp for repaired skb-s. Wrong RTT estimations
isn't good, because some congestion modules heavily depends on it.
This patch adds the TCPCB_REPAIRED flag, which is included in
TCPCB_RETRANS.
Thanks to Eric for the advice how to fix this issue.
This patch fixes the warning:
[ 879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
[ 879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
[ 879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 879.568177] 0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
[ 879.568776] 0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
[ 879.569386] ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
[ 879.569982] Call Trace:
[ 879.570264] [<ffffffff817aa2d2>] dump_stack+0x4d/0x66
[ 879.570599] [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0
[ 879.570935] [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20
[ 879.571292] [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380
[ 879.571614] [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710
[ 879.571958] [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370
[ 879.572315] [<ffffffff81657459>] release_sock+0x89/0x1d0
[ 879.572642] [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860
[ 879.573000] [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80
[ 879.573352] [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40
[ 879.573678] [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20
[ 879.574031] [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0
[ 879.574393] [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b
[ 879.574730] ---[ end trace a17cbc38eb8c5c00 ]---
v2: moving setting of skb->when for repaired skb-s in tcp_write_xmit,
where it's set for other skb-s.
Fixes: 431a91242d8d ("tcp: timestamp SYN+DATA messages")
Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/tcp.h | 4 +++-
net/ipv4/tcp.c | 14 +++++++-------
net/ipv4/tcp_output.c | 5 ++++-
3 files changed, 14 insertions(+), 9 deletions(-)
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -720,8 +720,10 @@ struct tcp_skb_cb {
#define TCPCB_SACKED_RETRANS 0x02 /* SKB retransmitted */
#define TCPCB_LOST 0x04 /* SKB is lost */
#define TCPCB_TAGBITS 0x07 /* All tag bits */
+#define TCPCB_REPAIRED 0x10 /* SKB repaired (no skb_mstamp) */
#define TCPCB_EVER_RETRANS 0x80 /* Ever retransmitted frame */
-#define TCPCB_RETRANS (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS)
+#define TCPCB_RETRANS (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS| \
+ TCPCB_REPAIRED)
__u8 ip_dsfield; /* IPv4 tos or IPv6 dsfield */
/* 1 byte hole */
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1175,13 +1175,6 @@ new_segment:
goto wait_for_memory;
/*
- * All packets are restored as if they have
- * already been sent.
- */
- if (tp->repair)
- TCP_SKB_CB(skb)->when = tcp_time_stamp;
-
- /*
* Check whether we can use HW checksum.
*/
if (sk->sk_route_caps & NETIF_F_ALL_CSUM)
@@ -1190,6 +1183,13 @@ new_segment:
skb_entail(sk, skb);
copy = size_goal;
max = size_goal;
+
+ /* All packets are restored as if they have
+ * already been sent. skb_mstamp isn't set to
+ * avoid wrong rtt estimation.
+ */
+ if (tp->repair)
+ TCP_SKB_CB(skb)->sacked |= TCPCB_REPAIRED;
}
/* Try to append data to the end of skb. */
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1876,8 +1876,11 @@ static bool tcp_write_xmit(struct sock *
tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
BUG_ON(!tso_segs);
- if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE)
+ if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
+ /* "when" is used as a start point for the retransmit timer */
+ TCP_SKB_CB(skb)->when = tcp_time_stamp;
goto repair; /* Skip network transmission */
+ }
cwnd_quota = tcp_cwnd_test(tp, skb);
if (!cwnd_quota) {
next prev parent reply other threads:[~2014-10-13 2:26 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-13 2:23 [PATCH 3.14 00/37] 3.14.22-stable review Greg Kroah-Hartman
2014-10-13 2:23 ` [PATCH 3.14 01/37] netlink: reset network header before passing to taps Greg Kroah-Hartman
2014-10-13 2:23 ` [PATCH 3.14 02/37] rtnetlink: fix VF info size Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 03/37] net: Always untag vlan-tagged traffic on input Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 04/37] myri10ge: check for DMA mapping errors Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 05/37] i40e: Dont stop driver probe when querying DCB config fails Greg Kroah-Hartman
2014-10-13 2:24 ` Greg Kroah-Hartman [this message]
2014-10-13 2:24 ` [PATCH 3.14 07/37] sit: Fix ipip6_tunnel_lookup device matching criteria Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 08/37] tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 09/37] tcp: fix ssthresh and undo for consecutive short FRTO episodes Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 10/37] packet: handle too big packets for PACKET_V3 Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 11/37] openvswitch: fix panic with multiple vlan headers Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 12/37] vxlan: fix incorrect initializer in union vxlan_addr Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 13/37] l2tp: fix race while getting PMTU on PPP pseudo-wire Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 14/37] ipv6: fix rtnl locking in setsockopt for anycast and multicast Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 15/37] bonding: fix div by zero while enslaving and transmitting Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 16/37] ipv6: restore the behavior of ipv6_sock_ac_drop() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 17/37] bridge: Check if vlan filtering is enabled only once Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 18/37] bridge: Fix br_should_learn to check vlan_enabled Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 19/37] net: allow macvlans to move to net namespace Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 20/37] tg3: Work around HW/FW limitations with vlan encapsulated frames Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 21/37] tg3: Allow for recieve of full-size 8021AD frames Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 22/37] xfrm: Generate blackhole routes only from route lookup functions Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 23/37] xfrm: Generate queueing " Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 24/37] macvtap: Fix race between device delete and open Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 25/37] Revert "net/macb: add pinctrl consumer support" Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 26/37] gro: fix aggregation for skb using frag_list Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 27/37] hyperv: Fix a bug in netvsc_start_xmit() Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 28/37] ip6_gre: fix flowi6_proto value in xmit path Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 29/37] team: avoid race condition in scheduling delayed work Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 30/37] sctp: handle association restarts when the socket is closed Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 31/37] tcp: fixing TLPs FIN recovery Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 32/37] USB: Add device quirk for ASUS T100 Base Station keyboard Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 33/37] USB: serial: cp210x: added Ketra N1 wireless interface support Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 34/37] USB: cp210x: add support for Seluxit USB dongle Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 35/37] usb: musb: dsps: kill OTG timer on suspend Greg Kroah-Hartman
2014-10-13 2:24 ` [PATCH 3.14 36/37] crypto: caam - fix addressing of struct member Greg Kroah-Hartman
2014-10-14 8:31 ` Cristian Stoica
2014-10-14 8:38 ` Greg Kroah-Hartman
2014-10-14 8:48 ` Cristian Stoica
2014-10-13 2:24 ` [PATCH 3.14 37/37] serial: 8250: Add Quark X1000 to 8250_pci.c Greg Kroah-Hartman
2014-10-13 15:18 ` [PATCH 3.14 00/37] 3.14.22-stable review Guenter Roeck
2014-10-13 20:31 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141013022400.564389067@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=avagin@openvz.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.