From: <gregkh@linuxfoundation.org>
To: edumazet@googl.com, davem@davemloft.net,
gregkh@linuxfoundation.org, liujian56@huawei.com,
ncardwell@google.com, soheil@google.com, ycheng@google.com
Cc: <stable@vger.kernel.org>, <stable-commits@vger.kernel.org>
Subject: Patch "tcp: update skb->skb_mstamp more carefully" has been added to the 4.9-stable tree
Date: Tue, 10 Oct 2017 16:10:52 +0200 [thread overview]
Message-ID: <15076446521968@kroah.com> (raw)
This is a note to let you know that I've just added the patch titled
tcp: update skb->skb_mstamp more carefully
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
tcp-update-skb-skb_mstamp-more-carefully.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From foo@baz Tue Oct 10 16:09:22 CEST 2017
From: Eric Dumazet <edumazet@googl.com>
Date: Wed, 13 Sep 2017 20:30:39 -0700
Subject: tcp: update skb->skb_mstamp more carefully
From: Eric Dumazet <edumazet@googl.com>
[ Upstream commit 8c72c65b426b47b3c166a8fef0d8927fe5e8a28d ]
liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
in tcp_probe_timer() :
https://www.spinics.net/lists/netdev/msg454496.html
After investigations, the root cause of the problem is that we update
skb->skb_mstamp of skbs in write queue, even if the attempt to send a
clone or copy of it failed. One reason being a routing problem.
This patch prevents this, solving liujian issue.
It also removes a potential RTT miscalculation, since
__tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
been changed.
A future ACK would then lead to a very small RTT sample and min_rtt
would then be lowered to this too small value.
Tested:
# cat user_timeout.pkt
--local_ip=192.168.102.64
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0`
+0 < S 0:0(0) win 0 <mss 1460>
+0 > S. 0:0(0) ack 1 <mss 1460>
+.1 < . 1:1(0) ack 1 win 65530
+0 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0
+0 write(4, ..., 24) = 24
+0 > P. 1:25(24) ack 1 win 29200
+.1 < . 1:1(0) ack 25 win 65530
//change the ipaddress
+1 `ifconfig tun0 192.168.0.10/16`
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+0 `ifconfig tun0 192.168.102.64/16`
+0 < . 1:2(1) ack 25 win 65530
+0 `ifconfig tun0 192.168.0.10/16`
+3 write(4, ..., 24) = -1
# ./packetdrill user_timeout.pkt
Signed-off-by: Eric Dumazet <edumazet@googl.com>
Reported-by: liujian <liujian56@huawei.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/tcp_output.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -914,6 +914,7 @@ static int tcp_transmit_skb(struct sock
struct tcp_skb_cb *tcb;
struct tcp_out_options opts;
unsigned int tcp_options_size, tcp_header_size;
+ struct sk_buff *oskb = NULL;
struct tcp_md5sig_key *md5;
struct tcphdr *th;
int err;
@@ -922,11 +923,11 @@ static int tcp_transmit_skb(struct sock
tp = tcp_sk(sk);
if (clone_it) {
- skb_mstamp_get(&skb->skb_mstamp);
TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq
- tp->snd_una;
tcp_rate_skb_sent(sk, skb);
+ oskb = skb;
if (unlikely(skb_cloned(skb)))
skb = pskb_copy(skb, gfp_mask);
else
@@ -934,6 +935,7 @@ static int tcp_transmit_skb(struct sock
if (unlikely(!skb))
return -ENOBUFS;
}
+ skb_mstamp_get(&skb->skb_mstamp);
inet = inet_sk(sk);
tcb = TCP_SKB_CB(skb);
@@ -1035,12 +1037,14 @@ static int tcp_transmit_skb(struct sock
err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
- if (likely(err <= 0))
- return err;
-
- tcp_enter_cwr(sk);
+ if (unlikely(err > 0)) {
+ tcp_enter_cwr(sk);
+ err = net_xmit_eval(err);
+ }
+ if (!err && oskb)
+ skb_mstamp_get(&oskb->skb_mstamp);
- return net_xmit_eval(err);
+ return err;
}
/* This routine just queues the buffer for sending.
@@ -2709,10 +2713,11 @@ int __tcp_retransmit_skb(struct sock *sk
skb_headroom(skb) >= 0xFFFF)) {
struct sk_buff *nskb;
- skb_mstamp_get(&skb->skb_mstamp);
nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
-ENOBUFS;
+ if (!err)
+ skb_mstamp_get(&skb->skb_mstamp);
} else {
err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
}
Patches currently in stable-queue which might be from edumazet@googl.com are
queue-4.9/tcp-update-skb-skb_mstamp-more-carefully.patch
reply other threads:[~2017-10-10 14:11 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15076446521968@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@googl.com \
--cc=liujian56@huawei.com \
--cc=ncardwell@google.com \
--cc=soheil@google.com \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.