From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: [PATCH net-next] tcp: fix TCP_REPAIR xmit queue setup Date: Thu, 18 Oct 2018 09:12:19 -0700 Message-ID: <20181018161219.127534-1-edumazet@google.com> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Cc: netdev , Eric Dumazet , Eric Dumazet To: "David S . Miller" , Neal Cardwell , Soheil Hassas Yeganeh , Andrey Vagin Return-path: Received: from mail-pg1-f195.google.com ([209.85.215.195]:46253 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728015AbeJSAOE (ORCPT ); Thu, 18 Oct 2018 20:14:04 -0400 Received: by mail-pg1-f195.google.com with SMTP id r190-v6so2783153pgr.13 for ; Thu, 18 Oct 2018 09:12:23 -0700 (PDT) Sender: netdev-owner@vger.kernel.org List-ID: Andrey reported the following warning triggered while running CRIU tests: tcp_clean_rtx_queue() ... last_ackt = tcp_skb_timestamp_us(skb); WARN_ON_ONCE(last_ackt == 0); This is caused by 5f6188a8003d ("tcp: do not change tcp_wstamp_ns in tcp_mstamp_refresh"), as we end up having skbs in retransmit queue with a zero skb->skb_mstamp_ns field. We could fix this bug in different ways, like making sure tp->tcp_wstamp_ns is not zero at socket creation, but as Neal pointed out, we also do not want that pacing status of a repaired socket could push tp->tcp_wstamp_ns far ahead in the future. So we prefer changing tcp_write_xmit() to not call tcp_update_skb_after_send() and instead do what is requested by TCP_REPAIR logic. Fixes: 5f6188a8003d ("tcp: do not change tcp_wstamp_ns in tcp_mstamp_refresh") Signed-off-by: Eric Dumazet Reported-by: Andrey Vagin Acked-by: Soheil Hassas Yeganeh Acked-by: Neal Cardwell --- net/ipv4/tcp_output.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index d212e4cbc68902e873afb4a12b43b467ccd6069b..c07990a35ff3bd9438d32c82863ef207c93bdb9e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2321,18 +2321,19 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, while ((skb = tcp_send_head(sk))) { unsigned int limit; - if (tcp_pacing_check(sk)) - break; - - tso_segs = tcp_init_tso_segs(skb, mss_now); - BUG_ON(!tso_segs); - if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) { - /* "skb_mstamp" is used as a start point for the retransmit timer */ - tcp_update_skb_after_send(sk, skb, tp->tcp_wstamp_ns); + /* "skb_mstamp_ns" is used as a start point for the retransmit timer */ + skb->skb_mstamp_ns = tp->tcp_wstamp_ns = tp->tcp_clock_cache; + list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue); goto repair; /* Skip network transmission */ } + if (tcp_pacing_check(sk)) + break; + + tso_segs = tcp_init_tso_segs(skb, mss_now); + BUG_ON(!tso_segs); + cwnd_quota = tcp_cwnd_test(tp, skb); if (!cwnd_quota) { if (push_one == 2) -- 2.19.1.331.ge82ca0e54c-goog