From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C3DB1DDF5; Tue, 16 Jul 2024 15:41:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721144482; cv=none; b=Gpz71XaNhtMgji5nR9mjc+kEQl7Pu/ZyRHVf0kItQfa22TKUXx3a0mfQJHzdWEvySxJ5TTbr9y1d/NgRnB28uGUOuC8g0VIshq/UnHNqZ+f+7s4BAiGgcqTL7iTcu8MslKlS1Cb57VuXTTojHzXlMrBYkPOiWSMxFyxQl8Wllsc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721144482; c=relaxed/simple; bh=84ZPYt59GO6iKQxoeEa04oQekm8LIN4VxZIDVYytl/I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kX9rj88OY1A1m0We0fnTOAzVyQLJ//kQslZ7/4Mc2LVv77zS2d2GS6Fzpc5Zm6ilwVHcGFcHkz/KvamdHhcK8cPPKAtR7HmYI0tLyeh7kr2YTIkaZuPhfpM1ZIXbODEuut4ROexE0lgL/jzYOOkWRhnsML5AKT6F0TnrqGoM0p0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=cK2ib4zO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="cK2ib4zO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B673AC116B1; Tue, 16 Jul 2024 15:41:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1721144482; bh=84ZPYt59GO6iKQxoeEa04oQekm8LIN4VxZIDVYytl/I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cK2ib4zOV2BNf7AMpLPRSeHIYEDZTJlHIKF6tzRj38UvYisJv02TCewLllpkEXJNg R6AKKME0GOlSg8NjL7qNMTCgvQsZfCHEXrX7VgkZhq2d2aSocuAJL0KU1JCrnIoA7A 2RN32Ao5LZNUOthjOFMbB3lMglyi0Es1s/oabhDc= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Neal Cardwell , Eric Dumazet , Yuchung Cheng , Kevin Yang , Jakub Kicinski , Sasha Levin Subject: [PATCH 5.4 50/78] tcp: fix incorrect undo caused by DSACK of TLP retransmit Date: Tue, 16 Jul 2024 17:31:22 +0200 Message-ID: <20240716152742.576387126@linuxfoundation.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240716152740.626160410@linuxfoundation.org> References: <20240716152740.626160410@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 5.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Neal Cardwell [ Upstream commit 0ec986ed7bab6801faed1440e8839dcc710331ff ] Loss recovery undo_retrans bookkeeping had a long-standing bug where a DSACK from a spurious TLP retransmit packet could cause an erroneous undo of a fast recovery or RTO recovery that repaired a single really-lost packet (in a sequence range outside that of the TLP retransmit). Basically, because the loss recovery state machine didn't account for the fact that it sent a TLP retransmit, the DSACK for the TLP retransmit could erroneously be implicitly be interpreted as corresponding to the normal fast recovery or RTO recovery retransmit that plugged a real hole, thus resulting in an improper undo. For example, consider the following buggy scenario where there is a real packet loss but the congestion control response is improperly undone because of this bug: + send packets P1, P2, P3, P4 + P1 is really lost + send TLP retransmit of P4 + receive SACK for original P2, P3, P4 + enter fast recovery, fast-retransmit P1, increment undo_retrans to 1 + receive DSACK for TLP P4, decrement undo_retrans to 0, undo (bug!) + receive cumulative ACK for P1-P4 (fast retransmit plugged real hole) The fix: when we initialize undo machinery in tcp_init_undo(), if there is a TLP retransmit in flight, then increment tp->undo_retrans so that we make sure that we receive a DSACK corresponding to the TLP retransmit, as well as DSACKs for all later normal retransmits, before triggering a loss recovery undo. Note that we also have to move the line that clears tp->tlp_high_seq for RTO recovery, so that upon RTO we remember the tp->tlp_high_seq value until tcp_init_undo() and clear it only afterward. Also note that the bug dates back to the original 2013 TLP implementation, commit 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)"). However, this patch will only compile and work correctly with kernels that have tp->tlp_retrans, which was added only in v5.8 in 2020 in commit 76be93fc0702 ("tcp: allow at most one TLP probe per flight"). So we associate this fix with that later commit. Fixes: 76be93fc0702 ("tcp: allow at most one TLP probe per flight") Signed-off-by: Neal Cardwell Reviewed-by: Eric Dumazet Cc: Yuchung Cheng Cc: Kevin Yang Link: https://patch.msgid.link/20240703171246.1739561-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/ipv4/tcp_input.c | 11 ++++++++++- net/ipv4/tcp_timer.c | 2 -- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 57907fe94b238..cf6221e9fda50 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1956,8 +1956,16 @@ void tcp_clear_retrans(struct tcp_sock *tp) static inline void tcp_init_undo(struct tcp_sock *tp) { tp->undo_marker = tp->snd_una; + /* Retransmission still in flight may cause DSACKs later. */ - tp->undo_retrans = tp->retrans_out ? : -1; + /* First, account for regular retransmits in flight: */ + tp->undo_retrans = tp->retrans_out; + /* Next, account for TLP retransmits in flight: */ + if (tp->tlp_high_seq && tp->tlp_retrans) + tp->undo_retrans++; + /* Finally, avoid 0, because undo_retrans==0 means "can undo now": */ + if (!tp->undo_retrans) + tp->undo_retrans = -1; } static bool tcp_is_rack(const struct sock *sk) @@ -2036,6 +2044,7 @@ void tcp_enter_loss(struct sock *sk) tcp_set_ca_state(sk, TCP_CA_Loss); tp->high_seq = tp->snd_nxt; + tp->tlp_high_seq = 0; tcp_ecn_queue_cwr(tp); /* F-RTO RFC5682 sec 3.1 step 1: retransmit SND.UNA if no previous diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 551c4a78f68d4..e20fced3c9cf6 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -467,8 +467,6 @@ void tcp_retransmit_timer(struct sock *sk) if (!tp->packets_out || WARN_ON_ONCE(tcp_rtx_queue_empty(sk))) return; - tp->tlp_high_seq = 0; - if (!tp->snd_wnd && !sock_flag(sk, SOCK_DEAD) && !((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))) { /* Receiver dastardly shrinks window. Our retransmits -- 2.43.0