* [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events
@ 2023-09-14 14:36 Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed Aananth V
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Aananth V @ 2023-09-14 14:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Paolo Abeni, David Miller, Jakub Kicinski, Neal Cardwell,
Yuchung Cheng, Aananth V
The 2023 SIGCOMM paper "Improving Network Availability with Protective
ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
effectively reduce application disruption during outages. To better
measure the efficacy of this feature, this patch set adds three more
detailed stats during RTO recovery and exports via TCP_INFO.
Applications and monitoring systems can leverage this data to measure
the network path diversity and end-to-end repair latency during network
outages to improve their network infrastructure.
Patch 1 fixes a bug in TFO SYNACK that we encountered while testing
these new metrics.
Patch 2 adds the new metrics to tcp_sock and tcp_info.
v2: Addressed feedback from a check bot in patch 2 by removing the
inline keyword from the tcp_update_rto_time and tcp_update_rto_stats
functions. Changed a comment in include/net/tcp.h to fit under 80 words.
Aananth V (2):
tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed
tcp: new TCP_INFO stats for RTO events
include/linux/tcp.h | 8 ++++++++
include/uapi/linux/tcp.h | 12 ++++++++++++
net/ipv4/tcp.c | 9 +++++++++
net/ipv4/tcp_input.c | 24 ++++++++++++++++++++----
net/ipv4/tcp_minisocks.c | 4 ++++
net/ipv4/tcp_timer.c | 17 +++++++++++++++--
6 files changed, 68 insertions(+), 6 deletions(-)
--
2.42.0.283.g2d96d420d3-goog
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next v2 1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed
2023-09-14 14:36 [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events Aananth V
@ 2023-09-14 14:36 ` Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 2/2] tcp: new TCP_INFO stats for RTO events Aananth V
2023-09-16 12:50 ` [PATCH net-next v2 0/2] " patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Aananth V @ 2023-09-14 14:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Paolo Abeni, David Miller, Jakub Kicinski, Neal Cardwell,
Yuchung Cheng, Aananth V
For passive TCP Fast Open sockets that had SYN/ACK timeout and did not
send more data in SYN_RECV, upon receiving the final ACK in 3WHS, the
congestion state may awkwardly stay in CA_Loss mode unless the CA state
was undone due to TCP timestamp checks. However, if
tcp_rcv_synrecv_state_fastopen() decides not to undo, then we should
enter CA_Open, because at that point we have received an ACK covering
the retransmitted SYNACKs. Currently, the icsk_ca_state is only set to
CA_Open after we receive an ACK for a data-packet. This is because
tcp_ack does not call tcp_fastretrans_alert (and tcp_process_loss) if
!prior_packets
Note that tcp_process_loss() calls tcp_try_undo_recovery(), so having
tcp_rcv_synrecv_state_fastopen() decide that if we're in CA_Loss we
should call tcp_try_undo_recovery() is consistent with that, and
low risk.
Fixes: dad8cea7add9 ("tcp: fix TFO SYNACK undo to avoid double-timestamp-undo")
Signed-off-by: Aananth V <aananthv@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_input.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 06fe1cf645d5..fe2ab0db2eb7 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6436,22 +6436,23 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
{
+ struct tcp_sock *tp = tcp_sk(sk);
struct request_sock *req;
/* If we are still handling the SYNACK RTO, see if timestamp ECR allows
* undo. If peer SACKs triggered fast recovery, we can't undo here.
*/
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_Loss)
- tcp_try_undo_loss(sk, false);
+ if (inet_csk(sk)->icsk_ca_state == TCP_CA_Loss && !tp->packets_out)
+ tcp_try_undo_recovery(sk);
/* Reset rtx states to prevent spurious retransmits_timed_out() */
- tcp_sk(sk)->retrans_stamp = 0;
+ tp->retrans_stamp = 0;
inet_csk(sk)->icsk_retransmits = 0;
/* Once we leave TCP_SYN_RECV or TCP_FIN_WAIT_1,
* we no longer need req so release it.
*/
- req = rcu_dereference_protected(tcp_sk(sk)->fastopen_rsk,
+ req = rcu_dereference_protected(tp->fastopen_rsk,
lockdep_sock_is_held(sk));
reqsk_fastopen_remove(sk, req, false);
--
2.42.0.283.g2d96d420d3-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net-next v2 2/2] tcp: new TCP_INFO stats for RTO events
2023-09-14 14:36 [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed Aananth V
@ 2023-09-14 14:36 ` Aananth V
2023-09-16 12:50 ` [PATCH net-next v2 0/2] " patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Aananth V @ 2023-09-14 14:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Paolo Abeni, David Miller, Jakub Kicinski, Neal Cardwell,
Yuchung Cheng, Aananth V
The 2023 SIGCOMM paper "Improving Network Availability with Protective
ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
effectively reduce application disruption during outages. To better
measure the efficacy of this feature, this patch adds three more
detailed stats during RTO recovery and exports via TCP_INFO.
Applications and monitoring systems can leverage this data to measure
the network path diversity and end-to-end repair latency during network
outages to improve their network infrastructure.
The following counters are added to tcp_sock in order to track RTO
events over the lifetime of a TCP socket.
1. u16 total_rto - Counts the total number of RTO timeouts.
2. u16 total_rto_recoveries - Counts the total number of RTO recoveries.
3. u32 total_rto_time - Counts the total time spent (ms) in RTO
recoveries. (time spent in CA_Loss and
CA_Recovery states)
To compute total_rto_time, we add a new u32 rto_stamp field to
tcp_sock. rto_stamp records the start timestamp (ms) of the last RTO
recovery (CA_Loss).
Corresponding fields are also added to the tcp_info struct.
Signed-off-by: Aananth V <aananthv@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
include/linux/tcp.h | 8 ++++++++
include/uapi/linux/tcp.h | 12 ++++++++++++
net/ipv4/tcp.c | 9 +++++++++
net/ipv4/tcp_input.c | 15 +++++++++++++++
net/ipv4/tcp_minisocks.c | 4 ++++
net/ipv4/tcp_timer.c | 17 +++++++++++++++--
6 files changed, 63 insertions(+), 2 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 3c5efeeb024f..9b371aa7c796 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -377,6 +377,14 @@ struct tcp_sock {
* Total data bytes retransmitted
*/
u32 total_retrans; /* Total retransmits for entire connection */
+ u32 rto_stamp; /* Start time (ms) of last CA_Loss recovery */
+ u16 total_rto; /* Total number of RTO timeouts, including
+ * SYN/SYN-ACK and recurring timeouts.
+ */
+ u16 total_rto_recoveries; /* Total number of RTO recoveries,
+ * including any unfinished recovery.
+ */
+ u32 total_rto_time; /* ms spent in (completed) RTO recoveries. */
u32 urg_seq; /* Seq of received urgent pointer */
unsigned int keepalive_time; /* time before keep alive takes place */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 879eeb0a084b..d1d08da6331a 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -289,6 +289,18 @@ struct tcp_info {
*/
__u32 tcpi_rehash; /* PLB or timeout triggered rehash attempts */
+
+ __u16 tcpi_total_rto; /* Total number of RTO timeouts, including
+ * SYN/SYN-ACK and recurring timeouts.
+ */
+ __u16 tcpi_total_rto_recoveries; /* Total number of RTO
+ * recoveries, including any
+ * unfinished recovery.
+ */
+ __u32 tcpi_total_rto_time; /* Total time spent in RTO recoveries
+ * in milliseconds, including any
+ * unfinished recovery.
+ */
};
/* netlink attributes types for SCM_TIMESTAMPING_OPT_STATS */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0c3040a63ebd..69b8d7073708 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3818,6 +3818,15 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
info->tcpi_rcv_wnd = tp->rcv_wnd;
info->tcpi_rehash = tp->plb_rehash + tp->timeout_rehash;
info->tcpi_fastopen_client_fail = tp->fastopen_client_fail;
+
+ info->tcpi_total_rto = tp->total_rto;
+ info->tcpi_total_rto_recoveries = tp->total_rto_recoveries;
+ info->tcpi_total_rto_time = tp->total_rto_time;
+ if (tp->rto_stamp) {
+ info->tcpi_total_rto_time += tcp_time_stamp_raw() -
+ tp->rto_stamp;
+ }
+
unlock_sock_fast(sk, slow);
}
EXPORT_SYMBOL_GPL(tcp_get_info);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fe2ab0db2eb7..f199e0ca0786 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2088,6 +2088,10 @@ void tcp_clear_retrans(struct tcp_sock *tp)
tp->undo_marker = 0;
tp->undo_retrans = -1;
tp->sacked_out = 0;
+ tp->rto_stamp = 0;
+ tp->total_rto = 0;
+ tp->total_rto_recoveries = 0;
+ tp->total_rto_time = 0;
}
static inline void tcp_init_undo(struct tcp_sock *tp)
@@ -2825,6 +2829,14 @@ void tcp_enter_recovery(struct sock *sk, bool ece_ack)
tcp_set_ca_state(sk, TCP_CA_Recovery);
}
+static void tcp_update_rto_time(struct tcp_sock *tp)
+{
+ if (tp->rto_stamp) {
+ tp->total_rto_time += tcp_time_stamp(tp) - tp->rto_stamp;
+ tp->rto_stamp = 0;
+ }
+}
+
/* Process an ACK in CA_Loss state. Move to CA_Open if lost data are
* recovered or spurious. Otherwise retransmits more on partial ACKs.
*/
@@ -3029,6 +3041,8 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
break;
case TCP_CA_Loss:
tcp_process_loss(sk, flag, num_dupack, rexmit);
+ if (icsk->icsk_ca_state != TCP_CA_Loss)
+ tcp_update_rto_time(tp);
tcp_identify_packet_loss(sk, ack_flag);
if (!(icsk->icsk_ca_state == TCP_CA_Open ||
(*ack_flag & FLAG_LOST_RETRANS)))
@@ -6446,6 +6460,7 @@ static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
tcp_try_undo_recovery(sk);
/* Reset rtx states to prevent spurious retransmits_timed_out() */
+ tcp_update_rto_time(tp);
tp->retrans_stamp = 0;
inet_csk(sk)->icsk_retransmits = 0;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index b98d476f1594..eee8ab1bfa0e 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -565,6 +565,10 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
newtp->undo_marker = treq->snt_isn;
newtp->retrans_stamp = div_u64(treq->snt_synack,
USEC_PER_SEC / TCP_TS_HZ);
+ newtp->total_rto = req->num_timeout;
+ newtp->total_rto_recoveries = 1;
+ newtp->total_rto_time = tcp_time_stamp_raw() -
+ newtp->retrans_stamp;
}
newtp->tsoffset = treq->ts_off;
#ifdef CONFIG_TCP_MD5SIG
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 984ab4a0421e..9e0c34175cfb 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -415,6 +415,19 @@ abort: tcp_write_err(sk);
}
}
+static void tcp_update_rto_stats(struct sock *sk)
+{
+ struct inet_connection_sock *icsk = inet_csk(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (!icsk->icsk_retransmits) {
+ tp->total_rto_recoveries++;
+ tp->rto_stamp = tcp_time_stamp(tp);
+ }
+ icsk->icsk_retransmits++;
+ tp->total_rto++;
+}
+
/*
* Timer for Fast Open socket to retransmit SYNACK. Note that the
* sk here is the child socket, not the parent (listener) socket.
@@ -447,7 +460,7 @@ static void tcp_fastopen_synack_timer(struct sock *sk, struct request_sock *req)
*/
inet_rtx_syn_ack(sk, req);
req->num_timeout++;
- icsk->icsk_retransmits++;
+ tcp_update_rto_stats(sk);
if (!tp->retrans_stamp)
tp->retrans_stamp = tcp_time_stamp(tp);
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
@@ -575,7 +588,7 @@ void tcp_retransmit_timer(struct sock *sk)
tcp_enter_loss(sk);
- icsk->icsk_retransmits++;
+ tcp_update_rto_stats(sk);
if (tcp_retransmit_skb(sk, tcp_rtx_queue_head(sk), 1) > 0) {
/* Retransmission failed because of local congestion,
* Let senders fight for local resources conservatively.
--
2.42.0.283.g2d96d420d3-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events
2023-09-14 14:36 [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 2/2] tcp: new TCP_INFO stats for RTO events Aananth V
@ 2023-09-16 12:50 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-09-16 12:50 UTC (permalink / raw)
To: Aananth V; +Cc: edumazet, netdev, pabeni, davem, kuba, ncardwell, ycheng
Hello:
This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:
On Thu, 14 Sep 2023 14:36:19 +0000 you wrote:
> The 2023 SIGCOMM paper "Improving Network Availability with Protective
> ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
> effectively reduce application disruption during outages. To better
> measure the efficacy of this feature, this patch set adds three more
> detailed stats during RTO recovery and exports via TCP_INFO.
> Applications and monitoring systems can leverage this data to measure
> the network path diversity and end-to-end repair latency during network
> outages to improve their network infrastructure.
>
> [...]
Here is the summary with links:
- [net-next,v2,1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed
https://git.kernel.org/netdev/net-next/c/e326578a2141
- [net-next,v2,2/2] tcp: new TCP_INFO stats for RTO events
https://git.kernel.org/netdev/net-next/c/3868ab0f1925
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-09-16 12:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-14 14:36 [PATCH net-next v2 0/2] tcp: new TCP_INFO stats for RTO events Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 1/2] tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed Aananth V
2023-09-14 14:36 ` [PATCH net-next v2 2/2] tcp: new TCP_INFO stats for RTO events Aananth V
2023-09-16 12:50 ` [PATCH net-next v2 0/2] " patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).