From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Yousuk Seung <ysseung@google.com>,
Neal Cardwell <ncardwell@google.com>,
Yuchung Cheng <ycheng@google.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
Eric Dumazet <edumazet@google.com>,
Priyaranjan Jha <priyarjha@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.9 46/75] tcp: invalidate rate samples during SACK reneging
Date: Mon, 1 Jan 2018 15:32:23 +0100 [thread overview]
Message-ID: <20180101140104.203366823@linuxfoundation.org> (raw)
In-Reply-To: <20180101140056.475827799@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yousuk Seung <ysseung@google.com>
[ Upstream commit d4761754b4fb2ef8d9a1e9d121c4bec84e1fe292 ]
Mark tcp_sock during a SACK reneging event and invalidate rate samples
while marked. Such rate samples may overestimate bw by including packets
that were SACKed before reneging.
< ack 6001 win 10000 sack 7001:38001
< ack 7001 win 0 sack 8001:38001 // Reneg detected
> seq 7001:8001 // RTO, SACK cleared.
< ack 38001 win 10000
In above example the rate sample taken after the last ack will count
7001-38001 as delivered while the actual delivery rate likely could
be much lower i.e. 7001-8001.
This patch adds a new field tcp_sock.sack_reneg and marks it when we
declare SACK reneging and entering TCP_CA_Loss, and unmarks it after
the last rate sample was taken before moving back to TCP_CA_Open. This
patch also invalidates rate samples taken while tcp_sock.is_sack_reneg
is set.
Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/tcp.h | 3 ++-
include/net/tcp.h | 2 +-
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 10 ++++++++--
net/ipv4/tcp_rate.c | 10 +++++++---
5 files changed, 19 insertions(+), 7 deletions(-)
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -219,7 +219,8 @@ struct tcp_sock {
} rack;
u16 advmss; /* Advertised MSS */
u8 rate_app_limited:1, /* rate_{delivered,interval_us} limited? */
- unused:7;
+ is_sack_reneg:1, /* in recovery from loss with SACK reneg? */
+ unused:6;
u8 nonagle : 4,/* Disable Nagle algorithm? */
thin_lto : 1,/* Use linear timeouts for thin streams */
thin_dupack : 1,/* Fast retransmit on first dupack */
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1001,7 +1001,7 @@ void tcp_rate_skb_sent(struct sock *sk,
void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
struct rate_sample *rs);
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs);
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs);
void tcp_rate_check_app_limited(struct sock *sk);
/* These functions determine how the current flow behaves in respect of SACK
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2297,6 +2297,7 @@ int tcp_disconnect(struct sock *sk, int
tp->snd_cwnd_cnt = 0;
tp->window_clamp = 0;
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
tcp_clear_retrans(tp);
inet_csk_delack_init(sk);
/* Initialize rcv_mss to TCP_MIN_MSS to avoid division by 0
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1966,6 +1966,8 @@ void tcp_enter_loss(struct sock *sk)
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSACKRENEGING);
tp->sacked_out = 0;
tp->fackets_out = 0;
+ /* Mark SACK reneging until we recover from this loss event. */
+ tp->is_sack_reneg = 1;
}
tcp_clear_all_retrans_hints(tp);
@@ -2463,6 +2465,7 @@ static bool tcp_try_undo_recovery(struct
return true;
}
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
return false;
}
@@ -2494,8 +2497,10 @@ static bool tcp_try_undo_loss(struct soc
NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPSPURIOUSRTOS);
inet_csk(sk)->icsk_retransmits = 0;
- if (frto_undo || tcp_is_sack(tp))
+ if (frto_undo || tcp_is_sack(tp)) {
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
+ }
return true;
}
return false;
@@ -3589,6 +3594,7 @@ static int tcp_ack(struct sock *sk, cons
struct tcp_sacktag_state sack_state;
struct rate_sample rs = { .prior_delivered = 0 };
u32 prior_snd_una = tp->snd_una;
+ bool is_sack_reneg = tp->is_sack_reneg;
u32 ack_seq = TCP_SKB_CB(skb)->seq;
u32 ack = TCP_SKB_CB(skb)->ack_seq;
bool is_dupack = false;
@@ -3711,7 +3717,7 @@ static int tcp_ack(struct sock *sk, cons
tcp_schedule_loss_probe(sk);
delivered = tp->delivered - delivered; /* freshly ACKed or SACKed */
lost = tp->lost - lost; /* freshly marked lost */
- tcp_rate_gen(sk, delivered, lost, &now, &rs);
+ tcp_rate_gen(sk, delivered, lost, is_sack_reneg, &now, &rs);
tcp_cong_control(sk, ack, delivered, flag, &rs);
tcp_xmit_recovery(sk, rexmit);
return 1;
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -106,7 +106,7 @@ void tcp_rate_skb_delivered(struct sock
/* Update the connection delivery information and generate a rate sample. */
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs)
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs)
{
struct tcp_sock *tp = tcp_sk(sk);
u32 snd_us, ack_us;
@@ -124,8 +124,12 @@ void tcp_rate_gen(struct sock *sk, u32 d
rs->acked_sacked = delivered; /* freshly ACKed or SACKed */
rs->losses = lost; /* freshly marked lost */
- /* Return an invalid sample if no timing information is available. */
- if (!rs->prior_mstamp.v64) {
+ /* Return an invalid sample if no timing information is available or
+ * in recovery from loss with SACK reneging. Rate samples taken during
+ * a SACK reneging event may overestimate bw by including packets that
+ * were SACKed before the reneg.
+ */
+ if (!rs->prior_mstamp.v64 || is_sack_reneg) {
rs->delivered = -1;
rs->interval_us = -1;
return;
next prev parent reply other threads:[~2018-01-01 14:35 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-01 14:31 [PATCH 4.9 00/75] 4.9.74-stable review Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 01/75] sync objtools copy of x86-opcode-map.txt Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 02/75] tracing: Remove extra zeroing out of the ring buffer page Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 03/75] tracing: Fix possible double free on failure of allocating trace buffer Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 04/75] tracing: Fix crash when it fails to alloc ring buffer Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 05/75] ring-buffer: Mask out the info bits when returning buffer page length Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 06/75] iw_cxgb4: Only validate the MSN for successful completions Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 07/75] ASoC: wm_adsp: Fix validation of firmware and coeff lengths Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 08/75] ASoC: da7218: fix fix child-node lookup Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 09/75] ASoC: fsl_ssi: AC97 ops need regmap, clock and cleaning up on failure Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 10/75] ASoC: twl4030: fix child-node lookup Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 11/75] ASoC: tlv320aic31xx: Fix GPIO1 register definition Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 12/75] ALSA: hda: Drop useless WARN_ON() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 13/75] ALSA: hda - fix headset mic detection issue on a Dell machine Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 14/75] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 15/75] x86/mm: Remove flush_tlb() and flush_tlb_current_task() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 16/75] x86/mm: Make flush_tlb_mm_range() more predictable Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 17/75] x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 18/75] x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 19/75] x86/mm: Disable PCID on 32-bit kernels Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 20/75] x86/mm: Add the nopcid boot option to turn off PCID Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 21/75] x86/mm: Enable CR4.PCIDE on supported systems Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 22/75] x86/mm/64: Fix reboot interaction with CR4.PCIDE Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 23/75] kbuild: add -fno-stack-check to kernel build options Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 24/75] ipv4: igmp: guard against silly MTU values Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 25/75] ipv6: mcast: better catch silly mtu values Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 26/75] net: fec: unmap the xmit buffer that are not transferred by DMA Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports Greg Kroah-Hartman
2018-01-16 3:50 ` Sebastian Gottschall
2018-01-16 3:58 ` Kevin Cernekee
2018-01-16 4:26 ` Sebastian Gottschall
2018-01-16 4:32 ` Kevin Cernekee
2018-01-16 4:44 ` Sebastian Gottschall
2018-01-16 5:16 ` Kevin Cernekee
2018-01-16 9:18 ` Sebastian Gottschall
2018-01-16 15:31 ` Kevin Cernekee
2018-01-16 15:40 ` Sebastian Gottschall
2018-01-16 9:21 ` Sebastian Gottschall
2018-01-16 4:55 ` Sebastian Gottschall
2018-01-16 5:55 ` Greg Kroah-Hartman
2018-01-16 7:34 ` Sebastian Gottschall
2018-01-16 8:15 ` Greg Kroah-Hartman
2018-01-16 15:25 ` David Miller
2018-01-16 15:34 ` Sebastian Gottschall
2018-01-01 14:32 ` [PATCH 4.9 28/75] netlink: Add netns check on taps Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 29/75] net: qmi_wwan: add Sierra EM7565 1199:9091 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 30/75] net: reevalulate autoflowlabel setting after sysctl setting Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 31/75] ptr_ring: add barriers Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 32/75] RDS: Check cmsg_len before dereferencing CMSG_DATA Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 33/75] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 34/75] tcp md5sig: Use skbs saddr when replying to an incoming segment Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 35/75] tg3: Fix rx hang on MTU change with 5717/5719 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 36/75] net: ipv4: fix for a race condition in raw_sendmsg Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 37/75] net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 38/75] sctp: Replace use of sockets_allocated with specified macro Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 39/75] adding missing rcu_read_unlock in ipxip6_rcv Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 40/75] ipv4: Fix use-after-free when flushing FIB tables Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 41/75] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 42/75] net: fec: Allow reception of frames bigger than 1522 bytes Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 43/75] net: Fix double free and memory corruption in get_net_ns_by_id() Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 44/75] net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 45/75] sock: free skb in skb_complete_tx_timestamp on error Greg Kroah-Hartman
2018-01-01 14:32 ` Greg Kroah-Hartman [this message]
2018-01-01 14:32 ` [PATCH 4.9 47/75] net/mlx5: Fix rate limit packet pacing naming and struct Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 48/75] net/mlx5e: Fix features check of IPv6 traffic Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 49/75] net/mlx5e: Fix possible deadlock of VXLAN lock Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 50/75] net/mlx5e: Add refcount to VXLAN structure Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 51/75] net/mlx5e: Prevent possible races in VXLAN control flow Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 52/75] net/mlx5: Fix error flow in CREATE_QP command Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 53/75] s390/qeth: apply takeover changes when mode is toggled Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 54/75] s390/qeth: dont apply takeover changes to RXIP Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 55/75] s390/qeth: lock IP table while applying takeover changes Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 56/75] s390/qeth: update takeover IPs after configuration change Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 57/75] usbip: fix usbip bind writing random string after command in match_busid Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 58/75] usbip: prevent leaking socket pointer address in messages Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 59/75] usbip: stub: stop printing kernel pointer addresses " Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 60/75] usbip: vhci: " Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 61/75] USB: serial: ftdi_sio: add id for Airbus DS P8GR Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 62/75] USB: serial: qcserial: add Sierra Wireless EM7565 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 63/75] USB: serial: option: add support for Telit ME910 PID 0x1101 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 64/75] USB: serial: option: adding support for YUGA CLM920-NC5 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 65/75] usb: Add device quirk for Logitech HD Pro Webcam C925e Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 66/75] usb: add RESET_RESUME for ELSA MicroLink 56K Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 67/75] USB: Fix off by one in type-specific length check of BOS SSP capability Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 68/75] usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 69/75] timers: Use deferrable base independent of base::nohz_active Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 70/75] timers: Invoke timer_start_debug() where it makes sense Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 71/75] timers: Reinitialize per cpu bases on hotplug Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 72/75] nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 73/75] x86/smpboot: Remove stale TLB flush invocations Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 74/75] n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD) Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 75/75] tty: fix tty_ldisc_receive_buf() documentation Greg Kroah-Hartman
2018-01-01 19:47 ` [PATCH 4.9 00/75] 4.9.74-stable review kernelci.org bot
2018-01-01 20:38 ` Naresh Kamboju
2018-01-02 16:49 ` Guenter Roeck
2018-01-02 18:22 ` Greg Kroah-Hartman
2018-01-02 16:57 ` Neal Cardwell
2018-01-02 18:21 ` Greg Kroah-Hartman
2018-01-02 18:32 ` David Miller
2018-01-02 19:11 ` Neal Cardwell
2018-01-02 19:12 ` David Miller
2018-01-02 20:08 ` Greg KH
2018-01-02 22:31 ` Neal Cardwell
2018-01-02 22:23 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180101140104.203366823@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=priyarjha@google.com \
--cc=soheil@google.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
--cc=ysseung@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox