From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Yousuk Seung <ysseung@google.com>,
Neal Cardwell <ncardwell@google.com>,
Yuchung Cheng <ycheng@google.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
Eric Dumazet <edumazet@google.com>,
Priyaranjan Jha <priyarjha@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.9 46/75] tcp: invalidate rate samples during SACK reneging
Date: Mon, 1 Jan 2018 15:32:23 +0100 [thread overview]
Message-ID: <20180101140104.203366823@linuxfoundation.org> (raw)
In-Reply-To: <20180101140056.475827799@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yousuk Seung <ysseung@google.com>
[ Upstream commit d4761754b4fb2ef8d9a1e9d121c4bec84e1fe292 ]
Mark tcp_sock during a SACK reneging event and invalidate rate samples
while marked. Such rate samples may overestimate bw by including packets
that were SACKed before reneging.
< ack 6001 win 10000 sack 7001:38001
< ack 7001 win 0 sack 8001:38001 // Reneg detected
> seq 7001:8001 // RTO, SACK cleared.
< ack 38001 win 10000
In above example the rate sample taken after the last ack will count
7001-38001 as delivered while the actual delivery rate likely could
be much lower i.e. 7001-8001.
This patch adds a new field tcp_sock.sack_reneg and marks it when we
declare SACK reneging and entering TCP_CA_Loss, and unmarks it after
the last rate sample was taken before moving back to TCP_CA_Open. This
patch also invalidates rate samples taken while tcp_sock.is_sack_reneg
is set.
Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/tcp.h | 3 ++-
include/net/tcp.h | 2 +-
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 10 ++++++++--
net/ipv4/tcp_rate.c | 10 +++++++---
5 files changed, 19 insertions(+), 7 deletions(-)
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -219,7 +219,8 @@ struct tcp_sock {
} rack;
u16 advmss; /* Advertised MSS */
u8 rate_app_limited:1, /* rate_{delivered,interval_us} limited? */
- unused:7;
+ is_sack_reneg:1, /* in recovery from loss with SACK reneg? */
+ unused:6;
u8 nonagle : 4,/* Disable Nagle algorithm? */
thin_lto : 1,/* Use linear timeouts for thin streams */
thin_dupack : 1,/* Fast retransmit on first dupack */
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1001,7 +1001,7 @@ void tcp_rate_skb_sent(struct sock *sk,
void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
struct rate_sample *rs);
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs);
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs);
void tcp_rate_check_app_limited(struct sock *sk);
/* These functions determine how the current flow behaves in respect of SACK
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2297,6 +2297,7 @@ int tcp_disconnect(struct sock *sk, int
tp->snd_cwnd_cnt = 0;
tp->window_clamp = 0;
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
tcp_clear_retrans(tp);
inet_csk_delack_init(sk);
/* Initialize rcv_mss to TCP_MIN_MSS to avoid division by 0
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1966,6 +1966,8 @@ void tcp_enter_loss(struct sock *sk)
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSACKRENEGING);
tp->sacked_out = 0;
tp->fackets_out = 0;
+ /* Mark SACK reneging until we recover from this loss event. */
+ tp->is_sack_reneg = 1;
}
tcp_clear_all_retrans_hints(tp);
@@ -2463,6 +2465,7 @@ static bool tcp_try_undo_recovery(struct
return true;
}
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
return false;
}
@@ -2494,8 +2497,10 @@ static bool tcp_try_undo_loss(struct soc
NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPSPURIOUSRTOS);
inet_csk(sk)->icsk_retransmits = 0;
- if (frto_undo || tcp_is_sack(tp))
+ if (frto_undo || tcp_is_sack(tp)) {
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
+ }
return true;
}
return false;
@@ -3589,6 +3594,7 @@ static int tcp_ack(struct sock *sk, cons
struct tcp_sacktag_state sack_state;
struct rate_sample rs = { .prior_delivered = 0 };
u32 prior_snd_una = tp->snd_una;
+ bool is_sack_reneg = tp->is_sack_reneg;
u32 ack_seq = TCP_SKB_CB(skb)->seq;
u32 ack = TCP_SKB_CB(skb)->ack_seq;
bool is_dupack = false;
@@ -3711,7 +3717,7 @@ static int tcp_ack(struct sock *sk, cons
tcp_schedule_loss_probe(sk);
delivered = tp->delivered - delivered; /* freshly ACKed or SACKed */
lost = tp->lost - lost; /* freshly marked lost */
- tcp_rate_gen(sk, delivered, lost, &now, &rs);
+ tcp_rate_gen(sk, delivered, lost, is_sack_reneg, &now, &rs);
tcp_cong_control(sk, ack, delivered, flag, &rs);
tcp_xmit_recovery(sk, rexmit);
return 1;
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -106,7 +106,7 @@ void tcp_rate_skb_delivered(struct sock
/* Update the connection delivery information and generate a rate sample. */
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs)
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs)
{
struct tcp_sock *tp = tcp_sk(sk);
u32 snd_us, ack_us;
@@ -124,8 +124,12 @@ void tcp_rate_gen(struct sock *sk, u32 d
rs->acked_sacked = delivered; /* freshly ACKed or SACKed */
rs->losses = lost; /* freshly marked lost */
- /* Return an invalid sample if no timing information is available. */
- if (!rs->prior_mstamp.v64) {
+ /* Return an invalid sample if no timing information is available or
+ * in recovery from loss with SACK reneging. Rate samples taken during
+ * a SACK reneging event may overestimate bw by including packets that
+ * were SACKed before the reneg.
+ */
+ if (!rs->prior_mstamp.v64 || is_sack_reneg) {
rs->delivered = -1;
rs->interval_us = -1;
return;
next prev parent reply other threads:[~2018-01-01 14:35 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-01 14:31 [PATCH 4.9 00/75] 4.9.74-stable review Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 01/75] sync objtools copy of x86-opcode-map.txt Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 02/75] tracing: Remove extra zeroing out of the ring buffer page Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 03/75] tracing: Fix possible double free on failure of allocating trace buffer Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 04/75] tracing: Fix crash when it fails to alloc ring buffer Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 05/75] ring-buffer: Mask out the info bits when returning buffer page length Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 06/75] iw_cxgb4: Only validate the MSN for successful completions Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 07/75] ASoC: wm_adsp: Fix validation of firmware and coeff lengths Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 08/75] ASoC: da7218: fix fix child-node lookup Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 09/75] ASoC: fsl_ssi: AC97 ops need regmap, clock and cleaning up on failure Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 10/75] ASoC: twl4030: fix child-node lookup Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 11/75] ASoC: tlv320aic31xx: Fix GPIO1 register definition Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 12/75] ALSA: hda: Drop useless WARN_ON() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 13/75] ALSA: hda - fix headset mic detection issue on a Dell machine Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 14/75] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 15/75] x86/mm: Remove flush_tlb() and flush_tlb_current_task() Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 16/75] x86/mm: Make flush_tlb_mm_range() more predictable Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 17/75] x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range() Greg Kroah-Hartman
2018-01-01 14:31 ` Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 18/75] x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code Greg Kroah-Hartman
2018-01-01 14:31 ` Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 19/75] x86/mm: Disable PCID on 32-bit kernels Greg Kroah-Hartman
2018-01-01 14:31 ` Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 20/75] x86/mm: Add the nopcid boot option to turn off PCID Greg Kroah-Hartman
2018-01-01 14:31 ` Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 21/75] x86/mm: Enable CR4.PCIDE on supported systems Greg Kroah-Hartman
2018-01-01 14:31 ` Greg Kroah-Hartman
2018-01-01 14:31 ` [PATCH 4.9 22/75] x86/mm/64: Fix reboot interaction with CR4.PCIDE Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 23/75] kbuild: add -fno-stack-check to kernel build options Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 24/75] ipv4: igmp: guard against silly MTU values Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 25/75] ipv6: mcast: better catch silly mtu values Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 26/75] net: fec: unmap the xmit buffer that are not transferred by DMA Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports Greg Kroah-Hartman
2018-01-16 3:50 ` Sebastian Gottschall
2018-01-16 3:58 ` Kevin Cernekee
2018-01-16 4:26 ` Sebastian Gottschall
2018-01-16 4:32 ` Kevin Cernekee
2018-01-16 4:44 ` Sebastian Gottschall
2018-01-16 5:16 ` Kevin Cernekee
2018-01-16 9:18 ` Sebastian Gottschall
2018-01-16 15:31 ` Kevin Cernekee
2018-01-16 15:40 ` Sebastian Gottschall
2018-01-16 9:21 ` Sebastian Gottschall
2018-01-16 4:55 ` Sebastian Gottschall
2018-01-16 5:55 ` Greg Kroah-Hartman
2018-01-16 7:34 ` Sebastian Gottschall
2018-01-16 8:15 ` Greg Kroah-Hartman
2018-01-16 15:25 ` David Miller
2018-01-16 15:34 ` Sebastian Gottschall
2018-01-01 14:32 ` [PATCH 4.9 28/75] netlink: Add netns check on taps Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 29/75] net: qmi_wwan: add Sierra EM7565 1199:9091 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 30/75] net: reevalulate autoflowlabel setting after sysctl setting Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 31/75] ptr_ring: add barriers Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 32/75] RDS: Check cmsg_len before dereferencing CMSG_DATA Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 33/75] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 34/75] tcp md5sig: Use skbs saddr when replying to an incoming segment Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 35/75] tg3: Fix rx hang on MTU change with 5717/5719 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 36/75] net: ipv4: fix for a race condition in raw_sendmsg Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 37/75] net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 38/75] sctp: Replace use of sockets_allocated with specified macro Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 39/75] adding missing rcu_read_unlock in ipxip6_rcv Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 40/75] ipv4: Fix use-after-free when flushing FIB tables Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 41/75] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 42/75] net: fec: Allow reception of frames bigger than 1522 bytes Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 43/75] net: Fix double free and memory corruption in get_net_ns_by_id() Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 44/75] net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 45/75] sock: free skb in skb_complete_tx_timestamp on error Greg Kroah-Hartman
2018-01-01 14:32 ` Greg Kroah-Hartman [this message]
2018-01-01 14:32 ` [PATCH 4.9 47/75] net/mlx5: Fix rate limit packet pacing naming and struct Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 48/75] net/mlx5e: Fix features check of IPv6 traffic Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 49/75] net/mlx5e: Fix possible deadlock of VXLAN lock Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 50/75] net/mlx5e: Add refcount to VXLAN structure Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 51/75] net/mlx5e: Prevent possible races in VXLAN control flow Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 52/75] net/mlx5: Fix error flow in CREATE_QP command Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 53/75] s390/qeth: apply takeover changes when mode is toggled Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 54/75] s390/qeth: dont apply takeover changes to RXIP Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 55/75] s390/qeth: lock IP table while applying takeover changes Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 56/75] s390/qeth: update takeover IPs after configuration change Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 57/75] usbip: fix usbip bind writing random string after command in match_busid Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 58/75] usbip: prevent leaking socket pointer address in messages Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 59/75] usbip: stub: stop printing kernel pointer addresses " Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 60/75] usbip: vhci: " Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 61/75] USB: serial: ftdi_sio: add id for Airbus DS P8GR Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 62/75] USB: serial: qcserial: add Sierra Wireless EM7565 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 63/75] USB: serial: option: add support for Telit ME910 PID 0x1101 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 64/75] USB: serial: option: adding support for YUGA CLM920-NC5 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 65/75] usb: Add device quirk for Logitech HD Pro Webcam C925e Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 66/75] usb: add RESET_RESUME for ELSA MicroLink 56K Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 67/75] USB: Fix off by one in type-specific length check of BOS SSP capability Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 68/75] usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 69/75] timers: Use deferrable base independent of base::nohz_active Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 70/75] timers: Invoke timer_start_debug() where it makes sense Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 71/75] timers: Reinitialize per cpu bases on hotplug Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 72/75] nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 73/75] x86/smpboot: Remove stale TLB flush invocations Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 74/75] n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD) Greg Kroah-Hartman
2018-01-01 14:32 ` [PATCH 4.9 75/75] tty: fix tty_ldisc_receive_buf() documentation Greg Kroah-Hartman
2018-01-01 19:47 ` [PATCH 4.9 00/75] 4.9.74-stable review kernelci.org bot
2018-01-01 20:38 ` Naresh Kamboju
2018-01-02 16:49 ` Guenter Roeck
2018-01-02 18:22 ` Greg Kroah-Hartman
2018-01-02 16:57 ` Neal Cardwell
2018-01-02 18:21 ` Greg Kroah-Hartman
2018-01-02 18:32 ` David Miller
2018-01-02 19:11 ` Neal Cardwell
2018-01-02 19:12 ` David Miller
2018-01-02 20:08 ` Greg KH
2018-01-02 22:31 ` Neal Cardwell
2018-01-02 22:23 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180101140104.203366823@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=priyarjha@google.com \
--cc=soheil@google.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
--cc=ysseung@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.