public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <Alexander.Levin@microsoft.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Cc: Neal Cardwell <ncardwell@google.com>,
	Yuchung Cheng <ycheng@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	Sasha Levin <Alexander.Levin@microsoft.com>
Subject: [PATCH AUTOSEL for 4.15 15/78] tcp: allow TLP in ECN CWR
Date: Thu, 8 Mar 2018 04:56:07 +0000	[thread overview]
Message-ID: <20180308045525.7662-15-alexander.levin@microsoft.com> (raw)
In-Reply-To: <20180308045525.7662-1-alexander.levin@microsoft.com>

From: Neal Cardwell <ncardwell@google.com>

[ Upstream commit b4f70c3d4ec32a2ff4c62e1e2da0da5f55fe12bd ]

This patch enables tail loss probe in cwnd reduction (CWR) state
to detect potential losses. Prior to this patch, since the sender
uses PRR to determine the cwnd in CWR state, the combination of
CWR+PRR plus tcp_tso_should_defer() could cause unnecessary stalls
upon losses: PRR makes cwnd so gentle that tcp_tso_should_defer()
defers sending wait for more ACKs. The ACKs may not come due to
packet losses.

Disallowing TLP when there is unused cwnd had the primary effect
of disallowing TLP when there is TSO deferral, Nagle deferral,
or we hit the rwin limit. Because basically every application
write() or incoming ACK will cause us to run tcp_write_xmit()
to see if we can send more, and then if we sent something we call
tcp_schedule_loss_probe() to see if we should schedule a TLP. At
that point, there are a few common reasons why some cwnd budget
could still be unused:

(a) rwin limit
(b) nagle check
(c) TSO deferral
(d) TSQ

For (d), after the next packet tx completion the TSQ mechanism
will allow us to send more packets, so we don't really need a
TLP (in practice it shouldn't matter whether we schedule one
or not). But for (a), (b), (c) the sender won't send any more
packets until it gets another ACK. But if the whole flight was
lost, or all the ACKs were lost, then we won't get any more ACKs,
and ideally we should schedule and send a TLP to get more feedback.
In particular for a long time we have wanted some kind of timer for
TSO deferral, and at least this would give us some kind of timer

Reported-by: Steve Ibanez <sibanez@stanford.edu>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Nandita Dukkipati <nanditad@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
---
 net/ipv4/tcp_output.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a4d214c7b506..04be9f833927 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2414,15 +2414,12 @@ bool tcp_schedule_loss_probe(struct sock *sk, bool advancing_rto)
 
 	early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans;
 	/* Schedule a loss probe in 2*RTT for SACK capable connections
-	 * in Open state, that are either limited by cwnd or application.
+	 * not in loss recovery, that are either limited by cwnd or application.
 	 */
 	if ((early_retrans != 3 && early_retrans != 4) ||
 	    !tp->packets_out || !tcp_is_sack(tp) ||
-	    icsk->icsk_ca_state != TCP_CA_Open)
-		return false;
-
-	if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
-	     !tcp_write_queue_empty(sk))
+	    (icsk->icsk_ca_state != TCP_CA_Open &&
+	     icsk->icsk_ca_state != TCP_CA_CWR))
 		return false;
 
 	/* Probe timeout is 2*rtt. Add minimum RTO to account
-- 
2.14.1

  parent reply	other threads:[~2018-03-08  4:56 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-08  4:56 [PATCH AUTOSEL for 4.15 01/78] ipmi_si: Fix error handling of platform device Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 03/78] Bluetooth: hci_qca: Avoid setup failure on missing rampatch Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 02/78] drm/amdgpu: use polling mem to set SDMA3 wptr for VF Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 05/78] cpufreq: longhaul: Revert transition_delay_us to 200 ms Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 04/78] Bluetooth: btqcomsmd: Fix skb double free corruption Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 06/78] dt-bindings: net: add TI CC2560 Bluetooth chip Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 07/78] media: c8sectpfe: fix potential NULL pointer dereference in c8sectpfe_timer_interrupt Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 08/78] drm/msm: fix leak in failed get_pages Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 09/78] net: fec: add phy_reset_after_clk_enable() support Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 10/78] dm: ensure bio submission follows a depth-first tree walk Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 11/78] IB/ipoib: Warn when one port fails to initialize Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 13/78] hv_netvsc: Fix the receive buffer size limit Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 12/78] RDMA/iwpm: Fix uninitialized error code in iwpm_send_mapinfo() Sasha Levin
2018-03-08  4:56 ` Sasha Levin [this message]
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 14/78] hv_netvsc: Fix the TX/RX buffer default sizes Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 16/78] KVM: x86: add support for emulating UMIP Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 17/78] spi: sh-msiof: Avoid writing to registers from spi_master.setup() Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 18/78] libbpf: prefer global symbols as bpf program name source Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 19/78] rtlwifi: rtl_pci: Fix the bug when inactiveps is enabled Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 20/78] rtlwifi: always initialize variables given to RT_TRACE() Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 22/78] ath10k: handling qos at STA side based on AP WMM enable/disable Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 21/78] media: bt8xx: Fix err 'bt878_probe()' Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 23/78] media: [RESEND] media: dvb-frontends: Add delay to Si2168 restart Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 24/78] qmi_wwan: set FLAG_SEND_ZLP to avoid network initiated disconnect Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 25/78] tty: goldfish: Enable 'earlycon' only if built-in Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 26/78] serial: 8250_dw: Disable clock on error Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 28/78] watchdog: Fix potential kref imbalance when opening watchdog Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 27/78] cros_ec: fix nul-termination for firmware build info Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 30/78] platform/chrome: Use proper protocol transfer function Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 29/78] watchdog: Fix kref imbalance seen if handle_boot_enabled=0 Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 31/78] dmaengine: zynqmp_dma: Fix race condition in the probe Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 32/78] drm/tilcdc: ensure nonatomic iowrite64 is not used Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 33/78] mmc: avoid removing non-removable hosts during suspend Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 34/78] mmc: block: fix logical error to avoid memory leak Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 35/78] /dev/mem: Add bounce buffer for copy-out Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 36/78] net: phy: meson-gxl: check phy_write return value Sasha Levin
2018-03-08 10:18   ` Jerome Brunet
2018-03-08 12:34     ` Greg KH
2018-03-19 15:28       ` Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 37/78] sfp: fix EEPROM reading in the case of non-SFF8472 SFPs Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 39/78] media: s5p-mfc: Fix lock contention - request_firmware() once Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 38/78] sfp: fix non-detection of PHY Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 41/78] IB/ipoib: Avoid memory leak if the SA returns a different DGID Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 42/78] RDMA/cma: Use correct size when writing netlink stats Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 40/78] rtc: ac100: Fix multiple race conditions Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 43/78] IB/umem: Fix use of npages/nmap fields Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 44/78] iser-target: avoid reinitializing rdma contexts for isert commands Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 46/78] PCI/ASPM: Calculate LTR_L1.2_THRESHOLD from device characteristics Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 45/78] bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 47/78] vgacon: Set VGA struct resource types Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 48/78] omapdrm: panel: fix compatible vendor string for td028ttec1 Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 50/78] drm/omap: DMM: Check for DMM readiness after successful transaction commit Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 49/78] mmc: sdhci-xenon: wait 5ms after set 1.8V signal enable Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 51/78] pty: cancel pty slave port buf's work in tty_release Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 53/78] PCI: designware-ep: Fix ->get_msi() to check MSI_EN bit Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 52/78] coresight: Fix disabling of CoreSight TPIU Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 56/78] media: davinci: fix a debug printk Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 55/78] PCI: rcar: Handle rcar_pcie_parse_request_of_pci_ranges() failures Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 54/78] PCI: endpoint: Fix find_first_zero_bit() usage Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 58/78] dt-bindings: display: panel: Fix compatible string for Toshiba LT089AC29000 Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 57/78] clk: check ops pointer on clock register Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 60/78] pinctrl: Really force states during suspend/resume Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 59/78] clk: use round rate to bail out early in set_rate Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 62/78] iommu/vt-d: clean up pr_irq if request_threaded_irq fails Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 61/78] pinctrl: rockchip: enable clock when reading pin direction register Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 63/78] ip6_vti: adjust vti mtu according to mtu of lower device Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 64/78] ip_gre: fix error path when erspan_rcv failed Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 65/78] ip_gre: fix potential memory leak in erspan_rcv Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 66/78] soc: qcom: smsm: fix child-node lookup Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 68/78] scsi: lpfc: Fix issues connecting with nvme initiator Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 67/78] scsi: lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 69/78] RDMA/ocrdma: Fix permissions for OCRDMA_RESET_STATS Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 71/78] nfsd4: permit layoutget of executable-only files Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 70/78] ARM: dts: aspeed-evb: Add unit name to memory node Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 73/78] clk: Don't touch hardware when reparenting during registration Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 72/78] clk: at91: pmc: Wait for clocks when resuming Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 75/78] clk: si5351: Rename internal plls to avoid name collisions Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 74/78] clk: axi-clkgen: Correctly handle nocount bit in recalc_rate() Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 76/78] crypto: artpec6 - set correct iv size for gcm(aes) Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 77/78] hwrng: core - Clean up RNG list when last hwrng is unregistered Sasha Levin
2018-03-08  4:56 ` [PATCH AUTOSEL for 4.15 78/78] dmaengine: ti-dma-crossbar: Fix event mapping for TPCC_EVT_MUX_60_63 Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180308045525.7662-15-alexander.levin@microsoft.com \
    --to=alexander.levin@microsoft.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=stable@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox