stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Yuchung Cheng <ycheng@google.com>, Van Jacobson <vanj@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	Nandita Dukkipati <nanditad@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [ 43/56] tcp: preserve ACK clocking in TSO
Date: Tue,  2 Apr 2013 15:50:10 -0700	[thread overview]
Message-ID: <20130402224716.860227467@linuxfoundation.org> (raw)
In-Reply-To: <20130402224711.840825715@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------


From: Eric Dumazet <edumazet@google.com>

[ Upstream commit f4541d60a449afd40448b06496dcd510f505928e ]

A long standing problem with TSO is the fact that tcp_tso_should_defer()
rearms the deferred timer, while it should not.

Current code leads to following bad bursty behavior :

20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119
20:11:24.484337 IP B > A: . ack 263721 win 1117
20:11:24.485086 IP B > A: . ack 265241 win 1117
20:11:24.485925 IP B > A: . ack 266761 win 1117
20:11:24.486759 IP B > A: . ack 268281 win 1117
20:11:24.487594 IP B > A: . ack 269801 win 1117
20:11:24.488430 IP B > A: . ack 271321 win 1117
20:11:24.489267 IP B > A: . ack 272841 win 1117
20:11:24.490104 IP B > A: . ack 274361 win 1117
20:11:24.490939 IP B > A: . ack 275881 win 1117
20:11:24.491775 IP B > A: . ack 277401 win 1117
20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119
20:11:24.492620 IP B > A: . ack 278921 win 1117
20:11:24.493448 IP B > A: . ack 280441 win 1117
20:11:24.494286 IP B > A: . ack 281961 win 1117
20:11:24.495122 IP B > A: . ack 283481 win 1117
20:11:24.495958 IP B > A: . ack 285001 win 1117
20:11:24.496791 IP B > A: . ack 286521 win 1117
20:11:24.497628 IP B > A: . ack 288041 win 1117
20:11:24.498459 IP B > A: . ack 289561 win 1117
20:11:24.499296 IP B > A: . ack 291081 win 1117
20:11:24.500133 IP B > A: . ack 292601 win 1117
20:11:24.500970 IP B > A: . ack 294121 win 1117
20:11:24.501388 IP B > A: . ack 295641 win 1117
20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119

While the expected behavior is more like :

20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119
20:19:49.260446 IP B > A: . ack 154281 win 1212
20:19:49.261282 IP B > A: . ack 155801 win 1212
20:19:49.262125 IP B > A: . ack 157321 win 1212
20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119
20:19:49.262958 IP B > A: . ack 158841 win 1212
20:19:49.263795 IP B > A: . ack 160361 win 1212
20:19:49.264628 IP B > A: . ack 161881 win 1212
20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119
20:19:49.265465 IP B > A: . ack 163401 win 1212
20:19:49.265886 IP B > A: . ack 164921 win 1212
20:19:49.266722 IP B > A: . ack 166441 win 1212
20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119
20:19:49.267559 IP B > A: . ack 167961 win 1212
20:19:49.268394 IP B > A: . ack 169481 win 1212
20:19:49.269232 IP B > A: . ack 171001 win 1212
20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_output.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1579,8 +1579,11 @@ static int tcp_tso_should_defer(struct s
 			goto send_now;
 	}
 
-	/* Ok, it looks like it is advisable to defer.  */
-	tp->tso_deferred = 1 | (jiffies << 1);
+	/* Ok, it looks like it is advisable to defer.
+	 * Do not rearm the timer if already set to not break TCP ACK clocking.
+	 */
+	if (!tp->tso_deferred)
+		tp->tso_deferred = 1 | (jiffies << 1);
 
 	return 1;
 



  parent reply	other threads:[~2013-04-02 22:50 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02 22:49 [ 00/56] 3.0.72-stable review Greg Kroah-Hartman
2013-04-02 22:49 ` [ 01/56] signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 02/56] kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER Greg Kroah-Hartman
2013-04-02 22:49 ` [ 03/56] SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked Greg Kroah-Hartman
2013-04-02 22:49 ` [ 04/56] Bluetooth: Fix not closing SCO sockets in the BT_CONNECT2 state Greg Kroah-Hartman
2013-04-02 22:49 ` [ 05/56] Bluetooth: Add support for Dell[QCA 0cf3:0036] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 06/56] Bluetooth: Add support for Dell[QCA 0cf3:817a] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 07/56] staging: comedi: s626: fix continuous acquisition Greg Kroah-Hartman
2013-04-02 22:49 ` [ 08/56] sysfs: fix race between readdir and lseek Greg Kroah-Hartman
2013-04-02 22:49 ` [ 09/56] sysfs: handle failure path correctly for readdir() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 10/56] b43: A fix for DMA transmission sequence errors Greg Kroah-Hartman
2013-04-02 22:49 ` [ 11/56] xen-blkback: fix dispatch_rw_block_io() error path Greg Kroah-Hartman
2013-04-02 22:49 ` [ 12/56] usb: ftdi_sio: Add support for Mitsubishi FX-USB-AW/-BD Greg Kroah-Hartman
2013-04-02 22:49 ` [ 13/56] vt: synchronize_rcu() under spinlock is not nice Greg Kroah-Hartman
2013-04-02 22:49 ` [ 14/56] mwifiex: cancel cmd timer and free curr_cmd in shutdown process Greg Kroah-Hartman
2013-04-06 19:55   ` Ben Hutchings
2013-04-08 17:58     ` Bing Zhao
2013-04-02 22:49 ` [ 15/56] net/irda: add missing error path release_sock call Greg Kroah-Hartman
2013-04-02 22:49 ` [ 16/56] usb: xhci: Fix TRB transfer length macro used for Event TRB Greg Kroah-Hartman
2013-04-02 22:49 ` [ 17/56] Btrfs: limit the global reserve to 512mb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 18/56] KVM: Clean up error handling during VCPU creation Greg Kroah-Hartman
2013-04-02 22:49 ` [ 19/56] x25: Validate incoming call user data lengths Greg Kroah-Hartman
2013-04-02 22:49 ` [ 20/56] x25: Handle undersized/fragmented skbs Greg Kroah-Hartman
2013-04-02 22:49 ` [ 21/56] batman-adv: bat_socket_read missing checks Greg Kroah-Hartman
2013-04-02 22:49 ` [ 22/56] batman-adv: Only write requested number of byte to user buffer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 23/56] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Greg Kroah-Hartman
2013-04-02 22:49 ` [ 24/56] NFSv4: include bitmap in nfsv4 get acl data Greg Kroah-Hartman
2013-04-02 22:49 ` [ 25/56] NFSv4: Fix an Oops in the NFSv4 getacl code Greg Kroah-Hartman
2013-04-02 22:49 ` [ 26/56] NFS: nfs_getaclargs.acl_len is a size_t Greg Kroah-Hartman
2013-04-02 22:49 ` [ 27/56] KVM: Ensure all vcpus are consistent with in-kernel irqchip settings Greg Kroah-Hartman
2013-04-02 22:49 ` [ 28/56] macvtap: zerocopy: validate vectors before building skb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 29/56] KVM: Fix buffer overflow in kvm_set_irq() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 30/56] mm/hotplug: correctly add new zone to all other nodes zone lists Greg Kroah-Hartman
2013-04-02 22:49 ` [ 31/56] KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461) Greg Kroah-Hartman
2013-04-02 22:49 ` [ 32/56] loop: prevent bdev freeing while device in use Greg Kroah-Hartman
2013-04-02 22:50 ` [ 33/56] nfsd4: reject "negative" acl lengths Greg Kroah-Hartman
2013-04-02 22:50 ` [ 34/56] drm/i915: dont set unpin_work if vblank_get fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 35/56] drm/i915: Dont clobber crtc->fb when queue_flip fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 36/56] efivars: explicitly calculate length of VariableName Greg Kroah-Hartman
2013-04-02 22:50 ` [ 37/56] efivars: Handle duplicate names from get_next_variable() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 38/56] ext4: use atomic64_t for the per-flexbg free_clusters count Greg Kroah-Hartman
2013-04-02 22:50 ` [ 39/56] tracing: Protect tracer flags with trace_types_lock Greg Kroah-Hartman
2013-04-02 22:50 ` [ 40/56] tracing: Prevent buffer overwrite disabled for latency tracers Greg Kroah-Hartman
2013-04-02 22:50 ` [ 41/56] sky2: Receive Overflows not counted Greg Kroah-Hartman
2013-04-02 22:50 ` [ 42/56] sky2: Threshold for Pause Packet is set wrong Greg Kroah-Hartman
2013-04-02 22:50 ` Greg Kroah-Hartman [this message]
2013-04-02 22:50 ` [ 44/56] tcp: undo spurious timeout after SACK reneging Greg Kroah-Hartman
2013-04-02 22:50 ` [ 45/56] 8021q: fix a potential use-after-free Greg Kroah-Hartman
2013-04-02 22:50 ` [ 46/56] thermal: shorten too long mcast group name Greg Kroah-Hartman
2013-04-02 22:50 ` [ 47/56] unix: fix a race condition in unix_release() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 48/56] aoe: reserve enough headroom on skbs Greg Kroah-Hartman
2013-04-02 22:50 ` [ 49/56] drivers: net: ethernet: davinci_emac: use netif_wake_queue() while restarting tx queue Greg Kroah-Hartman
2013-04-02 22:50 ` [ 50/56] atl1e: drop pci-msi support because of packet corruption Greg Kroah-Hartman
2013-04-02 22:50 ` [ 51/56] ipv6: fix bad free of addrconf_init_net Greg Kroah-Hartman
2013-04-02 22:50 ` [ 52/56] ks8851: Fix interpretation of rxlen field Greg Kroah-Hartman
2013-04-02 22:50 ` [ 53/56] net: add a synchronize_net() in netdev_rx_handler_unregister() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 54/56] pch_gbe: fix ip_summed checksum reporting on rx Greg Kroah-Hartman
2013-04-02 22:50 ` [ 55/56] smsc75xx: fix jumbo frame support Greg Kroah-Hartman
2013-04-02 22:50 ` [ 56/56] iommu/amd: Make sure dma_ops are set for hotplug devices Greg Kroah-Hartman
2013-04-03 15:19 ` [ 00/56] 3.0.72-stable review Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130402224716.860227467@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vanj@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).