From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Yuchung Cheng <ycheng@google.com>, Van Jacobson <vanj@google.com>,
Neal Cardwell <ncardwell@google.com>,
Nandita Dukkipati <nanditad@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [ 43/56] tcp: preserve ACK clocking in TSO
Date: Tue, 2 Apr 2013 15:50:10 -0700 [thread overview]
Message-ID: <20130402224716.860227467@linuxfoundation.org> (raw)
In-Reply-To: <20130402224711.840825715@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit f4541d60a449afd40448b06496dcd510f505928e ]
A long standing problem with TSO is the fact that tcp_tso_should_defer()
rearms the deferred timer, while it should not.
Current code leads to following bad bursty behavior :
20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119
20:11:24.484337 IP B > A: . ack 263721 win 1117
20:11:24.485086 IP B > A: . ack 265241 win 1117
20:11:24.485925 IP B > A: . ack 266761 win 1117
20:11:24.486759 IP B > A: . ack 268281 win 1117
20:11:24.487594 IP B > A: . ack 269801 win 1117
20:11:24.488430 IP B > A: . ack 271321 win 1117
20:11:24.489267 IP B > A: . ack 272841 win 1117
20:11:24.490104 IP B > A: . ack 274361 win 1117
20:11:24.490939 IP B > A: . ack 275881 win 1117
20:11:24.491775 IP B > A: . ack 277401 win 1117
20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119
20:11:24.492620 IP B > A: . ack 278921 win 1117
20:11:24.493448 IP B > A: . ack 280441 win 1117
20:11:24.494286 IP B > A: . ack 281961 win 1117
20:11:24.495122 IP B > A: . ack 283481 win 1117
20:11:24.495958 IP B > A: . ack 285001 win 1117
20:11:24.496791 IP B > A: . ack 286521 win 1117
20:11:24.497628 IP B > A: . ack 288041 win 1117
20:11:24.498459 IP B > A: . ack 289561 win 1117
20:11:24.499296 IP B > A: . ack 291081 win 1117
20:11:24.500133 IP B > A: . ack 292601 win 1117
20:11:24.500970 IP B > A: . ack 294121 win 1117
20:11:24.501388 IP B > A: . ack 295641 win 1117
20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119
While the expected behavior is more like :
20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119
20:19:49.260446 IP B > A: . ack 154281 win 1212
20:19:49.261282 IP B > A: . ack 155801 win 1212
20:19:49.262125 IP B > A: . ack 157321 win 1212
20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119
20:19:49.262958 IP B > A: . ack 158841 win 1212
20:19:49.263795 IP B > A: . ack 160361 win 1212
20:19:49.264628 IP B > A: . ack 161881 win 1212
20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119
20:19:49.265465 IP B > A: . ack 163401 win 1212
20:19:49.265886 IP B > A: . ack 164921 win 1212
20:19:49.266722 IP B > A: . ack 166441 win 1212
20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119
20:19:49.267559 IP B > A: . ack 167961 win 1212
20:19:49.268394 IP B > A: . ack 169481 win 1212
20:19:49.269232 IP B > A: . ack 171001 win 1212
20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/tcp_output.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1579,8 +1579,11 @@ static int tcp_tso_should_defer(struct s
goto send_now;
}
- /* Ok, it looks like it is advisable to defer. */
- tp->tso_deferred = 1 | (jiffies << 1);
+ /* Ok, it looks like it is advisable to defer.
+ * Do not rearm the timer if already set to not break TCP ACK clocking.
+ */
+ if (!tp->tso_deferred)
+ tp->tso_deferred = 1 | (jiffies << 1);
return 1;
next prev parent reply other threads:[~2013-04-02 22:54 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-02 22:49 [ 00/56] 3.0.72-stable review Greg Kroah-Hartman
2013-04-02 22:49 ` [ 01/56] signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 02/56] kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER Greg Kroah-Hartman
2013-04-02 22:49 ` [ 03/56] SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked Greg Kroah-Hartman
2013-04-02 22:49 ` [ 04/56] Bluetooth: Fix not closing SCO sockets in the BT_CONNECT2 state Greg Kroah-Hartman
2013-04-02 22:49 ` [ 05/56] Bluetooth: Add support for Dell[QCA 0cf3:0036] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 06/56] Bluetooth: Add support for Dell[QCA 0cf3:817a] Greg Kroah-Hartman
2013-04-02 22:49 ` [ 07/56] staging: comedi: s626: fix continuous acquisition Greg Kroah-Hartman
2013-04-02 22:49 ` [ 08/56] sysfs: fix race between readdir and lseek Greg Kroah-Hartman
2013-04-02 22:49 ` [ 09/56] sysfs: handle failure path correctly for readdir() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 10/56] b43: A fix for DMA transmission sequence errors Greg Kroah-Hartman
2013-04-02 22:49 ` [ 11/56] xen-blkback: fix dispatch_rw_block_io() error path Greg Kroah-Hartman
2013-04-02 22:49 ` [ 12/56] usb: ftdi_sio: Add support for Mitsubishi FX-USB-AW/-BD Greg Kroah-Hartman
2013-04-02 22:49 ` [ 13/56] vt: synchronize_rcu() under spinlock is not nice Greg Kroah-Hartman
2013-04-02 22:49 ` [ 14/56] mwifiex: cancel cmd timer and free curr_cmd in shutdown process Greg Kroah-Hartman
2013-04-06 19:55 ` Ben Hutchings
2013-04-08 17:58 ` Bing Zhao
2013-04-08 17:58 ` Bing Zhao
2013-04-02 22:49 ` [ 15/56] net/irda: add missing error path release_sock call Greg Kroah-Hartman
2013-04-02 22:49 ` [ 16/56] usb: xhci: Fix TRB transfer length macro used for Event TRB Greg Kroah-Hartman
2013-04-02 22:49 ` [ 17/56] Btrfs: limit the global reserve to 512mb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 18/56] KVM: Clean up error handling during VCPU creation Greg Kroah-Hartman
2013-04-02 22:49 ` [ 19/56] x25: Validate incoming call user data lengths Greg Kroah-Hartman
2013-04-02 22:49 ` [ 20/56] x25: Handle undersized/fragmented skbs Greg Kroah-Hartman
2013-04-02 22:49 ` [ 21/56] batman-adv: bat_socket_read missing checks Greg Kroah-Hartman
2013-04-02 22:49 ` [ 22/56] batman-adv: Only write requested number of byte to user buffer Greg Kroah-Hartman
2013-04-02 22:49 ` [ 23/56] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Greg Kroah-Hartman
2013-04-02 22:49 ` [ 24/56] NFSv4: include bitmap in nfsv4 get acl data Greg Kroah-Hartman
2013-04-02 22:49 ` [ 25/56] NFSv4: Fix an Oops in the NFSv4 getacl code Greg Kroah-Hartman
2013-04-02 22:49 ` [ 26/56] NFS: nfs_getaclargs.acl_len is a size_t Greg Kroah-Hartman
2013-04-02 22:49 ` [ 27/56] KVM: Ensure all vcpus are consistent with in-kernel irqchip settings Greg Kroah-Hartman
2013-04-02 22:49 ` [ 28/56] macvtap: zerocopy: validate vectors before building skb Greg Kroah-Hartman
2013-04-02 22:49 ` [ 29/56] KVM: Fix buffer overflow in kvm_set_irq() Greg Kroah-Hartman
2013-04-02 22:49 ` [ 30/56] mm/hotplug: correctly add new zone to all other nodes zone lists Greg Kroah-Hartman
2013-04-02 22:49 ` [ 31/56] KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461) Greg Kroah-Hartman
2013-04-02 22:49 ` [ 32/56] loop: prevent bdev freeing while device in use Greg Kroah-Hartman
2013-04-02 22:50 ` [ 33/56] nfsd4: reject "negative" acl lengths Greg Kroah-Hartman
2013-04-02 22:50 ` [ 34/56] drm/i915: dont set unpin_work if vblank_get fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 35/56] drm/i915: Dont clobber crtc->fb when queue_flip fails Greg Kroah-Hartman
2013-04-02 22:50 ` [ 36/56] efivars: explicitly calculate length of VariableName Greg Kroah-Hartman
2013-04-02 22:50 ` [ 37/56] efivars: Handle duplicate names from get_next_variable() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 38/56] ext4: use atomic64_t for the per-flexbg free_clusters count Greg Kroah-Hartman
2013-04-02 22:50 ` [ 39/56] tracing: Protect tracer flags with trace_types_lock Greg Kroah-Hartman
2013-04-02 22:50 ` [ 40/56] tracing: Prevent buffer overwrite disabled for latency tracers Greg Kroah-Hartman
2013-04-02 22:50 ` [ 41/56] sky2: Receive Overflows not counted Greg Kroah-Hartman
2013-04-02 22:50 ` [ 42/56] sky2: Threshold for Pause Packet is set wrong Greg Kroah-Hartman
2013-04-02 22:50 ` Greg Kroah-Hartman [this message]
2013-04-02 22:50 ` [ 44/56] tcp: undo spurious timeout after SACK reneging Greg Kroah-Hartman
2013-04-02 22:50 ` [ 45/56] 8021q: fix a potential use-after-free Greg Kroah-Hartman
2013-04-02 22:50 ` [ 46/56] thermal: shorten too long mcast group name Greg Kroah-Hartman
2013-04-02 22:50 ` [ 47/56] unix: fix a race condition in unix_release() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 48/56] aoe: reserve enough headroom on skbs Greg Kroah-Hartman
2013-04-02 22:50 ` [ 49/56] drivers: net: ethernet: davinci_emac: use netif_wake_queue() while restarting tx queue Greg Kroah-Hartman
2013-04-02 22:50 ` [ 50/56] atl1e: drop pci-msi support because of packet corruption Greg Kroah-Hartman
2013-04-02 22:50 ` Greg Kroah-Hartman
2013-04-02 22:50 ` [ 51/56] ipv6: fix bad free of addrconf_init_net Greg Kroah-Hartman
2013-04-02 22:50 ` [ 52/56] ks8851: Fix interpretation of rxlen field Greg Kroah-Hartman
2013-04-02 22:50 ` [ 53/56] net: add a synchronize_net() in netdev_rx_handler_unregister() Greg Kroah-Hartman
2013-04-02 22:50 ` [ 54/56] pch_gbe: fix ip_summed checksum reporting on rx Greg Kroah-Hartman
2013-04-02 22:50 ` [ 55/56] smsc75xx: fix jumbo frame support Greg Kroah-Hartman
2013-04-02 22:50 ` [ 56/56] iommu/amd: Make sure dma_ops are set for hotplug devices Greg Kroah-Hartman
2013-04-03 15:19 ` [ 00/56] 3.0.72-stable review Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130402224716.860227467@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nanditad@google.com \
--cc=ncardwell@google.com \
--cc=stable@vger.kernel.org \
--cc=vanj@google.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.