public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: "Ilpo JÀrvinen" <ilpo.jarvinen@helsinki.fi>,
	"David S. Miller" <davem@davemloft.net>,
	"Greg Kroah-Hartman" <gregkh@suse.de>
Subject: [2.6.20.21 review 11/35] TCP: Fix TCP rate-halving on bidirectional flows.
Date: Sat, 13 Oct 2007 17:28:33 +0200	[thread overview]
Message-ID: <20071013143451.%N@1wt.eu> (raw)
In-Reply-To: 20071013142822.%N@1wt.eu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0039-TCP-Fix-TCP-rate-halving-on-bidirectional-flows.patch --]
[-- Type: text/plain, Size: 2073 bytes --]

Actually, the ratehalving seems to work too well, as cwnd is
reduced on every second ACK even though the packets in flight
remains unchanged. Recoveries in a bidirectional flows suffer
quite badly because of this, both NewReno and SACK are affected.

After this patch, rate halving is performed for ACK only if
packets in flight was supposedly changed too.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/ipv4/tcp_input.c |   21 ++++++++++++---------
 1 files changed, 12 insertions(+), 9 deletions(-)

Index: 2.6/net/ipv4/tcp_input.c
===================================================================
--- 2.6.orig/net/ipv4/tcp_input.c
+++ 2.6/net/ipv4/tcp_input.c
@@ -1702,19 +1702,22 @@ static inline u32 tcp_cwnd_min(const str
 }
 
 /* Decrease cwnd each second ack. */
-static void tcp_cwnd_down(struct sock *sk)
+static void tcp_cwnd_down(struct sock *sk, int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	int decr = tp->snd_cwnd_cnt + 1;
 
-	tp->snd_cwnd_cnt = decr&1;
-	decr >>= 1;
+	if ((flag&FLAG_FORWARD_PROGRESS) ||
+	    (IsReno(tp) && !(flag&FLAG_NOT_DUP))) {
+		tp->snd_cwnd_cnt = decr&1;
+		decr >>= 1;
 
-	if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
-		tp->snd_cwnd -= decr;
+		if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
+			tp->snd_cwnd -= decr;
 
-	tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
+		tp->snd_cwnd_stamp = tcp_time_stamp;
+	}
 }
 
 /* Nothing was retransmitted or returned timestamp is less
@@ -1899,7 +1902,7 @@ static void tcp_try_to_open(struct sock 
 		}
 		tcp_moderate_cwnd(tp);
 	} else {
-		tcp_cwnd_down(sk);
+		tcp_cwnd_down(sk, flag);
 	}
 }
 
@@ -2100,7 +2103,7 @@ tcp_fastretrans_alert(struct sock *sk, u
 
 	if (is_dupack || tcp_head_timedout(sk, tp))
 		tcp_update_scoreboard(sk, tp);
-	tcp_cwnd_down(sk);
+	tcp_cwnd_down(sk, flag);
 	tcp_xmit_retransmit_queue(sk);
 }
 

-- 

  parent reply	other threads:[~2007-10-13 14:46 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 02/35] AVR32: Fix atomic_add_unless() and atomic_sub_unless() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 03/35] r8169: avoid needless NAPI poll scheduling Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 04/35] i386: allow debuggers to access the vsyscall page with compat vDSO Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 05/35] DCCP: Fix DCCP GFP_KERNEL allocation in atomic context Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 06/35] Netfilter: Missing Kbuild entry for netfilter Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 07/35] SNAP: Fix SNAP protocol header accesses Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 09/35] SPARC64: Fix sparc64 task stack traces Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 10/35] TCP: Do not autobind ports for TCP sockets Willy Tarreau
2007-10-13 15:28 ` Willy Tarreau [this message]
2007-10-13 15:28 ` [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors Willy Tarreau
2007-10-13 17:15   ` [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows Ilpo Järvinen
2007-10-13 17:22     ` Willy Tarreau
2007-10-13 17:50       ` Adrian Bunk
2007-10-13 18:10         ` Willy Tarreau
2007-10-14  8:55           ` Ilpo Järvinen
2007-10-13 15:28 ` [2.6.20.21 review 14/35] USB: fix DoS in pwc USB video driver Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 15/35] Convert snd-page-alloc proc file to use seq_file Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 16/35] setpgid(child) fails if the child was forked by sub-thread Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 17/35] sigqueue_free: fix the race with collect_signal() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 18/35] USB: fix linked list insertion bugfix for usb core Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 19/35] POWERPC: Flush registers to proper task context Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 21/35] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 22/35] Fix "Fix DAC960 driver on machines which dont support 64-bit DMA" Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 23/35] futex_compat: fix list traversal bugs Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 24/35] Leases can be hidden by flocks Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 25/35] nfs: fix oops re sysctls and V4 support Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 26/35] dir_index: error out instead of BUG on corrupt dx dirs Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 27/35] ieee1394: ohci1394: fix initialization if built non-modular Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 28/35] Fix race with shared tag queue maps Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 29/35] crypto: blkcipher_get_spot() handling of buffer at end of page Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 30/35] fix realtek phy id in forcedeth Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 31/35] Fix IPV6 append OOPS Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 33/35] Fix ipv6 double-sock-release with MSG_CONFIRM Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 34/35] Fix datagram recvmsg NULL iov handling regression Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 35/35] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071013143451.%N@1wt.eu \
    --to=w@1wt.eu \
    --cc=davem@davemloft.net \
    --cc=gregkh@suse.de \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox