public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: "Justin Forbes" <jmforbes@linuxtx.org>,
	"Zwane Mwaikambo" <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	"Randy Dunlap" <rdunlap@xenotime.net>,
	"Dave Jones" <davej@redhat.com>,
	"Chuck Wolber" <chuckw@quantumlinux.com>,
	"Chris Wedgwood" <reviews@ml.cw.f00f.org>,
	"Michael Krufky" <mkrufky@linuxtv.org>,
	"Chuck Ebbert" <cebbert@redhat.com>,
	"Domenico Andreoli" <cavokz@gmail.com>,
	"Willy Tarreau" <w@1wt.eu>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk,
	"Ilpo JÀrvinen" <ilpo.jarvinen@helsinki.fi>,
	"David S. Miller" <davem@davemloft.net>,
	"Chris Wright" <chrisw@sous-sol.org>
Subject: [patch 33/47] tcp FRTO: work-around inorder receivers
Date: Fri, 13 Jun 2008 17:11:49 -0700	[thread overview]
Message-ID: <20080614001149.GG24698@suse.de> (raw)
In-Reply-To: <20080614000840.GA24659@suse.de>

[-- Attachment #1: tcp-frto-work-around-inorder-receivers.patch --]
[-- Type: text/plain, Size: 2715 bytes --]

-stable review patch.  If anyone has any objections, please let us know.

------------------
From: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

[ upstream commit: 79d44516b4b178ffb6e2159c75584cfcfc097914 ]

If receiver consumes segments successfully only in-order, FRTO
fallback to conventional recovery produces RTO loop because
FRTO's forward transmissions will always get dropped and need to
be resent, yet by default they're not marked as lost (which are
the only segments we will retransmit in CA_Loss).

Price to pay about this is occassionally unnecessarily
retransmitting the forward transmission(s). SACK blocks help
a bit to avoid this, so it's mainly a concern for NewReno case
though SACK is not fully immune either.

This change has a side-effect of fixing SACKFRTO problem where
it didn't have snd_nxt of the RTO time available anymore when
fallback become necessary (this problem would have only occured
when RTO would occur for two or more segments and ECE arrives
in step 3; no need to figure out how to fix that unless the
TODO item of selective behavior is considered in future).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Damon L. Chesser <damon@damtek.com>
Tested-by: Damon L. Chesser <damon@damtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
 net/ipv4/tcp_input.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1841,9 +1841,16 @@ static void tcp_enter_frto_loss(struct s
 			TCP_SKB_CB(skb)->sacked &= ~TCPCB_SACKED_RETRANS;
 		}
 
-		/* Don't lost mark skbs that were fwd transmitted after RTO */
-		if (!(TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) &&
-		    !after(TCP_SKB_CB(skb)->end_seq, tp->frto_highmark)) {
+		/* Marking forward transmissions that were made after RTO lost
+		 * can cause unnecessary retransmissions in some scenarios,
+		 * SACK blocks will mitigate that in some but not in all cases.
+		 * We used to not mark them but it was causing break-ups with
+		 * receivers that do only in-order receival.
+		 *
+		 * TODO: we could detect presence of such receiver and select
+		 * different behavior per flow.
+		 */
+		if (!(TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED)) {
 			TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
 			tp->lost_out += tcp_skb_pcount(skb);
 		}
@@ -1859,7 +1866,7 @@ static void tcp_enter_frto_loss(struct s
 	tp->reordering = min_t(unsigned int, tp->reordering,
 			       sysctl_tcp_reordering);
 	tcp_set_ca_state(sk, TCP_CA_Loss);
-	tp->high_seq = tp->frto_highmark;
+	tp->high_seq = tp->snd_nxt;
 	TCP_ECN_queue_cwr(tp);
 
 	tcp_clear_retrans_hints_partial(tp);

-- 

  parent reply	other threads:[~2008-06-14  0:23 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080613234753.235721454@mini.kroah.org>
2008-06-14  0:08 ` [patch 00/47] 2.6.25-stable review Greg KH
2008-06-14  0:10   ` [patch 01/47] b43: Fix controller restart crash Greg KH
2008-06-14  0:10   ` [patch 02/47] ipwireless: Fix blocked sending Greg KH
2008-06-14  0:10   ` [patch 03/47] Add rd alias to new brd ramdisk driver Greg KH
2008-06-14  0:10   ` [patch 04/47] ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable Greg KH
2008-06-14  0:10   ` [patch 05/47] double-free of inode on alloc_file() failure exit in create_write_pipe() Greg KH
2008-06-14  0:10   ` [patch 06/47] ALSA: hda - Fix resume of auto-config mode with Realtek codecs Greg KH
2008-06-14  0:10   ` [patch 07/47] sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA card Greg KH
2008-06-14  0:10   ` [patch 08/47] ecryptfs: add missing lock around notify_change Greg KH
2008-06-14  0:10   ` [patch 09/47] eCryptfs: protect crypt_stat->flags in ecryptfs_open() Greg KH
2008-06-14  0:10   ` [patch 10/47] ecryptfs: clean up (un)lock_parent Greg KH
2008-06-14  0:10   ` [patch 11/47] ecryptfs: fix missed mutex_unlock Greg KH
2008-06-14  0:10   ` [patch 12/47] fbdev: export symbol fb_mode_option Greg KH
2008-06-14  0:10   ` [patch 13/47] sunhv: Fix locking in non-paged I/O case Greg KH
2008-06-14  0:10   ` [patch 14/47] af_key: Fix selector family initialization Greg KH
2008-06-14  0:10   ` [patch 15/47] ax25: Fix NULL pointer dereference and lockup Greg KH
2008-06-14  0:10   ` [patch 16/47] bluetooth: fix locking bug in the rfcomm socket cleanup handling Greg KH
2008-06-14  2:50     ` Dave Young
2008-06-14  3:46       ` David Miller
2008-06-16 19:40         ` [stable] " Greg KH
2008-06-14 12:26       ` Marcel Holtmann
2008-06-15  3:35         ` Dave Young
2008-06-14  0:11   ` [patch 17/47] can: Fix copy_from_user() results interpretation Greg KH
2008-06-14  0:11   ` [patch 18/47] cassini: Only use chip checksum for ipv4 packets Greg KH
2008-06-14  0:11   ` [patch 19/47] net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() Greg KH
2008-06-14  0:11   ` [patch 20/47] net_sched: cls_api: fix return value for non-existant classifiers Greg KH
2008-06-14  0:11   ` [patch 21/47] ipsec: Use the correct ip_local_out function Greg KH
2008-06-14  0:11   ` [patch 22/47] netlink: Fix nla_parse_nested_compat() to call nla_parse() directly Greg KH
2008-06-14  0:11   ` [patch 23/47] l2tp: avoid skb truesize bug if headroom is increased Greg KH
2008-06-14  0:11   ` [patch 24/47] vlan: Correctly handle device notifications for layered VLAN devices Greg KH
2008-06-14  0:11   ` [patch 25/47] tcp: TCP connection times out if ICMP frag needed is delayed Greg KH
2008-06-14  0:11   ` [patch 26/47] tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled Greg KH
2008-06-14  0:11   ` [patch 27/47] tcp: Limit cwnd growth when deferring for GSO Greg KH
2008-06-14  0:11   ` [patch 28/47] l2tp: Fix possible WARN_ON from socket code when UDP socket is closed Greg KH
2008-06-14  0:11   ` [patch 29/47] l2tp: Fix possible oops if transmitting or receiving when tunnel goes down Greg KH
2008-06-14  0:11   ` [patch 30/47] tcp: fix skb vs fack_count out-of-sync condition Greg KH
2008-06-14  0:11   ` [patch 31/47] tcp FRTO: Fix fallback to conventional recovery Greg KH
2008-06-14  0:11   ` [patch 32/47] tcp FRTO: SACK variant is errorneously used with NewReno Greg KH
2008-06-14  0:11   ` Greg KH [this message]
2008-06-14  0:11   ` [patch 34/47] mmc: wbsd: initialize tasklets before requesting interrupt Greg KH
2008-06-14  0:11   ` [patch 35/47] cciss: add new hardware support Greg KH
2008-06-14  0:11   ` [patch 36/47] hgafb: resource management fix Greg KH
2008-06-14  0:11   ` [patch 37/47] forcedeth: msi interrupts Greg KH
2008-06-14  0:12   ` [patch 38/47] tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp)) Greg KH
2008-06-14  0:12   ` [patch 39/47] IB/umem: Avoid sign problems when demoting npages to integer Greg KH
2008-06-14  0:12   ` [patch 40/47] m68k: Add ext2_find_{first,next}_bit() for ext4 Greg KH
2008-06-14  0:12   ` [patch 41/47] cifs: fix oops on mount when CONFIG_CIFS_DFS_UPCALL is enabled Greg KH
2008-06-14  0:12   ` [patch 42/47] CPUFREQ: Fix format string bug Greg KH
2008-06-14  0:12   ` [patch 43/47] serial: fix enable_irq_wake/disable_irq_wake imbalance in serial_core.c Greg KH
2008-06-14  0:12   ` [patch 44/47] Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST Greg KH
2008-06-14  0:12   ` [patch 45/47] bttv: Fix a deadlock in the bttv driver Greg KH
2008-06-14  0:12   ` [patch 46/47] x86: fix recursive dependencies Greg KH
2008-06-14  0:12   ` [patch 47/47] mac80211: send association event on IBSS create Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080614001149.GG24698@suse.de \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=w@1wt.eu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox