From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: "Justin Forbes" <jmforbes@linuxtx.org>,
"Zwane Mwaikambo" <zwane@arm.linux.org.uk>,
"Theodore Ts'o" <tytso@mit.edu>,
"Randy Dunlap" <rdunlap@xenotime.net>,
"Dave Jones" <davej@redhat.com>,
"Chuck Wolber" <chuckw@quantumlinux.com>,
"Chris Wedgwood" <reviews@ml.cw.f00f.org>,
"Michael Krufky" <mkrufky@linuxtv.org>,
"Chuck Ebbert" <cebbert@redhat.com>,
"Domenico Andreoli" <cavokz@gmail.com>,
"Willy Tarreau" <w@1wt.eu>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk,
"Ilpo JÀrvinen" <ilpo.jarvinen@helsinki.fi>,
"David S. Miller" <davem@davemloft.net>
Subject: [patch 38/47] tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))
Date: Fri, 13 Jun 2008 17:12:02 -0700 [thread overview]
Message-ID: <20080614001202.GL24698@suse.de> (raw)
In-Reply-To: <20080614000840.GA24659@suse.de>
[-- Attachment #1: tcp-fix-inconsistency-source.patch --]
[-- Type: text/plain, Size: 3347 bytes --]
-stable review patch. If anyone has any objections, please let us know.
------------------
From: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
[ upstream commit: 8aca6cb1179ed9bef9351028c8d8af852903eae2 ]
It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.
I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).
This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
net/ipv4/tcp_input.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2469,6 +2469,20 @@ static inline void tcp_complete_cwr(stru
tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
}
+static void tcp_try_keep_open(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ int state = TCP_CA_Open;
+
+ if (tcp_left_out(tp) || tp->retrans_out || tp->undo_marker)
+ state = TCP_CA_Disorder;
+
+ if (inet_csk(sk)->icsk_ca_state != state) {
+ tcp_set_ca_state(sk, state);
+ tp->high_seq = tp->snd_nxt;
+ }
+}
+
static void tcp_try_to_open(struct sock *sk, int flag)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -2482,15 +2496,7 @@ static void tcp_try_to_open(struct sock
tcp_enter_cwr(sk, 1);
if (inet_csk(sk)->icsk_ca_state != TCP_CA_CWR) {
- int state = TCP_CA_Open;
-
- if (tcp_left_out(tp) || tp->retrans_out || tp->undo_marker)
- state = TCP_CA_Disorder;
-
- if (inet_csk(sk)->icsk_ca_state != state) {
- tcp_set_ca_state(sk, state);
- tp->high_seq = tp->snd_nxt;
- }
+ tcp_try_keep_open(sk);
tcp_moderate_cwnd(tp);
} else {
tcp_cwnd_down(sk, flag);
@@ -3296,8 +3302,11 @@ no_queue:
return 1;
old_ack:
- if (TCP_SKB_CB(skb)->sacked)
+ if (TCP_SKB_CB(skb)->sacked) {
tcp_sacktag_write_queue(sk, skb, prior_snd_una);
+ if (icsk->icsk_ca_state == TCP_CA_Open)
+ tcp_try_keep_open(sk);
+ }
uninteresting_ack:
SOCK_DEBUG(sk, "Ack %u out of %u:%u\n", ack, tp->snd_una, tp->snd_nxt);
--
next prev parent reply other threads:[~2008-06-14 0:25 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080613234753.235721454@mini.kroah.org>
2008-06-14 0:08 ` [patch 00/47] 2.6.25-stable review Greg KH
2008-06-14 0:10 ` [patch 01/47] b43: Fix controller restart crash Greg KH
2008-06-14 0:10 ` [patch 02/47] ipwireless: Fix blocked sending Greg KH
2008-06-14 0:10 ` [patch 03/47] Add rd alias to new brd ramdisk driver Greg KH
2008-06-14 0:10 ` [patch 04/47] ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable Greg KH
2008-06-14 0:10 ` [patch 05/47] double-free of inode on alloc_file() failure exit in create_write_pipe() Greg KH
2008-06-14 0:10 ` [patch 06/47] ALSA: hda - Fix resume of auto-config mode with Realtek codecs Greg KH
2008-06-14 0:10 ` [patch 07/47] sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA card Greg KH
2008-06-14 0:10 ` [patch 08/47] ecryptfs: add missing lock around notify_change Greg KH
2008-06-14 0:10 ` [patch 09/47] eCryptfs: protect crypt_stat->flags in ecryptfs_open() Greg KH
2008-06-14 0:10 ` [patch 10/47] ecryptfs: clean up (un)lock_parent Greg KH
2008-06-14 0:10 ` [patch 11/47] ecryptfs: fix missed mutex_unlock Greg KH
2008-06-14 0:10 ` [patch 12/47] fbdev: export symbol fb_mode_option Greg KH
2008-06-14 0:10 ` [patch 13/47] sunhv: Fix locking in non-paged I/O case Greg KH
2008-06-14 0:10 ` [patch 14/47] af_key: Fix selector family initialization Greg KH
2008-06-14 0:10 ` [patch 15/47] ax25: Fix NULL pointer dereference and lockup Greg KH
2008-06-14 0:10 ` [patch 16/47] bluetooth: fix locking bug in the rfcomm socket cleanup handling Greg KH
2008-06-14 2:50 ` Dave Young
2008-06-14 3:46 ` David Miller
2008-06-16 19:40 ` [stable] " Greg KH
2008-06-14 12:26 ` Marcel Holtmann
2008-06-15 3:35 ` Dave Young
2008-06-14 0:11 ` [patch 17/47] can: Fix copy_from_user() results interpretation Greg KH
2008-06-14 0:11 ` [patch 18/47] cassini: Only use chip checksum for ipv4 packets Greg KH
2008-06-14 0:11 ` [patch 19/47] net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() Greg KH
2008-06-14 0:11 ` [patch 20/47] net_sched: cls_api: fix return value for non-existant classifiers Greg KH
2008-06-14 0:11 ` [patch 21/47] ipsec: Use the correct ip_local_out function Greg KH
2008-06-14 0:11 ` [patch 22/47] netlink: Fix nla_parse_nested_compat() to call nla_parse() directly Greg KH
2008-06-14 0:11 ` [patch 23/47] l2tp: avoid skb truesize bug if headroom is increased Greg KH
2008-06-14 0:11 ` [patch 24/47] vlan: Correctly handle device notifications for layered VLAN devices Greg KH
2008-06-14 0:11 ` [patch 25/47] tcp: TCP connection times out if ICMP frag needed is delayed Greg KH
2008-06-14 0:11 ` [patch 26/47] tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled Greg KH
2008-06-14 0:11 ` [patch 27/47] tcp: Limit cwnd growth when deferring for GSO Greg KH
2008-06-14 0:11 ` [patch 28/47] l2tp: Fix possible WARN_ON from socket code when UDP socket is closed Greg KH
2008-06-14 0:11 ` [patch 29/47] l2tp: Fix possible oops if transmitting or receiving when tunnel goes down Greg KH
2008-06-14 0:11 ` [patch 30/47] tcp: fix skb vs fack_count out-of-sync condition Greg KH
2008-06-14 0:11 ` [patch 31/47] tcp FRTO: Fix fallback to conventional recovery Greg KH
2008-06-14 0:11 ` [patch 32/47] tcp FRTO: SACK variant is errorneously used with NewReno Greg KH
2008-06-14 0:11 ` [patch 33/47] tcp FRTO: work-around inorder receivers Greg KH
2008-06-14 0:11 ` [patch 34/47] mmc: wbsd: initialize tasklets before requesting interrupt Greg KH
2008-06-14 0:11 ` [patch 35/47] cciss: add new hardware support Greg KH
2008-06-14 0:11 ` [patch 36/47] hgafb: resource management fix Greg KH
2008-06-14 0:11 ` [patch 37/47] forcedeth: msi interrupts Greg KH
2008-06-14 0:12 ` Greg KH [this message]
2008-06-14 0:12 ` [patch 39/47] IB/umem: Avoid sign problems when demoting npages to integer Greg KH
2008-06-14 0:12 ` [patch 40/47] m68k: Add ext2_find_{first,next}_bit() for ext4 Greg KH
2008-06-14 0:12 ` [patch 41/47] cifs: fix oops on mount when CONFIG_CIFS_DFS_UPCALL is enabled Greg KH
2008-06-14 0:12 ` [patch 42/47] CPUFREQ: Fix format string bug Greg KH
2008-06-14 0:12 ` [patch 43/47] serial: fix enable_irq_wake/disable_irq_wake imbalance in serial_core.c Greg KH
2008-06-14 0:12 ` [patch 44/47] Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST Greg KH
2008-06-14 0:12 ` [patch 45/47] bttv: Fix a deadlock in the bttv driver Greg KH
2008-06-14 0:12 ` [patch 46/47] x86: fix recursive dependencies Greg KH
2008-06-14 0:12 ` [patch 47/47] mac80211: send association event on IBSS create Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080614001202.GL24698@suse.de \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=cavokz@gmail.com \
--cc=cebbert@redhat.com \
--cc=chuckw@quantumlinux.com \
--cc=davej@redhat.com \
--cc=davem@davemloft.net \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=jmforbes@linuxtx.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mkrufky@linuxtv.org \
--cc=rdunlap@xenotime.net \
--cc=reviews@ml.cw.f00f.org \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=w@1wt.eu \
--cc=zwane@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox