public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: "Justin Forbes" <jmforbes@linuxtx.org>,
	"Zwane Mwaikambo" <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	"Randy Dunlap" <rdunlap@xenotime.net>,
	"Dave Jones" <davej@redhat.com>,
	"Chuck Wolber" <chuckw@quantumlinux.com>,
	"Chris Wedgwood" <reviews@ml.cw.f00f.org>,
	"Michael Krufky" <mkrufky@linuxtv.org>,
	"Chuck Ebbert" <cebbert@redhat.com>,
	"Domenico Andreoli" <cavokz@gmail.com>,
	"Willy Tarreau" <w@1wt.eu>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk,
	"Ilpo JÀrvinen" <ilpo.jarvinen@helsinki.fi>,
	"David S. Miller" <davem@davemloft.net>,
	"Chris Wright" <chrisw@sous-sol.org>
Subject: [patch 30/47] tcp: fix skb vs fack_count out-of-sync condition
Date: Fri, 13 Jun 2008 17:11:42 -0700	[thread overview]
Message-ID: <20080614001142.GD24698@suse.de> (raw)
In-Reply-To: <20080614000840.GA24659@suse.de>

[-- Attachment #1: tcp-fix-skb-vs-fack_count-out-of-sync-condition.patch --]
[-- Type: text/plain, Size: 2837 bytes --]

-stable review patch.  If anyone has any objections, please let us know.

------------------
From: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

[ upstream commit: a6604471db5e7a33474a7f16c64d6b118fae3e74 ]

This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing &
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
 net/ipv4/tcp_input.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1393,9 +1393,9 @@ static struct sk_buff *tcp_maybe_skippin
 
 	if (before(next_dup->start_seq, skip_to_seq)) {
 		skb = tcp_sacktag_skip(skb, sk, next_dup->start_seq, fack_count);
-		tcp_sacktag_walk(skb, sk, NULL,
-				 next_dup->start_seq, next_dup->end_seq,
-				 1, fack_count, reord, flag);
+		skb = tcp_sacktag_walk(skb, sk, NULL,
+				     next_dup->start_seq, next_dup->end_seq,
+				     1, fack_count, reord, flag);
 	}
 
 	return skb;

-- 

  parent reply	other threads:[~2008-06-14  0:22 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080613234753.235721454@mini.kroah.org>
2008-06-14  0:08 ` [patch 00/47] 2.6.25-stable review Greg KH
2008-06-14  0:10   ` [patch 01/47] b43: Fix controller restart crash Greg KH
2008-06-14  0:10   ` [patch 02/47] ipwireless: Fix blocked sending Greg KH
2008-06-14  0:10   ` [patch 03/47] Add rd alias to new brd ramdisk driver Greg KH
2008-06-14  0:10   ` [patch 04/47] ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable Greg KH
2008-06-14  0:10   ` [patch 05/47] double-free of inode on alloc_file() failure exit in create_write_pipe() Greg KH
2008-06-14  0:10   ` [patch 06/47] ALSA: hda - Fix resume of auto-config mode with Realtek codecs Greg KH
2008-06-14  0:10   ` [patch 07/47] sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA card Greg KH
2008-06-14  0:10   ` [patch 08/47] ecryptfs: add missing lock around notify_change Greg KH
2008-06-14  0:10   ` [patch 09/47] eCryptfs: protect crypt_stat->flags in ecryptfs_open() Greg KH
2008-06-14  0:10   ` [patch 10/47] ecryptfs: clean up (un)lock_parent Greg KH
2008-06-14  0:10   ` [patch 11/47] ecryptfs: fix missed mutex_unlock Greg KH
2008-06-14  0:10   ` [patch 12/47] fbdev: export symbol fb_mode_option Greg KH
2008-06-14  0:10   ` [patch 13/47] sunhv: Fix locking in non-paged I/O case Greg KH
2008-06-14  0:10   ` [patch 14/47] af_key: Fix selector family initialization Greg KH
2008-06-14  0:10   ` [patch 15/47] ax25: Fix NULL pointer dereference and lockup Greg KH
2008-06-14  0:10   ` [patch 16/47] bluetooth: fix locking bug in the rfcomm socket cleanup handling Greg KH
2008-06-14  2:50     ` Dave Young
2008-06-14  3:46       ` David Miller
2008-06-16 19:40         ` [stable] " Greg KH
2008-06-14 12:26       ` Marcel Holtmann
2008-06-15  3:35         ` Dave Young
2008-06-14  0:11   ` [patch 17/47] can: Fix copy_from_user() results interpretation Greg KH
2008-06-14  0:11   ` [patch 18/47] cassini: Only use chip checksum for ipv4 packets Greg KH
2008-06-14  0:11   ` [patch 19/47] net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() Greg KH
2008-06-14  0:11   ` [patch 20/47] net_sched: cls_api: fix return value for non-existant classifiers Greg KH
2008-06-14  0:11   ` [patch 21/47] ipsec: Use the correct ip_local_out function Greg KH
2008-06-14  0:11   ` [patch 22/47] netlink: Fix nla_parse_nested_compat() to call nla_parse() directly Greg KH
2008-06-14  0:11   ` [patch 23/47] l2tp: avoid skb truesize bug if headroom is increased Greg KH
2008-06-14  0:11   ` [patch 24/47] vlan: Correctly handle device notifications for layered VLAN devices Greg KH
2008-06-14  0:11   ` [patch 25/47] tcp: TCP connection times out if ICMP frag needed is delayed Greg KH
2008-06-14  0:11   ` [patch 26/47] tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled Greg KH
2008-06-14  0:11   ` [patch 27/47] tcp: Limit cwnd growth when deferring for GSO Greg KH
2008-06-14  0:11   ` [patch 28/47] l2tp: Fix possible WARN_ON from socket code when UDP socket is closed Greg KH
2008-06-14  0:11   ` [patch 29/47] l2tp: Fix possible oops if transmitting or receiving when tunnel goes down Greg KH
2008-06-14  0:11   ` Greg KH [this message]
2008-06-14  0:11   ` [patch 31/47] tcp FRTO: Fix fallback to conventional recovery Greg KH
2008-06-14  0:11   ` [patch 32/47] tcp FRTO: SACK variant is errorneously used with NewReno Greg KH
2008-06-14  0:11   ` [patch 33/47] tcp FRTO: work-around inorder receivers Greg KH
2008-06-14  0:11   ` [patch 34/47] mmc: wbsd: initialize tasklets before requesting interrupt Greg KH
2008-06-14  0:11   ` [patch 35/47] cciss: add new hardware support Greg KH
2008-06-14  0:11   ` [patch 36/47] hgafb: resource management fix Greg KH
2008-06-14  0:11   ` [patch 37/47] forcedeth: msi interrupts Greg KH
2008-06-14  0:12   ` [patch 38/47] tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp)) Greg KH
2008-06-14  0:12   ` [patch 39/47] IB/umem: Avoid sign problems when demoting npages to integer Greg KH
2008-06-14  0:12   ` [patch 40/47] m68k: Add ext2_find_{first,next}_bit() for ext4 Greg KH
2008-06-14  0:12   ` [patch 41/47] cifs: fix oops on mount when CONFIG_CIFS_DFS_UPCALL is enabled Greg KH
2008-06-14  0:12   ` [patch 42/47] CPUFREQ: Fix format string bug Greg KH
2008-06-14  0:12   ` [patch 43/47] serial: fix enable_irq_wake/disable_irq_wake imbalance in serial_core.c Greg KH
2008-06-14  0:12   ` [patch 44/47] Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST Greg KH
2008-06-14  0:12   ` [patch 45/47] bttv: Fix a deadlock in the bttv driver Greg KH
2008-06-14  0:12   ` [patch 46/47] x86: fix recursive dependencies Greg KH
2008-06-14  0:12   ` [patch 47/47] mac80211: send association event on IBSS create Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080614001142.GD24698@suse.de \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=w@1wt.eu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox