netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
To: Andy Furniss <lists@andyfurniss.entadsl.com>,
	David Miller <davem@davemloft.net>
Cc: Netdev <netdev@vger.kernel.org>, int 986 <int986@gmail.com>,
	Brian Vowell <brian.vowell@gmail.com>,
	Robin Johnson <robbat2@gentoo.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH] tcp: fix skb vs fack_count out-of-sync condition
Date: Wed, 4 Jun 2008 13:42:43 +0300 (EEST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0806041330030.16829@wrl-59.cs.helsinki.fi> (raw)
In-Reply-To: <483E0848.6030307@andyfurniss.entadsl.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3777 bytes --]

On Thu, 29 May 2008, Andy Furniss wrote:

> ? wrote:
> > On Sat, 3 May 2008, Andy Furniss wrote:
> > 
> > > Ilpo J?rvinen wrote:
> > >
> > > > Since I've not yet found any other path that could lead to it except the
> > > > yesterday's false-positive, getting the write queue state more
> > > > accurately
> > > > captured by the debug patch on the very first occassion might help.
> > > > Here's
> > > > less spammy version of the debug patch (I added one printout as well to
> > > > get
> > > > more usefulness to transitional state reported).
> > > >
> > > OK, increased CONFIG_LOG_BUF_SHIFT from 14 to 18 and reversed old /
> > > applied
> > > new patch.
> > 
> > Any luck so far?
> > 
> 
> Finally - looks like I got two together.
> 
> It doesn't start with WARNING - but dmesg isn't full.

Ok, here's finally the fix to those WARN_ON(!tp->sacked_out && 
tp->fackets_out) WARNINGs, both stable-2.6.25 and linus' tree affected.

-- 
 i.

--
[PATCH] tcp: fix skb vs fack_count out-of-sync condition

This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing & 
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: int 986 <int986@gmail.com>
Cc: Andy Furniss <lists@andyfurniss.entadsl.com>
Cc: Brian Vowell <brian.vowell@gmail.com>
---
 net/ipv4/tcp_input.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 5d5b08b..a00c7f8 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1392,9 +1392,9 @@ static struct sk_buff *tcp_maybe_skipping_dsack(struct sk_buff *skb,
 
 	if (before(next_dup->start_seq, skip_to_seq)) {
 		skb = tcp_sacktag_skip(skb, sk, next_dup->start_seq, fack_count);
-		tcp_sacktag_walk(skb, sk, NULL,
-				 next_dup->start_seq, next_dup->end_seq,
-				 1, fack_count, reord, flag);
+		skb = tcp_sacktag_walk(skb, sk, NULL,
+				     next_dup->start_seq, next_dup->end_seq,
+				     1, fack_count, reord, flag);
 	}
 
 	return skb;
-- 
1.5.2.2

  parent reply	other threads:[~2008-06-04 10:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-28 12:02 WARNING: at net/ipv4/tcp_input.c Andy Furniss
2008-04-28 15:57 ` Ilpo Järvinen
2008-04-28 19:07   ` Andy Furniss
2008-04-28 19:21     ` Ilpo Järvinen
2008-05-01  9:01       ` Andy Furniss
2008-05-01  9:37         ` Ilpo Järvinen
2008-05-01  9:54           ` Andy Furniss
2008-05-02 10:51             ` Ilpo Järvinen
2008-05-03 12:03               ` Andy Furniss
2008-05-12 10:50                 ` Ilpo Järvinen
2008-05-13 10:20                   ` Andy Furniss
2008-05-29  1:35                   ` Andy Furniss
2008-05-29  8:10                     ` Ilpo Järvinen
2008-06-04 10:42                     ` Ilpo Järvinen [this message]
2008-06-04 19:08                       ` [PATCH] tcp: fix skb vs fack_count out-of-sync condition David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0806041330030.16829@wrl-59.cs.helsinki.fi \
    --to=ilpo.jarvinen@helsinki.fi \
    --cc=akpm@linux-foundation.org \
    --cc=brian.vowell@gmail.com \
    --cc=davem@davemloft.net \
    --cc=int986@gmail.com \
    --cc=lists@andyfurniss.entadsl.com \
    --cc=netdev@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=robbat2@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).