netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
To: "Damon L. Chesser" <damon@damtek.com>,
	David Miller <davem@davemloft.net>
Cc: Bug 213081 <213081@bugs.launchpad.net>,
	478062@bugs.debian.org, Netdev <netdev@vger.kernel.org>
Subject: Re: Fix FRTO+NewReno problem (Was: Re: This has a work around)
Date: Mon, 12 May 2008 13:08:09 +0300 (EEST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0805120859470.1041@wrl-59.cs.helsinki.fi> (raw)
In-Reply-To: <Pine.LNX.4.64.0805082336440.31776@wrl-59.cs.helsinki.fi>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3694 bytes --]

On Thu, 8 May 2008, Ilpo Järvinen wrote:

> On Thu, 8 May 2008, Damon L. Chesser wrote:
> 
> > reran the print job with the correct kernel (for control reasons) and
> > received
> > the same results:  tcp_frto=1 no print.  tcp_frto=0 I can print.  Attached
> > is
> > the output of tcpdump
> > 
> > uname -r = 2.6.24-1-amd64
> 
> Well, that was a surprise, there must be something else too I didn't yet
> notice. I don't think it's that necessary for you to test that patch I sent
> earlier (basically the code paths it would have fixed were already in use with
> tcp_frto=1). And that patch was "obviously correct" anyway though it wasn't
> enough to fix this issue.
> 
> ...I too can probably reproduce this locally with small amount of work because
> the receiver pattern is dead obvious from the logs.

Yes indeed, some hping3 tcl acting as a clone of that network printer did 
it :-). Below is the 2nd patch (both are necessary). Besides them there's 
still SACKFRTO snd_nxt != frto_highmark problem remaining but it is a lot 
less severe and rare than this problem was and I'm still trying to find a 
simple way to fix it w/o adding another u32 to tcp_sock. I may need to 
think this retrans_stamp usage more around the rest of TCP code too as 
it seems to be somewhat suspicious here and there.

-- 
 i.

ps. ...you could have at least considered reporting upstream a bit 
earlier if some problem goes away/appears by changing kernel version 
(especially since you already tried some non-distro kernels and found 
them non-working), it might help to catch devs attention who hardly
hang much around distro bug trackers :-).

--
[PATCH] [TCP] FRTO: Fix fallback to conventional recovery

It seems that commit 009a2e3e4ec ("[TCP] FRTO: Improve
interoperability with other undo_marker users") run into
another land-mine which caused fallback to conventional
recovery to break:

1. Cumulative ACK arrives after FRTO retransmission
2. tcp_try_to_open sees zero retrans_out, clears retrans_stamp
   which should be kept like in CA_Loss state it would be
3. undo_marker change allowed tcp_packet_delayed to return
   true because of the cleared retrans_stamp once FRTO is
   terminated causing LossUndo to occur, which means all loss
   markings FRTO made are reverted.

This means that the conventional recovery basically recovered
one loss per RTO, which is not that efficient. It becomes a
serious problem to progress of the flow if many segments were
lost or when losses will persist to the FRTO RTTs as well.
Retrans_stamp was incorrectly cleared even before that
particular change (though it's effect is not often significant).

It was quite unobvious that the undo_marker change broken
something like this, I had a quite long session to track it
down because of the non-intuitiviness of the bug (luckily I
had a trivial reproducer at hand and I was also able to learn
to use kprobes in the process as well :-)).

This together with the NewReno+FRTO fix (62ab22278308a)
should finally fix Damon's problems.

Compile tested (but I did experiment with a similar fix on
a live kernel with systemtap+kprobes).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Damon L. Chesser <damon@damtek.com>
---
 net/ipv4/tcp_input.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 81ece1f..4c2255c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2481,7 +2481,7 @@ static void tcp_try_to_open(struct sock *sk, int flag)
 
 	tcp_verify_left_out(tp);
 
-	if (tp->retrans_out == 0)
+	if (!tp->frto_counter && tp->retrans_out == 0)
 		tp->retrans_stamp = 0;
 
 	if (flag & FLAG_ECE)
-- 
1.5.2.2

  reply	other threads:[~2008-05-12 10:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <48207F06.50306@damtek.com>
     [not found] ` <Pine.LNX.4.64.0805061912350.8965@wrl-59.cs.helsinki.fi>
     [not found]   ` <4821C37A.7040306@damtek.com>
2008-05-07 22:26     ` Fix FRTO+NewReno problem (Was: Re: This has a work around) Ilpo Järvinen
2008-05-08  8:10       ` Fix FRTO+NewReno problem David Miller
2008-05-08 16:50       ` Bug#478062: Fix FRTO+NewReno problem (Was: Re: This has a work around) Damon L. Chesser
2008-05-08 17:05       ` Damon L. Chesser
2008-05-08 18:16       ` Damon L. Chesser
2008-05-08 20:42         ` Ilpo Järvinen
2008-05-12 10:08           ` Ilpo Järvinen [this message]
     [not found]             ` <4828279C.3010102@damtek.com>
2008-05-12 11:32               ` Ilpo Järvinen
2008-05-12 11:55                 ` Damon L. Chesser
2008-05-12 12:07                   ` Ilpo Järvinen
2008-05-12 13:44                     ` Damon L. Chesser
2008-05-12 14:35                       ` Ilpo Järvinen
2008-05-12 16:40                         ` Ilpo Järvinen
2008-05-12 17:02                         ` Damon L. Chesser
2008-05-12 18:23                           ` Ilpo Järvinen
2008-05-12 18:39                         ` Damon L. Chesser
2008-05-12 19:12                           ` Ilpo Järvinen
2008-05-12 19:18                             ` Damon L. Chesser
2008-05-12 19:25                               ` Ilpo Järvinen
2008-05-12 22:48                             ` Fix FRTO+NewReno problem David Miller
2008-05-13  9:42                               ` Ilpo Järvinen
2008-05-13  9:49                                 ` David Miller
2008-05-13 10:04                                   ` Ilpo Järvinen
2008-05-13 10:08                                     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0805120859470.1041@wrl-59.cs.helsinki.fi \
    --to=ilpo.jarvinen@helsinki.fi \
    --cc=213081@bugs.launchpad.net \
    --cc=478062@bugs.debian.org \
    --cc=damon@damtek.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).