From mboxrd@z Thu Jan  1 00:00:00 1970
From: Rick Jones <rick.jones2@hp.com>
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
Date: Fri, 13 Jul 2007 09:55:10 -0700
Message-ID: <4697AE6E.4070600@hp.com>
References: <20070712.161510.26510093.noboru.obata.ar@hitachi.com>	<20070712.023710.36923635.davem@davemloft.net>	<20070712.225950.12335719.noboru.obata.ar@hitachi.com>	<20070712.132448.115910193.davem@davemloft.net>	<20070712141203.7350429a@freepuppy.rosehill.hemminger.net>	<46969CA9.8030406@hp.com> <Pine.LNX.4.64.0707130726550.30150@kivilampi-30.cs.helsinki.fi>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: Stephen Hemminger <shemminger@linux-foundation.org>,
	noboru.obata.ar@hitachi.com, David Miller <davem@davemloft.net>,
	yoshfuji@linux-ipv6.org, Netdev <netdev@vger.kernel.org>
To: =?ISO-8859-1?Q?Ilpo_J=E4rvinen?= <ilpo.jarvinen@helsinki.fi>
Return-path: <netdev-owner@vger.kernel.org>
Received: from palrel10.hp.com ([156.153.255.245]:36267 "EHLO palrel10.hp.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752812AbXGMQ4L (ORCPT <rfc822;netdev@vger.kernel.org>);
	Fri, 13 Jul 2007 12:56:11 -0400
In-Reply-To: <Pine.LNX.4.64.0707130726550.30150@kivilampi-30.cs.helsinki.fi>
Sender: netdev-owner@vger.kernel.org
List-Id: netdev.vger.kernel.org

Ilpo J=E4rvinen wrote:
> On Thu, 12 Jul 2007, Rick Jones wrote:
>=20
>=20
>>>One question is why the RTO gets so large that it limits failover?
>>>
>>>If Linux TCP is working correctly,  RTO should be srtt + 2*rttvar
>>>
>>>So either there is a huge srtt or variance, or something is going
>>>wrong with RTT estimation.  Given some reasonable maximums of
>>>Srtt =3D 500ms and rttvar =3D 250ms, that would cause RTO to be 1sec=
ond.
>>
>>I suspect that what is happening here is that a link goes down in a t=
runk
>>somewhere for some number of seconds, resulting in a given TCP segmen=
t being
>>retransmitted several times, with the doubling of the RTO each time.
>=20
>=20
> But that's a back-off for the retransmissions, the doubling is=20
> temporary... Once you return to normal conditions, the accumulated ba=
ckoff=20
> multiplier will be immediately cut back to normal. So you should then=
 be=20
> back to 1 second (like in the example or whatever) again...

=46ine, but so?  I suspect the point of the patch is to provide a lower=
 cap on the=20
accumulated backoff so data starts flowing over the connection within t=
hat lower=20
cap once the link is restored/failed-over.

rick jones