From: Arnd Hannemann <arnd@arndnet.de>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexander Zimmermann <alexander.zimmermann@comsys.rwth-aachen.de>,
Yuchung Cheng <ycheng@google.com>,
Hagen Paul Pfeifer <hagen@jauu.net>,
netdev <netdev@vger.kernel.org>,
Lukowski Damian <damian@tvk.rwth-aachen.de>
Subject: Re: [PATCH] tcp: bound RTO to minimum
Date: Thu, 25 Aug 2011 11:46:02 +0200 [thread overview]
Message-ID: <4E5619DA.6070902@arndnet.de> (raw)
In-Reply-To: <1314263389.2387.21.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Hi Eric,
Am 25.08.2011 11:09, schrieb Eric Dumazet:
> Le jeudi 25 août 2011 à 10:46 +0200, Arnd Hannemann a écrit :
>> Am 25.08.2011 10:26, schrieb Eric Dumazet:
>>> Le jeudi 25 août 2011 à 09:28 +0200, Alexander Zimmermann a écrit :
>>>> Am 25.08.2011 um 07:28 schrieb Eric Dumazet:
>>>
>>>>> Real question is : do we really want to process ~1000 timer interrupts
>>>>> per tcp session, ~2000 skb alloc/free/build/handling, possibly ~1000 ARP
>>>>> requests, only to make tcp revover in ~1sec when connectivity returns
>>>>> back. This just doesnt scale.
>>>>
>>>> maybe a stupid question, but 1000?. With an minRTO of 200ms and a maximum
>>>> probing time of 120s, we 600 retransmits in a worst-case-senario
>>>> (assumed that we get for every rot retransmission an icmp). No?
>>>
>>> Where is asserted the "max probing time of 120s" ?
>>>
>>> It is not the case on my machine :
>>> I have way more retransmits than that, even if spaced by 1600 ms
>>>
>>> 07:16:13.389331 write(3, "\350F\235JC\357\376\363&\3\374\270R\21L\26\324{\37p\342\244i\304\356\241I:\301\332\222\26"..., 48) = 48
>>> 07:16:13.389417 select(7, [3 4], [], NULL, NULL) = 1 (in [3])
>>> 07:31:39.901311 read(3, 0xff8c4c90, 8192) = -1 EHOSTUNREACH (No route to host)
>>>
>>> Old kernels where performing up to 15 retries, doing exponential backoff.
>>>
>>> Now its kind of unlimited, according to experimental results.
>>
>> That shouldn't be. It should stop after the same time a TCP connection with an
>> RTO of Minimum RTO which is doing 15 retries (tcp_retries2=15) and doing exponential backoff.
>> So it should be around 900s*. But it could be that because of the icsk_retransmit wrapover
>> this doesn't work as expected.
>>
>> * 200ms + 400ms + 800ms ...
>
> It is 924 second with retries2=15 (default value)
>
> I said ~1000 probes.
>
> If ICMP are not rate limited, that could be about 924*5 probes, instead
> of 15 probes on old kernels.
At a rate of 5 packets/s if RTT is zero, yes. I would like to say: so
what? But your example with millions of idle connections stands.
> Maybe we should refine the thing a bit, to not reverse backoff unless
> rto is > some_threshold.
>
> Say 10s being the value, that would give at most 92 tries.
I personally think that 10s would be too large and eliminate the benefit of the
algorithm, so I would prefer a different solution.
In case of one bulk data TCP session, which was transmitting hundreds of packets/s
before the connectivity disruption those worst case rate of 5 packet/s really
seems conservative enough.
However in case of a lot of idle connections, which were transmitting only
a number of packets per minute. We might increase the rate drastically for
a certain period until it throttles down. You say that we have a problem here
correct?
Do you think it would be possible without much hassle to use a kind of "global"
rate limiting only for these probe packets of a TCP connection?
> I mean, what is the gain to be able to restart a frozen TCP session with
> a 1sec latency instead of 10s if it was blocked more than 60 seconds ?
I'm afraid it does a lot, especially in highly dynamic environments. You
don't have just the additional latency, you may actually miss the full
period where connectivity was there, and then just retransmit into the next
connectivity disrupted period.
Best regards,
Arnd
next prev parent reply other threads:[~2011-08-25 9:46 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-24 16:21 [BUG] tcp : how many times a frame can possibly be retransmitted ? Eric Dumazet
2011-08-24 19:03 ` Alexander Zimmermann
2011-08-24 19:39 ` Jerry Chu
2011-08-24 19:45 ` Eric Dumazet
2011-08-24 22:44 ` Ilpo Järvinen
2011-08-24 23:00 ` Eric Dumazet
2011-08-24 23:41 ` [PATCH] tcp: bound RTO to minimum Hagen Paul Pfeifer
2011-08-24 23:43 ` Hagen Paul Pfeifer
2011-08-25 1:50 ` Yuchung Cheng
2011-08-25 5:28 ` Eric Dumazet
2011-08-25 7:28 ` Alexander Zimmermann
2011-08-25 8:26 ` Eric Dumazet
2011-08-25 8:44 ` Alexander Zimmermann
2011-08-25 8:46 ` Arnd Hannemann
2011-08-25 9:09 ` Eric Dumazet
2011-08-25 9:46 ` Arnd Hannemann [this message]
2011-08-25 10:02 ` Eric Dumazet
2011-08-25 10:14 ` Ilpo Järvinen
2011-08-25 10:15 ` Arnd Hannemann
2011-08-25 8:56 ` [BUG] tcp : how many times a frame can possibly be retransmitted ? Ilpo Järvinen
2011-08-25 9:40 ` Eric Dumazet
2011-08-25 10:07 ` Ilpo Järvinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E5619DA.6070902@arndnet.de \
--to=arnd@arndnet.de \
--cc=alexander.zimmermann@comsys.rwth-aachen.de \
--cc=damian@tvk.rwth-aachen.de \
--cc=eric.dumazet@gmail.com \
--cc=hagen@jauu.net \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).