All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
To: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Cc: Netdev <netdev@vger.kernel.org>
Subject: Re: TCP IPv4 strange retransmits
Date: Wed, 05 Mar 2008 14:04:39 +0100	[thread overview]
Message-ID: <47CE9A67.5010002@nets.rwth-aachen.de> (raw)
In-Reply-To: <Pine.LNX.4.64.0803050819020.15712@kivilampi-30.cs.helsinki.fi>

Ilpo Järvinen wrote:
> On Wed, 5 Mar 2008, Arnd Hannemann wrote:
> 
>> Ilpo Järvinen wrote:
>>
>>> No, if there's any skb which is more than fackets_out-tp->reordering from 
>>> the highest SACKed skb, it will be marked TCPCB_LOST (see 
>>> tcp_mark_head_lost & it's caller), and all LOST segments are retransmitted 
>>> by the earlier loop (for a while still as I'm going to very likely change 
>>> that in net-2.6.26, commits for consolidating both, nearly identical loops 
>>> are already in my local git and await some testing).
>>>
>>> Forwardretrans is only incremented when there isn't TCPCB_LOST set for a 
>>> segment and it doesn't apply in this case anyway because you have new data 
>>> to send (see the decision making for forward retransmits, it's well 
>>> commented btw).
>> Ah, I see. Thank you for clarifying.
>> However fackets_out is not so well documented ;-)
> 
> I think I've fixed this for 2.6.25... :-) :
> 
> ...
> /* Heurestics to calculate number of duplicate ACKs. There's no dupACKs
>  * counter when SACK is enabled (without SACK, sacked_out is used for
>  * that purpose).
>  *
>  * Instead, with FACK TCP uses fackets_out that includes both SACKed
>  * segments up to the highest received SACK block so far and holes in
>  * between them.
>  *
>  * With reordering, holes may still be in flight, so RFC3517 recovery
>  * uses pure sacked_out (total number of SACKed segments) even though
>  * it violates the RFC that uses duplicate ACKs, often these are equal
>  * but when e.g. out-of-window ACKs or packet duplication occurs,
>  * they differ. Since neither occurs due to loss, TCP should really
>  * ignore them.
>  */
> static inline int tcp_dupack_heurestics(struct tcp_sock *tp)
> ...

Great :-) But shouldn't it read "heuristics" ?

> ...Though some FACK comments seem to be saying something else still.
> 
>> But it now makes all sense (with dump order):
>> An ACK 19225 arrives with SACK block {27745:29165}, so fackets_out becomes 
>> ~6 ((27745-19225)/1450)
>> tp->reordering is 3 at this time so he starts to retransmit.
>> However some SACK ACK comes early enough so he stops at 4 retransmits.
>> Or something like that...
> 
> Another thing you should consider is reordering detection which hopefully 
> worked at 13:08:20.667529 through the newly discored SACK block which is 
> _lower_ than the highestmost SACK block received so far. That results in 
> FACK -> RFC3517, FACK is built on inorder assumptions and whenever we find 
> that untrue, e.g., due to SACK/ACK for non-rexmit when something larger 
> has been confirmed received we disable it. Ah, but this was 2.6.24.y? It 

Yes, it was 2.6.24.2. Actually you can see reordering detection at work here[3],
the tool[4] we are using to measure TCP throughput samples the tcp_info struct and the
column #reor should reflect tp->reordering.
First it is 3 then it grows up to 16. Off course this is only a hint because
tcp_info is only sampled every 50ms in this example, but at least it shows that some
reordering detection took place...

> doesn't yet do RFC3517 IIRC, but has something remotely resembling 
> newreno, but only for the first packet because the next cumulative ACK may 
> often trigger timedout loop which basically marks everything lost (I don't 
> remember if the latter was changed to occur only with FACK ages ago or 
> not).

Not sure if I understood this. Will have to look into this some more.

> 
>>>> Tcpdump:
>> Sorry, this was just bogus. Just wanted to point out the timestamp 
>> differences and made a wrong example. Screen full of numbers... ;-)
> 
> I thought so :-).
> 
> ...Large, nearly equal numbers in two dimensions, maybe at some day 
> I wake up and notice I've read them too long noticing that capturing 
> this kind of things is no longer a problem to me... :-/
> 

[3] http://www.umic-mesh.net/~hannemann/strange-reorder/flowgrind.output
[4] http://www.umic-mesh.net/research/tcp/flowgrind.html

  reply	other threads:[~2008-03-05 13:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-04 13:00 TCP IPv4 strange retransmits Arnd Hannemann
2008-03-04 13:36 ` Ilpo Järvinen
2008-03-04 14:31   ` Arnd Hannemann
2008-03-04 21:04     ` H. Willstrand
2008-03-04 22:41       ` Arnd Hannemann
2008-03-04 21:07     ` Ilpo Järvinen
2008-03-04 21:19       ` Ilpo Järvinen
2008-03-04 23:03       ` Arnd Hannemann
2008-03-05  7:00         ` Ilpo Järvinen
2008-03-05 13:04           ` Arnd Hannemann [this message]
2008-03-05 19:32             ` Ilpo Järvinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47CE9A67.5010002@nets.rwth-aachen.de \
    --to=hannemann@nets.rwth-aachen.de \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.