* Weird TCP SACK problem. in Linux...
@ 2006-07-18 19:38 Oumer Teyeb
2006-07-19 9:38 ` Xiaoliang (David) Wei
2006-07-19 13:27 ` Alexey Kuznetsov
0 siblings, 2 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-18 19:38 UTC (permalink / raw)
To: netdev
Hello Guys,
I have some questions regarding TCP SACK implementation in Linux .
As I am not a subscriber, could you please cc the reply to me? thanks!
I am doing these experiments to find out the impact of reordering. So I
have different TCP versions (newReno, SACK, FACk, DSACK, FRTO,....) as
implemented in Linux. and I am trying their combination to see how they
behave. What struck me was that when I dont use timestamps, introducing
SACK increases the download time but decreases the total number of
retransmissions.
When timestamps is used, SACK leads to an increase in both the download
time and the retransmissions.
So I looked further into the results, and what I found was that when
SACK (when I refer to SACK here, I mean SACK only without FACK and
DSACK) is used, the retransmissions seem to happen earlier .
at www.kom.auc.dk/~oumer/first_transmission_times.pdf
you can find the pic of cdf of the time when the first TCP
retransmission occured for the four combinations of SACK and timestamps
after hundrends of downloads of a 100K file for the different conditions
under network reordering...
This explains the reason why the download time increases with SACK,
because the earlier we go into fast recovery the longer the time we
spend on congestion avoidance, and the longer the download time... ...I
am not 100% sure that the retransmissions are only due to reordering as
I am using tcptrace to get my results, but I am guessing they are not
because when I used FRTO, there was no improvment, showing that there
were indeed no timeouts (as FRTO acts only on timeouts).....
...but I couldnt figure out why the retransmissions occur earlier for
SACK than no SACK TCP. As far as I know, for both SACK and non SACK
cases, we need three (or more according to the setting) duplicate ACKs
to enter the fast retransmission /recovery state.... which would have
resulted in the same behaviour to the first occurance of a
retransmission..... or is there some undocumented enhancment in Linux
TCP when using SACK that makes it enter fast retransmit earlier... the
ony explanation I could imagine is something like this
non SACK case
=============
1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we
are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
and 7&8...third duplicate ACK.....)...
so if SACK behaved like this...
3&4 SACKEd.... 2 packets out of order received
5&6 SACKEd....4 packets out of order received.... start fast
retransmission....as reorderd is greater than 3.... (this is true when
it comes to marking packets as lost during fast recovery, but is it true
als for the first retransmission?)
.. any ideas why this is happening???
One more thing, say I have FRTO, DSACK and timestamps enabled, which
algorithm takes precedence ? if FRTO is enabled, then all spurious
timeout detection are done through FRTO or a combination?..
Thanks in advance,
Oumer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
@ 2006-07-19 9:38 ` Xiaoliang (David) Wei
2006-07-19 10:00 ` Oumer Teyeb
2006-07-19 13:27 ` Alexey Kuznetsov
1 sibling, 1 reply; 10+ messages in thread
From: Xiaoliang (David) Wei @ 2006-07-19 9:38 UTC (permalink / raw)
To: Oumer Teyeb; +Cc: netdev
Hi Oumer,
Your result is interesting. Just a few questions (along with your texts):
> So I looked further into the results, and what I found was that when
> SACK (when I refer to SACK here, I mean SACK only without FACK and
> DSACK) is used, the retransmissions seem to happen earlier .
> at www.kom.auc.dk/~oumer/first_transmission_times.pdf
> you can find the pic of cdf of the time when the first TCP
> retransmission occured for the four combinations of SACK and timestamps
> after hundrends of downloads of a 100K file for the different conditions
> under network reordering...
Could you give a little bit more details on the scenarios. For example:
What is your RTT, capacity and etc? Linux versions? Packetsize is
1.5K? Then 100K is about 66 packets. Do flows finish slow start or
not? Also, what is the reordering level? Are you using Dummynet or
real network?
> ...but I couldnt figure out why the retransmissions occur earlier for
> SACK than no SACK TCP. As far as I know, for both SACK and non SACK
> cases, we need three (or more according to the setting) duplicate ACKs
> to enter the fast retransmission /recovery state.... which would have
> resulted in the same behaviour to the first occurance of a
> retransmission..... or is there some undocumented enhancment in Linux
> TCP when using SACK that makes it enter fast retransmit earlier... the
> ony explanation I could imagine is something like this
Are you sure FACK is turned OFF? FACK might retransmit earlier if you
have packet reordering, I think.
> non SACK case
> =============
> 1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we
> are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
> is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
> and 7&8...third duplicate ACK.....)...
>
> so if SACK behaved like this...
>
> 3&4 SACKEd.... 2 packets out of order received
> 5&6 SACKEd....4 packets out of order received.... start fast
> retransmission....as reorderd is greater than 3.... (this is true when
> it comes to marking packets as lost during fast recovery, but is it true
> als for the first retransmission?)
I guess delayed ACK is turned off when there is packet reordering. The
receiver will send one ack for each data packet whenever there is out
of order packets in its queue. So we will get duplicate ack ealier
than what you explain above...
> One more thing, say I have FRTO, DSACK and timestamps enabled, which
> algorithm takes precedence ? if FRTO is enabled, then all spurious
> timeout detection are done through FRTO or a combination?..
They are compatible, I think?
When retransmission timer times out, it first tries to go through
FRTO. If FRTO found it's a real loss, then it goes to traditional
timeout process as specified in FRTO algorithm.
-David
--
Xiaoliang (David) Wei Graduate Student, CS@Caltech
http://davidwei.org
***********************************************
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 9:38 ` Xiaoliang (David) Wei
@ 2006-07-19 10:00 ` Oumer Teyeb
0 siblings, 0 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 10:00 UTC (permalink / raw)
To: Xiaoliang (David) Wei; +Cc: netdev
Hi David,
I am using an emualtor that I developed using netfilter (see
http://kom.aau.dk/~oumer/publications/VTC05.pdf for a description of the
emulator).. and I emualte a UMTS network with RTT of 150ms, and I use a
384kbps connection. There is UMTS frame erasure rate of 10%, but I have
persistant link layer retransmission, which means nothing is actually
lost. So due to this link layer errors, some packets arrive out of order
and the effect of that on tcp performance is what I am after. I am using
linux 2.4.
I have put more detailed traces at
www.kom.auc.dk/~oumer/sackstuff.tar.gz
I have run the different cases 10 times each,
NT_NSACK[1-10].dat---no timestamp, no SACK
NT_SACK[1-10].dat----no timestamp, SACK
T_NSACK[1-10].dat---timestamp, no SACK
T_SACK[1-10].dat----timestamp. SACK
(by no SACK I mean only SACK, DSACK and FACK disabled, I also have
results when they are enabled, see below for curves illustrating the
different cases...)
the files without extension are just two column files that summarize the
ten runs for the four different cases, the first column in the #
retransmission, and second column is the download time, the values are
gathered from tcptrace
the two eps files are just the plot summarizing the above average
download time and average retransmission # for each case...
one more thing in the trace files, you will find 3 tcp connections, the
first one is not modified by my emulator that causes the reordering
(actually, that is the connection through which I reset the destination
catch that stores some metrics from previous runs using some commands
via ssh), the second one is the ftp control channel and the third one is
the ftp data channel....the emulator affects the last two channels
and causes reordering once in a while.....
please dont hesistate to ask me if anything is not clear...
Also, I have put the final curves of all my emulations showing the
download times and percentage of retransmissions (#retransmission
/total packets sent)
at
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf
There are a lot of other things that I dont understand from these two
curve. However the most bizzare one (apart from the SACK issue that
started this discussion) is why DSACK leads to increased retransmissions
when used without timestamps? (the behaviour is ok interms of download
time as it is reducing it, showing that DSACK base spurious
retransmission is at work)
Thanks a lot for taking the time
Regards,
Oumer
Xiaoliang (David) Wei wrote:
> Hi Oumer,
>
> Your result is interesting. Just a few questions (along with your
> texts):
>
>> So I looked further into the results, and what I found was that when
>> SACK (when I refer to SACK here, I mean SACK only without FACK and
>> DSACK) is used, the retransmissions seem to happen earlier .
>> at www.kom.auc.dk/~oumer/first_transmission_times.pdf
>> you can find the pic of cdf of the time when the first TCP
>> retransmission occured for the four combinations of SACK and timestamps
>> after hundrends of downloads of a 100K file for the different conditions
>> under network reordering...
>
>
> Could you give a little bit more details on the scenarios. For example:
> What is your RTT, capacity and etc? Linux versions? Packetsize is
> 1.5K? Then 100K is about 66 packets. Do flows finish slow start or
> not? Also, what is the reordering level? Are you using Dummynet or
> real network?
>
>
>> ...but I couldnt figure out why the retransmissions occur earlier for
>> SACK than no SACK TCP. As far as I know, for both SACK and non SACK
>> cases, we need three (or more according to the setting) duplicate ACKs
>> to enter the fast retransmission /recovery state.... which would have
>> resulted in the same behaviour to the first occurance of a
>> retransmission..... or is there some undocumented enhancment in Linux
>> TCP when using SACK that makes it enter fast retransmit earlier... the
>> ony explanation I could imagine is something like this
>
>
> Are you sure FACK is turned OFF? FACK might retransmit earlier if you
> have packet reordering, I think.
>
>
>> non SACK case
>> =============
>> 1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we
>> are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
>> is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
>> and 7&8...third duplicate ACK.....)...
>>
>> so if SACK behaved like this...
>>
>> 3&4 SACKEd.... 2 packets out of order received
>> 5&6 SACKEd....4 packets out of order received.... start fast
>> retransmission....as reorderd is greater than 3.... (this is true when
>> it comes to marking packets as lost during fast recovery, but is it true
>> als for the first retransmission?)
>
>
> I guess delayed ACK is turned off when there is packet reordering. The
> receiver will send one ack for each data packet whenever there is out
> of order packets in its queue. So we will get duplicate ack ealier
> than what you explain above...
>
>
>> One more thing, say I have FRTO, DSACK and timestamps enabled, which
>> algorithm takes precedence ? if FRTO is enabled, then all spurious
>> timeout detection are done through FRTO or a combination?..
>
>
> They are compatible, I think?
>
> When retransmission timer times out, it first tries to go through
> FRTO. If FRTO found it's a real loss, then it goes to traditional
> timeout process as specified in FRTO algorithm.
>
> -David
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
2006-07-19 9:38 ` Xiaoliang (David) Wei
@ 2006-07-19 13:27 ` Alexey Kuznetsov
2006-07-19 15:02 ` Oumer Teyeb
1 sibling, 1 reply; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-19 13:27 UTC (permalink / raw)
To: Oumer Teyeb; +Cc: netdev
Hello!
> DSACK) is used, the retransmissions seem to happen earlier .
Yes. With SACK/FACK retransmissions can be triggered earlier,
if an ACK SACKs a segment which is far enough from current snd.una.
That's what happens f.e. in T_SACK_dump5.dat
01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 18825:20273[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948058 469778216> [|] (DF)(ttl 64, id 19165)
01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778229 418948031,nop,nop, sack 1 {10137:11585} > (DF) [tos 0x8] (ttl 62, id 45508)
01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778235 418948031,nop,nop, sack 2 {13033:14481}{10137:11585} > (DF) [tos 0x8] (ttl 62, id 45509)
01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 8689:10137[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948076 469778235> [|] (DF) (ttl 64, id 19166)
The second sack confirms that 13033..14481 already arrived.
And this is even not a mistake, the third dupack arrived immediately:
01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778238 418948031,nop,nop, sack 2 {13033:15929}{10137:11585} > (DF) [tos 0x8] (ttl 62, id 45510)
Actually, it is the reason why the FACK heuristics is not disabled
even when FACK disabled. Experiments showed that relaxing it severely
damages recovery in presense of real multiple losses.
And when it happens to be reordering, undoing works really well.
There is one more thing, which probably happens in your experiments,
though I did not find it in dumps. If reordering exceeds RTT, i.e.
we receive SACK for a segment, which was sent as part of forward
retransmission after a hole was detected, fast retransmit entered immediately.
Two dupacks is enough for this: first triggers forward transmission,
if the second SACKs the segmetn which has just been sent, we are there.
> One more thing, say I have FRTO, DSACK and timestamps enabled, which
> algorithm takes precedence ?
They live together, essnetially, not dependant.
Alexey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 13:27 ` Alexey Kuznetsov
@ 2006-07-19 15:02 ` Oumer Teyeb
2006-07-19 15:49 ` Alexey Kuznetsov
0 siblings, 1 reply; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 15:02 UTC (permalink / raw)
To: Alexey Kuznetsov; +Cc: netdev
Hi ,
Alexey Kuznetsov wrote:
>Hello!
>
>
>
>>DSACK) is used, the retransmissions seem to happen earlier .
>>
>>
>
>Yes. With SACK/FACK retransmissions can be triggered earlier,
>if an ACK SACKs a segment which is far enough from current snd.una.
>That's what happens f.e. in T_SACK_dump5.dat
>
>01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 18825:20273[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948058 469778216> [|] (DF)(ttl 64, id 19165)
>01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778229 418948031,nop,nop, sack 1 {10137:11585} > (DF) [tos 0x8] (ttl 62, id 45508)
>01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778235 418948031,nop,nop, sack 2 {13033:14481}{10137:11585} > (DF) [tos 0x8] (ttl 62, id 45509)
>01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 8689:10137[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948076 469778235> [|] (DF) (ttl 64, id 19166)
>
>The second sack confirms that 13033..14481 already arrived.
>
>And this is even not a mistake, the third dupack arrived immediately:
>01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778238 418948031,nop,nop, sack 2 {13033:15929}{10137:11585} > (DF) [tos 0x8] (ttl 62, id 45510)
>
>
Thanks a lot Alexey for pointing that out.!!!..That was more or less
what I was asumming.... but is this feature of linux TCP documented
somewhere? as far as I can see I couldnt find it in Pasi's paper.... in
the conservative sack based recovery RFC (* RFC 3517), it is clearly
*stated that the
Upon the receipt of the first (DupThresh - 1) duplicate ACKs, the
scoreboard is to be updated as normal. Note: The first and second
duplicate ACKs can also be used to trigger the transmission of
previously unsent segments using the Limited Transmit algorithm
[RFC3042].
When a TCP sender receives the duplicate ACK corresponding to
DupThresh ACKs, the scoreboard MUST be updated with the new SACK
information (via Update ()). If no previous loss event has occurred
on the connection or the cumulative acknowledgment point is beyond
the last value of RecoveryPoint, a loss recovery phase SHOULD be
initiated, per the fast retransmit algorithm outlined in [RFC2581].
ofcourse, once we are in the fast recovery phase we are able to mark a packet lost based on the criteria (also from the same RFC)
IsLost (SeqNum):
This routine returns whether the given sequence number is
considered to be lost. The routine returns true when either
DupThresh discontiguous SACKed sequences have arrived above
'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
than 'SeqNum' have been SACKed. Otherwise, the routine returns
false.
But from the trace portion you cut outside it seems the sack
implementation in linux simply checked the sn of the newly sacked one,
and finding out that there are two blocks in between, considered it as
if it is a dupthresh duplicate ack and retransmitted it... So if we were
not using sack the retransmission would have occured after
01:28:15.90... so the TCP SACK retransmitted in this case around 50ms
earlier...but it might be larger in some cases, (I will try to look
into the traces to find larger time differences but you can see there is
a clear difference by looking at the plots of the cdf of the time of
occurance of the first retransmissions for the different cases at
http://kom.aau.dk/~oumer/first_transmission_times.pdf .... so I am on
the verge of concluding TCP SACK is worse than non SACK TCP incase of
persistent reordering....if only I could find a reference about the
linux TCP SACK behaviour we discussed above :-)...
>Actually, it is the reason why the FACK heuristics is not disabled
>even when FACK disabled. Experiments showed that relaxing it severely
>damages recovery in presense of real multiple losses.
>And when it happens to be reordering, undoing works really well.
>
>
so you are saying, it doesnt matter whether I disable FACK or not, it is
basically set by default?
and it is disabled only when reordering is detected (and this is done
either through timestamps or DSACK, right?)...
so if neither DSACK and timestamps are enabled we are unable to detect
disorder, so basically there should be no difference between SACK and
FACK, cause it is always FACK used... and that seems to make sense from
the results I have (i.e. referrring to ....
http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf
)...
now let's introduce DSACK and no timestamps... that means we are able to
detect some reordering and download time should decrease, and it does so
as shown in the first of the figures I just give the link to...however,
the # of retransmissions increases as shown in the second figure? isnt
that odd? shouldnt it be the other way around?
Also why does the # retransmissions in the timestamp case increases when
we use SACK/FACK as compared with no SACK case?...and as you mentioned
earlier reordering undoing works very well, by comparing the curves with
and without timestamps, but some of this seems to be undo when we use it
along with SACK, FACK and DSACK, eventhough the differences are not that
much...
>There is one more thing, which probably happens in your experiments,
>though I did not find it in dumps. If reordering exceeds RTT, i.e.
>we receive SACK for a segment, which was sent as part of forward
>retransmission after a hole was detected, fast retransmit entered immediately.
>Two dupacks is enough for this: first triggers forward transmission,
>if the second SACKs the segmetn which has just been sent, we are there.
>
>
This one , I dont think I understood you. Could you please make it a bit
more clearer?
>>One more thing, say I have FRTO, DSACK and timestamps enabled, which
>>algorithm takes precedence ?
>>
>>
>
>They live together, essnetially, not dependant.
>
>
OK ...but if timestamps are enabled, then I just couldnt figure out the
use of DSACK, can it tell us something more than we can find using
timestamps??
>Alexey
>
>
Regards,
Oumer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 15:02 ` Oumer Teyeb
@ 2006-07-19 15:49 ` Alexey Kuznetsov
2006-07-19 16:32 ` Oumer Teyeb
0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-19 15:49 UTC (permalink / raw)
To: Oumer Teyeb; +Cc: netdev
HellO!
> IsLost (SeqNum):
> This routine returns whether the given sequence number is
> considered to be lost. The routine returns true when either
> DupThresh discontiguous SACKed sequences have arrived above
> 'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
> than 'SeqNum' have been SACKed. Otherwise, the routine returns
> false.
It is not used. The metric is just distance between snd.una and
the most forward sack.
It can be changed, but, to be honest, counting "discontiguous SACked sequences"
looks really weird and totally unjustified.
You can look for function tcp_time_to_recover() and replace
tcp_fackets_out(tp) > tp->reordering with something like
tp->sacked_out+1 > tp->reordering. It is not so weird as rfc
recommends, but it should make some difference.
> so you are saying, it doesnt matter whether I disable FACK or not, it is
> basically set by default?
Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.
> and it is disabled only when reordering is detected (and this is done
> either through timestamps or DSACK, right?)...
> so if neither DSACK and timestamps are enabled we are unable to detect
> disorder, so basically there should be no difference between SACK and
> FACK, cause it is always FACK used... and that seems to make sense from
> the results I have
Yes. But FACKless tcp still retransmits less aggressively.
> the # of retransmissions increases as shown in the second figure? isnt
> that odd? shouldnt it be the other way around?
The most odd is that I see no correlation between #of retransmits
and download time in you graphs. Actually, the correlation is negative. :-)
> Also why does the # retransmissions in the timestamp case increases when
> we use SACK/FACK as compared with no SACK case?
Excessive retransmissions still happen. Undoing just restores cwnd
and tries to increase reordering metric to avoid false retransmits.
> This one , I dont think I understood you. Could you please make it a bit
> more clearer?
1. Suppose, some segments, but not all, were delayed.
2. Senders sees dupack with a SACK. It is the first, SACK allows to open
window for one segment, you send one segment with snd.nxt.
3. Receivers receives it before delayed segments arrived.
4. When senders sees this SACK, it assumes that all the delayed
segments are lost.
> OK ...but if timestamps are enabled, then I just couldnt figure out the
> use of DSACK, can it tell us something more than we can find using
> timestamps??
It depends. Normally, no. If the network is fast, timestamps are just
too coarse to detect redundant retransmissions.
Plus, the heuristcs based on timestamps essentially relies on a bug
in our timestamps processing code. Another side could have it fixed. :-)
Alexey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 15:49 ` Alexey Kuznetsov
@ 2006-07-19 16:32 ` Oumer Teyeb
2006-07-19 17:32 ` Oumer Teyeb
2006-07-20 23:23 ` Alexey Kuznetsov
0 siblings, 2 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 16:32 UTC (permalink / raw)
To: Alexey Kuznetsov; +Cc: netdev
Hi,
Alexey Kuznetsov wrote:
>Condition triggering start of fast retransmit is the same.
>The behaviour while retransmit is different. FACKless code
>behaves more like NewReno.
>
>
Ok, that is a good point!! Now at least I can convince myself the CDFs
for the first retransmissions showing that SACK leads to earlier
retransmissions than no SACK are not wrong....and I can even convince
myself that this is the real reason behind sack/fack's performance
degredation for the case of no timestamps,:-)... ...
>>and it is disabled only when reordering is detected (and this is done
>>either through timestamps or DSACK, right?)...
>>so if neither DSACK and timestamps are enabled we are unable to detect
>>disorder, so basically there should be no difference between SACK and
>>FACK, cause it is always FACK used... and that seems to make sense from
>>the results I have
>>
>>
>
>Yes. But FACKless tcp still retransmits less aggressively.
>
>
>
>>the # of retransmissions increases as shown in the second figure? isnt
>>that odd? shouldnt it be the other way around?
>>
>>
>
>The most odd is that I see no correlation between #of retransmits
>and download time in you graphs. Actually, the correlation is negative. :-)
>
>
>
yeah, that was what confuses me the most... in
www.kom.auc.dk/~oumer/ret_vs_download.pdf
I have a plot of the summary of runs of two hundrend runs for the four
combinations of SACK(ON/OFF), timestamps(ON/OFF)
I just collected the retransmission from each run, and averaged the
download time for each retransmission count..... I see no clear
pattern...so that was why I was focusing more on when retransmissions
are triggered rather than how many of them are they...because first, the
earlier you are in the fast recovery phase (if you dont revert it ) the
more time you spend on congestion avoidance, and it hurts the throughput
quite a lot, also, the number of times you enter fast retransmit is more
harmful than that of the number of retransmissions because more
unncessary retransmissions during a fast recovery costs some bandwidth,
but it doesnt damage the "future" of the connection as much as a
retransmission that drives tcp into fast recovery....
>>Also why does the # retransmissions in the timestamp case increases when
>>we use SACK/FACK as compared with no SACK case?
>>
>>
>
>Excessive retransmissions still happen. Undoing just restores cwnd
>and tries to increase reordering metric to avoid false retransmits.
>
>
>
Hmmm... I dont understand this....so if reording can be detected, (i.e
we use timestamps, DSACK), the dupthreshold is increased temporarily? Ok
this adds to the explanation of why the retransmissions are
less in the timestamp case than in the non timestamp case (in addition
to the fact that with timestamps, we get out of fast recovery earlier
than non timestamps case, and hence also less retransmissions)...but
what I was referring to was if you use timestamps then why the increase
in the number of retransmissions when we use FACK, SACK or DSACK as
compared to the no SACK case...Is this dupthreshold increase documented
somewhere properly? in the linux congestion paper by you and Pasi , you
mention it briefly in section 5 "linux fast recovery does not fully
follow RFC 2582.. the sender adjusts the threshold for triggering fast
retransmit dynamically, based on the observerd reordering in the
network..." but it doesnt exactly say how this dynamic adjustment is
done ....
>1. Suppose, some segments, but not all, were delayed.
>2. Senders sees dupack with a SACK. It is the first, SACK allows to open
> window for one segment, you send one segment with snd.nxt.
>3. Receivers receives it before delayed segments arrived.
>4. When senders sees this SACK, it assumes that all the delayed
> segments are lost.
>
>
Thanks! it is very clear now.! but it is basically the same effect (for
the explanation that I am seeking)...as the trace you quoted, right, two
duplicate acks leading to retransmission....
>>OK ...but if timestamps are enabled, then I just couldnt figure out the
>>use of DSACK, can it tell us something more than we can find using
>>timestamps??
>>
>>
>
>It depends. Normally, no. If the network is fast, timestamps are just
>too coarse to detect redundant retransmissions.
>
>Plus, the heuristcs based on timestamps essentially relies on a bug
>in our timestamps processing code. Another side could have it fixed. :-)
>
>
Ok, for my studies it shouldnt matter because I am using the buggy code
on both the sender and receiver.. :-) (though I dont understand what
this bug you are referring to is about :-)....
>Alexey
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 16:32 ` Oumer Teyeb
@ 2006-07-19 17:32 ` Oumer Teyeb
2006-07-20 15:41 ` Oumer Teyeb
2006-07-20 23:23 ` Alexey Kuznetsov
1 sibling, 1 reply; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 17:32 UTC (permalink / raw)
To: Alexey Kuznetsov; +Cc: netdev
Oumer Teyeb wrote:
> Hi,
>
> Alexey Kuznetsov wrote:
>
>> Condition triggering start of fast retransmit is the same.
>> The behaviour while retransmit is different. FACKless code
>> behaves more like NewReno.
>>
>>
> Ok, that is a good point!! Now at least I can convince myself the
> CDFs for the first retransmissions showing that SACK leads to earlier
> retransmissions than no SACK are not wrong....and I can even convince
> myself that this is the real reason behind sack/fack's performance
> degredation for the case of no timestamps,:-)... ...
Actually, then the increase in the number of retransmissions and the
increase in teh download time from no SACK - SACK for timestamp case
seems to make sense also...my reasoning is like this...if there is
timestamps, that means there is reordering detection...hence the number
retransmissions are reduced because we avoid the time spent in fast
recovery.... when we introduce SACK on top of timestamps, we enter fast
retransmits earlier than no SACK case as we seem to agree, and since the
timestamp reduces the number of retransmission once we are in fast
recovery, the retransmissions we see are basically the first few
retransmissions that made us enter the false fast retransmits, so we
have a little increase in the retransmissions and a little increase in
the download times... but when no timestamps are used, there is no
reordering detection and so SACK leads to less number of retransmissions
because it retransmits selectively, but it doesnt improve the download
time because it enters fast retransmit eralier than the no SACK and in
this case the fast retransmits are very costly because they are not
detected lead to window reduction.... am I making sense?:-).... still
the DSACK case is puzzling me....
Regards,
Oumer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 17:32 ` Oumer Teyeb
@ 2006-07-20 15:41 ` Oumer Teyeb
0 siblings, 0 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-20 15:41 UTC (permalink / raw)
To: Alexey Kuznetsov; +Cc: netdev
Hi Alexy, Is there anything linux specific about the DSACK
implementation that might lead to increase in the number of
retransmissions, but leads to improvment in download time when
timestamps are not used (and the reverse effect when timestamps are
used, less retransmissions but bigger download times)? because I
couldnt figure it out,....also is there anywhere where the reordering
response of tcp linux described? (it seem dupthreshold is dynamically
adjusted based on the reordering history... but I was not able to find
out how...)...
Oumer Teyeb wrote:
> Oumer Teyeb wrote:
>
>> Hi,
>>
>> Alexey Kuznetsov wrote:
>>
>>> Condition triggering start of fast retransmit is the same.
>>> The behaviour while retransmit is different. FACKless code
>>> behaves more like NewReno.
>>>
>>>
>> Ok, that is a good point!! Now at least I can convince myself the
>> CDFs for the first retransmissions showing that SACK leads to earlier
>> retransmissions than no SACK are not wrong....and I can even convince
>> myself that this is the real reason behind sack/fack's performance
>> degredation for the case of no timestamps,:-)... ...
>
>
> Actually, then the increase in the number of retransmissions and the
> increase in teh download time from no SACK - SACK for timestamp case
> seems to make sense also...my reasoning is like this...if there is
> timestamps, that means there is reordering detection...hence the
> number retransmissions are reduced because we avoid the time spent in
> fast recovery.... when we introduce SACK on top of timestamps, we
> enter fast retransmits earlier than no SACK case as we seem to agree,
> and since the timestamp reduces the number of retransmission once we
> are in fast recovery, the retransmissions we see are basically the
> first few retransmissions that made us enter the false fast
> retransmits, so we have a little increase in the retransmissions and a
> little increase in the download times... but when no timestamps are
> used, there is no reordering detection and so SACK leads to less
> number of retransmissions because it retransmits selectively, but it
> doesnt improve the download time because it enters fast retransmit
> eralier than the no SACK and in this case the fast retransmits are
> very costly because they are not detected lead to window reduction....
> am I making sense?:-).... still the DSACK case is puzzling me....
>
> Regards,
> Oumer
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Weird TCP SACK problem. in Linux...
2006-07-19 16:32 ` Oumer Teyeb
2006-07-19 17:32 ` Oumer Teyeb
@ 2006-07-20 23:23 ` Alexey Kuznetsov
1 sibling, 0 replies; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-20 23:23 UTC (permalink / raw)
To: Oumer Teyeb; +Cc: netdev
Hello!
> Hmmm... I dont understand this....so if reording can be detected, (i.e
> we use timestamps, DSACK), the dupthreshold is increased
Yes.
> implementation that might lead to increase in the number of
> retransmissions, but leads to improvment in download time
Hmm... I thought and still do not know.
> couldnt figure it out,....also is there anywhere where the reordering
> response of tcp linux described? (it seem dupthreshold is dynamically
> adjusted based on the reordering history... but I was not able to find
> out how...)...
That's comment from tcp_input:
* Reordering detection.
* --------------------
* Reordering metric is maximal distance, which a packet can be displaced
* in packet stream. With SACKs we can estimate it:
*
* 1. SACK fills old hole and the corresponding segment was not
* ever retransmitted -> reordering. Alas, we cannot use it
* when segment was retransmitted.
* 2. The last flaw is solved with D-SACK. D-SACK arrives
* for retransmitted and already SACKed segment -> reordering..
Alexey
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-07-20 23:23 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
2006-07-19 9:38 ` Xiaoliang (David) Wei
2006-07-19 10:00 ` Oumer Teyeb
2006-07-19 13:27 ` Alexey Kuznetsov
2006-07-19 15:02 ` Oumer Teyeb
2006-07-19 15:49 ` Alexey Kuznetsov
2006-07-19 16:32 ` Oumer Teyeb
2006-07-19 17:32 ` Oumer Teyeb
2006-07-20 15:41 ` Oumer Teyeb
2006-07-20 23:23 ` Alexey Kuznetsov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).