netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weird TCP SACK problem. in Linux...
@ 2006-07-18 19:38 Oumer Teyeb
  2006-07-19  9:38 ` Xiaoliang (David) Wei
  2006-07-19 13:27 ` Alexey Kuznetsov
  0 siblings, 2 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-18 19:38 UTC (permalink / raw)
  To: netdev

Hello Guys,

I have some questions regarding TCP SACK implementation in Linux .
As I am not a subscriber, could you please cc the reply to me? thanks!

I am doing these experiments to find out the impact of reordering. So I 
have different TCP versions (newReno, SACK, FACk, DSACK, FRTO,....) as 
implemented in Linux. and I am trying their combination to see how they 
behave. What struck me was that when I dont use timestamps, introducing 
SACK increases the download time but decreases the total number of 
retransmissions.
When timestamps is used, SACK leads to an increase in both the download 
time and the retransmissions.

So I looked further into the results, and what I found was that when 
SACK (when I refer to SACK here, I mean SACK only without FACK and 
DSACK)  is used, the retransmissions seem to happen earlier .
at www.kom.auc.dk/~oumer/first_transmission_times.pdf
you can find the pic of cdf of the time when the first TCP 
retransmission occured for the four combinations of SACK and timestamps 
after hundrends of downloads of a 100K file for the different conditions 
under network reordering...

This explains the reason why the download time increases with SACK, 
because the earlier we go into fast recovery the longer the time we 
spend on congestion avoidance, and the longer the download time... ...I 
am not 100% sure that the retransmissions are only due to reordering as 
I am using tcptrace to get my results, but I am guessing they are not 
because when I used FRTO, there was no improvment, showing that there 
were indeed no timeouts (as FRTO acts only on timeouts).....

...but I couldnt figure out why the retransmissions occur earlier for 
SACK than no SACK TCP. As far as I know, for both SACK and non SACK 
cases, we need three (or more according to the setting) duplicate ACKs 
to enter the fast retransmission /recovery state.... which would have 
resulted in the same behaviour to the first occurance of a 
retransmission..... or is there some undocumented enhancment in Linux 
TCP when using SACK that makes it enter fast retransmit earlier... the 
ony explanation I could imagine is something like this

non SACK case
=============
1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we 
are using delayed ACKs...and we get a triple duplicate ACK after pkt#8 
is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK 
and 7&8...third duplicate ACK.....)...

so if SACK behaved like this...

3&4 SACKEd.... 2 packets out of order received
5&6 SACKEd....4 packets out of order received.... start fast 
retransmission....as reorderd is greater than 3.... (this is true when 
it comes to marking packets as lost during fast recovery, but is it true 
als for the first retransmission?)

.. any ideas why this is happening???

One more thing, say I have FRTO, DSACK and timestamps enabled, which 
algorithm takes precedence ? if FRTO is enabled, then all spurious 
timeout detection are done through FRTO or a combination?..

Thanks in advance,
Oumer

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
@ 2006-07-19  9:38 ` Xiaoliang (David) Wei
  2006-07-19 10:00   ` Oumer Teyeb
  2006-07-19 13:27 ` Alexey Kuznetsov
  1 sibling, 1 reply; 10+ messages in thread
From: Xiaoliang (David) Wei @ 2006-07-19  9:38 UTC (permalink / raw)
  To: Oumer Teyeb; +Cc: netdev

Hi Oumer,

    Your result is interesting. Just a few questions (along with your texts):

> So I looked further into the results, and what I found was that when
> SACK (when I refer to SACK here, I mean SACK only without FACK and
> DSACK)  is used, the retransmissions seem to happen earlier .
> at www.kom.auc.dk/~oumer/first_transmission_times.pdf
> you can find the pic of cdf of the time when the first TCP
> retransmission occured for the four combinations of SACK and timestamps
> after hundrends of downloads of a 100K file for the different conditions
> under network reordering...

Could you give a little bit more details on the scenarios. For example:
What is your RTT, capacity and etc? Linux versions? Packetsize is
1.5K? Then 100K is about 66 packets. Do flows finish slow start or
not? Also, what is the reordering level? Are you using Dummynet or
real network?


> ...but I couldnt figure out why the retransmissions occur earlier for
> SACK than no SACK TCP. As far as I know, for both SACK and non SACK
> cases, we need three (or more according to the setting) duplicate ACKs
> to enter the fast retransmission /recovery state.... which would have
> resulted in the same behaviour to the first occurance of a
> retransmission..... or is there some undocumented enhancment in Linux
> TCP when using SACK that makes it enter fast retransmit earlier... the
> ony explanation I could imagine is something like this

Are you sure FACK is turned OFF? FACK might retransmit earlier if you
have packet reordering, I think.


> non SACK case
> =============
> 1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we
> are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
> is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
> and 7&8...third duplicate ACK.....)...
>
> so if SACK behaved like this...
>
> 3&4 SACKEd.... 2 packets out of order received
> 5&6 SACKEd....4 packets out of order received.... start fast
> retransmission....as reorderd is greater than 3.... (this is true when
> it comes to marking packets as lost during fast recovery, but is it true
> als for the first retransmission?)

I guess delayed ACK is turned off when there is packet reordering. The
receiver will send one ack for each data packet whenever there is out
of order packets in its queue. So we will get duplicate ack ealier
than what you explain above...


> One more thing, say I have FRTO, DSACK and timestamps enabled, which
> algorithm takes precedence ? if FRTO is enabled, then all spurious
> timeout detection are done through FRTO or a combination?..

They are compatible, I think?

When retransmission timer times out, it first tries to go through
FRTO. If FRTO found it's a real loss, then it goes to traditional
timeout process as specified in FRTO algorithm.

-David

-- 
Xiaoliang (David) Wei      Graduate Student, CS@Caltech
http://davidwei.org
***********************************************

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19  9:38 ` Xiaoliang (David) Wei
@ 2006-07-19 10:00   ` Oumer Teyeb
  0 siblings, 0 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 10:00 UTC (permalink / raw)
  To: Xiaoliang (David) Wei; +Cc: netdev

Hi David,

I am using an emualtor that I developed using netfilter (see 
http://kom.aau.dk/~oumer/publications/VTC05.pdf for a description of the 
emulator).. and I emualte a UMTS network with RTT of 150ms, and I use a 
384kbps connection. There is UMTS frame erasure rate of 10%, but I have 
persistant link layer retransmission, which means nothing is actually 
lost. So due to this link layer errors, some packets arrive out of order 
and the effect of that on tcp performance is what I am after. I am using 
linux 2.4.

I have put more detailed traces  at
www.kom.auc.dk/~oumer/sackstuff.tar.gz
I have run the different cases 10 times each,

NT_NSACK[1-10].dat---no timestamp, no SACK 
NT_SACK[1-10].dat----no timestamp, SACK
T_NSACK[1-10].dat---timestamp, no SACK
T_SACK[1-10].dat----timestamp. SACK

(by no SACK I mean only SACK, DSACK and FACK disabled, I also have 
results when they are enabled, see below for curves illustrating the 
different cases...)

the files without extension are just two column files that summarize the 
ten runs for the four different cases, the first column in the # 
retransmission, and second column is the download time, the values are 
gathered from tcptrace

the two eps files are just the plot summarizing  the above average 
download time and average retransmission # for each case...

one more thing in the trace files, you will find 3 tcp connections, the 
first one is not modified by my emulator that causes the reordering 
(actually, that is the connection through which I reset the destination 
catch that stores some metrics from previous runs using some commands 
via ssh), the second one is the ftp control channel and the third one is 
the ftp data channel....the emulator affects the last two channels
and causes reordering once in a while.....
please dont hesistate to ask me if anything is not clear...

Also, I have put the final curves of all my emulations showing the 
download times and percentage of retransmissions (#retransmission 
/total  packets sent)
at
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf

There are a lot of other things that I dont understand from these two 
curve. However the most bizzare one (apart from the SACK issue that 
started this discussion) is why DSACK leads to increased retransmissions 
when used without timestamps? (the behaviour is ok interms of download 
time as it is reducing it, showing that DSACK base spurious 
retransmission is at work)

Thanks a lot for taking the time

Regards,
Oumer







Xiaoliang (David) Wei wrote:

> Hi Oumer,
>
>    Your result is interesting. Just a few questions (along with your 
> texts):
>
>> So I looked further into the results, and what I found was that when
>> SACK (when I refer to SACK here, I mean SACK only without FACK and
>> DSACK)  is used, the retransmissions seem to happen earlier .
>> at www.kom.auc.dk/~oumer/first_transmission_times.pdf
>> you can find the pic of cdf of the time when the first TCP
>> retransmission occured for the four combinations of SACK and timestamps
>> after hundrends of downloads of a 100K file for the different conditions
>> under network reordering...
>
>
> Could you give a little bit more details on the scenarios. For example:
> What is your RTT, capacity and etc? Linux versions? Packetsize is
> 1.5K? Then 100K is about 66 packets. Do flows finish slow start or
> not? Also, what is the reordering level? Are you using Dummynet or
> real network?
>
>
>> ...but I couldnt figure out why the retransmissions occur earlier for
>> SACK than no SACK TCP. As far as I know, for both SACK and non SACK
>> cases, we need three (or more according to the setting) duplicate ACKs
>> to enter the fast retransmission /recovery state.... which would have
>> resulted in the same behaviour to the first occurance of a
>> retransmission..... or is there some undocumented enhancment in Linux
>> TCP when using SACK that makes it enter fast retransmit earlier... the
>> ony explanation I could imagine is something like this
>
>
> Are you sure FACK is turned OFF? FACK might retransmit earlier if you
> have packet reordering, I think.
>
>
>> non SACK case
>> =============
>> 1 2 3 4 5 6 7 8 9 10..... were sent and 2 was reorderd....and assume we
>> are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
>> is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
>> and 7&8...third duplicate ACK.....)...
>>
>> so if SACK behaved like this...
>>
>> 3&4 SACKEd.... 2 packets out of order received
>> 5&6 SACKEd....4 packets out of order received.... start fast
>> retransmission....as reorderd is greater than 3.... (this is true when
>> it comes to marking packets as lost during fast recovery, but is it true
>> als for the first retransmission?)
>
>
> I guess delayed ACK is turned off when there is packet reordering. The
> receiver will send one ack for each data packet whenever there is out
> of order packets in its queue. So we will get duplicate ack ealier
> than what you explain above...
>
>
>> One more thing, say I have FRTO, DSACK and timestamps enabled, which
>> algorithm takes precedence ? if FRTO is enabled, then all spurious
>> timeout detection are done through FRTO or a combination?..
>
>
> They are compatible, I think?
>
> When retransmission timer times out, it first tries to go through
> FRTO. If FRTO found it's a real loss, then it goes to traditional
> timeout process as specified in FRTO algorithm.
>
> -David
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
  2006-07-19  9:38 ` Xiaoliang (David) Wei
@ 2006-07-19 13:27 ` Alexey Kuznetsov
  2006-07-19 15:02   ` Oumer Teyeb
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-19 13:27 UTC (permalink / raw)
  To: Oumer Teyeb; +Cc: netdev

Hello!

> DSACK)  is used, the retransmissions seem to happen earlier .

Yes. With SACK/FACK retransmissions can be triggered earlier,
if an ACK SACKs a segment which is far enough from current snd.una.
That's what happens f.e. in T_SACK_dump5.dat

01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 18825:20273[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948058 469778216> [|] (DF)(ttl 64, id 19165)
01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778229 418948031,nop,nop, sack 1 {10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45508)
01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778235 418948031,nop,nop, sack 2 {13033:14481}{10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45509)
01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 8689:10137[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948076 469778235> [|] (DF) (ttl 64, id 19166)

The second sack confirms that 13033..14481 already arrived.

And this is even not a mistake, the third dupack arrived immediately:
01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778238 418948031,nop,nop, sack 2 {13033:15929}{10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45510)

Actually, it is the reason why the FACK heuristics is not disabled
even when FACK disabled. Experiments showed that relaxing it severely
damages recovery in presense of real multiple losses.
And when it happens to be reordering, undoing works really well.


There is one more thing, which probably happens in your experiments,
though I did not find it in dumps. If reordering exceeds RTT, i.e.
we receive SACK for a segment, which was sent as part of forward
retransmission after a hole was detected, fast retransmit entered immediately.
Two dupacks is enough for this: first triggers forward transmission,
if the second SACKs the segmetn which has just been sent, we are there.

> One more thing, say I have FRTO, DSACK and timestamps enabled, which 
> algorithm takes precedence ?

They live together, essnetially, not dependant. 

Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 13:27 ` Alexey Kuznetsov
@ 2006-07-19 15:02   ` Oumer Teyeb
  2006-07-19 15:49     ` Alexey Kuznetsov
  0 siblings, 1 reply; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 15:02 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: netdev

Hi ,

Alexey Kuznetsov wrote:

>Hello!
>
>  
>
>>DSACK)  is used, the retransmissions seem to happen earlier .
>>    
>>
>
>Yes. With SACK/FACK retransmissions can be triggered earlier,
>if an ACK SACKs a segment which is far enough from current snd.una.
>That's what happens f.e. in T_SACK_dump5.dat
>
>01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 18825:20273[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948058 469778216> [|] (DF)(ttl 64, id 19165)
>01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778229 418948031,nop,nop, sack 1 {10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45508)
>01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778235 418948031,nop,nop, sack 2 {13033:14481}{10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45509)
>01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 8689:10137[31857](1448) ack 1/5841 win 5840/0 <nop,nop,timestamp 418948076 469778235> [|] (DF) (ttl 64, id 19166)
>
>The second sack confirms that 13033..14481 already arrived.
>
>And this is even not a mistake, the third dupack arrived immediately:
>01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 8689/31857 win 23168/0 <nop,nop,timestamp 469778238 418948031,nop,nop, sack 2 {13033:15929}{10137:11585} > (DF) [tos 0x8]  (ttl 62, id 45510)
>  
>
Thanks a lot Alexey for pointing that out.!!!..That was more or less 
what I was asumming....  but is this feature of linux TCP documented 
somewhere? as far as I can see I couldnt find it in Pasi's paper.... in 
the conservative sack based recovery RFC (* RFC 3517), it is clearly 
*stated that the

   Upon the receipt of the first (DupThresh - 1) duplicate ACKs, the
   scoreboard is to be updated as normal.  Note: The first and second
   duplicate ACKs can also be used to trigger the transmission of
   previously unsent segments using the Limited Transmit algorithm
   [RFC3042].

   When a TCP sender receives the duplicate ACK corresponding to
   DupThresh ACKs, the scoreboard MUST be updated with the new SACK
   information (via Update ()).  If no previous loss event has occurred
   on the connection or the cumulative acknowledgment point is beyond
   the last value of RecoveryPoint, a loss recovery phase SHOULD be
   initiated, per the fast retransmit algorithm outlined in [RFC2581].

ofcourse,  once we are in the fast recovery phase we are able to mark a packet lost based on the criteria (also from the same RFC)

IsLost (SeqNum):
      This routine returns whether the given sequence number is
      considered to be lost.  The routine returns true when either
      DupThresh discontiguous SACKed sequences have arrived above
      'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
      than 'SeqNum' have been SACKed.  Otherwise, the routine returns
      false.

But from the trace portion you cut outside  it seems the sack 
implementation in linux simply checked the sn of the newly sacked one, 
and finding out that there are two blocks in between, considered it as 
if it is a dupthresh duplicate ack and retransmitted it... So if we were 
not using sack the retransmission would have occured after 
01:28:15.90... so the TCP SACK retransmitted in this case around 50ms 
earlier...but  it might be larger in some cases, (I will try to look 
into the traces to find larger time differences but you can see there is 
a clear difference by looking at the plots of the cdf of the time of 
occurance of the first retransmissions for the different cases at  
http://kom.aau.dk/~oumer/first_transmission_times.pdf .... so I am on 
the verge of concluding TCP SACK is worse than non SACK TCP incase of 
persistent reordering....if only I could find a reference about the 
linux TCP SACK behaviour we discussed above :-)...

>Actually, it is the reason why the FACK heuristics is not disabled
>even when FACK disabled. Experiments showed that relaxing it severely
>damages recovery in presense of real multiple losses.
>And when it happens to be reordering, undoing works really well.
>  
>
so you are saying, it doesnt matter whether I disable FACK or not, it is 
basically set by default?
and it is disabled only when reordering is detected (and this is done 
either through timestamps or DSACK, right?)...
so if neither DSACK and timestamps are enabled we are unable to detect 
disorder, so basically there should be no difference between SACK and 
FACK, cause it is always FACK used... and that seems to make sense  from 
the results I have  (i.e. referrring to ....
http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf
)...

now let's introduce DSACK and no timestamps... that means we are able to 
detect some reordering and download time should decrease, and it does so 
as shown in the first of the figures I just give the link to...however, 
the # of retransmissions increases as shown in the second figure? isnt 
that odd? shouldnt it be the other way around?

Also why does the # retransmissions in the timestamp case increases when 
we use SACK/FACK as compared with no SACK case?...and as you mentioned 
earlier reordering undoing works very well, by comparing the curves with 
and without timestamps, but some of this seems to be undo when we use it 
along with SACK, FACK and DSACK, eventhough the differences are not that 
much...

>There is one more thing, which probably happens in your experiments,
>though I did not find it in dumps. If reordering exceeds RTT, i.e.
>we receive SACK for a segment, which was sent as part of forward
>retransmission after a hole was detected, fast retransmit entered immediately.
>Two dupacks is enough for this: first triggers forward transmission,
>if the second SACKs the segmetn which has just been sent, we are there.
>  
>
This one , I dont think I understood you. Could you please make it a bit 
more clearer?

>>One more thing, say I have FRTO, DSACK and timestamps enabled, which 
>>algorithm takes precedence ?
>>    
>>
>
>They live together, essnetially, not dependant. 
>  
>
OK ...but if timestamps are enabled, then I just couldnt figure out the 
use of  DSACK, can it tell us something more than we can find using 
timestamps??

>Alexey
>  
>
Regards,
Oumer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 15:02   ` Oumer Teyeb
@ 2006-07-19 15:49     ` Alexey Kuznetsov
  2006-07-19 16:32       ` Oumer Teyeb
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-19 15:49 UTC (permalink / raw)
  To: Oumer Teyeb; +Cc: netdev

HellO!

> IsLost (SeqNum):
>      This routine returns whether the given sequence number is
>      considered to be lost.  The routine returns true when either
>      DupThresh discontiguous SACKed sequences have arrived above
>      'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
>      than 'SeqNum' have been SACKed.  Otherwise, the routine returns
>      false.

It is not used. The metric is just distance between snd.una and
the most forward sack.

It can be changed, but, to be honest, counting "discontiguous SACked sequences"
looks really weird and totally unjustified.

You can look for function tcp_time_to_recover() and replace
tcp_fackets_out(tp) > tp->reordering with something like
tp->sacked_out+1 > tp->reordering. It is not so weird as rfc
recommends, but it should make some difference.


> so you are saying, it doesnt matter whether I disable FACK or not, it is 
> basically set by default?

Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.


> and it is disabled only when reordering is detected (and this is done 
> either through timestamps or DSACK, right?)...
> so if neither DSACK and timestamps are enabled we are unable to detect 
> disorder, so basically there should be no difference between SACK and 
> FACK, cause it is always FACK used... and that seems to make sense  from 
> the results I have 

Yes. But FACKless tcp still retransmits less aggressively.


> the # of retransmissions increases as shown in the second figure? isnt 
> that odd? shouldnt it be the other way around?

The most odd is that I see no correlation between #of retransmits
and download time in you graphs. Actually, the correlation is negative. :-)


> Also why does the # retransmissions in the timestamp case increases when 
> we use SACK/FACK as compared with no SACK case?

Excessive retransmissions still happen. Undoing just restores cwnd
and tries to increase reordering metric to avoid false retransmits.


> This one , I dont think I understood you. Could you please make it a bit 
> more clearer?

1. Suppose, some segments, but not all, were delayed.
2. Senders sees dupack with a SACK. It is the first, SACK allows to open
   window for one segment, you send one segment with snd.nxt.
3. Receivers receives it before delayed segments arrived.
4. When senders sees this SACK, it assumes that all the delayed
   segments are lost.


> OK ...but if timestamps are enabled, then I just couldnt figure out the 
> use of  DSACK, can it tell us something more than we can find using 
> timestamps??

It depends. Normally, no. If the network is fast, timestamps are just
too coarse to detect redundant retransmissions.

Plus, the heuristcs based on timestamps essentially relies on a bug
in our timestamps processing code. Another side could have it fixed. :-)

Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 15:49     ` Alexey Kuznetsov
@ 2006-07-19 16:32       ` Oumer Teyeb
  2006-07-19 17:32         ` Oumer Teyeb
  2006-07-20 23:23         ` Alexey Kuznetsov
  0 siblings, 2 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 16:32 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: netdev

Hi,

Alexey Kuznetsov wrote:

>Condition triggering start of fast retransmit is the same.
>The behaviour while retransmit is different. FACKless code
>behaves more like NewReno.
>  
>
Ok, that is a good point!!  Now at least I can convince myself the CDFs 
for the first retransmissions showing that SACK leads to earlier 
retransmissions than no SACK are not wrong....and I can even convince 
myself that this is the real reason behind sack/fack's performance 
degredation for the case of no timestamps,:-)... ...

>>and it is disabled only when reordering is detected (and this is done 
>>either through timestamps or DSACK, right?)...
>>so if neither DSACK and timestamps are enabled we are unable to detect 
>>disorder, so basically there should be no difference between SACK and 
>>FACK, cause it is always FACK used... and that seems to make sense  from 
>>the results I have 
>>    
>>
>
>Yes. But FACKless tcp still retransmits less aggressively.
>
>  
>
>>the # of retransmissions increases as shown in the second figure? isnt 
>>that odd? shouldnt it be the other way around?
>>    
>>
>
>The most odd is that I see no correlation between #of retransmits
>and download time in you graphs. Actually, the correlation is negative. :-)
>
>  
>
yeah, that was what confuses me the most... in
www.kom.auc.dk/~oumer/ret_vs_download.pdf
I have a plot of the summary of runs of two hundrend runs for the four 
combinations of SACK(ON/OFF), timestamps(ON/OFF)
I just collected the retransmission from each run, and averaged the 
download time for each retransmission count..... I see no clear 
pattern...so that was why I was focusing more on when retransmissions 
are triggered rather than how many of them are they...because first, the 
earlier you are in the fast recovery phase (if you dont revert it ) the 
more time you spend on congestion avoidance, and it hurts the throughput 
quite a lot, also, the number of times you enter fast retransmit is more 
harmful than that of the number of retransmissions because more 
unncessary retransmissions during a fast recovery costs some bandwidth, 
but it doesnt damage the "future" of the connection as much as a 
retransmission that drives tcp into fast recovery....

>>Also why does the # retransmissions in the timestamp case increases when 
>>we use SACK/FACK as compared with no SACK case?
>>    
>>
>
>Excessive retransmissions still happen. Undoing just restores cwnd
>and tries to increase reordering metric to avoid false retransmits.
>
>  
>
Hmmm... I dont understand this....so if reording can be detected, (i.e 
we use timestamps, DSACK), the dupthreshold is increased temporarily? Ok 
this adds to the explanation of  why the retransmissions are
less in the timestamp case than in the non timestamp case (in addition 
to the fact that with timestamps, we get out of fast recovery earlier 
than non timestamps case, and hence also less retransmissions)...but 
what I was referring to was if you use timestamps then why the increase 
in the number of retransmissions when we use FACK, SACK or DSACK as 
compared to the no SACK case...Is this dupthreshold increase documented 
somewhere properly? in the linux congestion paper by you and Pasi , you 
mention it briefly in section 5 "linux fast recovery does not fully 
follow RFC 2582.. the sender adjusts the threshold for triggering fast 
retransmit dynamically, based on the observerd reordering in the 
network..." but it doesnt exactly say how this dynamic adjustment is 
done ....

>1. Suppose, some segments, but not all, were delayed.
>2. Senders sees dupack with a SACK. It is the first, SACK allows to open
>   window for one segment, you send one segment with snd.nxt.
>3. Receivers receives it before delayed segments arrived.
>4. When senders sees this SACK, it assumes that all the delayed
>   segments are lost.
>  
>
Thanks! it is very clear now.! but it is basically the same effect (for 
the explanation that I am seeking)...as the trace you quoted, right, two 
duplicate acks leading to retransmission....

>>OK ...but if timestamps are enabled, then I just couldnt figure out the 
>>use of  DSACK, can it tell us something more than we can find using 
>>timestamps??
>>    
>>
>
>It depends. Normally, no. If the network is fast, timestamps are just
>too coarse to detect redundant retransmissions.
>
>Plus, the heuristcs based on timestamps essentially relies on a bug
>in our timestamps processing code. Another side could have it fixed. :-)
>  
>
Ok, for my studies it shouldnt matter because I am using the buggy code 
on both the sender and receiver.. :-) (though I dont understand what 
this bug you are referring to is about :-)....

>Alexey
>  
>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 16:32       ` Oumer Teyeb
@ 2006-07-19 17:32         ` Oumer Teyeb
  2006-07-20 15:41           ` Oumer Teyeb
  2006-07-20 23:23         ` Alexey Kuznetsov
  1 sibling, 1 reply; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-19 17:32 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: netdev

Oumer Teyeb wrote:

> Hi,
>
> Alexey Kuznetsov wrote:
>
>> Condition triggering start of fast retransmit is the same.
>> The behaviour while retransmit is different. FACKless code
>> behaves more like NewReno.
>>  
>>
> Ok, that is a good point!!  Now at least I can convince myself the 
> CDFs for the first retransmissions showing that SACK leads to earlier 
> retransmissions than no SACK are not wrong....and I can even convince 
> myself that this is the real reason behind sack/fack's performance 
> degredation for the case of no timestamps,:-)... ...

Actually, then the increase in the number of retransmissions and the 
increase in teh download time from no SACK - SACK for timestamp case 
seems to make sense also...my reasoning is like this...if there is 
timestamps, that means there is reordering detection...hence the number 
retransmissions are reduced because we avoid the time spent in fast 
recovery.... when we introduce SACK on top of timestamps, we enter fast 
retransmits earlier than no SACK case as we seem to agree, and since the 
timestamp reduces the number of retransmission once we are in fast 
recovery, the retransmissions we see are basically the first few 
retransmissions that made us enter the false fast retransmits, so we 
have a little increase in the retransmissions and a little increase in 
the download times... but when no timestamps are used, there is no 
reordering detection and so SACK leads to less number of retransmissions 
because it retransmits selectively, but it doesnt improve the download 
time because it enters fast retransmit eralier than the no SACK and in 
this case the fast retransmits are very costly because they are not 
detected lead to window reduction.... am I making sense?:-).... still 
the DSACK case is puzzling me....

Regards,
Oumer

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 17:32         ` Oumer Teyeb
@ 2006-07-20 15:41           ` Oumer Teyeb
  0 siblings, 0 replies; 10+ messages in thread
From: Oumer Teyeb @ 2006-07-20 15:41 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: netdev

Hi Alexy, Is there anything linux specific about the DSACK 
implementation that might lead to increase in the number of 
retransmissions, but leads to improvment in download time when 
timestamps are not used (and the reverse effect  when timestamps are 
used, less retransmissions but bigger download times)?  because I 
couldnt figure it out,....also is there anywhere where the reordering 
response of tcp linux described? (it seem dupthreshold is dynamically 
adjusted based on the reordering history... but I was not able to find 
out how...)...
 
Oumer Teyeb wrote:

> Oumer Teyeb wrote:
>
>> Hi,
>>
>> Alexey Kuznetsov wrote:
>>
>>> Condition triggering start of fast retransmit is the same.
>>> The behaviour while retransmit is different. FACKless code
>>> behaves more like NewReno.
>>>  
>>>
>> Ok, that is a good point!!  Now at least I can convince myself the 
>> CDFs for the first retransmissions showing that SACK leads to earlier 
>> retransmissions than no SACK are not wrong....and I can even convince 
>> myself that this is the real reason behind sack/fack's performance 
>> degredation for the case of no timestamps,:-)... ...
>
>
> Actually, then the increase in the number of retransmissions and the 
> increase in teh download time from no SACK - SACK for timestamp case 
> seems to make sense also...my reasoning is like this...if there is 
> timestamps, that means there is reordering detection...hence the 
> number retransmissions are reduced because we avoid the time spent in 
> fast recovery.... when we introduce SACK on top of timestamps, we 
> enter fast retransmits earlier than no SACK case as we seem to agree, 
> and since the timestamp reduces the number of retransmission once we 
> are in fast recovery, the retransmissions we see are basically the 
> first few retransmissions that made us enter the false fast 
> retransmits, so we have a little increase in the retransmissions and a 
> little increase in the download times... but when no timestamps are 
> used, there is no reordering detection and so SACK leads to less 
> number of retransmissions because it retransmits selectively, but it 
> doesnt improve the download time because it enters fast retransmit 
> eralier than the no SACK and in this case the fast retransmits are 
> very costly because they are not detected lead to window reduction.... 
> am I making sense?:-).... still the DSACK case is puzzling me....
>
> Regards,
> Oumer
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Weird TCP SACK problem. in Linux...
  2006-07-19 16:32       ` Oumer Teyeb
  2006-07-19 17:32         ` Oumer Teyeb
@ 2006-07-20 23:23         ` Alexey Kuznetsov
  1 sibling, 0 replies; 10+ messages in thread
From: Alexey Kuznetsov @ 2006-07-20 23:23 UTC (permalink / raw)
  To: Oumer Teyeb; +Cc: netdev

Hello!

> Hmmm... I dont understand this....so if reording can be detected, (i.e 
> we use timestamps, DSACK), the dupthreshold is increased

Yes.

> implementation that might lead to increase in the number of 
> retransmissions, but leads to improvment in download time

Hmm... I thought and still do not know.


> couldnt figure it out,....also is there anywhere where the reordering 
> response of tcp linux described? (it seem dupthreshold is dynamically 
> adjusted based on the reordering history... but I was not able to find 
> out how...)...

That's comment from tcp_input:

 * Reordering detection.
 * --------------------
 * Reordering metric is maximal distance, which a packet can be displaced
 * in packet stream. With SACKs we can estimate it:
 *
 * 1. SACK fills old hole and the corresponding segment was not
 *    ever retransmitted -> reordering. Alas, we cannot use it
 *    when segment was retransmitted.
 * 2. The last flaw is solved with D-SACK. D-SACK arrives
 *    for retransmitted and already SACKed segment -> reordering..


Alexey


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-07-20 23:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-18 19:38 Weird TCP SACK problem. in Linux Oumer Teyeb
2006-07-19  9:38 ` Xiaoliang (David) Wei
2006-07-19 10:00   ` Oumer Teyeb
2006-07-19 13:27 ` Alexey Kuznetsov
2006-07-19 15:02   ` Oumer Teyeb
2006-07-19 15:49     ` Alexey Kuznetsov
2006-07-19 16:32       ` Oumer Teyeb
2006-07-19 17:32         ` Oumer Teyeb
2006-07-20 15:41           ` Oumer Teyeb
2006-07-20 23:23         ` Alexey Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).