* TCP IPv4 strange retransmits
@ 2008-03-04 13:00 Arnd Hannemann
2008-03-04 13:36 ` Ilpo Järvinen
0 siblings, 1 reply; 11+ messages in thread
From: Arnd Hannemann @ 2008-03-04 13:00 UTC (permalink / raw)
To: Netdev
Hi,
I'm observing some retransmits with kernel 2.6.24.2, which I don't understand.
For instance in this cutout[1] of a sequence diagram which was captured[2]
on the TCP sender, 4 retransmits are made.
According to netstat -st output[3][4] all those 4 retransmits were "fast retransmit".
But there are no three DUPACKs which I expected would be needed for fast retransmit?
Also interesting all retransmits happen _after_ those segments were
already acked and sacked, internal queuing or latency issues?
It would be great if somebody could shed some light on this,
why those segments are retransmitted.
Dumps and xplots are available here[5].
Scenario details:
192.168.0.5 <---------------\ /------------------> 192.168.0.7
|
[ tc qdisc add dev wldev root netem delay 10ms reorder 25% ]
|
|
192.168.0.6
192.168.0.5 establishes connection to 192.168.0.7 via 192.168.0.6.
Then bulk tcp transfer was performed from 192.168.0.5 to 192.168.0.6 for 500 ms.
Default tcp configuration of 2.6.24.2 was used (cc=cubic).
Best regards,
Arnd
[1] http://www.umic-mesh.net/~hannemann/strange-reorder/strange_reorder.png
[2] http://www.umic-mesh.net/~hannemann/strange-reorder/sender.dump
[3] http://www.umic-mesh.net/~hannemann/strange-reorder/netstat-sender.before
[4] http://www.umic-mesh.net/~hannemann/strange-reorder/netstat-sender.after
[5] http://www.umic-mesh.net/~hannemann/strange-reorder/
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: TCP IPv4 strange retransmits 2008-03-04 13:00 TCP IPv4 strange retransmits Arnd Hannemann @ 2008-03-04 13:36 ` Ilpo Järvinen 2008-03-04 14:31 ` Arnd Hannemann 0 siblings, 1 reply; 11+ messages in thread From: Ilpo Järvinen @ 2008-03-04 13:36 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Netdev On Tue, 4 Mar 2008, Arnd Hannemann wrote: > I'm observing some retransmits with kernel 2.6.24.2, which I don't > understand. For instance in this cutout[1] of a sequence diagram which > was captured[2] on the TCP sender, 4 retransmits are made. They don't correspond to each other? > According to netstat -st output[3][4] all those 4 retransmits were "fast > retransmit". > But there are no three DUPACKs which I expected would be needed for fast > retransmit? With FACK it's enough that you have fackets_out > tp->reordering (=dupThresh). > Also interesting all retransmits happen _after_ those segments were > already acked and sacked, internal queuing or latency issues? I think your viewer is doing something wrong, sender.dump is not giving such information (or you draw that from wrong end?). Or it just draws DSACK like that? > It would be great if somebody could shed some light on this, > why those segments are retransmitted. > Dumps and xplots are available here[5]. ...I quickly glanced over it and found no strange behavior in the sender.dump. -- i. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 13:36 ` Ilpo Järvinen @ 2008-03-04 14:31 ` Arnd Hannemann 2008-03-04 21:04 ` H. Willstrand 2008-03-04 21:07 ` Ilpo Järvinen 0 siblings, 2 replies; 11+ messages in thread From: Arnd Hannemann @ 2008-03-04 14:31 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: Netdev Hi, Ilpo Järvinen wrote: > On Tue, 4 Mar 2008, Arnd Hannemann wrote: > >> I'm observing some retransmits with kernel 2.6.24.2, which I don't >> understand. For instance in this cutout[1] of a sequence diagram which >> was captured[2] on the TCP sender, 4 retransmits are made. > > They don't correspond to each other? Hmm, they should. > >> According to netstat -st output[3][4] all those 4 retransmits were "fast >> retransmit". >> But there are no three DUPACKs which I expected would be needed for fast >> retransmit? > > With FACK it's enough that you have fackets_out > tp->reordering > (=dupThresh). If it is FACK shouldn't it be accounted for LINUX_MIB_TCPFORWARDRETRANS instead of LINUX_MIB_TCPFASTRETRANS? > >> Also interesting all retransmits happen _after_ those segments were >> already acked and sacked, internal queuing or latency issues? > > I think your viewer is doing something wrong, sender.dump is not giving > such information (or you draw that from wrong end?). Or it just draws > DSACK like that? Viewer is tcptrace and xplot. So nothing special at all. You see it also in wireshark, if you draw a sequence diagram. You also see it in wireshark if you sort by capture timestamp. I always thought that capture timestamp order is correct and not dump order, but maybe I'm wrong? Tcpdump: 12:08:20.667538 IP 192.168.0.7.33824 > 192.168.0.5.50139: . ack 23485 win 22720 <nop,nop,timestamp 969759 972885,nop,nop,sack 2 {24905:26325}{27745:29165}> ^^^^^ got acked at .667538 12:08:20.646749 IP 192.168.0.5.50139 > 192.168.0.7.33824: . 22065:23485(1420) ack 1 win 2864 <nop,nop,timestamp 972885 969754> ^^^^^ got retransmitted at .646749 > >> It would be great if somebody could shed some light on this, >> why those segments are retransmitted. >> Dumps and xplots are available here[5]. > > ...I quickly glanced over it and found no strange behavior in > the sender.dump. Best regards, Arnd ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 14:31 ` Arnd Hannemann @ 2008-03-04 21:04 ` H. Willstrand 2008-03-04 22:41 ` Arnd Hannemann 2008-03-04 21:07 ` Ilpo Järvinen 1 sibling, 1 reply; 11+ messages in thread From: H. Willstrand @ 2008-03-04 21:04 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Ilpo Järvinen, Netdev On Tue, Mar 4, 2008 at 3:31 PM, Arnd Hannemann <hannemann@nets.rwth-aachen.de> wrote: > Hi, > > > Ilpo Järvinen wrote: > > On Tue, 4 Mar 2008, Arnd Hannemann wrote: > > > >> I'm observing some retransmits with kernel 2.6.24.2, which I don't > >> understand. For instance in this cutout[1] of a sequence diagram which > >> was captured[2] on the TCP sender, 4 retransmits are made. > > > > They don't correspond to each other? > > Hmm, they should. > > > > > >> According to netstat -st output[3][4] all those 4 retransmits were "fast > >> retransmit". > >> But there are no three DUPACKs which I expected would be needed for fast > >> retransmit? > > > > With FACK it's enough that you have fackets_out > tp->reordering > > (=dupThresh). > > If it is FACK shouldn't it be accounted for LINUX_MIB_TCPFORWARDRETRANS > instead of LINUX_MIB_TCPFASTRETRANS? > > > > > >> Also interesting all retransmits happen _after_ those segments were > >> already acked and sacked, internal queuing or latency issues? > > > > I think your viewer is doing something wrong, sender.dump is not giving > > such information (or you draw that from wrong end?). Or it just draws > > DSACK like that? > > Viewer is tcptrace and xplot. So nothing special at all. > You see it also in wireshark, if you draw a sequence diagram. > You also see it in wireshark if you sort by capture timestamp. I always thought > that capture timestamp order is correct and not dump order, but maybe I'm wrong? > > Tcpdump: > > 12:08:20.667538 IP 192.168.0.7.33824 > 192.168.0.5.50139: . ack 23485 win 22720 <nop,nop,timestamp 969759 972885,nop,nop,sack 2 {24905:26325}{27745:29165}> > ^^^^^ got acked at .667538 > > 12:08:20.646749 IP 192.168.0.5.50139 > 192.168.0.7.33824: . 22065:23485(1420) ack 1 win 2864 <nop,nop,timestamp 972885 969754> > ^^^^^ got retransmitted at .646749 > > > > > >> It would be great if somebody could shed some light on this, > >> why those segments are retransmitted. > >> Dumps and xplots are available here[5]. > > > > ...I quickly glanced over it and found no strange behavior in > > the sender.dump. > > Best regards, > Arnd > > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hi! I recommend you to capture packages both on sender-side and receiver-side to verify the tcpdump. Regards, Harri ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 21:04 ` H. Willstrand @ 2008-03-04 22:41 ` Arnd Hannemann 0 siblings, 0 replies; 11+ messages in thread From: Arnd Hannemann @ 2008-03-04 22:41 UTC (permalink / raw) To: H. Willstrand; +Cc: Ilpo Järvinen, Netdev H. Willstrand wrote: [snip] > > Hi! > > I recommend you to capture packages both on sender-side and > receiver-side to verify the tcpdump. In fact I did: http://www.umic-mesh.net/~hannemann/strange-reorder/receiver.dump > Regards, > Harri > Regards, Arnd ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 14:31 ` Arnd Hannemann 2008-03-04 21:04 ` H. Willstrand @ 2008-03-04 21:07 ` Ilpo Järvinen 2008-03-04 21:19 ` Ilpo Järvinen 2008-03-04 23:03 ` Arnd Hannemann 1 sibling, 2 replies; 11+ messages in thread From: Ilpo Järvinen @ 2008-03-04 21:07 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 3278 bytes --] On Tue, 4 Mar 2008, Arnd Hannemann wrote: > Hi, > > Ilpo Järvinen wrote: > > On Tue, 4 Mar 2008, Arnd Hannemann wrote: > > > >> I'm observing some retransmits with kernel 2.6.24.2, which I don't > >> understand. For instance in this cutout[1] of a sequence diagram which > >> was captured[2] on the TCP sender, 4 retransmits are made. > > > > They don't correspond to each other? > > Hmm, they should. Yeah, they probably do, I was just too hasty and failed to notice those small negative offsets. > >> According to netstat -st output[3][4] all those 4 retransmits were "fast > >> retransmit". > >> But there are no three DUPACKs which I expected would be needed for fast > >> retransmit? > > > > With FACK it's enough that you have fackets_out > tp->reordering > > (=dupThresh). > > If it is FACK shouldn't it be accounted for LINUX_MIB_TCPFORWARDRETRANS > instead of LINUX_MIB_TCPFASTRETRANS? No, if there's any skb which is more than fackets_out-tp->reordering from the highest SACKed skb, it will be marked TCPCB_LOST (see tcp_mark_head_lost & it's caller), and all LOST segments are retransmitted by the earlier loop (for a while still as I'm going to very likely change that in net-2.6.26, commits for consolidating both, nearly identical loops are already in my local git and await some testing). Forwardretrans is only incremented when there isn't TCPCB_LOST set for a segment and it doesn't apply in this case anyway because you have new data to send (see the decision making for forward retransmits, it's well commented btw). > >> Also interesting all retransmits happen _after_ those segments were > >> already acked and sacked, internal queuing or latency issues? > > > > I think your viewer is doing something wrong, sender.dump is not giving > > such information (or you draw that from wrong end?). Or it just draws > > DSACK like that? > > Viewer is tcptrace and xplot. So nothing special at all. > You see it also in wireshark, if you draw a sequence diagram. Ah, now I noticed those small timeleaps, very small enough to not catch my eye earlier as the amount of numbers in such screen is just overhelming... :-) > You also see it in wireshark if you sort by capture timestamp. I always > thought that capture timestamp order is correct and not dump order, but > maybe I'm wrong? I'm not sure, in the other order they make very much sense. In addition, the ACKs are processed in order and their effects are immediate even if there's more information awaiting to be processed. > Tcpdump: > > 12:08:20.667538 IP 192.168.0.7.33824 > 192.168.0.5.50139: . ack 23485 win 22720 <nop,nop,timestamp 969759 972885,nop,nop,sack 2 {24905:26325}{27745:29165}> > ^^^^^ got acked at .667538 Did you paste wrong timestamp as 667538 == 667538? ...It just makes no sense for me, what are you trying to say here? > 12:08:20.646749 IP 192.168.0.5.50139 > 192.168.0.7.33824: . 22065:23485(1420) ack 1 win 2864 <nop,nop,timestamp 972885 969754> > ^^^^^ got retransmitted at .646749 What's the problem here? At .646749 something was retransmitted, but only after .667538 it was acked? Again, this makes very little sense for me... Why did you copy them wrong way around from the tcpdump log? Or are these two lines related at all? -- i. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 21:07 ` Ilpo Järvinen @ 2008-03-04 21:19 ` Ilpo Järvinen 2008-03-04 23:03 ` Arnd Hannemann 1 sibling, 0 replies; 11+ messages in thread From: Ilpo Järvinen @ 2008-03-04 21:19 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 835 bytes --] On Tue, 4 Mar 2008, Ilpo Järvinen wrote: > In addition, the ACKs are processed in order and their effects are > immediate even if there's more information awaiting to be processed. Before somebody asks or suggests (one might be tempted to think it's a good idea), no, we likely don't want to do it other way around unless somebody first proves that it won't negatively affect TCP's ACK clock, and would benefits only some corner-case like this (and even in such case, one might get bitten by the tcp_max_burst). It would of course be possible to come up with a solution that reverse those _and_ fixes the ACK clock problems caused by such approach. The problems are similar to what LRO is causing btw, so it might not be complete waste of efforts to fix the ACK clock problems and reuse the solution in LRO as well. -- i. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 21:07 ` Ilpo Järvinen 2008-03-04 21:19 ` Ilpo Järvinen @ 2008-03-04 23:03 ` Arnd Hannemann 2008-03-05 7:00 ` Ilpo Järvinen 1 sibling, 1 reply; 11+ messages in thread From: Arnd Hannemann @ 2008-03-04 23:03 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: Netdev Ilpo Järvinen wrote: > On Tue, 4 Mar 2008, Arnd Hannemann wrote: > >> Hi, >> >> Ilpo Järvinen wrote: >>> On Tue, 4 Mar 2008, Arnd Hannemann wrote: >>> >>>> I'm observing some retransmits with kernel 2.6.24.2, which I don't >>>> understand. For instance in this cutout[1] of a sequence diagram which >>>> was captured[2] on the TCP sender, 4 retransmits are made. >>> They don't correspond to each other? >> Hmm, they should. > > Yeah, they probably do, I was just too hasty and failed to notice those > small negative offsets. > >>>> According to netstat -st output[3][4] all those 4 retransmits were "fast >>>> retransmit". >>>> But there are no three DUPACKs which I expected would be needed for fast >>>> retransmit? >>> With FACK it's enough that you have fackets_out > tp->reordering >>> (=dupThresh). >> If it is FACK shouldn't it be accounted for LINUX_MIB_TCPFORWARDRETRANS >> instead of LINUX_MIB_TCPFASTRETRANS? > > No, if there's any skb which is more than fackets_out-tp->reordering from > the highest SACKed skb, it will be marked TCPCB_LOST (see > tcp_mark_head_lost & it's caller), and all LOST segments are retransmitted > by the earlier loop (for a while still as I'm going to very likely change > that in net-2.6.26, commits for consolidating both, nearly identical loops > are already in my local git and await some testing). > > Forwardretrans is only incremented when there isn't TCPCB_LOST set for a > segment and it doesn't apply in this case anyway because you have new data > to send (see the decision making for forward retransmits, it's well > commented btw). Ah, I see. Thank you for clarifying. However fackets_out is not so well documented ;-) But it now makes all sense (with dump order): An ACK 19225 arrives with SACK block {27745:29165}, so fackets_out becomes ~6 ((27745-19225)/1450) tp->reordering is 3 at this time so he starts to retransmit. However some SACK ACK comes early enough so he stops at 4 retransmits. Or something like that... > >>>> Also interesting all retransmits happen _after_ those segments were >>>> already acked and sacked, internal queuing or latency issues? >>> I think your viewer is doing something wrong, sender.dump is not giving >>> such information (or you draw that from wrong end?). Or it just draws >>> DSACK like that? >> Viewer is tcptrace and xplot. So nothing special at all. >> You see it also in wireshark, if you draw a sequence diagram. > > Ah, now I noticed those small timeleaps, very small enough to not > catch my eye earlier as the amount of numbers in such screen is just > overhelming... :-) Very small indeed. Probably the time a packets travels in kernel through the layer is higher than the difference between ACK and retransmit. > >> You also see it in wireshark if you sort by capture timestamp. I always >> thought that capture timestamp order is correct and not dump order, but >> maybe I'm wrong? > > I'm not sure, in the other order they make very much sense. In addition, > the ACKs are processed in order and their effects are immediate even if > there's more information awaiting to be processed. > >> Tcpdump: >> >> 12:08:20.667538 IP 192.168.0.7.33824 > 192.168.0.5.50139: . ack 23485 win 22720 <nop,nop,timestamp 969759 972885,nop,nop,sack 2 {24905:26325}{27745:29165}> >> ^^^^^ got acked at .667538 > > Did you paste wrong timestamp as 667538 == 667538? ...It just makes no > sense for me, what are you trying to say here? > >> 12:08:20.646749 IP 192.168.0.5.50139 > 192.168.0.7.33824: . 22065:23485(1420) ack 1 win 2864 <nop,nop,timestamp 972885 969754> >> ^^^^^ got retransmitted at .646749 > > What's the problem here? At .646749 something was retransmitted, but only > after .667538 it was acked? Again, this makes very little sense for me... > Why did you copy them wrong way around from the tcpdump log? Or are these > two lines related at all? Sorry, this was just bogus. Just wanted to point out the timestamp differences and made a wrong example. Screen full of numbers... ;-) Thanks for your help. Best regards, Arnd ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-04 23:03 ` Arnd Hannemann @ 2008-03-05 7:00 ` Ilpo Järvinen 2008-03-05 13:04 ` Arnd Hannemann 0 siblings, 1 reply; 11+ messages in thread From: Ilpo Järvinen @ 2008-03-05 7:00 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 3171 bytes --] On Wed, 5 Mar 2008, Arnd Hannemann wrote: > Ilpo Järvinen wrote: > > > No, if there's any skb which is more than fackets_out-tp->reordering from > > the highest SACKed skb, it will be marked TCPCB_LOST (see > > tcp_mark_head_lost & it's caller), and all LOST segments are retransmitted > > by the earlier loop (for a while still as I'm going to very likely change > > that in net-2.6.26, commits for consolidating both, nearly identical loops > > are already in my local git and await some testing). > > > > Forwardretrans is only incremented when there isn't TCPCB_LOST set for a > > segment and it doesn't apply in this case anyway because you have new data > > to send (see the decision making for forward retransmits, it's well > > commented btw). > > Ah, I see. Thank you for clarifying. > However fackets_out is not so well documented ;-) I think I've fixed this for 2.6.25... :-) : ... /* Heurestics to calculate number of duplicate ACKs. There's no dupACKs * counter when SACK is enabled (without SACK, sacked_out is used for * that purpose). * * Instead, with FACK TCP uses fackets_out that includes both SACKed * segments up to the highest received SACK block so far and holes in * between them. * * With reordering, holes may still be in flight, so RFC3517 recovery * uses pure sacked_out (total number of SACKed segments) even though * it violates the RFC that uses duplicate ACKs, often these are equal * but when e.g. out-of-window ACKs or packet duplication occurs, * they differ. Since neither occurs due to loss, TCP should really * ignore them. */ static inline int tcp_dupack_heurestics(struct tcp_sock *tp) ... ...Though some FACK comments seem to be saying something else still. > But it now makes all sense (with dump order): > An ACK 19225 arrives with SACK block {27745:29165}, so fackets_out becomes > ~6 ((27745-19225)/1450) > tp->reordering is 3 at this time so he starts to retransmit. > However some SACK ACK comes early enough so he stops at 4 retransmits. > Or something like that... Another thing you should consider is reordering detection which hopefully worked at 13:08:20.667529 through the newly discored SACK block which is _lower_ than the highestmost SACK block received so far. That results in FACK -> RFC3517, FACK is built on inorder assumptions and whenever we find that untrue, e.g., due to SACK/ACK for non-rexmit when something larger has been confirmed received we disable it. Ah, but this was 2.6.24.y? It doesn't yet do RFC3517 IIRC, but has something remotely resembling newreno, but only for the first packet because the next cumulative ACK may often trigger timedout loop which basically marks everything lost (I don't remember if the latter was changed to occur only with FACK ages ago or not). > >> Tcpdump: > > Sorry, this was just bogus. Just wanted to point out the timestamp > differences and made a wrong example. Screen full of numbers... ;-) I thought so :-). ...Large, nearly equal numbers in two dimensions, maybe at some day I wake up and notice I've read them too long noticing that capturing this kind of things is no longer a problem to me... :-/ -- i. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-05 7:00 ` Ilpo Järvinen @ 2008-03-05 13:04 ` Arnd Hannemann 2008-03-05 19:32 ` Ilpo Järvinen 0 siblings, 1 reply; 11+ messages in thread From: Arnd Hannemann @ 2008-03-05 13:04 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: Netdev Ilpo Järvinen wrote: > On Wed, 5 Mar 2008, Arnd Hannemann wrote: > >> Ilpo Järvinen wrote: >> >>> No, if there's any skb which is more than fackets_out-tp->reordering from >>> the highest SACKed skb, it will be marked TCPCB_LOST (see >>> tcp_mark_head_lost & it's caller), and all LOST segments are retransmitted >>> by the earlier loop (for a while still as I'm going to very likely change >>> that in net-2.6.26, commits for consolidating both, nearly identical loops >>> are already in my local git and await some testing). >>> >>> Forwardretrans is only incremented when there isn't TCPCB_LOST set for a >>> segment and it doesn't apply in this case anyway because you have new data >>> to send (see the decision making for forward retransmits, it's well >>> commented btw). >> Ah, I see. Thank you for clarifying. >> However fackets_out is not so well documented ;-) > > I think I've fixed this for 2.6.25... :-) : > > ... > /* Heurestics to calculate number of duplicate ACKs. There's no dupACKs > * counter when SACK is enabled (without SACK, sacked_out is used for > * that purpose). > * > * Instead, with FACK TCP uses fackets_out that includes both SACKed > * segments up to the highest received SACK block so far and holes in > * between them. > * > * With reordering, holes may still be in flight, so RFC3517 recovery > * uses pure sacked_out (total number of SACKed segments) even though > * it violates the RFC that uses duplicate ACKs, often these are equal > * but when e.g. out-of-window ACKs or packet duplication occurs, > * they differ. Since neither occurs due to loss, TCP should really > * ignore them. > */ > static inline int tcp_dupack_heurestics(struct tcp_sock *tp) > ... Great :-) But shouldn't it read "heuristics" ? > ...Though some FACK comments seem to be saying something else still. > >> But it now makes all sense (with dump order): >> An ACK 19225 arrives with SACK block {27745:29165}, so fackets_out becomes >> ~6 ((27745-19225)/1450) >> tp->reordering is 3 at this time so he starts to retransmit. >> However some SACK ACK comes early enough so he stops at 4 retransmits. >> Or something like that... > > Another thing you should consider is reordering detection which hopefully > worked at 13:08:20.667529 through the newly discored SACK block which is > _lower_ than the highestmost SACK block received so far. That results in > FACK -> RFC3517, FACK is built on inorder assumptions and whenever we find > that untrue, e.g., due to SACK/ACK for non-rexmit when something larger > has been confirmed received we disable it. Ah, but this was 2.6.24.y? It Yes, it was 2.6.24.2. Actually you can see reordering detection at work here[3], the tool[4] we are using to measure TCP throughput samples the tcp_info struct and the column #reor should reflect tp->reordering. First it is 3 then it grows up to 16. Off course this is only a hint because tcp_info is only sampled every 50ms in this example, but at least it shows that some reordering detection took place... > doesn't yet do RFC3517 IIRC, but has something remotely resembling > newreno, but only for the first packet because the next cumulative ACK may > often trigger timedout loop which basically marks everything lost (I don't > remember if the latter was changed to occur only with FACK ages ago or > not). Not sure if I understood this. Will have to look into this some more. > >>>> Tcpdump: >> Sorry, this was just bogus. Just wanted to point out the timestamp >> differences and made a wrong example. Screen full of numbers... ;-) > > I thought so :-). > > ...Large, nearly equal numbers in two dimensions, maybe at some day > I wake up and notice I've read them too long noticing that capturing > this kind of things is no longer a problem to me... :-/ > [3] http://www.umic-mesh.net/~hannemann/strange-reorder/flowgrind.output [4] http://www.umic-mesh.net/research/tcp/flowgrind.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: TCP IPv4 strange retransmits 2008-03-05 13:04 ` Arnd Hannemann @ 2008-03-05 19:32 ` Ilpo Järvinen 0 siblings, 0 replies; 11+ messages in thread From: Ilpo Järvinen @ 2008-03-05 19:32 UTC (permalink / raw) To: Arnd Hannemann; +Cc: Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 2207 bytes --] On Wed, 5 Mar 2008, Arnd Hannemann wrote: > Ilpo Järvinen wrote: > > On Wed, 5 Mar 2008, Arnd Hannemann wrote: > > > >> Ilpo Järvinen wrote: > > > > ... > > /* Heurestics to calculate number of duplicate ACKs. There's no dupACKs > > Great :-) But shouldn't it read "heuristics" ? Sure, a Finnish vovel leaked into it. If somebody would have asked, I wouldn't even have known which was the right form in English. > > Another thing you should consider is reordering detection which hopefully > > worked at 13:08:20.667529 through the newly discored SACK block which is > > _lower_ than the highestmost SACK block received so far. That results in > > FACK -> RFC3517, FACK is built on inorder assumptions and whenever we find > > that untrue, e.g., due to SACK/ACK for non-rexmit when something larger > > has been confirmed received we disable it. Ah, but this was 2.6.24.y? It > > Yes, it was 2.6.24.2. Actually you can see reordering detection at work here[3], > the tool[4] we are using to measure TCP throughput samples the tcp_info struct and the > column #reor should reflect tp->reordering. > First it is 3 then it grows up to 16. Off course this is only a hint because > tcp_info is only sampled every 50ms in this example, but at least it shows that some > reordering detection took place... Ok. I usually can determine exact events from tcpdump, too used to them... :-) > > doesn't yet do RFC3517 IIRC, but has something remotely resembling > > newreno, but only for the first packet because the next cumulative ACK may > > often trigger timedout loop which basically marks everything lost (I don't > > remember if the latter was changed to occur only with FACK ages ago or > > not). > > Not sure if I understood this. Will have to look into this some more. Before 2.6.25, the non-FACK SACK was quite strange mixture of things. It won't resemble anything RFCish by any means, unless timedout loop (see the loop that plays with scorboard_skb_hint) was already changed to be used with FACK only in 2.6.24 or before it (I don't remember if I ever submitted that because making non-FACK SACK behave very close to what RFC3517 does was just around the corner as well). -- i. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-03-05 19:32 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-04 13:00 TCP IPv4 strange retransmits Arnd Hannemann 2008-03-04 13:36 ` Ilpo Järvinen 2008-03-04 14:31 ` Arnd Hannemann 2008-03-04 21:04 ` H. Willstrand 2008-03-04 22:41 ` Arnd Hannemann 2008-03-04 21:07 ` Ilpo Järvinen 2008-03-04 21:19 ` Ilpo Järvinen 2008-03-04 23:03 ` Arnd Hannemann 2008-03-05 7:00 ` Ilpo Järvinen 2008-03-05 13:04 ` Arnd Hannemann 2008-03-05 19:32 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).