netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 2.6.22: Leak r=1 1
@ 2007-07-11 17:40 Sami Farin
  2007-07-11 19:01 ` Chuck Ebbert
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Sami Farin @ 2007-07-11 17:40 UTC (permalink / raw)
  To: Linux Networking Mailing List

That's right, so descriptive is the new Linux kernel 2.6.22.
Took a while to grep what is "leaking".

Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux

Just normal Internet usage, azureus for example =)
I think this is easy to trigger.
But that printk is not very useful, or is it?
I am also using HTB+ESFQ to limit outgoing bandwidth...

# ss -n|wc -l
870

# ping -A 80.223.96.1 
PING 80.223.96.1 (80.223.96.1) 56(84) bytes of data.
64 bytes from 80.223.96.1: icmp_seq=1 ttl=255 time=431 ms
...
--- 80.223.96.1 ping statistics ---
40 packets transmitted, 25 received, 37% packet loss, time 17954ms
rtt min/avg/max/mdev = 406.000/467.758/530.983/29.384 ms, pipe 2, ipg/ewma 460.361/456.381 ms

But ploss is only temporary (when I am downloading with azureus =) ,
when only uploading (95% of bandwidth used) rtt avg = 32ms).

# dmesg|grep Leak
[114992.191011] Leak r=1 4
[124231.713348] Leak r=1 4
[142807.938284] Leak r=1 4
[142999.674521] Leak r=1 1
[143177.462073] Leak r=1 4
[143230.001570] Leak r=1 4
[143232.982560] Leak r=1 4
[143234.537096] Leak r=1 4
[143297.927760] Leak r=1 4
[143300.633603] Leak r=1 4
[143302.172917] Leak r=1 4
[143357.083193] Leak r=1 1
[143361.780879] Leak r=1 4
[143413.706490] Leak r=1 4
[143552.996598] Leak r=1 1

[root@safari /proc/sys/net/ipv4]# grep . *
icmp_echo_ignore_all:0
icmp_echo_ignore_broadcasts:1
icmp_errors_use_inbound_ifaddr:0
icmp_ignore_bogus_error_responses:1
icmp_ratelimit:1000
icmp_ratemask:6168
igmp_max_memberships:20
igmp_max_msf:10
inet_peer_gc_maxtime:120
inet_peer_gc_mintime:10
inet_peer_maxttl:600
inet_peer_minttl:120
inet_peer_threshold:65664
ip_default_ttl:61
ip_dynaddr:0
ip_forward:0
ip_local_port_range:40000       65535
ip_no_pmtu_disc:1
ip_nonlocal_bind:0
ipfrag_high_thresh:262144
ipfrag_low_thresh:196608
ipfrag_max_dist:64
ipfrag_secret_interval:600
ipfrag_time:30
tcp_abc:2
tcp_abort_on_overflow:0
tcp_adv_win_scale:2
tcp_allowed_congestion_control:cubic bic reno
tcp_app_win:31
tcp_available_congestion_control:cubic bic reno westwood vegas scalable
hybla htcp highspeed
tcp_base_mss:512
tcp_congestion_control:cubic
tcp_dma_copybreak:4096
tcp_dsack:1
tcp_ecn:1
tcp_fack:1
tcp_fin_timeout:30
tcp_frto:1
tcp_frto_response:0
tcp_keepalive_intvl:75
tcp_keepalive_probes:9
tcp_keepalive_time:3300
tcp_low_latency:0
tcp_max_orphans:2048
tcp_max_ssthresh:0
tcp_max_syn_backlog:1024
tcp_max_tw_buckets:180000
tcp_mem:95136   126848  190272
tcp_moderate_rcvbuf:1
tcp_mtu_probing:0
tcp_no_metrics_save:0
tcp_orphan_retries:0
tcp_reordering:3
tcp_retrans_collapse:0
tcp_retries1:3
tcp_retries2:15
tcp_rfc1337:0
tcp_rmem:4096   87380   262144
tcp_sack:1
tcp_slow_start_after_idle:0
tcp_stdurg:0
tcp_syn_retries:5
tcp_synack_retries:5
tcp_syncookies:0
tcp_timestamps:1
tcp_tso_win_divisor:3
tcp_tw_recycle:1
tcp_tw_reuse:0
tcp_window_scaling:1
tcp_wmem:4096   16384   262144
tcp_workaround_signed_windows:0

-- 
Do what you love because life is too short for anything else.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-11 17:40 Linux 2.6.22: Leak r=1 1 Sami Farin
@ 2007-07-11 19:01 ` Chuck Ebbert
  2007-07-12  7:53 ` Ilpo Järvinen
  2007-07-18  9:16 ` Ilpo Järvinen
  2 siblings, 0 replies; 8+ messages in thread
From: Chuck Ebbert @ 2007-07-11 19:01 UTC (permalink / raw)
  To: Linux Networking Mailing List

On 07/11/2007 01:40 PM, Sami Farin wrote:
> That's right, so descriptive is the new Linux kernel 2.6.22.
> Took a while to grep what is "leaking".
> 

You didn't post that:

$ find . -type f | xargs grep "Leak r=" /dev/null
./net/ipv4/tcp_input.c:                 printk(KERN_DEBUG "Leak r=%u %d\n",


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-11 17:40 Linux 2.6.22: Leak r=1 1 Sami Farin
  2007-07-11 19:01 ` Chuck Ebbert
@ 2007-07-12  7:53 ` Ilpo Järvinen
  2007-07-12 10:04   ` Sami Farin
  2007-07-15  7:20   ` David Miller
  2007-07-18  9:16 ` Ilpo Järvinen
  2 siblings, 2 replies; 8+ messages in thread
From: Ilpo Järvinen @ 2007-07-12  7:53 UTC (permalink / raw)
  To: Sami Farin, David Miller; +Cc: Linux Networking Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4925 bytes --]

On Wed, 11 Jul 2007, Sami Farin wrote:

> That's right, so descriptive is the new Linux kernel 2.6.22.
> Took a while to grep what is "leaking".
> 
> Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux
> 
> Just normal Internet usage, azureus for example =)
> I think this is easy to trigger.

I guess those packet loss periods help you to reproduce it so easily.

> # ss -n|wc -l
> 870
> 
> # ping -A 80.223.96.1 
> PING 80.223.96.1 (80.223.96.1) 56(84) bytes of data.
> 64 bytes from 80.223.96.1: icmp_seq=1 ttl=255 time=431 ms
> ...
> --- 80.223.96.1 ping statistics ---
> 40 packets transmitted, 25 received, 37% packet loss, time 17954ms
> rtt min/avg/max/mdev = 406.000/467.758/530.983/29.384 ms, pipe 2, ipg/ewma 460.361/456.381 ms
> 
> But ploss is only temporary (when I am downloading with azureus =) ,
> when only uploading (95% of bandwidth used) rtt avg = 32ms).
> 
> # dmesg|grep Leak
> [114992.191011] Leak r=1 4
> [124231.713348] Leak r=1 4
> [142807.938284] Leak r=1 4
> [142999.674521] Leak r=1 1
> [143177.462073] Leak r=1 4
> [143230.001570] Leak r=1 4
> [143232.982560] Leak r=1 4
> [143234.537096] Leak r=1 4
> [143297.927760] Leak r=1 4
> [143300.633603] Leak r=1 4
> [143302.172917] Leak r=1 4
> [143357.083193] Leak r=1 1
> [143361.780879] Leak r=1 4
> [143413.706490] Leak r=1 4
> [143552.996598] Leak r=1 1
> 
> [root@safari /proc/sys/net/ipv4]# grep . *
[...snip...]

> tcp_frto:1

I suspect this is the main ingrediment to trigger these leaks, well, 
I'm pretty sure of... Sami, please test the patch included below, 
Dave can then put that one to net-2.6 and to stable too.

This is sort of poking to dark still... But it's pretty much the only 
place where the SACKED_RETRANS bit is touched without checking first that 
the adjustment can safely be made (and all SACKED_RETRANS changes in 
2.6.22 are FRTO related as well). Most likely something cleared 
SACKED_RETRANS bit underneath FRTO and in tcp_enter_frto_loss I just 
blindly assumed that it's still there.

While the patch below probably works, and leaks are no more, I'd like to 
get bottom of this by really figuring out what caused the SACKED_RETRANS 
bit to get cleared in the first place (wasn't expecting this happen while 
I wrote the FRTO). I guess that it could be "lost retransmit" loop in 
sacktag but again I've no concrete proof for that yet. Because for that to
trigger, something must have allowed sending skbs past the snd_nxt at the 
time of the RTO, which too must be prevented during FRTO! Thus there could 
be other issues while this is just a sympthom of the main problem.

I'd be interested to study some tcpdumps that relate to Leak cases you're 
seeing. Could you record some Sami? I'm not sure though how one can figure 
out the timestamp relation between the kernel log and a tcpdump log... 
Anyway, for this debugging, you should use a debug version of this patch 
with WARN_ON to get exact timestamp of the event since the leak print may 
occur much later on, I put one available at 
http://www.cs.helsinki.fi/u/ijjarvin/patches/ .


Other candidates for the cause are even less likely. The first two are 
self-standing so that this patch is going to be necessary as long as 
fuzzy SACK blocks are allowed to be received and processed in sacktag 
(regardless there turns to be additional problem triggering this one or 
not):
 - DSACK touching snd_una (receiver is pretty inconsistent with itself 
   because snd_una wasn't advanced).
 - 2xRTO and SACK @ snd_una (same note as above)
 - snd_una advanced full skb _without_ FLAG_DATA_ACKED being set 
   (unlikely)
 - Head not retransmitted on RTO when FRTO was enabled (no, it's not
   going to be this one)
It cannot be double SACKED_RETRANS because another msg would be printed
already in tcp_retransmit_skb.

Anyway, I'll be offline over the weekend starting from the Friday
morning.

...thanks for your report Sami.

-- 
 i.



[PATCH] [TCP]: Verify the presence of RETRANS bit when leaving FRTO

For yet unknown reason, something cleared SACKED_RETRANS bit
underneath FRTO.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 69f9f1e..4e5884a 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1398,7 +1398,9 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 		 * waiting for the first ACK and did not get it)...
 		 */
 		if ((tp->frto_counter == 1) && !(flag&FLAG_DATA_ACKED)) {
-			tp->retrans_out += tcp_skb_pcount(skb);
+			/* For some reason this R-bit might get cleared? */
+			if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_RETRANS)
+				tp->retrans_out += tcp_skb_pcount(skb);
 			/* ...enter this if branch just for the first segment */
 			flag |= FLAG_DATA_ACKED;
 		} else {
-- 
1.5.0.6

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-12  7:53 ` Ilpo Järvinen
@ 2007-07-12 10:04   ` Sami Farin
  2007-07-12 13:56     ` Ilpo Järvinen
  2007-07-15  7:20   ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Sami Farin @ 2007-07-12 10:04 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: David Miller, Linux Networking Mailing List

On Thu, Jul 12, 2007 at 10:53:57 +0300, Ilpo Järvinen wrote:
> On Wed, 11 Jul 2007, Sami Farin wrote:
> 
> > That's right, so descriptive is the new Linux kernel 2.6.22.
> > Took a while to grep what is "leaking".
> > 
> > Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux
> > 
> > Just normal Internet usage, azureus for example =)
> > I think this is easy to trigger.
> 
> I guess those packet loss periods help you to reproduce it so easily.
...
> I'd be interested to study some tcpdumps that relate to Leak cases you're 
> seeing. Could you record some Sami? I'm not sure though how one can figure 

I now have 300 MB capture and several new&retarded music videos...
And 10 WARNINGs and 0 Leak printk's.

2007-07-12 12:03:18.910712500 <4>[ 1318.606826] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:21:55.575049500 <4>[ 2434.941077] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:25:56.626918500 <4>[ 2675.917531] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:26:01.421714500 <4>[ 2680.710860] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:27:55.996561500 <4>[ 2795.252008] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:33:03.405492500 <4>[ 3102.570088] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:33:59.837033500 <4>[ 3158.985152] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:44:59.580682500 <4>[ 3818.697530] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:45:06.146194500 <4>[ 3825.261028] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
2007-07-12 12:45:07.637015500 <4>[ 3826.751240] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()

This is MAYBE the guilty connection if timestamps are to be believed:

2007-07-12 12:02:35.311410 IP (tos 0x0, ttl  61, id 17078, offset 0, flags [none], proto: TCP (6), length: 60) 80.223.106.128.43771 > 62.203.174.236.24442: SWE, cksum 0x26f7 (correct), 1227344370:1227344370(0) win 5720 <mss 1430,sackOK,timestamp 934750 0,nop,wscale 3>
2007-07-12 12:02:38.281251 IP (tos 0x0, ttl  61, id 17079, offset 0, flags [none], proto: TCP (6), length: 60) 80.223.106.128.43771 > 62.203.174.236.24442: SWE, cksum 0x1b3f (correct), 1227344370:1227344370(0) win 5720 <mss 1430,sackOK,timestamp 937750 0,nop,wscale 3>
2007-07-12 12:02:38.792865 IP (tos 0x0, ttl 113, id 46391, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0xc936 (correct), ack 1227344371 win 17640 <nop,nop,timestamp 2122974 934750>
2007-07-12 12:02:38.854298 IP (tos 0x0, ttl 113, id 46396, offset 0, flags [DF], proto: TCP (6), length: 64) 62.203.174.236.24442 > 80.223.106.128.43771: S, cksum 0x319e (correct), 602133927:602133927(0) ack 1227344371 win 17640 <mss 1260,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK>
2007-07-12 12:02:38.854335 IP (tos 0x0, ttl  61, id 17080, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: ., cksum 0x6251 (correct), ack 602133928 win 715 <nop,nop,timestamp 938335 0>
2007-07-12 12:02:38.858231 IP (tos 0x0, ttl  61, id 17081, offset 0, flags [none], proto: TCP (6), length: 372) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xaa7d (incorrect (-> 0x006d), 1227344371:1227344691(320) ack 602133928 win 715 <nop,nop,timestamp 938339 0>
2007-07-12 12:02:39.305447 IP (tos 0x0, ttl 113, id 46441, offset 0, flags [DF], proto: TCP (6), length: 159) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0x18b6 (correct), 602133928:602134035(107) ack 1227344691 win 17320 <nop,nop,timestamp 2122980 938339>
2007-07-12 12:02:39.305482 IP (tos 0x0, ttl  61, id 17082, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: ., cksum 0xf9de (correct), ack 602134035 win 715 <nop,nop,timestamp 938786 2122980>
2007-07-12 12:02:39.309403 IP (tos 0x0, ttl  61, id 17083, offset 0, flags [none], proto: TCP (6), length: 263) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xaa10 (incorrect (-> 0xf1b3), 1227344691:1227344902(211) ack 602134035 win 715 <nop,nop,timestamp 938790 2122980>
2007-07-12 12:02:40.649923 IP (tos 0x0, ttl  61, id 17084, offset 0, flags [none], proto: TCP (6), length: 263) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xaa10 (incorrect (-> 0xec76), 1227344691:1227344902(211) ack 602134035 win 715 <nop,nop,timestamp 940131 2122980>
2007-07-12 12:02:41.148856 IP (tos 0x0, ttl 113, id 46591, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0xb73b (correct), ack 1227344902 win 17109 <nop,nop,timestamp 2122998 938790>
2007-07-12 12:02:42.679961 IP (tos 0x0, ttl 113, id 46707, offset 0, flags [DF], proto: TCP (6), length: 484) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0x3390 (correct), 602134035:602134467(432) ack 1227344902 win 17109 <nop,nop,timestamp 2123014 938790>
2007-07-12 12:02:42.703122 IP (tos 0x0, ttl  61, id 17085, offset 0, flags [none], proto: TCP (6), length: 120) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xa981 (incorrect (-> 0xd5f6), 1227344902:1227344970(68) ack 602134467 win 849 <nop,nop,timestamp 942184 2123014>
2007-07-12 12:02:43.188971 IP (tos 0x0, ttl 113, id 46763, offset 0, flags [DF], proto: TCP (6), length: 120) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0x9271 (correct), 602134467:602134535(68) ack 1227344970 win 17041 <nop,nop,timestamp 2123019 942184>
2007-07-12 12:02:43.204691 IP (tos 0x0, ttl  61, id 17086, offset 0, flags [none], proto: TCP (6), length: 73) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xa952 (incorrect (-> 0xbfff), 1227344970:1227344991(21) ack 602134535 win 849 <nop,nop,timestamp 942685 2123019>
2007-07-12 12:02:43.783551 IP (tos 0x0, ttl 113, id 46818, offset 0, flags [DF], proto: TCP (6), length: 669) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0x12ae (correct), 602134535:602135152(617) ack 1227344991 win 17020 <nop,nop,timestamp 2123024 942685>
2007-07-12 12:02:43.783611 IP (tos 0x0, ttl  61, id 17087, offset 0, flags [none], proto: TCP (6), length: 556) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xab35 (incorrect (-> 0x7c3e), 1227344991:1227345495(504) ack 602135152 win 1004 <nop,nop,timestamp 943264 2123024>
2007-07-12 12:02:44.298747 IP (tos 0x0, ttl 113, id 46880, offset 0, flags [DF], proto: TCP (6), length: 172) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0x1323 (correct), 602135152:602135272(120) ack 1227345495 win 16516 <nop,nop,timestamp 2123030 943264>
2007-07-12 12:02:44.298779 IP (tos 0x0, ttl  61, id 17088, offset 0, flags [none], proto: TCP (6), length: 172) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xa9b5 (incorrect (-> 0x07d5), 1227345495:1227345615(120) ack 602135272 win 1004 <nop,nop,timestamp 943779 2123030>
2007-07-12 12:02:44.957682 IP (tos 0x0, ttl 113, id 46936, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0xa072 (correct), ack 1227345615 win 16396 <nop,nop,timestamp 2123037 943779>
2007-07-12 12:02:44.957710 IP (tos 0x0, ttl  61, id 17089, offset 0, flags [none], proto: TCP (6), length: 94) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xa967 (incorrect (-> 0xcb33), 1227345615:1227345657(42) ack 602135272 win 1004 <nop,nop,timestamp 944438 2123037>
2007-07-12 12:02:45.607790 IP (tos 0x0, ttl 113, id 46991, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0x98d3 (correct), ack 1227345657 win 17640 <nop,nop,timestamp 2123043 944438>
2007-07-12 12:02:46.323334 IP (tos 0x0, ttl 113, id 47054, offset 0, flags [DF], proto: TCP (6), length: 71) 62.203.174.236.24442 > 80.223.106.128.43771: P, cksum 0xea0c (correct), 602135272:602135291(19) ack 1227345657 win 17640 <nop,nop,timestamp 2123050 944438>
2007-07-12 12:02:46.362849 IP (tos 0x0, ttl  61, id 17090, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: ., cksum 0xd437 (correct), ack 602135291 win 1004 <nop,nop,timestamp 945844 2123050>
2007-07-12 12:03:11.745201 IP (tos 0x0, ttl  61, id 17091, offset 0, flags [none], proto: TCP (6), length: 76) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xa955 (incorrect (-> 0xad2a), 1227345657:1227345681(24) ack 602135291 win 1004 <nop,nop,timestamp 970864 2123050>
2007-07-12 12:03:12.568928 IP (tos 0x0, ttl 113, id 49328, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0x3079 (correct), ack 1227345681 win 17616 <nop,nop,timestamp 2123312 970864>
2007-07-12 12:03:14.454877 IP (tos 0x0, ttl  61, id 17094, offset 0, flags [none], proto: TCP (6), length: 792) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xac21 (incorrect (-> 0x7cc5), 1227345681:1227346421(740) ack 602135291 win 1004 <nop,nop,timestamp 973936 2123312>
2007-07-12 12:03:14.934510 IP (tos 0x0, ttl 113, id 49559, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: F, cksum 0x245c (correct), 602135291:602135291(0) ack 1227346421 win 16876 <nop,nop,timestamp 2123340 973936>
2007-07-12 12:03:14.934558 IP (tos 0x0, ttl  61, id 17095, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: ., cksum 0x607c (correct), ack 602135292 win 1004 <nop,nop,timestamp 974415 2123340>
2007-07-12 12:03:17.077239 IP (tos 0x0, ttl  61, id 17092, offset 0, flags [none], proto: TCP (6), length: 792) 80.223.106.128.43771 > 62.203.174.236.24442: P, cksum 0xac21 (incorrect (-> 0x8423), 1227345681:1227346421(740) ack 602135291 win 1004 <nop,nop,timestamp 972050 2123312>
2007-07-12 12:03:17.410043 IP (tos 0x0, ttl 113, id 49773, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0x2443 (correct), ack 1227346421 win 16876 <nop,nop,timestamp 2123365 973936>
2007-07-12 12:03:18.585016 IP (tos 0x0, ttl  61, id 17093, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: F, cksum 0x6993 (correct), 1227346421:1227346421(0) ack 602135291 win 1004 <nop,nop,timestamp 972117 2123312>
2007-07-12 12:03:18.910310 IP (tos 0x0, ttl 113, id 49888, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0x2434 (correct), ack 1227346421 win 16876 <nop,nop,timestamp 2123380 973936>
2007-07-12 12:03:20.381849 IP (tos 0x0, ttl  61, id 17096, offset 0, flags [none], proto: TCP (6), length: 52) 80.223.106.128.43771 > 62.203.174.236.24442: F, cksum 0x4b0c (correct), 1227346421:1227346421(0) ack 602135292 win 1004 <nop,nop,timestamp 979863 2123380>
2007-07-12 12:03:20.456165 IP (tos 0x0, ttl 113, id 50000, offset 0, flags [DF], proto: TCP (6), length: 52) 62.203.174.236.24442 > 80.223.106.128.43771: ., cksum 0x0cfd (correct), ack 1227346422 win 16876 <nop,nop,timestamp 2123395 979863>

> out the timestamp relation between the kernel log and a tcpdump log... 
> Anyway, for this debugging, you should use a debug version of this patch 
> with WARN_ON to get exact timestamp of the event since the leak print may 
> occur much later on, I put one available at 
> http://www.cs.helsinki.fi/u/ijjarvin/patches/ .

Well, haven't gotten Leaks anymore after applying the patch.

Thanks for quick action.

-- 
Do what you love because life is too short for anything else.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-12 10:04   ` Sami Farin
@ 2007-07-12 13:56     ` Ilpo Järvinen
  0 siblings, 0 replies; 8+ messages in thread
From: Ilpo Järvinen @ 2007-07-12 13:56 UTC (permalink / raw)
  To: Sami Farin; +Cc: David Miller, Linux Networking Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2011 bytes --]

On Thu, 12 Jul 2007, Sami Farin wrote:

> On Thu, Jul 12, 2007 at 10:53:57 +0300, Ilpo Järvinen wrote:
> > On Wed, 11 Jul 2007, Sami Farin wrote:
> > 
> > > That's right, so descriptive is the new Linux kernel 2.6.22.
> > > Took a while to grep what is "leaking".
> > > 
> > > Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux
> > > 
> > > Just normal Internet usage, azureus for example =)
> > > I think this is easy to trigger.
> > 
> > I guess those packet loss periods help you to reproduce it so easily.
> ...
> > I'd be interested to study some tcpdumps that relate to Leak cases you're 
> > seeing. Could you record some Sami? I'm not sure though how one can figure 
> 
> I now have 300 MB capture and several new&retarded music videos...
> And 10 WARNINGs and 0 Leak printk's.

Thanks, every warning would have lead to a Leak print later on (not 
necessarily with 1-to-1 relation but every pending warning would be 
"acknowledged" by a single Leak print). So every time the WARNING 
triggers, inconsistency would have been result without the patch.

> 2007-07-12 12:03:18.910712500 <4>[ 1318.606826] WARNING: at net/ipv4/tcp_input.c:1402 tcp_enter_frto_loss()
...snip...
 
> This is MAYBE the guilty connection if timestamps are to be believed:
 
...snip...

I think you got the correct connection... Thanks. The problem seems to
be related to FIN (a case that wouldn't have occurred to me without your 
log, thanks again :-))... I think that the patch I suggested should be 
fine (and it fixes the fuzzy sack block issues as well) though I still 
have problem in figuring out what's the exact path of execution on each 
ACK near the end of the connection (the sent packets are misplaced in the 
shown dump but the original order can be reconstructed from IP identifiers 
and TCP timestamps).

> Well, haven't gotten Leaks anymore after applying the patch.

I'd have been a bit surprised if they would have still been there with 
the patch...

-- 
 i.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-12  7:53 ` Ilpo Järvinen
  2007-07-12 10:04   ` Sami Farin
@ 2007-07-15  7:20   ` David Miller
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2007-07-15  7:20 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: safari-kernel, netdev

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 12 Jul 2007 10:53:57 +0300 (EEST)

> Dave can then put that one to net-2.6 and to stable too.
 ...
> [PATCH] [TCP]: Verify the presence of RETRANS bit when leaving FRTO
> 
> For yet unknown reason, something cleared SACKED_RETRANS bit
> underneath FRTO.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied, and I'll push to -stable.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-11 17:40 Linux 2.6.22: Leak r=1 1 Sami Farin
  2007-07-11 19:01 ` Chuck Ebbert
  2007-07-12  7:53 ` Ilpo Järvinen
@ 2007-07-18  9:16 ` Ilpo Järvinen
  2007-07-18 21:39   ` Sami Farin
  2 siblings, 1 reply; 8+ messages in thread
From: Ilpo Järvinen @ 2007-07-18  9:16 UTC (permalink / raw)
  To: Sami Farin; +Cc: Linux Networking Mailing List, Pasi Sarolahti

On Wed, 11 Jul 2007, Sami Farin wrote:

> That's right, so descriptive is the new Linux kernel 2.6.22.
> 
> Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux
> 
> [root@safari /proc/sys/net/ipv4]# grep . *

...snip...

> tcp_frto:1

...This is fully unrelated to the issue but I'm a bit curious who enabled 
frto on your machine (since it's disabled by default), did you do it by 
yourself or the distribution perhaps?

This is interesting because frto to be useful in large scale, sender must 
have it enabled and therefore it's usually not under control of the host 
that is attached to the wireless access link. ...In case some 
distribution (yours) is already enabling it, the deployment is somewhat 
proceeding already :-)... ...And we also get incremental testing of the 
code too...

-- 
 i.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Linux 2.6.22: Leak r=1 1
  2007-07-18  9:16 ` Ilpo Järvinen
@ 2007-07-18 21:39   ` Sami Farin
  0 siblings, 0 replies; 8+ messages in thread
From: Sami Farin @ 2007-07-18 21:39 UTC (permalink / raw)
  To: Linux Networking Mailing List; +Cc: Ilpo Järvinen, Pasi Sarolahti

On Wed, Jul 18, 2007 at 12:16:56 +0300, Ilpo Järvinen wrote:
> On Wed, 11 Jul 2007, Sami Farin wrote:
> 
> > That's right, so descriptive is the new Linux kernel 2.6.22.
> > 
> > Linux safari.finland.fbi 2.6.22-cfs-v19 #3 SMP Tue Jul 10 00:22:25 EEST 2007 i686 i686 i386 GNU/Linux
> > 
> > [root@safari /proc/sys/net/ipv4]# grep . *
> 
> ...snip...
> 
> > tcp_frto:1
> 
> ...This is fully unrelated to the issue but I'm a bit curious who enabled 
> frto on your machine (since it's disabled by default), did you do it by 
> yourself or the distribution perhaps?

I enabled it by myself...

If you'd like to get more widespread testing,
try suggesting Fedora project to add the tuning
to /etc/sysctl.conf (or something like that).
Maybe fedora-devel-list:
http://www.redhat.com/mailman/listinfo/fedora-devel-list
Note that they have antispam on that list which requires
that email address found on From: header field must be
subscribed to the list or otherwise your email is
devnulled.

-- 
Do what you love because life is too short for anything else.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-07-18 22:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-11 17:40 Linux 2.6.22: Leak r=1 1 Sami Farin
2007-07-11 19:01 ` Chuck Ebbert
2007-07-12  7:53 ` Ilpo Järvinen
2007-07-12 10:04   ` Sami Farin
2007-07-12 13:56     ` Ilpo Järvinen
2007-07-15  7:20   ` David Miller
2007-07-18  9:16 ` Ilpo Järvinen
2007-07-18 21:39   ` Sami Farin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).