Re: 2.6.23-rc8-mm2 - drivers/net/ibm_newemac/mal

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: 2.6.23-rc8-mm2 - drivers/net/ibm_newemac/mal - broken
       [not found] <20070927022220.c76a7a6e.akpm@linux-foundation.org>
@ 2007-09-27 10:52 ` Kamalesh Babulal
  2007-09-28 15:42 ` 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING Cedric Le Goater
  2007-09-28 16:30 ` /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2] Jiri Slaby
  2 siblings, 0 replies; 12+ messages in thread
From: Kamalesh Babulal @ 2007-09-27 10:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev, eugene.surovegin

Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/

Hi Andrew,

The drivers/net/ibm_newemac/mal seems to be broken with 2.6.23-rc8-mm2 also, it was
reported on 2.6.23-rc8-mm1 (http://lkml.org/lkml/2007/9/25/173).


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
       [not found] <20070927022220.c76a7a6e.akpm@linux-foundation.org>
  2007-09-27 10:52 ` 2.6.23-rc8-mm2 - drivers/net/ibm_newemac/mal - broken Kamalesh Babulal
@ 2007-09-28 15:42 ` Cedric Le Goater
  2007-09-28 19:10   ` Ilpo Järvinen
  2007-09-28 16:30 ` /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2] Jiri Slaby
  2 siblings, 1 reply; 12+ messages in thread
From: Cedric Le Goater @ 2007-09-28 15:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev, Ilpo Järvinen, David Miller

Hello ! 

Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/

I just found that warning in my logs. It seems that it's been 
happening since rc7-mm1 at least. 

Thanks !

C.

WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()

Call Trace:
 <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
 [<ffffffff80411c79>] tcp_data_queue+0x5be/0xae7
 [<ffffffff80412b54>] tcp_rcv_established+0x61f/0x6df
 [<ffffffff80254146>] __lock_acquire+0x8a1/0xf1b
 [<ffffffff80419cfd>] tcp_v4_do_rcv+0x3e/0x394
 [<ffffffff8041a66f>] tcp_v4_rcv+0x61c/0x9a9
 [<ffffffff803ff1e3>] ip_local_deliver+0x1da/0x2a4
 [<ffffffff803ffb4e>] ip_rcv+0x583/0x5c9
 [<ffffffff8046d33f>] packet_rcv_spkt+0x19a/0x1a8
 [<ffffffff803e081c>] netif_receive_skb+0x2cf/0x2f5
 [<ffffffff88042505>] :tg3:tg3_poll+0x65d/0x8a4
 [<ffffffff803e09e8>] net_rx_action+0xb8/0x191
 [<ffffffff8023a927>] __do_softirq+0x5f/0xe0
 [<ffffffff8020c98c>] call_softirq+0x1c/0x28
 [<ffffffff8020e9c3>] do_softirq+0x3b/0xb8
 [<ffffffff8023aa1e>] irq_exit+0x4e/0x50
 [<ffffffff8020e7df>] do_IRQ+0xbd/0xd7
 [<ffffffff80209cb9>] mwait_idle+0x0/0x4d
 [<ffffffff8020bce6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff80209cfc>] mwait_idle+0x43/0x4d
 [<ffffffff802099fb>] enter_idle+0x22/0x24
 [<ffffffff80209c4f>] cpu_idle+0x9d/0xc0
 [<ffffffff80476a91>] rest_init+0x55/0x57
 [<ffffffff80630815>] start_kernel+0x2d6/0x2e2
 [<ffffffff80630134>] _sinittext+0x134/0x13b

^ permalink raw reply	[flat|nested] 12+ messages in thread

* /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2]
       [not found] <20070927022220.c76a7a6e.akpm@linux-foundation.org>
  2007-09-27 10:52 ` 2.6.23-rc8-mm2 - drivers/net/ibm_newemac/mal - broken Kamalesh Babulal
  2007-09-28 15:42 ` 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING Cedric Le Goater
@ 2007-09-28 16:30 ` Jiri Slaby
  2007-09-28 17:03   ` Eric W. Biederman
  2 siblings, 1 reply; 12+ messages in thread
From: Jiri Slaby @ 2007-09-28 16:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev, Eric Biederman

On 09/27/2007 11:22 AM, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/

# find /proc >/dev/null
find: WARNING: Hard link count is wrong for /proc/net: this may be a bug in your
filesystem driver.  Automatically turning on find's -noleaf option.  Earlier
results may have failed to include directories that should have been searched.
# stat net
  File: `net'
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 3h/3d   Inode: 4026531864  Links: 2
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2007-09-28 18:21:24.651209759 +0200
Modify: 2007-09-28 18:21:24.651209759 +0200
Change: 2007-09-28 18:21:24.651209759 +0200
# stat net/
  File: `net/'
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 3h/3d   Inode: 4026531909  Links: 4
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2007-09-28 18:26:48.813048220 +0200
Modify: 2007-09-28 18:26:48.813048220 +0200
Change: 2007-09-28 18:26:48.813048220 +0200

hmm, this is some kind of weirdness :)

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University

-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2]
  2007-09-28 16:30 ` /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2] Jiri Slaby
@ 2007-09-28 17:03   ` Eric W. Biederman
  0 siblings, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2007-09-28 17:03 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, linux-kernel, netdev, Linux Containers

Jiri Slaby <jirislaby@gmail.com> writes:

> On 09/27/2007 11:22 AM, Andrew Morton wrote:
>>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/

Yep.

> # find /proc >/dev/null
> find: WARNING: Hard link count is wrong for /proc/net: this may be a bug in your
> filesystem driver.  Automatically turning on find's -noleaf option.  Earlier
> results may have failed to include directories that should have been searched.
> # stat net
>   File: `net'
>   Size: 0               Blocks: 0          IO Block: 1024   directory
> Device: 3h/3d   Inode: 4026531864  Links: 2
> Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
> Access: 2007-09-28 18:21:24.651209759 +0200
> Modify: 2007-09-28 18:21:24.651209759 +0200
> Change: 2007-09-28 18:21:24.651209759 +0200
> # stat net/
>   File: `net/'
>   Size: 0               Blocks: 0          IO Block: 1024   directory
> Device: 3h/3d   Inode: 4026531909  Links: 4
> Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
> Access: 2007-09-28 18:26:48.813048220 +0200
> Modify: 2007-09-28 18:26:48.813048220 +0200
> Change: 2007-09-28 18:26:48.813048220 +0200
>
> hmm, this is some kind of weirdness :)

Yes.

I can explain it. For the network namespace stuff we need special handling
of /proc/net so that depending on the network namespace we are resolving
against you see a different behavior.  So you actually are observing
two different directories, one being a magic invisible symlink to the
other.

Currently I am resolving against current (which has a number of
limitations) and the weird ugly effect you are current seeing.

So it looks like I need to either make /proc/net a symlink to
/proc/self/net or make the network namespace something that we capture
at mount time of /proc.

This was my don't get hung up on this implementation detail version.
Thanks for pointing out it has user visible problems.  I will see
what I can do to resolve this.

Eric

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-09-28 15:42 ` 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING Cedric Le Goater
@ 2007-09-28 19:10   ` Ilpo Järvinen
  2007-09-29 12:44     ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2007-09-28 19:10 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

On Fri, 28 Sep 2007, Cedric Le Goater wrote:

> Hello ! 
> 
> Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/
> 
> I just found that warning in my logs. It seems that it's been 
> happening since rc7-mm1 at least. 
> 
> Thanks !
> 
> C.
> 
> WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
>
> Call Trace:
>  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
> ...snip...

...Thanks for the report, I'll have look what could still break 
fackets_out...

-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-09-28 19:10   ` Ilpo Järvinen
@ 2007-09-29 12:44     ` Ilpo Järvinen
  2007-09-29 14:55       ` Cedric Le Goater
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2007-09-29 12:44 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1533 bytes --]

On Fri, 28 Sep 2007, Ilpo Järvinen wrote:
> On Fri, 28 Sep 2007, Cedric Le Goater wrote:
>
> > I just found that warning in my logs. It seems that it's been 
> > happening since rc7-mm1 at least. 
> > 
> > WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
> >
> > Call Trace:
> >  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
> > ...snip...
> 
> ...Thanks for the report, I'll have look what could still break 
> fackets_out...

I think this one is now clear to me, tcp_fragment/collapse adjusts 
fackets_out (incorrectly) also for reno flow when there were some dupACKs 
that made sacked_out != 0. Could you please try if patch below proves all 
them to be of non-SACK origin... In case that's true, it's rather 
harmless, I'll send a fix on Monday or so (this would anyway be needed)... 
If you find out that them occur with SACK enabled flow, that would be
more interesting and requires more digging...

-- 
 i.



diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2286361..e642779 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2311,8 +2311,10 @@ tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 	if (!tp->packets_out)
 		tp->sacked_out = 0;
 
-	if (WARN_ON(!tp->sacked_out && tp->fackets_out))
+	if (WARN_ON(!tp->sacked_out && tp->fackets_out)) {
+		printk(KERN_ERR "TCP %d\n", tcp_is_reno(tp));
 		tp->fackets_out = 0;
+	}
 
 	/* Now state machine starts.
 	 * A. ECE, hence prohibit cwnd undoing, the reduction is required. */

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-09-29 12:44     ` Ilpo Järvinen
@ 2007-09-29 14:55       ` Cedric Le Goater
  2007-09-29 20:49         ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Cedric Le Goater @ 2007-09-29 14:55 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Andrew Morton, LKML, Netdev, David Miller

Ilpo Järvinen wrote:
> On Fri, 28 Sep 2007, Ilpo Järvinen wrote:
>> On Fri, 28 Sep 2007, Cedric Le Goater wrote:
>>
>>> I just found that warning in my logs. It seems that it's been 
>>> happening since rc7-mm1 at least. 
>>>
>>> WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
>>>
>>> Call Trace:
>>>  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
>>> ...snip...
>> ...Thanks for the report, I'll have look what could still break 
>> fackets_out...
> 
> I think this one is now clear to me, tcp_fragment/collapse adjusts 
> fackets_out (incorrectly) also for reno flow when there were some dupACKs 
> that made sacked_out != 0. Could you please try if patch below proves all 
> them to be of non-SACK origin... In case that's true, it's rather 
> harmless, I'll send a fix on Monday or so (this would anyway be needed)... 
> If you find out that them occur with SACK enabled flow, that would be
> more interesting and requires more digging...

I'm trying now to reproduce this WARNING. 

It seems that the n/w behaves differently during the week ends. Probably
taking a break. 

C.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-09-29 14:55       ` Cedric Le Goater
@ 2007-09-29 20:49         ` Ilpo Järvinen
  2007-10-01  9:26           ` Cedric Le Goater
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2007-09-29 20:49 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2047 bytes --]

On Sat, 29 Sep 2007, Cedric Le Goater wrote:

> Ilpo Järvinen wrote:
> > On Fri, 28 Sep 2007, Ilpo Järvinen wrote:
> >> On Fri, 28 Sep 2007, Cedric Le Goater wrote:
> >>
> >>> I just found that warning in my logs. It seems that it's been 
> >>> happening since rc7-mm1 at least. 
> >>>
> >>> WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
> >>>
> >>> Call Trace:
> >>>  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
> >>> ...snip...
> >> ...Thanks for the report, I'll have look what could still break 
> >> fackets_out...
> > 
> > I think this one is now clear to me, tcp_fragment/collapse adjusts 
> > fackets_out (incorrectly) also for reno flow when there were some dupACKs 
> > that made sacked_out != 0. Could you please try if patch below proves all 
> > them to be of non-SACK origin... In case that's true, it's rather 
> > harmless, I'll send a fix on Monday or so (this would anyway be needed)... 
> > If you find out that them occur with SACK enabled flow, that would be
> > more interesting and requires more digging...
> 
> I'm trying now to reproduce this WARNING. 
> 
> It seems that the n/w behaves differently during the week ends. Probably
> taking a break. 

Thanks.

Of course there are other means too to determine if TCP flows do negotiate 
SACK enabled or not. Depending on your test case (which is fully unknown 
to me) they may or may not be usable... At least the value of tcp_sack 
sysctl on both systems or tcpdump catching SYN packets should give that 
detail. ...If you know to which hosts TCP could be connected (and active) 
to, while the WARNING triggers, it's really easy to test what is being 
negotiated as it's unlikely to change at short notice and any TCP flow to 
that host will get us the same information though the WARNING would not be 
triggered with it at this time. Obviously if at least one of the remotes 
is not known or the set ends up being mixture of reno and SACK flows, then 
we'll just have to wait and see which fish we get...

-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-09-29 20:49         ` Ilpo Järvinen
@ 2007-10-01  9:26           ` Cedric Le Goater
  2007-10-02 10:26             ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Cedric Le Goater @ 2007-10-01  9:26 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Andrew Morton, LKML, Netdev, David Miller

Ilpo Järvinen wrote:
> On Sat, 29 Sep 2007, Cedric Le Goater wrote:
> 
>> Ilpo Järvinen wrote:
>>> On Fri, 28 Sep 2007, Ilpo Järvinen wrote:
>>>> On Fri, 28 Sep 2007, Cedric Le Goater wrote:
>>>>
>>>>> I just found that warning in my logs. It seems that it's been 
>>>>> happening since rc7-mm1 at least. 
>>>>>
>>>>> WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
>>>>>
>>>>> Call Trace:
>>>>>  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x1894
>>>>> ...snip...
>>>> ...Thanks for the report, I'll have look what could still break 
>>>> fackets_out...
>>> I think this one is now clear to me, tcp_fragment/collapse adjusts 
>>> fackets_out (incorrectly) also for reno flow when there were some dupACKs 
>>> that made sacked_out != 0. Could you please try if patch below proves all 
>>> them to be of non-SACK origin... In case that's true, it's rather 
>>> harmless, I'll send a fix on Monday or so (this would anyway be needed)... 
>>> If you find out that them occur with SACK enabled flow, that would be
>>> more interesting and requires more digging...
>> I'm trying now to reproduce this WARNING. 
>>
>> It seems that the n/w behaves differently during the week ends. Probably
>> taking a break. 
> 
> Thanks.
> 
> Of course there are other means too to determine if TCP flows do negotiate 
> SACK enabled or not. Depending on your test case (which is fully unknown 
> to me) they may or may not be usable... At least the value of tcp_sack 
> sysctl on both systems or tcpdump catching SYN packets should give that 
> detail. ...If you know to which hosts TCP could be connected (and active) 
> to, while the WARNING triggers, it's really easy to test what is being 
> negotiated as it's unlikely to change at short notice and any TCP flow to 
> that host will get us the same information though the WARNING would not be 
> triggered with it at this time. Obviously if at least one of the remotes 
> is not known or the set ends up being mixture of reno and SACK flows, then 
> we'll just have to wait and see which fish we get...
 
got it !

r3-06.test.meiosys.com login: WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()

Call Trace:
 <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x18af
 [<ffffffff80412b6f>] tcp_rcv_established+0x61f/0x6df
 [<ffffffff80254146>] __lock_acquire+0x8a1/0xf1b
 [<ffffffff80419d19>] tcp_v4_do_rcv+0x3e/0x394
 [<ffffffff8041a68b>] tcp_v4_rcv+0x61c/0x9a9
 [<ffffffff803ff1e3>] ip_local_deliver+0x1da/0x2a4
 [<ffffffff803ffb4e>] ip_rcv+0x583/0x5c9
 [<ffffffff8046d35b>] packet_rcv_spkt+0x19a/0x1a8
 [<ffffffff803e081c>] netif_receive_skb+0x2cf/0x2f5
 [<ffffffff88042505>] :tg3:tg3_poll+0x65d/0x8a4
 [<ffffffff803e09e8>] net_rx_action+0xb8/0x191
 [<ffffffff8023a927>] __do_softirq+0x5f/0xe0
 [<ffffffff8020c98c>] call_softirq+0x1c/0x28
 [<ffffffff8020e9c3>] do_softirq+0x3b/0xb8
 [<ffffffff8023aa1e>] irq_exit+0x4e/0x50
 [<ffffffff8020e7df>] do_IRQ+0xbd/0xd7
 [<ffffffff80209cb9>] mwait_idle+0x0/0x4d
 [<ffffffff8020bce6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff80209cfc>] mwait_idle+0x43/0x4d
 [<ffffffff802099fb>] enter_idle+0x22/0x24
 [<ffffffff80209c4f>] cpu_idle+0x9d/0xc0
 [<ffffffff80476aa1>] rest_init+0x55/0x57
 [<ffffffff80630815>] start_kernel+0x2d6/0x2e2
 [<ffffffff80630134>] _sinittext+0x134/0x13b

TCP 0


I wasn't doing any particular test on n/w so it took me a while to figure 
out how I was triggering the WARNING. Apparently, this is happening when I 
run ketchup, but not always. This test machine is behind many firewall & 
routers so it might be a reason.

tcpdump gave me this output for a wget on kernel.org :

10:51:14.835981 IP r3-06.test.meiosys.com.40322 > pub2.kernel.org.http: S 737836267:737836267(0) win 5840 <mss 1460,sackOK,timestamp 1309245 0,nop,wscale 7>
10:51:14.975153 IP pub2.kernel.org.http > r3-06.test.meiosys.com.40321: F 524:524(0) ack 166 win 5840
10:51:14.975177 IP r3-06.test.meiosys.com.40321 > pub2.kernel.org.http: . ack 525 win 7504

I'm trying to get the WARNING and the tcpdump output for it but for the
moment, it seems it's beyond my reach :/

Hope it helps !

C. 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-10-01  9:26           ` Cedric Le Goater
@ 2007-10-02 10:26             ` Ilpo Järvinen
  2007-10-02 20:06               ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2007-10-02 10:26 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

On Mon, 1 Oct 2007, Cedric Le Goater wrote:

> got it !
> 
> r3-06.test.meiosys.com login: WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert()
> 
> Call Trace:
>  <IRQ>  [<ffffffff8040fdc3>] tcp_ack+0xcd6/0x18af
[...snip...]
> 
> TCP 0

Hmm, so it's SACK then... 

> I wasn't doing any particular test on n/w so it took me a while to figure 
> out how I was triggering the WARNING. Apparently, this is happening when I 
> run ketchup, but not always. This test machine is behind many firewall & 
> routers so it might be a reason.
>
> I'm trying to get the WARNING and the tcpdump output for it but for the
> moment, it seems it's beyond my reach :/

I'm currently out of ideas where it could come from... so lets try 
brute-force checking as your test case is not very high-speed... This 
could hide it though... :-(

Please put the patch below on top of clean rc8-mm2 (it includes the patch
I gave you last time) and try to reproduce.... These counter bugs can
survive for sometime until !sacked_out condition occurs, so the patch
below tries to find that out when inconsisteny occurs for the first time 
regardless of sacked_out (I also removed some statics which hopefully 
reduces compiler inlining for easier reading of the output). I tried this 
myself (except for verify()s in frto funcs and minor printout 
modifications), didn't trigger for me.

-- 
 i.

---
 include/net/tcp.h     |    3 +
 net/ipv4/tcp_input.c  |   23 +++++++++--
 net/ipv4/tcp_ipv4.c   |  102 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp_output.c |    6 ++-
 4 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 991ccdc..54a0d91 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -43,6 +43,9 @@
 
 #include <linux/seq_file.h>
 
+extern void tcp_verify_fackets(struct sock *sk);
+extern void tcp_print_queue(struct sock *sk);
+
 extern struct inet_hashinfo tcp_hashinfo;
 
 extern atomic_t tcp_orphan_count;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e22ffe7..1d7367d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1140,7 +1140,7 @@ static int tcp_check_dsack(struct tcp_sock *tp, struct sk_buff *ack_skb,
 	return dup_sack;
 }
 
-static int
+int
 tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_una)
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
@@ -1160,6 +1160,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
 	int first_sack_index;
 
 	if (!tp->sacked_out) {
+		if (WARN_ON(tp->fackets_out))
+			tcp_print_queue(sk);
 		tp->fackets_out = 0;
 		tp->highest_sack = tp->snd_una;
 	}
@@ -1420,6 +1422,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
 			}
 		}
 	}
+	tcp_verify_fackets(sk);
 
 	/* Check for lost retransmit. This superb idea is
 	 * borrowed from "ratehalving". Event "C".
@@ -1632,13 +1635,14 @@ void tcp_enter_frto(struct sock *sk)
 	tcp_set_ca_state(sk, TCP_CA_Disorder);
 	tp->high_seq = tp->snd_nxt;
 	tp->frto_counter = 1;
+	tcp_verify_fackets(sk);
 }
 
 /* Enter Loss state after F-RTO was applied. Dupack arrived after RTO,
  * which indicates that we should follow the traditional RTO recovery,
  * i.e. mark everything lost and do go-back-N retransmission.
  */
-static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
+void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
@@ -1675,6 +1679,7 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 		}
 	}
 	tcp_verify_left_out(tp);
+	tcp_verify_fackets(sk);
 
 	tp->snd_cwnd = tcp_packets_in_flight(tp) + allowed_segments;
 	tp->snd_cwnd_cnt = 0;
@@ -1753,6 +1758,7 @@ void tcp_enter_loss(struct sock *sk, int how)
 		}
 	}
 	tcp_verify_left_out(tp);
+	tcp_verify_fackets(sk);
 
 	tp->reordering = min_t(unsigned int, tp->reordering,
 					     sysctl_tcp_reordering);
@@ -2308,7 +2314,7 @@ static void tcp_mtup_probe_success(struct sock *sk, struct sk_buff *skb)
  * It does _not_ decide what to send, it is made in function
  * tcp_xmit_retransmit_queue().
  */
-static void
+void
 tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
@@ -2322,8 +2328,11 @@ tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 	if (!tp->packets_out)
 		tp->sacked_out = 0;
 
-	if (WARN_ON(!tp->sacked_out && tp->fackets_out))
+	if (WARN_ON(!tp->sacked_out && tp->fackets_out)) {
+		printk(KERN_ERR "TCP %d\n", tcp_is_reno(tp));
+		tcp_print_queue(sk);
 		tp->fackets_out = 0;
+	}
 
 	/* Now state machine starts.
 	 * A. ECE, hence prohibit cwnd undoing, the reduction is required. */
@@ -2333,6 +2342,8 @@ tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 	/* B. In all the states check for reneging SACKs. */
 	if (tp->sacked_out && tcp_check_sack_reneging(sk))
 		return;
+	
+	tcp_verify_fackets(sk);
 
 	/* C. Process data loss notification, provided it is valid. */
 	if ((flag&FLAG_DATA_LOST) &&
@@ -2572,7 +2583,7 @@ static u32 tcp_tso_acked(struct sock *sk, struct sk_buff *skb)
  * is before the ack sequence we can discard it as it's confirmed to have
  * arrived at the other end.
  */
-static int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
+int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	const struct inet_connection_sock *icsk = inet_csk(sk);
@@ -2694,6 +2705,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
 			ca_ops->pkts_acked(sk, pkts_acked, rtt_us);
 		}
 	}
+	tcp_verify_fackets(sk);
+
 
 #if FASTRETRANS_DEBUG > 0
 	BUG_TRAP((int)tp->sacked_out >= 0);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7fed0a6..8b18757 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -108,6 +108,108 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
 	.lhash_wait  = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_wait),
 };
 
+void tcp_print_queue(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb;
+	char s[50+1];
+	char i[50+1];
+	int idx = 0;
+	u32 hs = tp->highest_sack;
+	
+	if (!tp->sacked_out)
+		hs = tp->snd_una;
+	
+	tcp_for_write_queue(skb, sk) {
+		if (skb == tcp_send_head(sk))
+			break;
+		
+		if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
+			if (skb->len < tp->mss_cache)
+				s[idx] = 's';
+			else
+				s[idx] = 'S';
+		} else {
+			s[idx] = '-';
+		}
+		if ((TCP_SKB_CB(skb)->seq == hs) && (tp->fastpath_skb_hint == skb))
+			i[idx] = 'x';
+		else if (tp->fastpath_skb_hint == skb)
+			i[idx] = 'f';
+		else if (TCP_SKB_CB(skb)->seq == hs)
+			i[idx] = 'h';
+		else
+			i[idx] = ' ';
+			
+		if (++idx >= 50) {
+			s[idx] = 0;
+			i[idx] = 0;
+			printk(KERN_ERR "TCP wq(s) %s\n", s);
+			printk(KERN_ERR "TCP wq(i) %s\n", i);
+			idx = 0;
+		}
+	}
+	if (idx) {
+		s[idx] = '<';
+		s[idx+1] = 0;
+		i[idx] = '<';
+		i[idx+1] = 0;
+		printk(KERN_ERR "TCP wq(s) %s\n", s);
+		printk(KERN_ERR "TCP wq(i) %s\n", i);
+	}
+	printk(KERN_ERR "s%u f%u p%u seq: su%u hs%u sn%u (%u)\n",
+		tp->sacked_out, tp->fackets_out, tp->packets_out,
+		tp->snd_una, tp->highest_sack, tp->snd_nxt,
+		((tp->fastpath_skb_hint == NULL) ? 0 :
+			TCP_SKB_CB(tp->fastpath_skb_hint)->seq));
+}
+
+void tcp_verify_fackets(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb;
+	u32 fackets = 0;
+	int hisack_valid = 0;
+	int err = 0;
+	
+	if (tcp_is_reno(tp))
+		return;
+	
+	if (!tp->sacked_out) {
+		if (WARN_ON(tp->fackets_out))
+			err = 1;
+		else if (tp->fastpath_skb_hint == NULL)
+			return;
+	}
+	
+	/* ...expensive processing here... */
+	tcp_for_write_queue(skb, sk) {
+		if (skb == tcp_send_head(sk))
+			break;
+
+		if (tp->sacked_out && (TCP_SKB_CB(skb)->seq == tp->highest_sack)) {
+			hisack_valid = 1;
+			if (WARN_ON(tp->fackets_out != fackets + tcp_skb_pcount(skb)))
+				err = 1;
+		}
+
+		if (skb == tp->fastpath_skb_hint)
+			if (WARN_ON(fackets != tp->fastpath_cnt_hint))
+				err = 1;
+
+		if (WARN_ON((fackets > tp->fackets_out) && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED)))
+			err = 1;
+
+		fackets += tcp_skb_pcount(skb);
+	}
+	
+	if (WARN_ON(tp->sacked_out && !hisack_valid))
+		err = 1;
+	
+	if (err)
+		tcp_print_queue(sk);
+}
+
 static int tcp_v4_get_port(struct sock *sk, unsigned short snum)
 {
 	return inet_csk_get_port(&tcp_hashinfo, sk, snum,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6199abe..4c70caf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -773,6 +773,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss
 			tcp_verify_left_out(tp);
 		}
 		tcp_adjust_fackets_out(tp, skb, diff);
+		
+		tcp_verify_fackets(sk);
 	}
 
 	/* Link BUFF into the send queue. */
@@ -1688,7 +1690,7 @@ u32 __tcp_select_window(struct sock *sk)
 }
 
 /* Attempt to collapse two adjacent SKB's during retransmission. */
-static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int mss_now)
+void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int mss_now)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *next_skb = tcp_write_queue_next(sk, skb);
@@ -1764,6 +1766,8 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int m
 		if (tp->fastpath_skb_hint == next_skb)
 			tp->fastpath_skb_hint = skb;
 
+		tcp_verify_fackets(sk);
+
 		sk_stream_free_skb(sk, next_skb);
 	}
 }
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-10-02 10:26             ` Ilpo Järvinen
@ 2007-10-02 20:06               ` Ilpo Järvinen
  2007-10-02 21:48                 ` Ilpo Järvinen
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2007-10-02 20:06 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9192 bytes --]

On Tue, 2 Oct 2007, Ilpo Järvinen wrote:

> I'm currently out of ideas where it could come from... so lets try 
> brute-force checking as your test case is not very high-speed... This 
> could hide it though... :-(
> 
> Please put the patch below on top of clean rc8-mm2 (it includes the patch
> I gave you last time) and try to reproduce.... These counter bugs can
> survive for sometime until !sacked_out condition occurs, so the patch
> below tries to find that out when inconsisteny occurs for the first time 
> regardless of sacked_out (I also removed some statics which hopefully 
> reduces compiler inlining for easier reading of the output). I tried this 
> myself (except for verify()s in frto funcs and minor printout 
> modifications), didn't trigger for me.

In case you haven't yet get started (or it's easy enough to replace), 
please use the one below instead (I forgot one counter from printout
in the last patch, which might turn out useful...). 

-- 
 i.


---
 include/net/tcp.h     |    3 +
 net/ipv4/tcp_input.c  |   23 +++++++++--
 net/ipv4/tcp_ipv4.c   |  103 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp_output.c |    6 ++-
 4 files changed, 129 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 991ccdc..54a0d91 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -43,6 +43,9 @@
 
 #include <linux/seq_file.h>
 
+extern void tcp_verify_fackets(struct sock *sk);
+extern void tcp_print_queue(struct sock *sk);
+
 extern struct inet_hashinfo tcp_hashinfo;
 
 extern atomic_t tcp_orphan_count;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e22ffe7..1d7367d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1140,7 +1140,7 @@ static int tcp_check_dsack(struct tcp_sock *tp, struct sk_buff *ack_skb,
 	return dup_sack;
 }
 
-static int
+int
 tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_una)
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
@@ -1160,6 +1160,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
 	int first_sack_index;
 
 	if (!tp->sacked_out) {
+		if (WARN_ON(tp->fackets_out))
+			tcp_print_queue(sk);
 		tp->fackets_out = 0;
 		tp->highest_sack = tp->snd_una;
 	}
@@ -1420,6 +1422,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
 			}
 		}
 	}
+	tcp_verify_fackets(sk);
 
 	/* Check for lost retransmit. This superb idea is
 	 * borrowed from "ratehalving". Event "C".
@@ -1632,13 +1635,14 @@ void tcp_enter_frto(struct sock *sk)
 	tcp_set_ca_state(sk, TCP_CA_Disorder);
 	tp->high_seq = tp->snd_nxt;
 	tp->frto_counter = 1;
+	tcp_verify_fackets(sk);
 }
 
 /* Enter Loss state after F-RTO was applied. Dupack arrived after RTO,
  * which indicates that we should follow the traditional RTO recovery,
  * i.e. mark everything lost and do go-back-N retransmission.
  */
-static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
+void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
@@ -1675,6 +1679,7 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 		}
 	}
 	tcp_verify_left_out(tp);
+	tcp_verify_fackets(sk);
 
 	tp->snd_cwnd = tcp_packets_in_flight(tp) + allowed_segments;
 	tp->snd_cwnd_cnt = 0;
@@ -1753,6 +1758,7 @@ void tcp_enter_loss(struct sock *sk, int how)
 		}
 	}
 	tcp_verify_left_out(tp);
+	tcp_verify_fackets(sk);
 
 	tp->reordering = min_t(unsigned int, tp->reordering,
 					     sysctl_tcp_reordering);
@@ -2308,7 +2314,7 @@ static void tcp_mtup_probe_success(struct sock *sk, struct sk_buff *skb)
  * It does _not_ decide what to send, it is made in function
  * tcp_xmit_retransmit_queue().
  */
-static void
+void
 tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
@@ -2322,8 +2328,11 @@ tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 	if (!tp->packets_out)
 		tp->sacked_out = 0;
 
-	if (WARN_ON(!tp->sacked_out && tp->fackets_out))
+	if (WARN_ON(!tp->sacked_out && tp->fackets_out)) {
+		printk(KERN_ERR "TCP %d\n", tcp_is_reno(tp));
+		tcp_print_queue(sk);
 		tp->fackets_out = 0;
+	}
 
 	/* Now state machine starts.
 	 * A. ECE, hence prohibit cwnd undoing, the reduction is required. */
@@ -2333,6 +2342,8 @@ tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
 	/* B. In all the states check for reneging SACKs. */
 	if (tp->sacked_out && tcp_check_sack_reneging(sk))
 		return;
+	
+	tcp_verify_fackets(sk);
 
 	/* C. Process data loss notification, provided it is valid. */
 	if ((flag&FLAG_DATA_LOST) &&
@@ -2572,7 +2583,7 @@ static u32 tcp_tso_acked(struct sock *sk, struct sk_buff *skb)
  * is before the ack sequence we can discard it as it's confirmed to have
  * arrived at the other end.
  */
-static int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
+int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	const struct inet_connection_sock *icsk = inet_csk(sk);
@@ -2694,6 +2705,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, s32 *seq_rtt_p)
 			ca_ops->pkts_acked(sk, pkts_acked, rtt_us);
 		}
 	}
+	tcp_verify_fackets(sk);
+
 
 #if FASTRETRANS_DEBUG > 0
 	BUG_TRAP((int)tp->sacked_out >= 0);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7fed0a6..c38acc1 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -108,6 +108,109 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
 	.lhash_wait  = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_wait),
 };
 
+void tcp_print_queue(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb;
+	char s[50+1];
+	char i[50+1];
+	int idx = 0;
+	u32 hs = tp->highest_sack;
+	
+	if (!tp->sacked_out)
+		hs = tp->snd_una;
+	
+	tcp_for_write_queue(skb, sk) {
+		if (skb == tcp_send_head(sk))
+			break;
+		
+		if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
+			if (skb->len < tp->mss_cache)
+				s[idx] = 's';
+			else
+				s[idx] = 'S';
+		} else {
+			s[idx] = '-';
+		}
+		if ((TCP_SKB_CB(skb)->seq == hs) && (tp->fastpath_skb_hint == skb))
+			i[idx] = 'x';
+		else if (tp->fastpath_skb_hint == skb)
+			i[idx] = 'f';
+		else if (TCP_SKB_CB(skb)->seq == hs)
+			i[idx] = 'h';
+		else
+			i[idx] = ' ';
+			
+		if (++idx >= 50) {
+			s[idx] = 0;
+			i[idx] = 0;
+			printk(KERN_ERR "TCP wq(s) %s\n", s);
+			printk(KERN_ERR "TCP wq(i) %s\n", i);
+			idx = 0;
+		}
+	}
+	if (idx) {
+		s[idx] = '<';
+		s[idx+1] = 0;
+		i[idx] = '<';
+		i[idx+1] = 0;
+		printk(KERN_ERR "TCP wq(s) %s\n", s);
+		printk(KERN_ERR "TCP wq(i) %s\n", i);
+	}
+	printk(KERN_ERR "s%u f%u (%u) p%u seq: su%u hs%u sn%u (%u)\n",
+		tp->sacked_out, tp->fackets_out, tp->packets_out,
+		tp->fastpath_cnt_hint,
+		tp->snd_una, tp->highest_sack, tp->snd_nxt,
+		((tp->fastpath_skb_hint == NULL) ? 0 :
+			TCP_SKB_CB(tp->fastpath_skb_hint)->seq));
+}
+
+void tcp_verify_fackets(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb;
+	u32 fackets = 0;
+	int hisack_valid = 0;
+	int err = 0;
+	
+	if (tcp_is_reno(tp))
+		return;
+	
+	if (!tp->sacked_out) {
+		if (WARN_ON(tp->fackets_out))
+			err = 1;
+		else if (tp->fastpath_skb_hint == NULL)
+			return;
+	}
+	
+	/* ...expensive processing here... */
+	tcp_for_write_queue(skb, sk) {
+		if (skb == tcp_send_head(sk))
+			break;
+
+		if (tp->sacked_out && (TCP_SKB_CB(skb)->seq == tp->highest_sack)) {
+			hisack_valid = 1;
+			if (WARN_ON(tp->fackets_out != fackets + tcp_skb_pcount(skb)))
+				err = 1;
+		}
+
+		if (skb == tp->fastpath_skb_hint)
+			if (WARN_ON(fackets != tp->fastpath_cnt_hint))
+				err = 1;
+
+		if (WARN_ON((fackets > tp->fackets_out) && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED)))
+			err = 1;
+
+		fackets += tcp_skb_pcount(skb);
+	}
+	
+	if (WARN_ON(tp->sacked_out && !hisack_valid))
+		err = 1;
+	
+	if (err)
+		tcp_print_queue(sk);
+}
+
 static int tcp_v4_get_port(struct sock *sk, unsigned short snum)
 {
 	return inet_csk_get_port(&tcp_hashinfo, sk, snum,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6199abe..4c70caf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -773,6 +773,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss
 			tcp_verify_left_out(tp);
 		}
 		tcp_adjust_fackets_out(tp, skb, diff);
+		
+		tcp_verify_fackets(sk);
 	}
 
 	/* Link BUFF into the send queue. */
@@ -1688,7 +1690,7 @@ u32 __tcp_select_window(struct sock *sk)
 }
 
 /* Attempt to collapse two adjacent SKB's during retransmission. */
-static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int mss_now)
+void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int mss_now)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *next_skb = tcp_write_queue_next(sk, skb);
@@ -1764,6 +1766,8 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int m
 		if (tp->fastpath_skb_hint == next_skb)
 			tp->fastpath_skb_hint = skb;
 
+		tcp_verify_fackets(sk);
+
 		sk_stream_free_skb(sk, next_skb);
 	}
 }
-- 
1.5.0.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
  2007-10-02 20:06               ` Ilpo Järvinen
@ 2007-10-02 21:48                 ` Ilpo Järvinen
  0 siblings, 0 replies; 12+ messages in thread
From: Ilpo Järvinen @ 2007-10-02 21:48 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: Andrew Morton, LKML, Netdev, David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 250 bytes --]

> On Tue, 2 Oct 2007, Ilpo Järvinen wrote:
> 
> > I'm currently out of ideas where it could come from...

Hmm, there seems to be off-by-one in tcp_retrans_try_collapse after
all, or in fact, two of them. I'll post patch for this tomorrow...


-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-10-02 21:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070927022220.c76a7a6e.akpm@linux-foundation.org>
2007-09-27 10:52 ` 2.6.23-rc8-mm2 - drivers/net/ibm_newemac/mal - broken Kamalesh Babulal
2007-09-28 15:42 ` 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING Cedric Le Goater
2007-09-28 19:10   ` Ilpo Järvinen
2007-09-29 12:44     ` Ilpo Järvinen
2007-09-29 14:55       ` Cedric Le Goater
2007-09-29 20:49         ` Ilpo Järvinen
2007-10-01  9:26           ` Cedric Le Goater
2007-10-02 10:26             ` Ilpo Järvinen
2007-10-02 20:06               ` Ilpo Järvinen
2007-10-02 21:48                 ` Ilpo Järvinen
2007-09-28 16:30 ` /proc/net/ bad hard links count [Was: 2.6.23-rc8-mm2] Jiri Slaby
2007-09-28 17:03   ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).