* Re: 2.6.9 failed assertion in tcp_timer.c [not found] ` <20041021165224.GA25399@linuxace.com> @ 2004-10-21 21:57 ` Herbert Xu 2004-10-22 5:53 ` David S. Miller 2004-10-22 18:54 ` Phil Oester 0 siblings, 2 replies; 9+ messages in thread From: Herbert Xu @ 2004-10-21 21:57 UTC (permalink / raw) To: Phil Oester; +Cc: linux-net, netdev, David S. Miller [-- Attachment #1: Type: text/plain, Size: 633 bytes --] On Thu, Oct 21, 2004 at 09:52:24AM -0700, Phil Oester wrote: > > KERNEL: assertion (!tcp_get_pcount(&tp->packets_out) || !skb_ > queue_empty(&sk->sk_write_queue)) failed at net/ipv4/tcp_input.c (2496) Perhaps the MTU is growing upwards? Does this patch help? Dave, we need to use the skb's MSS in trim_head as otherwise the counters will be screwed up. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 603 bytes --] ===== net/ipv4/tcp_output.c 1.68 vs edited ===== --- 1.68/net/ipv4/tcp_output.c 2004-10-20 15:13:52 +10:00 +++ edited/net/ipv4/tcp_output.c 2004-10-22 07:56:45 +10:00 @@ -566,8 +566,6 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) { - struct tcp_opt *tp = tcp_sk(sk); - if (skb_cloned(skb) && pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) return -ENOMEM; @@ -590,7 +588,7 @@ /* Any change of skb->len requires recalculation of tso * factor and mss. */ - tcp_set_skb_tso_segs(skb, tp->mss_cache_std); + tcp_set_skb_tso_segs(skb, tcp_skb_mss(skb)); return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-21 21:57 ` 2.6.9 failed assertion in tcp_timer.c Herbert Xu @ 2004-10-22 5:53 ` David S. Miller 2004-10-22 18:54 ` Phil Oester 1 sibling, 0 replies; 9+ messages in thread From: David S. Miller @ 2004-10-22 5:53 UTC (permalink / raw) To: Herbert Xu; +Cc: kernel, linux-net, netdev On Fri, 22 Oct 2004 07:57:43 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > Dave, we need to use the skb's MSS in trim_head as otherwise the > counters will be screwed up. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Excellent catch. Patch applied, thanks Herbert. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-21 21:57 ` 2.6.9 failed assertion in tcp_timer.c Herbert Xu 2004-10-22 5:53 ` David S. Miller @ 2004-10-22 18:54 ` Phil Oester 2004-10-22 21:50 ` Herbert Xu 1 sibling, 1 reply; 9+ messages in thread From: Phil Oester @ 2004-10-22 18:54 UTC (permalink / raw) To: Herbert Xu; +Cc: linux-net, netdev, David S. Miller On Fri, Oct 22, 2004 at 07:57:43AM +1000, Herbert Xu wrote: > On Thu, Oct 21, 2004 at 09:52:24AM -0700, Phil Oester wrote: > > > > KERNEL: assertion (!tcp_get_pcount(&tp->packets_out) || !skb_ > > queue_empty(&sk->sk_write_queue)) failed at net/ipv4/tcp_input.c (2496) > > Perhaps the MTU is growing upwards? Does this patch help? > > Dave, we need to use the skb's MSS in trim_head as otherwise the > counters will be screwed up. This patch gets it a bit further, but still panics (divide by 0?): divide error: 0000 [#1] SMP CPU: 0 EIP: 0060:[<c029e36c>] Not tainted VLI EFLAGS: 00010246 (2.6.9) EIP is at tcp_set_skb_tso_segs+0x2c/0x50 eax: 000000f3 ebx: d75f5b00 ecx: 00000000 edx: 00000000 esi: 00000001 edi: d6c22460 ebp: d6c22650 esp: c035decc ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c035c000 task=c030eac0) Stack: d75f5b00 c029e858 d75f5b00 d75f5b00 d6c22460 fffffff5 c029f3dd c02971a6 00000002 000005b4 d6c22460 d6c22650 d6c224c4 c035df58 c02a148d 00000002 00004100 00000080 f7c52260 c0393bec c038bda0 f710f130 00000010 c035df4c Call Trace: [<c029e858>] tcp_trim_head+0x88/0xc0 [<c029f3dd>] tcp_retransmit_skb+0x8d/0x320 [<c02971a6>] tcp_enter_loss+0x76/0x240 [<c02a148d>] tcp_retransmit_timer+0xfd/0x430 [<c02a1889>] tcp_write_timer+0xc9/0x100 [<c02a17c0>] tcp_write_timer+0x0/0x100 [<c011ee49>] run_timer_softirq+0xd9/0x170 [<c011b0f6>] __do_softirq+0xb6/0xd0 [<c011b13d>] do_softirq+0x2d/0x30 [<c010de35>] smp_apic_timer_interrupt+0x85/0xf0 [<c010492e>] apic_timer_interrupt+0x1a/0x20 [<c0101f40>] default_idle+0x0/0x40 [<c0101f6c>] default_idle+0x2c/0x40 [<c0101ffb>] cpu_idle+0x3b/0x50 [<c035ea66>] start_kernel+0x156/0x180 [<c035e530>] unknown_bootoption+0x0/0x180 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-22 18:54 ` Phil Oester @ 2004-10-22 21:50 ` Herbert Xu 2004-10-22 22:12 ` David S. Miller 2004-10-22 23:24 ` Herbert Xu 0 siblings, 2 replies; 9+ messages in thread From: Herbert Xu @ 2004-10-22 21:50 UTC (permalink / raw) To: Phil Oester; +Cc: linux-net, netdev, David S. Miller [-- Attachment #1: Type: text/plain, Size: 603 bytes --] On Fri, Oct 22, 2004 at 11:54:35AM -0700, Phil Oester wrote: > > This patch gets it a bit further, but still panics (divide by 0?): Oops, I forgot that this gets called on non-TSO packets as well. So we don't really know whether it went further or not :) We should only set the TSO size if the packet was TSO to start with. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 390 bytes --] --- linux-2.6/net/ipv4/tcp_output.c.orig 2004-10-22 07:56:45.000000000 +1000 +++ linux-2.6/net/ipv4/tcp_output.c 2004-10-23 07:48:15.000000000 +1000 @@ -588,7 +588,8 @@ /* Any change of skb->len requires recalculation of tso * factor and mss. */ - tcp_set_skb_tso_segs(skb, tcp_skb_mss(skb)); + if (tcp_skb_mss(skb)) + tcp_set_skb_tso_segs(skb, tcp_skb_mss(skb)); return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-22 21:50 ` Herbert Xu @ 2004-10-22 22:12 ` David S. Miller 2004-10-22 23:24 ` Herbert Xu 1 sibling, 0 replies; 9+ messages in thread From: David S. Miller @ 2004-10-22 22:12 UTC (permalink / raw) To: Herbert Xu; +Cc: kernel, linux-net, netdev On Sat, 23 Oct 2004 07:50:40 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Fri, Oct 22, 2004 at 11:54:35AM -0700, Phil Oester wrote: > > > > This patch gets it a bit further, but still panics (divide by 0?): > > Oops, I forgot that this gets called on non-TSO packets as well. > So we don't really know whether it went further or not :) > > We should only set the TSO size if the packet was TSO to start with. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Applied. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-22 21:50 ` Herbert Xu 2004-10-22 22:12 ` David S. Miller @ 2004-10-22 23:24 ` Herbert Xu 2004-10-25 19:58 ` Phil Oester 2004-10-26 4:09 ` David S. Miller 1 sibling, 2 replies; 9+ messages in thread From: Herbert Xu @ 2004-10-22 23:24 UTC (permalink / raw) To: Phil Oester; +Cc: linux-net, netdev, David S. Miller [-- Attachment #1: Type: text/plain, Size: 1079 bytes --] On Sat, Oct 23, 2004 at 07:50:40AM +1000, herbert wrote: > > So we don't really know whether it went further or not :) Actually, I think we've caught your crash now. If that code path is triggering at all, then it'll trigger with TSO packets too. If we get a truly partial ack on a TSO packet, then tcp_tso_acked will not trim it off. So we will fall through to this last-ditch trim call, which doesn't update packets_out. There are two solutions to this problem. I've taken the simpler approach for now. We simply trim off the partial bits in tcp_tso_acked and live with the fact that the packet counters may differ from what's on the netwrok by one. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Later on we can "fix" this by remembering where the original TSO packet started from, perhaps in skb->h or somewhere. Dave, is this worth it? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 1327 bytes --] ===== net/ipv4/tcp_input.c 1.81 vs edited ===== --- 1.81/net/ipv4/tcp_input.c 2004-10-04 07:31:39 +10:00 +++ edited/net/ipv4/tcp_input.c 2004-10-23 09:12:42 +10:00 @@ -2369,25 +2369,19 @@ { struct tcp_opt *tp = tcp_sk(sk); struct tcp_skb_cb *scb = TCP_SKB_CB(skb); - __u32 mss = tcp_skb_mss(skb); - __u32 snd_una = tp->snd_una; - __u32 orig_seq, seq; - __u32 packets_acked = 0; + __u32 seq = tp->snd_una; + __u32 packets_acked; int acked = 0; /* If we get here, the whole TSO packet has not been * acked. */ - BUG_ON(!after(scb->end_seq, snd_una)); + BUG_ON(!after(scb->end_seq, seq)); - seq = orig_seq = scb->seq; - while (!after(seq + mss, snd_una)) { - packets_acked++; - seq += mss; - } - - if (tcp_trim_head(sk, skb, (seq - orig_seq))) + packets_acked = tcp_skb_pcount(skb); + if (tcp_trim_head(sk, skb, seq - scb->seq)) return 0; + packets_acked -= tcp_skb_pcount(skb); if (packets_acked) { __u8 sacked = scb->sacked; --- linux-2.6/net/ipv4/tcp_output.c.orig 2004-10-23 08:44:16.000000000 +1000 +++ linux-2.6/net/ipv4/tcp_output.c 2004-10-23 09:10:13.000000000 +1000 @@ -588,7 +588,7 @@ /* Any change of skb->len requires recalculation of tso * factor and mss. */ - if (tcp_skb_mss(skb)) + if (tcp_skb_pcount(skb) > 1) tcp_set_skb_tso_segs(skb, tcp_skb_mss(skb)); return 0; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-22 23:24 ` Herbert Xu @ 2004-10-25 19:58 ` Phil Oester 2004-10-25 20:25 ` David S. Miller 2004-10-26 4:09 ` David S. Miller 1 sibling, 1 reply; 9+ messages in thread From: Phil Oester @ 2004-10-25 19:58 UTC (permalink / raw) To: Herbert Xu; +Cc: linux-net, netdev, David S. Miller On Sat, Oct 23, 2004 at 09:24:03AM +1000, Herbert Xu wrote: > Actually, I think we've caught your crash now. If that code path > is triggering at all, then it'll trigger with TSO packets too. If > we get a truly partial ack on a TSO packet, then tcp_tso_acked will > not trim it off. So we will fall through to this last-ditch trim > call, which doesn't update packets_out. > > There are two solutions to this problem. I've taken the simpler > approach for now. We simply trim off the partial bits in tcp_tso_acked > and live with the fact that the packet counters may differ from > what's on the netwrok by one. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > Later on we can "fix" this by remembering where the original TSO > packet started from, perhaps in skb->h or somewhere. Dave, is this > worth it? Looks like your most recent 2 patches did the trick. Uptime on the test box is over 4 hours, whereas before I could count on a panic within about an hour. I'll continue running 2.6.9 + your patches and let you know if this changes... Thanks, Phil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-25 19:58 ` Phil Oester @ 2004-10-25 20:25 ` David S. Miller 0 siblings, 0 replies; 9+ messages in thread From: David S. Miller @ 2004-10-25 20:25 UTC (permalink / raw) To: Phil Oester; +Cc: herbert, linux-net, netdev On Mon, 25 Oct 2004 12:58:15 -0700 Phil Oester <kernel@linuxace.com> wrote: > > Later on we can "fix" this by remembering where the original TSO > > packet started from, perhaps in skb->h or somewhere. Dave, is this > > worth it? > > Looks like your most recent 2 patches did the trick. Uptime on the > test box is over 4 hours, whereas before I could count on a panic > within about an hour. I'll continue running 2.6.9 + your patches > and let you know if this changes... Thanks for testing Phil. Herbert I'll ponder over this patch today. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.9 failed assertion in tcp_timer.c 2004-10-22 23:24 ` Herbert Xu 2004-10-25 19:58 ` Phil Oester @ 2004-10-26 4:09 ` David S. Miller 1 sibling, 0 replies; 9+ messages in thread From: David S. Miller @ 2004-10-26 4:09 UTC (permalink / raw) To: Herbert Xu; +Cc: kernel, linux-net, netdev On Sat, 23 Oct 2004 09:24:03 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > Actually, I think we've caught your crash now. If that code path > is triggering at all, then it'll trigger with TSO packets too. If > we get a truly partial ack on a TSO packet, then tcp_tso_acked will > not trim it off. So we will fall through to this last-ditch trim > call, which doesn't update packets_out. > > There are two solutions to this problem. I've taken the simpler > approach for now. We simply trim off the partial bits in tcp_tso_acked > and live with the fact that the packet counters may differ from > what's on the netwrok by one. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > Later on we can "fix" this by remembering where the original TSO > packet started from, perhaps in skb->h or somewhere. Dave, is this > worth it? I'm going to apply this patch for now. I think at this point we can generalize TSO processing a bit better in this code now. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-10-26 4:09 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20041019215054.GA17565@linuxace.com>
[not found] ` <E1CK3Az-0004P2-00@gondolin.me.apana.org.au>
[not found] ` <20041020164748.GA20905@linuxace.com>
[not found] ` <20041020212651.GA8534@gondor.apana.org.au>
[not found] ` <20041020225536.GA22179@linuxace.com>
[not found] ` <20041021130339.GA19345@gondor.apana.org.au>
[not found] ` <20041021165224.GA25399@linuxace.com>
2004-10-21 21:57 ` 2.6.9 failed assertion in tcp_timer.c Herbert Xu
2004-10-22 5:53 ` David S. Miller
2004-10-22 18:54 ` Phil Oester
2004-10-22 21:50 ` Herbert Xu
2004-10-22 22:12 ` David S. Miller
2004-10-22 23:24 ` Herbert Xu
2004-10-25 19:58 ` Phil Oester
2004-10-25 20:25 ` David S. Miller
2004-10-26 4:09 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).