* Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918
@ 2005-08-01 8:33 Guillaume Pelat
2005-08-04 3:33 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Guillaume Pelat @ 2005-08-01 8:33 UTC (permalink / raw)
To: davem; +Cc: akpm, netdev, linux-kernel
Hi,
> [TCP]: Fix two TSO sizing bugs
>
> MSS changes can be lost since we preemptively initialize
> the tso_segs count for an SKB before we %100 commit
> to sending it out.
> So, by the time we send it out, the tso_size information
> can be stale due to PMTU events. This mucks up all of the
> logic in our send engine, and can even result in the BUG()
> triggering in tcp_tso_should_defer().
> Another problem we have is that we're storing the tp->mss_cache,
> not the SACK block normalized MSS, as the tso_size. That's wrong
> too.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
I just tried the patch attached. :)
The bug is still here (same symptoms), with a slightly different backtrace :
------------[ cut here ]------------
kernel BUG at net/ipv4/tcp_output.c:918!
invalid operand: 0000 [#1]
CPU: 0
EIP: 0060:[<c027dd66>] Not tainted VLI
EFLAGS: 00010297 (2.6.13-rc4-endy)
EIP is at tcp_tso_should_defer+0xd6/0xf0
eax: 00000005 ebx: f5032e80 ecx: 00000007 edx: f3b2fc00
esi: 00000006 edi: 00000006 ebp: c031fd78 esp: c031fd68
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c031e000 task=c02dbb80)
Stack: f129b2b8 f5032e80 00000006 f3b2fc00 c031fdb0 c027de7b f3b2fc00
f3b2fc00
f5032e80 0000000b f3b2fc00 b7773884 00000000 00000005 00000002
f3b2fc00
f3b2fc00 f5a26034 c031fdd4 c027e1b2 f3b2fc00 00000564 00000001
f5a26034
Call Trace:
[<c0102e5f>] show_stack+0x7f/0xa0
[<c0103002>] show_registers+0x152/0x1c0
[<c01031f8>] die+0xc8/0x140
[<c0103325>] do_trap+0xb5/0xc0
[<c010366c>] do_invalid_op+0xbc/0xd0
[<c0102aa3>] error_code+0x4f/0x54
[<c027de7b>] tcp_write_xmit+0xfb/0x400
[<c027e1b2>] __tcp_push_pending_frames+0x32/0xd0
[<c027b1cc>] tcp_rcv_established+0x27c/0x860 (was
tcp_rcv_state_process before).
[<c0283f8a>] tcp_v4_do_rcv+0x11a/0x120
[<c0284502>] tcp_v4_rcv+0x572/0x750
[<c026a62b>] ip_local_deliver+0xcb/0x1d0
[<c026aa52>] ip_rcv+0x322/0x4a0
[<c0256a97>] netif_receive_skb+0x137/0x1a0
[<c0256b8f>] process_backlog+0x8f/0x110
[<c0256c82>] net_rx_action+0x72/0x100
[<c01172dc>] __do_softirq+0x8c/0xa0
[<c011731a>] do_softirq+0x2a/0x30
[<c01173d5>] irq_exit+0x35/0x40
[<c01044fc>] do_IRQ+0x3c/0x70
[<c0102a46>] common_interrupt+0x1a/0x20
[<c0100997>] cpu_idle+0x57/0x60
[<c010024b>] _stext+0x2b/0x30
[<c0320847>] start_kernel+0x147/0x170
[<c0100199>] 0xc0100199
Code: 89 f8 0f af c2 3b 45 f0 0f 47 45 f0 31 d2 89 45 f0 f7 f3 31 d2 39
c1 73 ce ba 01 00 00 00 eb c7 6b c2 03 31 d239 c1 77 be eb ee <0f> 0b 96
03 ce 54 2d c0 e9 76 ff ff ff 8b ba 78 02 00 00 eb eb
<0>Kernel panic - not syncing: Fatal exception in interrupt
I guess it's the same bug :)
Just a few more infos about the problem :
- Turning off TSO with ethtool solves the problem.
- I tried 2.6.13-rc4 on another server (with the same configuration) and
the bug occured too (so i guess it's not due to some weird memory
problem :) )
- The problem dont seems to be present in 2.6.12.3 (but i only tried
2.6.12.3 during 2 days).
Best Regards,
Guillaume Pelat
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918
2005-08-01 8:33 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Guillaume Pelat
@ 2005-08-04 3:33 ` Herbert Xu
2005-08-04 10:35 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2005-08-04 3:33 UTC (permalink / raw)
To: Guillaume Pelat; +Cc: davem, akpm, netdev, linux-kernel
On Mon, Aug 01, 2005 at 08:33:20AM +0000, Guillaume Pelat wrote:
>
> I just tried the patch attached. :)
>
> The bug is still here (same symptoms), with a slightly different backtrace :
> ------------[ cut here ]------------
> kernel BUG at net/ipv4/tcp_output.c:918!
OK, let's try again :)
I bet it's the tcp_enter_cwr() call in tcp_transmit_skb(). So
the sequence is:
tcp_write_xmit
cwnd_quota = tcp_cwnd_test
tcp_transmit_skb
tcp_enter_cwr
tp->snd_cwnd = min(tp->snd_cwnd, in_flight + 1)
At this point cwnd_quota is out-of-sync with tp->snd_cwnd.
cwnd_quota -= tcp_skb_pcount(skb)
cwnd_quota > 0
tcp_tso_should_defer
BUG since tp->snd_cwnd is smaller than what
cwnd_quota indicated.
So I suppose we should reset cwnd_quota after tcp_transmit_skb?
Perhaps we should only transmit one MSS in this case?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918
2005-08-04 3:33 ` Herbert Xu
@ 2005-08-04 10:35 ` Herbert Xu
2005-08-04 17:41 ` Guillaume Pelat
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2005-08-04 10:35 UTC (permalink / raw)
To: Guillaume Pelat; +Cc: davem, akpm, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 413 bytes --]
On Thu, Aug 04, 2005 at 01:33:29PM +1000, herbert wrote:
>
> So I suppose we should reset cwnd_quota after tcp_transmit_skb?
Please try this patch to see if this is really the problem or not.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: p --]
[-- Type: text/plain, Size: 674 bytes --]
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1027,19 +1027,14 @@ static int tcp_write_xmit(struct sock *s
tcp_minshall_update(tp, mss_now, skb);
sent_pkts++;
- /* Do not optimize this to use tso_segs. If we chopped up
- * the packet above, tso_segs will no longer be valid.
- */
- cwnd_quota -= tcp_skb_pcount(skb);
-
- BUG_ON(cwnd_quota < 0);
- if (!cwnd_quota)
- break;
-
skb = sk->sk_send_head;
if (!skb)
break;
+
tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
+ cwnd_quota = tcp_cwnd_test(tp, skb);
+ if (!cwnd_quota)
+ break;
}
if (likely(sent_pkts)) {
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918
2005-08-04 10:35 ` Herbert Xu
@ 2005-08-04 17:41 ` Guillaume Pelat
2005-08-04 18:23 ` [RFC] : SLAB : Could we have a process context only versions of kmem_cache_alloc(), and kmem_cache_free() Eric Dumazet
2005-08-04 23:58 ` 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Andrew Morton
0 siblings, 2 replies; 9+ messages in thread
From: Guillaume Pelat @ 2005-08-04 17:41 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, akpm, netdev, linux-kernel
Hi,
Herbert Xu wrote:
> On Thu, Aug 04, 2005 at 01:33:29PM +1000, herbert wrote:
>
>>So I suppose we should reset cwnd_quota after tcp_transmit_skb?
>
> Please try this patch to see if this is really the problem or not.
>
> Thanks,
I just applied your patch, and it seems to work :)
2 hours uptime, and no crash yet (without the patch, it was crashing a
few mins only after booting).
So i think the bug is crushed :)
Thanks a lot !
--
Guillaume Pelat
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC] : SLAB : Could we have a process context only versions of kmem_cache_alloc(), and kmem_cache_free()
2005-08-04 17:41 ` Guillaume Pelat
@ 2005-08-04 18:23 ` Eric Dumazet
2005-08-04 23:58 ` 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Andrew Morton
1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2005-08-04 18:23 UTC (permalink / raw)
To: linux-kernel
Hi
The cost of local_irq_save(flags)/local_irq_restore(flags) in slab functions is very high
popf, cli, pushf do stress the modern processors.
Maybe we could provide special functions for caches that are known to be used only from process context ?
These functions may use the local_irq_save(flags)/local_irq_restore(flags) only if needed (cache_alloc_refill() or cache_flusharray())
Something like :
void *kmem_cache_alloc_noirq(kmem_cache_t *cachep, unsigned int __nocast flags)
{
unsigned long save_flags;
void* objp;
struct array_cache *ac;
cache_alloc_debugcheck_before(cachep, flags);
check_irq_on();
preempt_disable();
ac = ac_data(cachep);
if (likely(ac->avail)) {
STATS_INC_ALLOCHIT(cachep);
ac->touched = 1;
objp = ac_entry(ac)[--ac->avail];
} else {
STATS_INC_ALLOCMISS(cachep);
local_irq_save(save_flags);
objp = cache_alloc_refill(cachep, flags);
local_irq_restore(save_flags);
}
preempt_enable();
objp = cache_alloc_debugcheck_after(cachep, flags, objp, __builtin_return_address(0));
prefetchw(objp);
return objp;
}
void kmem_cache_free_noirq(kmem_cache_t *cachep, void *objp)
{
struct array_cache *ac;
check_irq_on();
preempt_disable();
ac = ac_data(cachep);
objp = cache_free_debugcheck(cachep, objp, __builtin_return_address(0));
if (likely(ac->avail < ac->limit)) {
STATS_INC_FREEHIT(cachep);
} else {
unsigned long flags;
STATS_INC_FREEMISS(cachep);
local_irq_save(flags);
cache_flusharray(cachep, ac);
local_irq_restore(flags);
}
ac_entry(ac)[ac->avail++] = objp;
preempt_disable();
}
Thank you
Eric Dumazet
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918
2005-08-04 17:41 ` Guillaume Pelat
2005-08-04 18:23 ` [RFC] : SLAB : Could we have a process context only versions of kmem_cache_alloc(), and kmem_cache_free() Eric Dumazet
@ 2005-08-04 23:58 ` Andrew Morton
2005-08-05 0:57 ` [TCP]: Fix TSO cwnd caching bug Herbert Xu
1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2005-08-04 23:58 UTC (permalink / raw)
To: Guillaume Pelat; +Cc: herbert, davem, netdev, linux-kernel
Guillaume Pelat <guillaume.pelat@winch-hebergement.net> wrote:
>
> Hi,
>
> Herbert Xu wrote:
> > On Thu, Aug 04, 2005 at 01:33:29PM +1000, herbert wrote:
> >
> >>So I suppose we should reset cwnd_quota after tcp_transmit_skb?
> >
> > Please try this patch to see if this is really the problem or not.
> >
> > Thanks,
>
> I just applied your patch, and it seems to work :)
> 2 hours uptime, and no crash yet (without the patch, it was crashing a
> few mins only after booting).
> So i think the bug is crushed :)
>
Thanks, Guillaume. Herbert, David is travelling and not able to do a lot
of patchmonkeying. Could you please prepare and submit a final patch?
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [TCP]: Fix TSO cwnd caching bug
2005-08-04 23:58 ` 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Andrew Morton
@ 2005-08-05 0:57 ` Herbert Xu
2005-08-05 1:08 ` Andrew Morton
2005-08-05 8:33 ` David S. Miller
0 siblings, 2 replies; 9+ messages in thread
From: Herbert Xu @ 2005-08-05 0:57 UTC (permalink / raw)
To: Andrew Morton; +Cc: Guillaume Pelat, davem, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 928 bytes --]
On Thu, Aug 04, 2005 at 04:58:42PM -0700, Andrew Morton wrote:
>
> Thanks, Guillaume. Herbert, David is travelling and not able to do a lot
> of patchmonkeying. Could you please prepare and submit a final patch?
OK, here is the final version. It depends on the patch that David
posted earlier on in this thread. Please let me know if you need a
copy of that.
[TCP]: Fix TSO cwnd caching bug
tcp_write_xmit caches the cwnd value indirectly in cwnd_quota.
When tcp_transmit_skb reduces the cwnd because of tcp_enter_cwr,
the cached value becomes invalid.
This patch ensures that the cwnd value is always reread after
each tcp_transmit_skb call.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: tso-1 --]
[-- Type: text/plain, Size: 1464 bytes --]
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -972,19 +972,18 @@ static int tcp_write_xmit(struct sock *s
if (unlikely(sk->sk_state == TCP_CLOSE))
return 0;
- skb = sk->sk_send_head;
- if (unlikely(!skb))
- return 0;
-
- tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
- cwnd_quota = tcp_cwnd_test(tp, skb);
- if (unlikely(!cwnd_quota))
- goto out;
-
sent_pkts = 0;
- while (likely(tcp_snd_wnd_test(tp, skb, mss_now))) {
+ while ((skb = sk->sk_send_head)) {
+ tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
BUG_ON(!tso_segs);
+ cwnd_quota = tcp_cwnd_test(tp, skb);
+ if (!cwnd_quota)
+ break;
+
+ if (unlikely(!tcp_snd_wnd_test(tp, skb, mss_now)))
+ break;
+
if (tso_segs == 1) {
if (unlikely(!tcp_nagle_test(tp, skb, mss_now,
(tcp_skb_is_last(sk, skb) ?
@@ -1026,27 +1025,12 @@ static int tcp_write_xmit(struct sock *s
tcp_minshall_update(tp, mss_now, skb);
sent_pkts++;
-
- /* Do not optimize this to use tso_segs. If we chopped up
- * the packet above, tso_segs will no longer be valid.
- */
- cwnd_quota -= tcp_skb_pcount(skb);
-
- BUG_ON(cwnd_quota < 0);
- if (!cwnd_quota)
- break;
-
- skb = sk->sk_send_head;
- if (!skb)
- break;
- tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
}
if (likely(sent_pkts)) {
tcp_cwnd_validate(sk, tp);
return 0;
}
-out:
return !tp->packets_out && sk->sk_send_head;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [TCP]: Fix TSO cwnd caching bug
2005-08-05 0:57 ` [TCP]: Fix TSO cwnd caching bug Herbert Xu
@ 2005-08-05 1:08 ` Andrew Morton
2005-08-05 8:33 ` David S. Miller
1 sibling, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2005-08-05 1:08 UTC (permalink / raw)
To: Herbert Xu; +Cc: guillaume.pelat, davem, netdev, linux-kernel
Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Thu, Aug 04, 2005 at 04:58:42PM -0700, Andrew Morton wrote:
> >
> > Thanks, Guillaume. Herbert, David is travelling and not able to do a lot
> > of patchmonkeying. Could you please prepare and submit a final patch?
>
> OK, here is the final version.
Thanks.
> It depends on the patch that David
> posted earlier on in this thread. Please let me know if you need a
> copy of that.
Yes please.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [TCP]: Fix TSO cwnd caching bug
2005-08-05 0:57 ` [TCP]: Fix TSO cwnd caching bug Herbert Xu
2005-08-05 1:08 ` Andrew Morton
@ 2005-08-05 8:33 ` David S. Miller
1 sibling, 0 replies; 9+ messages in thread
From: David S. Miller @ 2005-08-05 8:33 UTC (permalink / raw)
To: herbert; +Cc: akpm, guillaume.pelat, netdev, linux-kernel
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 5 Aug 2005 10:57:41 +1000
> OK, here is the final version. It depends on the patch that David
> posted earlier on in this thread. Please let me know if you need a
> copy of that.
>
> [TCP]: Fix TSO cwnd caching bug
Good catch Herbert :)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-08-05 8:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-01 8:33 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Guillaume Pelat
2005-08-04 3:33 ` Herbert Xu
2005-08-04 10:35 ` Herbert Xu
2005-08-04 17:41 ` Guillaume Pelat
2005-08-04 18:23 ` [RFC] : SLAB : Could we have a process context only versions of kmem_cache_alloc(), and kmem_cache_free() Eric Dumazet
2005-08-04 23:58 ` 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918 Andrew Morton
2005-08-05 0:57 ` [TCP]: Fix TSO cwnd caching bug Herbert Xu
2005-08-05 1:08 ` Andrew Morton
2005-08-05 8:33 ` David S. Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox