From: george anzinger <george@mvista.com>
To: netdev@oss.sgi.com, linux-net@vger.kernel.org, davem@redhat.com,
ak@muc.de, kuznet@ms2.inr.ac.ru, pekkas@netcore.fi
Subject: System crash in tcp_fragment()
Date: Mon, 20 May 2002 12:42:08 -0700 [thread overview]
Message-ID: <acbkpj$knv$2@main.gmane.org> (raw)
I wonder if you could help me squash a bug in the tcp code.
Here is what we know thus far:
An SMP (x386 dual) 2.4.17 kernel crashes with an attempt to
deference NULL at the end of tcp_fragment() (in
net/ipv4/tcp_output.c) while attempting to link in the newly
created fragment.
The bugzilla report is:
http://www.telecomlinux.org/bugzilla/show_bug.cgi?id=503
Incase you can not see this, it appears that the addresses
of each skb are alright, so the assumption is that the skb
passed to tcp_fragment() has been unlinked while
tcp_fragment() was doing its thing. This implies a need for
locking at some higher level and we don't know enough about
the tcp code to divine where this might best be done.
Here is the call stack:
Panic screen:
<1>Unable to handle kernel NULL pointer deference at
virtual address
00000004
<4> printing eip:
<4>c0256fb2
<1>*pde = 00000000
<4>Oops: 0002
<4>CPU: 1
<4>EIP: 0010:[<c0256fb2>] Not tainted
<4>EFLAGS: 00010296
<4>eax: 00000000 ebx: c4d3ada0 ecx: c4d3ada0 edx:
00000000
<4>esi: c4e60780 edi: 000005a8 ebp: 00000610 esp:
c1219e78
<4>ds: 0018 es: 0018 ss: 0018
<4>Process swapper (pid: 0, stackpage=c1219000)
<4>Stack: c4c84478 00000064 c88937cd 00006270 00000010
c4e60780 c4c84478
000005a8
<4> 000005a8 c025787f c4c843a0 c4e60780 000005a8
c4c843a0 c4c84478
c4c843a0
<4> 004bd6a9 c0259a32 c4c843a0 c4e60780 c4c843a0
00000000 c1219ee8
00004050
<4>Call Trace: [<c88937cd>] [<c025787f>] [<c0259a32>]
[<c01bedc5>]
[<c0259c36>]
<4> [<c0128d6a>] [<c0259b50>] [<c0128e6d>]
[<c01246fb>] [<c0109604>]
[<c0105490>]
<4> [<c0105490>] [<c0105490>] [<c01054bc>]
[<c0105542>] [<c011d3db>]
[<c011d76d>]
<4>
<4>Code: 89 5a 04 89 1e 89 43 08 ff 40 08 31 c0 83 c4 14
5b 5e 5f 5d
<1>Dumping from interrupt handler !
<1>Uncertain scenario - but will try my best
<4>
<4>dump: Dumping to device 0x806 [sd(8,6)] on CPU 1 ...
<4>dump: Compression value is 0x0, Writing dump header
<4>
<4>dump: Pass 1: Saving Reserved Pages:
<4>dump: Memory Bank[0]: 0 ... 7feffff:
[...]
lcrash backtrace:
>> bt
================================================================
STACK TRACE FOR TASK: 0xc1218000(swapper)
0 tcp_fragment+674 [0xc0256fb2]
1 tcp_retransmit_skb+170 [0xc025787a]
2 tcp_retransmit_timer+493 [0xc0259a2d]
3 tcp_write_timer+225 [0xc0259c31]
4 timer_bh+710 [0xc0128d66]
5 timer_softirq+40 [0xc0128e68]
6 do_softirq+185 [0xc01246f9]
7 do_IRQ+511 [0xc01095ff]
8 do_IRQ+511 [0xc01095ff]
TRACE ERROR 0x1
================================================================
We assumed that this might be related to preempt code in the
kernel, however, this now appears unlikely. The primary
reason for preempt related failures is the use of
unprotected "cpu ids" to access "per cpu" data structures.
To this end we have made changes to the "skb" management
code to include the smp_processor_id() calls in the relevant
interrupt off areas, however, this problem does not seem to
have any such issues.
Is is possible for the other cpu (or even this one given the
ksoftirqd stuff) to remove or alter the skb that
tcp_fragment() is processing? What locks, if any, are
needed to prevent this.
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
next reply other threads:[~2002-05-20 19:42 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-20 19:42 george anzinger [this message]
[not found] <3CE95190.75C52E2D@mvista.com>
2002-05-20 20:29 ` System crash in tcp_fragment() Andi Kleen
[not found] ` <20020520222937.A1467@averell>
2002-05-20 21:18 ` george anzinger
2002-05-20 21:25 ` kuznet
2002-05-20 22:08 ` David S. Miller
[not found] ` <200205202125.BAA03545@sex.inr.ac.ru>
2002-05-20 23:01 ` george anzinger
2002-05-20 23:54 ` Andi Kleen
[not found] <20020521015407.A1296@wotan.suse.de>
2002-05-21 0:11 ` kuznet
[not found] ` <3CE99434.20E7479C@mvista.com>
2002-05-21 0:18 ` David S. Miller
2002-05-21 0:39 ` Andi Kleen
2002-05-21 0:20 ` Andi Kleen
2002-05-21 0:26 ` george anzinger
[not found] ` <20020521022007.A6248@wotan.suse.de>
2002-05-21 0:34 ` george anzinger
[not found] <3CE9960D.15D41380@mvista.com>
[not found] ` <200205210041.EAA04407@sex.inr.ac.ru>
2002-05-21 0:34 ` David S. Miller
2002-05-21 0:41 ` kuznet
[not found] <20020520.173416.105610032.davem@redhat.com>
2002-05-21 1:00 ` kuznet
2002-05-21 1:49 ` Nivedita Singhvi
[not found] <Pine.LNX.4.33.0205201836160.9301-100000@w-nivedita2.des.beaverton.ibm.com>
[not found] ` <3CE9E466.AC2358EE@mvista.com>
2002-05-21 6:00 ` David S. Miller
[not found] ` <20020520.230021.29510217.davem@redhat.com>
2002-05-21 7:25 ` george anzinger
2002-05-21 9:49 ` Andi Kleen
[not found] ` <3CE9F679.90ACF597@mvista.com>
2002-05-21 7:22 ` David S. Miller
2002-05-21 12:47 ` kuznet
2002-05-21 15:42 ` george anzinger
2002-05-21 12:54 ` Andi Kleen
2002-05-21 6:08 ` george anzinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='acbkpj$knv$2@main.gmane.org' \
--to=george@mvista.com \
--cc=ak@muc.de \
--cc=davem@redhat.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-net@vger.kernel.org \
--cc=netdev@oss.sgi.com \
--cc=pekkas@netcore.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).