public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* oops in inet_bind/tcp_v4_get_port
@ 2003-09-11 11:42 Wichert Akkerman
  0 siblings, 0 replies; 5+ messages in thread
From: Wichert Akkerman @ 2003-09-11 11:42 UTC (permalink / raw)
  To: netdev, linux-kernel

I just had a kernel oops while restarting exim4. A hand-copied oops
report is below. This is using kernel 2.6.0-test3 on a UP box without
preempt after about a month of uptime. 

Wichert.

*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c029cde6>]    Not tainted
EFLAGS: 00010246
EIP is at tcp_v4_get_port+0x296/0x2b0
eax: 00000000  ebx: dec3d740  ecx: c18ed860   edx: c18ed870
esi: 00000019  edi: 00000000  ebp: dfcc0064   esp: ddb07e70
ds: 007b   es: 007b   ss: 0068
Process exim4 (pid: 14228, threadinfo=ddb600 task=ddb1b300)
Stack: 00000000 00000000 00000000 00000000 c18ed860 00000000 00000000 00000001
       ddc8f508 00000000 00000246 00000001 ddc8f3e0 ffffffea ddc8f508 00000019
       c02adbac ddc8f3e0 00000019 ddb07ee8 00000002 ddaa99a0 ddb07ee8 00000010
Call Trace:
 [<c02adbac>] inet_bind+0x17c/0x230
 [<c025f24b>] sys_bind+0x7b/0xb0
 [<c011551c>] do_page_fault+0x12c/0x469
 [<c025f938>] sys_setsockopt+0x78/0xc0
 [<c026001b>] sys_socketcall+0xbb/0x2a0
 [>c010901f>] syscall_call+0x7/0xb

Code: 0f b6 40 49 24 20 84 c0 75 9c eb 8a 8d b4 26 00 00 00 00 8d
  <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

-- 
Wichert Akkerman <wichert@wiggy.net>    It is simple to make things.
http://www.wiggy.net/                   It is hard to make things simple.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* re: oops in inet_bind/tcp_v4_get_port
@ 2003-09-13  9:58 James Harper
  2003-09-13 10:25 ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 1 reply; 5+ messages in thread
From: James Harper @ 2003-09-13  9:58 UTC (permalink / raw)
  To: linux-kernel

I get a null pointer exception in the same routine when restarting slapd 
in 2.6.0-test5, and it hangs my system hard. I'm investigating now. If 
anyone has a patch already please send me a copy too!

thanks

James


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: oops in inet_bind/tcp_v4_get_port
  2003-09-13  9:58 James Harper
@ 2003-09-13 10:25 ` YOSHIFUJI Hideaki / 吉藤英明
  2003-09-13 12:23   ` James Harper
  0 siblings, 1 reply; 5+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-09-13 10:25 UTC (permalink / raw)
  To: james.harper; +Cc: linux-kernel, yoshfuji

In article <3F62EA61.1000804@bigpond.com> (at Sat, 13 Sep 2003 19:58:57 +1000), James Harper <james.harper@bigpond.com> says:

> I get a null pointer exception in the same routine when restarting slapd 
> in 2.6.0-test5, and it hangs my system hard. I'm investigating now. If 
> anyone has a patch already please send me a copy too!

Have you tried to disable kernek preemption?

--yoshfuji

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: oops in inet_bind/tcp_v4_get_port
  2003-09-13 10:25 ` YOSHIFUJI Hideaki / 吉藤英明
@ 2003-09-13 12:23   ` James Harper
  2003-09-13 13:18     ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 1 reply; 5+ messages in thread
From: James Harper @ 2003-09-13 12:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: yoshfuji

I haven't disabled preemption, but I have pinned down where i'm getting 
the crash... it appears to be related to ipv6, and from what I can 
determine the following is happening:

When I stop slapd... netstat -an | grep 389 looks like this:

tcp 1 0 127.0.0.1:32973 127.0.0.1:389 CLOSE_WAIT
tcp 0 0 127.0.0.1:32974 127.0.0.1:389 TIME_WAIT
tcp6 0 0 ::ffff:127.0.0.1:389 ::ffff:127.0.0.1:32973 FIN_WAIT2
tcp6 0 0 ::ffff:127.0.0.1:389 ::ffff:127.0.0.1:32958 FIN_WAIT2

If I restart it immediately, I get the oops (it's always a null pointer 
dereference, it's more often the one where you access memory that's out 
of bounds). If I wait until the tcp6 connections time out, and then 
restart, I don't get the oops.

The crash is happening in net/ipv4/tcp_ipv4.c - tcp_bind_conflict (it's 
inline, which i guess is why it isn't in the oops trace, but it's called 
from tcp_v4_get_port), specifically in the call to the macro 
ipv6_only_sock. My guess is that while the sock says it's PF_INET6, it 
doesn't have the extra ipv6 stuff (specifically the pointer to 
ipv6_pinfo) so it's reading past the end of the structure, or that the 
stuff past the main sock struct is getting corrupted. I think the former 
is more likely but either possibility explains why I got a null pointer 
dereference one time, and the other oops the other time.

This is the first time i've ever really looked at the networking code in 
the kernel so I can't easily see how the above situation could arise, 
but if anyone wants me to test anything i'm more than happy to!

thanks

James


YOSHIFUJI Hideaki / ???? wrote:

>In article <3F62EA61.1000804@bigpond.com> (at Sat, 13 Sep 2003 19:58:57 +1000), James Harper <james.harper@bigpond.com> says:
>
>  
>
>>I get a null pointer exception in the same routine when restarting slapd 
>>in 2.6.0-test5, and it hangs my system hard. I'm investigating now. If 
>>anyone has a patch already please send me a copy too!
>>    
>>
>
>Have you tried to disable kernek preemption?
>
>--yoshfuji
>
>  
>




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: oops in inet_bind/tcp_v4_get_port
  2003-09-13 12:23   ` James Harper
@ 2003-09-13 13:18     ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 0 replies; 5+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-09-13 13:18 UTC (permalink / raw)
  To: james.harper; +Cc: linux-kernel, yoshfuji

In article <3F630C59.5090004@bigpond.com> (at Sat, 13 Sep 2003 22:23:53 +1000), James Harper <james.harper@bigpond.com> says:

> The crash is happening in net/ipv4/tcp_ipv4.c - tcp_bind_conflict (it's 
> inline, which i guess is why it isn't in the oops trace, but it's called 
> from tcp_v4_get_port), specifically in the call to the macro 
> ipv6_only_sock. My guess is that while the sock says it's PF_INET6, it 
> doesn't have the extra ipv6 stuff (specifically the pointer to 
> ipv6_pinfo) so it's reading past the end of the structure, or that the 
> stuff past the main sock struct is getting corrupted. I think the former 
> is more likely but either possibility explains why I got a null pointer 
> dereference one time, and the other oops the other time.

good point.  please try this patch.

Index: linux-2.6/include/net/tcp.h
===================================================================
RCS file: /home/cvs/linux-2.5/include/net/tcp.h,v
retrieving revision 1.45
diff -u -r1.45 tcp.h
--- linux-2.6/include/net/tcp.h	10 Jul 2003 02:18:15 -0000	1.45
+++ linux-2.6/include/net/tcp.h	13 Sep 2003 11:36:32 -0000
@@ -219,6 +219,7 @@
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 	struct in6_addr		tw_v6_daddr;
 	struct in6_addr		tw_v6_rcv_saddr;
+	u8			tw_v6_v6only;
 #endif
 };
 
@@ -285,6 +286,18 @@
 extern void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo);
 extern void tcp_tw_deschedule(struct tcp_tw_bucket *tw);
 
+/* Check if tcp socket is the "v6only" socket */
+int inline tcp6_only_sock(const struct sock *sk)
+{
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+	if (sk->sk_state == TCP_TIME_WAIT)
+		return tcptw_sk(sk)->tw_v6_v6only;
+	else
+		return ipv6_only_sock(sk);
+#else
+	return 0;
+#endif
+}
 
 /* Socket demux engine toys. */
 #ifdef __BIG_ENDIAN
Index: linux-2.6/net/ipv4/tcp_ipv4.c
===================================================================
RCS file: /home/cvs/linux-2.5/net/ipv4/tcp_ipv4.c,v
retrieving revision 1.57
diff -u -r1.57 tcp_ipv4.c
--- linux-2.6/net/ipv4/tcp_ipv4.c	9 Sep 2003 23:24:51 -0000	1.57
+++ linux-2.6/net/ipv4/tcp_ipv4.c	13 Sep 2003 11:36:32 -0000
@@ -186,7 +186,7 @@
 
 	sk_for_each_bound(sk2, node, &tb->owners) {
 		if (sk != sk2 &&
-		    !ipv6_only_sock(sk2) &&
+		    !tcp6_only_sock(sk2) &&
 		    sk->sk_bound_dev_if == sk2->sk_bound_dev_if) {
 			if (!reuse || !sk2->sk_reuse ||
 			    sk2->sk_state == TCP_LISTEN) {
@@ -418,7 +418,7 @@
 	sk_for_each(sk, node, head) {
 		struct inet_opt *inet = inet_sk(sk);
 
-		if (inet->num == hnum && !ipv6_only_sock(sk)) {
+		if (inet->num == hnum && !tcp6_only_sock(sk)) {
 			__u32 rcv_saddr = inet->rcv_saddr;
 
 			score = (sk->sk_family == PF_INET ? 1 : 0);
@@ -457,7 +457,7 @@
 
 		if (inet->num == hnum && !sk->sk_node.next &&
 		    (!inet->rcv_saddr || inet->rcv_saddr == daddr) &&
-		    (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) &&
+		    (sk->sk_family == PF_INET || !tcp6_only_sock(sk)) &&
 		    !sk->sk_bound_dev_if)
 			goto sherry_cache;
 		sk = __tcp_v4_lookup_listener(head, daddr, hnum, dif);
Index: linux-2.6/net/ipv4/tcp_minisocks.c
===================================================================
RCS file: /home/cvs/linux-2.5/net/ipv4/tcp_minisocks.c,v
retrieving revision 1.40
diff -u -r1.40 tcp_minisocks.c
--- linux-2.6/net/ipv4/tcp_minisocks.c	27 Jun 2003 16:44:43 -0000	1.40
+++ linux-2.6/net/ipv4/tcp_minisocks.c	13 Sep 2003 11:36:32 -0000
@@ -367,7 +367,9 @@
 
 			ipv6_addr_copy(&tw->tw_v6_daddr, &np->daddr);
 			ipv6_addr_copy(&tw->tw_v6_rcv_saddr, &np->rcv_saddr);
-		}
+			tw->tw_v6_v6only = np->ipv6only;
+		} else
+			tw->tw_v6_v6only = 0;
 #endif
 		/* Linkage updates. */
 		__tcp_tw_hashdance(sk, tw);
Index: linux-2.6/net/ipv6/addrconf.c
===================================================================
RCS file: /home/cvs/linux-2.5/net/ipv6/addrconf.c,v
retrieving revision 1.53
diff -u -r1.53 addrconf.c
--- linux-2.6/net/ipv6/addrconf.c	9 Sep 2003 23:24:51 -0000	1.53
+++ linux-2.6/net/ipv6/addrconf.c	13 Sep 2003 11:36:32 -0000
@@ -968,16 +968,16 @@
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	int addr_type = ipv6_addr_type(&np->rcv_saddr);
 
-	if (!inet_sk(sk2)->rcv_saddr && !ipv6_only_sock(sk))
+	if (!inet_sk(sk2)->rcv_saddr && !tcp6_only_sock(sk))
 		return 1;
 
 	if (sk2->sk_family == AF_INET6 &&
 	    ipv6_addr_any(&inet6_sk(sk2)->rcv_saddr) &&
-	    !(ipv6_only_sock(sk2) && addr_type == IPV6_ADDR_MAPPED))
+	    !(tcp6_only_sock(sk2) && addr_type == IPV6_ADDR_MAPPED))
 		return 1;
 
 	if (addr_type == IPV6_ADDR_ANY &&
-	    (!ipv6_only_sock(sk) ||
+	    (!tcp6_only_sock(sk) ||
 	     !(sk2->sk_family == AF_INET6 ?
 	       (ipv6_addr_type(&inet6_sk(sk2)->rcv_saddr) == IPV6_ADDR_MAPPED) :
 	        1)))
@@ -991,7 +991,7 @@
 		return 1;
 
 	if (addr_type == IPV6_ADDR_MAPPED &&
-	    !ipv6_only_sock(sk2) &&
+	    !tcp6_only_sock(sk2) &&
 	    (!inet_sk(sk2)->rcv_saddr ||
 	     !inet_sk(sk)->rcv_saddr ||
 	     inet_sk(sk)->rcv_saddr == inet_sk(sk2)->rcv_saddr))

-- 
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-09-13 13:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-11 11:42 oops in inet_bind/tcp_v4_get_port Wichert Akkerman
  -- strict thread matches above, loose matches on Subject: below --
2003-09-13  9:58 James Harper
2003-09-13 10:25 ` YOSHIFUJI Hideaki / 吉藤英明
2003-09-13 12:23   ` James Harper
2003-09-13 13:18     ` YOSHIFUJI Hideaki / 吉藤英明

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox