netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
@ 2013-05-16 22:23 Eric Dumazet
  2013-05-17  0:27 ` [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data() Eric Dumazet
  2013-06-15 18:51 ` [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Sebastian Andrzej Siewior
  0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2013-05-16 22:23 UTC (permalink / raw)
  To: David Miller, Herbert Xu; +Cc: netdev, Hideaki YOSHIFUJI, Neal Cardwell

Hi Herbert

Looking at the code added in commit 0178b695fd6b40a62a215cb
("ipv6: Copy cork options in ip6_append_data") it looks like we can have
either a memleak or corruption (later in ip6_cork_release()) in case one
of the sub-allocation (ip6_opt_dup()/ip6_rthdr_dup()) fails.

I would at least use a kzalloc() instead of kmalloc() in 

np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);

Or maybe better, reuse the code in  ipv6_dup_options() so that we
perform a single memory allocation ?

Am I missing something obvious ?

Thanks !

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
  2013-05-16 22:23 [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Eric Dumazet
@ 2013-05-17  0:27 ` Eric Dumazet
  2013-05-17 13:58   ` Herbert Xu
  2013-06-15 18:51 ` [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Sebastian Andrzej Siewior
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2013-05-17  0:27 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, netdev, Hideaki YOSHIFUJI, Neal Cardwell

On Thu, 2013-05-16 at 15:23 -0700, Eric Dumazet wrote:
> Hi Herbert
> 
> Looking at the code added in commit 0178b695fd6b40a62a215cb
> ("ipv6: Copy cork options in ip6_append_data") it looks like we can have
> either a memleak or corruption (later in ip6_cork_release()) in case one
> of the sub-allocation (ip6_opt_dup()/ip6_rthdr_dup()) fails.
> 
> I would at least use a kzalloc() instead of kmalloc() in 
> 
> np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
> 
> Or maybe better, reuse the code in  ipv6_dup_options() so that we
> perform a single memory allocation ?

Something like following maybe ?

[PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()

commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
added some code duplication and bad error recovery, leading to potential
crash.

Allow ipv6_dup_options() to be called with a NULL socket argument
so that we can reuse it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Neal Cardwell <ncardwell@google.com>
---
Only compile-tested, I would appreciate a review from Herbert and/or
Hideaki

 net/ipv6/exthdrs.c    |    6 +++++-
 net/ipv6/ip6_output.c |   38 ++++----------------------------------
 2 files changed, 9 insertions(+), 35 deletions(-)

diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 07a7d65..905ec23 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -721,7 +721,11 @@ ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
 {
 	struct ipv6_txoptions *opt2;
 
-	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
+	if (sk)
+		opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
+	else
+		opt2 = kmalloc(opt->tot_len, GFP_ATOMIC);
+
 	if (opt2) {
 		long dif = (char *)opt2 - (char *)opt;
 		memcpy(opt2, opt, opt->tot_len);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d2eedf1..fd44b9c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1147,32 +1147,8 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			if (WARN_ON(np->cork.opt))
 				return -EINVAL;
 
-			np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
-			if (unlikely(np->cork.opt == NULL))
-				return -ENOBUFS;
-
-			np->cork.opt->tot_len = opt->tot_len;
-			np->cork.opt->opt_flen = opt->opt_flen;
-			np->cork.opt->opt_nflen = opt->opt_nflen;
-
-			np->cork.opt->dst0opt = ip6_opt_dup(opt->dst0opt,
-							    sk->sk_allocation);
-			if (opt->dst0opt && !np->cork.opt->dst0opt)
-				return -ENOBUFS;
-
-			np->cork.opt->dst1opt = ip6_opt_dup(opt->dst1opt,
-							    sk->sk_allocation);
-			if (opt->dst1opt && !np->cork.opt->dst1opt)
-				return -ENOBUFS;
-
-			np->cork.opt->hopopt = ip6_opt_dup(opt->hopopt,
-							   sk->sk_allocation);
-			if (opt->hopopt && !np->cork.opt->hopopt)
-				return -ENOBUFS;
-
-			np->cork.opt->srcrt = ip6_rthdr_dup(opt->srcrt,
-							    sk->sk_allocation);
-			if (opt->srcrt && !np->cork.opt->srcrt)
+			np->cork.opt = ipv6_dup_options(NULL, opt);
+			if (unlikely(!np->cork.opt))
 				return -ENOBUFS;
 
 			/* need source address above miyazawa*/
@@ -1463,14 +1439,8 @@ EXPORT_SYMBOL_GPL(ip6_append_data);
 
 static void ip6_cork_release(struct inet_sock *inet, struct ipv6_pinfo *np)
 {
-	if (np->cork.opt) {
-		kfree(np->cork.opt->dst0opt);
-		kfree(np->cork.opt->dst1opt);
-		kfree(np->cork.opt->hopopt);
-		kfree(np->cork.opt->srcrt);
-		kfree(np->cork.opt);
-		np->cork.opt = NULL;
-	}
+	kfree(np->cork.opt);
+	np->cork.opt = NULL;
 
 	if (inet->cork.base.dst) {
 		dst_release(inet->cork.base.dst);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
  2013-05-17  0:27 ` [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data() Eric Dumazet
@ 2013-05-17 13:58   ` Herbert Xu
  2013-05-17 14:53     ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2013-05-17 13:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Hideaki YOSHIFUJI, Neal Cardwell

On Thu, May 16, 2013 at 05:27:32PM -0700, Eric Dumazet wrote:
> On Thu, 2013-05-16 at 15:23 -0700, Eric Dumazet wrote:
> > Hi Herbert
> > 
> > Looking at the code added in commit 0178b695fd6b40a62a215cb
> > ("ipv6: Copy cork options in ip6_append_data") it looks like we can have
> > either a memleak or corruption (later in ip6_cork_release()) in case one
> > of the sub-allocation (ip6_opt_dup()/ip6_rthdr_dup()) fails.
> > 
> > I would at least use a kzalloc() instead of kmalloc() in 
> > 
> > np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
> > 
> > Or maybe better, reuse the code in  ipv6_dup_options() so that we
> > perform a single memory allocation ?
> 
> Something like following maybe ?
> 
> [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
> 
> commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
> added some code duplication and bad error recovery, leading to potential
> crash.
> 
> Allow ipv6_dup_options() to be called with a NULL socket argument
> so that we can reuse it.

Yes you're quite right, my code was definitely buggy.

> @@ -721,7 +721,11 @@ ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
>  {
>  	struct ipv6_txoptions *opt2;
>  
> -	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
> +	if (sk)
> +		opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
> +	else
> +		opt2 = kmalloc(opt->tot_len, GFP_ATOMIC);
> +

However, I think this function is just as buggy as the original
code that I replaced.  If you look at the code that fills in the
options in ip6_datagram_send_ctl, you'll find that the options do
not lie in the memory area of the opt + opt->tot_len.  They instead
point to data in the cmsg.

So I think we should

1) fix ipv6_dup_options to do what I tried do but in a non-buggy way;
2) make the UDP path use it.

BTW, in the UDP path we also have a socket so we can just charge the
memory to it and avoid using kmalloc at all.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
  2013-05-17 13:58   ` Herbert Xu
@ 2013-05-17 14:53     ` Eric Dumazet
  2013-05-17 23:36       ` Herbert Xu
  2013-05-18 19:57       ` David Miller
  0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2013-05-17 14:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, netdev, Hideaki YOSHIFUJI, Neal Cardwell

From: Eric Dumazet <edumazet@google.com>

On Fri, 2013-05-17 at 21:58 +0800, Herbert Xu wrote:

> However, I think this function is just as buggy as the original
> code that I replaced.  If you look at the code that fills in the
> options in ip6_datagram_send_ctl, you'll find that the options do
> not lie in the memory area of the opt + opt->tot_len.  They instead
> point to data in the cmsg.
> 
> So I think we should
> 
> 1) fix ipv6_dup_options to do what I tried do but in a non-buggy way;
> 2) make the UDP path use it.
> 
> BTW, in the UDP path we also have a socket so we can just charge the
> memory to it and avoid using kmalloc at all.

OK, so I guess for stable we should use kzalloc(), and work on a cleanup
in net-next.

Thanks !

[PATCH] ipv6: fix possible crashes in ip6_cork_release()

commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
added some code duplication and bad error recovery, leading to potential
crash in ip6_cork_release() as kfree() could be called with garbage.

use kzalloc() to make sure this wont happen.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Neal Cardwell <ncardwell@google.com>
---
 net/ipv6/ip6_output.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d2eedf1..dae1949 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1147,7 +1147,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			if (WARN_ON(np->cork.opt))
 				return -EINVAL;
 
-			np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
+			np->cork.opt = kzalloc(opt->tot_len, sk->sk_allocation);
 			if (unlikely(np->cork.opt == NULL))
 				return -ENOBUFS;
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
  2013-05-17 14:53     ` Eric Dumazet
@ 2013-05-17 23:36       ` Herbert Xu
  2013-05-18 19:57       ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2013-05-17 23:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Hideaki YOSHIFUJI, Neal Cardwell

On Fri, May 17, 2013 at 07:53:13AM -0700, Eric Dumazet wrote:
>
> OK, so I guess for stable we should use kzalloc(), and work on a cleanup
> in net-next.

I agree.  Thank you.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data()
  2013-05-17 14:53     ` Eric Dumazet
  2013-05-17 23:36       ` Herbert Xu
@ 2013-05-18 19:57       ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2013-05-18 19:57 UTC (permalink / raw)
  To: eric.dumazet; +Cc: herbert, netdev, yoshfuji, ncardwell

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 17 May 2013 07:53:13 -0700

> [PATCH] ipv6: fix possible crashes in ip6_cork_release()
> 
> commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
> added some code duplication and bad error recovery, leading to potential
> crash in ip6_cork_release() as kfree() could be called with garbage.
> 
> use kzalloc() to make sure this wont happen.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
  2013-05-16 22:23 [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Eric Dumazet
  2013-05-17  0:27 ` [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data() Eric Dumazet
@ 2013-06-15 18:51 ` Sebastian Andrzej Siewior
  2013-06-16  9:12   ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-15 18:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Herbert Xu, netdev, Hideaki YOSHIFUJI,
	Neal Cardwell

On Thu, May 16, 2013 at 03:23:10PM -0700, Eric Dumazet wrote:
> Hi Herbert
Hi Eric,

> Looking at the code added in commit 0178b695fd6b40a62a215cb
> ("ipv6: Copy cork options in ip6_append_data") it looks like we can have
> either a memleak or corruption (later in ip6_cork_release()) in case one
> of the sub-allocation (ip6_opt_dup()/ip6_rthdr_dup()) fails.

Would this explain the following on 3.9.5?

| BUG: unable to handle kernel paging request at 00000000ffffc52c
| IP: [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea
| RIP: 0010:[<ffffffff81342d2b>]  [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea
| RSP: 0018:ffff880072cf7a28  EFLAGS: 00010202
| RAX: 00000000ffffc334 RBX: ffff88007c14cd80 RCX: 0000000000000008
| RDX: 00000000ffffffe0 RSI: 0000000000000048 RDI: ffff88007c14cd80
| RBP: 0000000000000000 R08: ffff880072cf7a98 R09: 0000000000000040
| R10: 0000000000000000 R11: ffff88007c14cd80 R12: ffff88007c6208c0
| R13: 0000000000000008 R14: 0000000000000000 R15: 000000000000fff0
| FS:  00007f2342014700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
| CR2: 00000000ffffc52c CR3: 0000000020799000 CR4: 00000000000006f0
| DR0: 00000000327ff15b DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
| Process trinity-child0 (pid: 31667, threadinfo ffff880072cf6000, task ffff880037509830)
| Stack:
|  0000000000000001 0000000000000400 0000000800000028 0000ffe800000000
|  0000000000000000 0000000000000008 0000000000000008 ffff88007c14ce90
|  ffffffff812f9545 ffff880072cf7db8 0000000000000000 0000002000000010
| Call Trace:
|  [<ffffffff812f9545>] ? ip_skb_dst_mtu+0x32/0x32
|  [<ffffffff81390462>] ? _raw_spin_lock_bh+0xe/0x1c
|  [<ffffffff8106161c>] ? should_resched+0x5/0x23
|  [<ffffffff81356606>] ? udpv6_sendmsg+0x668/0x84d
|  [<ffffffff812be1ef>] ? sock_sendmsg+0x4f/0x6c
|  [<ffffffff812be3fe>] ? __sys_sendmsg+0x1f2/0x284
|  [<ffffffff813904bb>] ? _raw_spin_lock_irqsave+0x14/0x35
|  [<ffffffff81058710>] ? remove_wait_queue+0xe/0x48
|  [<ffffffff8139047c>] ? _raw_spin_unlock_irqrestore+0xc/0xd
|  [<ffffffff81257004>] ? n_tty_write+0x309/0x348
|  [<ffffffff8102f296>] ? kvm_clock_read+0x1c/0x1e
|  [<ffffffff811cf695>] ? timerqueue_add+0x79/0x98
|  [<ffffffff8105a352>] ? enqueue_hrtimer+0x36/0x6d
|  [<ffffffff8139047c>] ? _raw_spin_unlock_irqrestore+0xc/0xd
|  [<ffffffff811219bc>] ? fget_light+0x2e/0x7c
|  [<ffffffff812bf425>] ? sys_sendmsg+0x39/0x57
|  [<ffffffff81395869>] ? system_call_fastpath+0x16/0x1b
| Code: 00 0f 8f 12 fa ff ff e9 d9 f4 ff ff c7 44 24 70 f2 ff ff ff 8b 4c 24 14 29 8b e4 02 00 00 49 8b 84 24 48 01 00 00 48 85 c0 74 0c <48>  8b 80 f8 01 00 00 65 48 ff 40 70 48 8b 43 30 48 8b 80 70 01 
| RIP  [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea

unfortunately I have no idea how this happend. trinity was running a while and
I managed not to get any logs due to a pebkac. The RIP is at

|IP6_INC_STATS(sock_net(sk), rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS);

|81342d1e:       49 8b 84 24 48 01 00    mov    0x148(%r12),%rax
|81342d25:       00 
|81342d26:       48 85 c0                test   %rax,%rax
|81342d29:       74 0c                   je     ffffffff81342d37 <ip6_append_data+0xb9f>
|81342d2b:       48 8b 80 f8 01 00 00    mov    0x1f8(%rax),%rax
^^^
|81342d32:       65 48 ff 40 70          incq   %gs:0x70(%rax)

This looks like rt6i_idev is not NULL but it is also not a valid pointer since the
upper 32bit are NULL.

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
  2013-06-15 18:51 ` [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Sebastian Andrzej Siewior
@ 2013-06-16  9:12   ` Eric Dumazet
  2013-06-16 19:07     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2013-06-16  9:12 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: David Miller, Herbert Xu, netdev, Hideaki YOSHIFUJI,
	Neal Cardwell

On Sat, 2013-06-15 at 20:51 +0200, Sebastian Andrzej Siewior wrote:
> On Thu, May 16, 2013 at 03:23:10PM -0700, Eric Dumazet wrote:
> > Hi Herbert
> Hi Eric,
> 
> > Looking at the code added in commit 0178b695fd6b40a62a215cb
> > ("ipv6: Copy cork options in ip6_append_data") it looks like we can have
> > either a memleak or corruption (later in ip6_cork_release()) in case one
> > of the sub-allocation (ip6_opt_dup()/ip6_rthdr_dup()) fails.
> 
> Would this explain the following on 3.9.5?

No, thats a different issue.

> 
> | BUG: unable to handle kernel paging request at 00000000ffffc52c
> | IP: [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea
> | RIP: 0010:[<ffffffff81342d2b>]  [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea
> | RSP: 0018:ffff880072cf7a28  EFLAGS: 00010202
> | RAX: 00000000ffffc334 RBX: ffff88007c14cd80 RCX: 0000000000000008
> | RDX: 00000000ffffffe0 RSI: 0000000000000048 RDI: ffff88007c14cd80
> | RBP: 0000000000000000 R08: ffff880072cf7a98 R09: 0000000000000040
> | R10: 0000000000000000 R11: ffff88007c14cd80 R12: ffff88007c6208c0
> | R13: 0000000000000008 R14: 0000000000000000 R15: 000000000000fff0
> | FS:  00007f2342014700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> | CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> | CR2: 00000000ffffc52c CR3: 0000000020799000 CR4: 00000000000006f0
> | DR0: 00000000327ff15b DR1: 0000000000000000 DR2: 0000000000000000
> | DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> | Process trinity-child0 (pid: 31667, threadinfo ffff880072cf6000, task ffff880037509830)
> | Stack:
> |  0000000000000001 0000000000000400 0000000800000028 0000ffe800000000
> |  0000000000000000 0000000000000008 0000000000000008 ffff88007c14ce90
> |  ffffffff812f9545 ffff880072cf7db8 0000000000000000 0000002000000010
> | Call Trace:
> |  [<ffffffff812f9545>] ? ip_skb_dst_mtu+0x32/0x32
> |  [<ffffffff81390462>] ? _raw_spin_lock_bh+0xe/0x1c
> |  [<ffffffff8106161c>] ? should_resched+0x5/0x23
> |  [<ffffffff81356606>] ? udpv6_sendmsg+0x668/0x84d
> |  [<ffffffff812be1ef>] ? sock_sendmsg+0x4f/0x6c
> |  [<ffffffff812be3fe>] ? __sys_sendmsg+0x1f2/0x284
> |  [<ffffffff813904bb>] ? _raw_spin_lock_irqsave+0x14/0x35
> |  [<ffffffff81058710>] ? remove_wait_queue+0xe/0x48
> |  [<ffffffff8139047c>] ? _raw_spin_unlock_irqrestore+0xc/0xd
> |  [<ffffffff81257004>] ? n_tty_write+0x309/0x348
> |  [<ffffffff8102f296>] ? kvm_clock_read+0x1c/0x1e
> |  [<ffffffff811cf695>] ? timerqueue_add+0x79/0x98
> |  [<ffffffff8105a352>] ? enqueue_hrtimer+0x36/0x6d
> |  [<ffffffff8139047c>] ? _raw_spin_unlock_irqrestore+0xc/0xd
> |  [<ffffffff811219bc>] ? fget_light+0x2e/0x7c
> |  [<ffffffff812bf425>] ? sys_sendmsg+0x39/0x57
> |  [<ffffffff81395869>] ? system_call_fastpath+0x16/0x1b
> | Code: 00 0f 8f 12 fa ff ff e9 d9 f4 ff ff c7 44 24 70 f2 ff ff ff 8b 4c 24 14 29 8b e4 02 00 00 49 8b 84 24 48 01 00 00 48 85 c0 74 0c <48>  8b 80 f8 01 00 00 65 48 ff 40 70 48 8b 43 30 48 8b 80 70 01 
> | RIP  [<ffffffff81342d2b>] ip6_append_data+0xb93/0xbea
> 
> unfortunately I have no idea how this happend. trinity was running a while and
> I managed not to get any logs due to a pebkac. The RIP is at
> 
> |IP6_INC_STATS(sock_net(sk), rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS);
> 
> |81342d1e:       49 8b 84 24 48 01 00    mov    0x148(%r12),%rax
> |81342d25:       00 
> |81342d26:       48 85 c0                test   %rax,%rax
> |81342d29:       74 0c                   je     ffffffff81342d37 <ip6_append_data+0xb9f>
> |81342d2b:       48 8b 80 f8 01 00 00    mov    0x1f8(%rax),%rax
> ^^^
> |81342d32:       65 48 ff 40 70          incq   %gs:0x70(%rax)
> 
> This looks like rt6i_idev is not NULL but it is also not a valid pointer since the
> upper 32bit are NULL.

Yep, this was discussed 2 months ago. Initial report from Dave Jones

http://comments.gmane.org/gmane.linux.network/264030

So far, I am not sure we solved the problem.
Could you try latest net-next tree ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
  2013-06-16  9:12   ` Eric Dumazet
@ 2013-06-16 19:07     ` Sebastian Andrzej Siewior
  2013-06-16 20:10       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-16 19:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Herbert Xu, netdev, Hideaki YOSHIFUJI,
	Neal Cardwell

On Sun, Jun 16, 2013 at 02:12:33AM -0700, Eric Dumazet wrote:
> 
> Yep, this was discussed 2 months ago. Initial report from Dave Jones
> 
> http://comments.gmane.org/gmane.linux.network/264030
> 
> So far, I am not sure we solved the problem.
> Could you try latest net-next tree ?

Yep. So I run pretty soon into

| BUG: unable to handle kernel paging request at 000000000e180200
| IP: [<ffffffff8131ff8c>] ip6_push_pending_frames+0x28a/0x428
| PGD 7a30f067 PUD 7a310067 PMD 0
| Oops: 0000 [#1] SMP
| Modules linked in: xfrm_user xfrm_algo ipt_ULOG x_tables can_bcm can irda crc_ccitt ax25 nfc rfkill ipx p8023 p8022 atm appletalk psnap llc nfnetlink cirrus ttm snd_pcm snd_page_alloc snd_timer snd soundcore parport_pc drm_kms_helper drm i2c_piix4 syscopyarea sysfillrect psmouse serio_raw sysimgblt parport processor button thermal_sys joydev evdev pcspkr i2c_core loop autofs4 hid_generic usbhid hid btrfs xor zlib_deflate raid6_pq crc32c libcrc32c sg sr_mod cdrom ata_generic virtio_blk virtio_net floppy ata_piix uhci_hcd ehci_hcd libata usbcore scsi_mod usb_common virtio_pci virtio_ring virtio
| CPU: 0 PID: 1034 Comm: trinity-child0 Not tainted 3.10.0-rc4-next-20130607 #1
| Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
| task: ffff880072e477f0 ti: ffff88007a31e000 task.ti: ffff88007a31e000
| RIP: 0010:[<ffffffff8131ff8c>]  [<ffffffff8131ff8c>] ip6_push_pending_frames+0x28a/0x428
| RSP: 0018:ffff88007a31fa40  EFLAGS: 00010206
| RAX: 000000000e180000 RBX: ffff88002ec6a880 RCX: ffff880061604c18
| RDX: 0f02000affff0000 RSI: 0000000000000028 RDI: ffff88007a374c80
| RBP: ffff88007a31fac0 R08: 0000000013fc42a0 R09: ffff880061604c48
| R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b599180
| R13: ffffffff81676340 R14: 0000000000000000 R15: ffff880061604cc8
| FS:  00007f68082a3700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| CR2: 000000000e180200 CR3: 000000007a30e000 CR4: 00000000000006f0
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
| Stack:
|  ffff88007a374c00 ffff88007b599290 ffff88007b5994e0 ffff88007a374c00
|  1100000000000000 ffff88007a31fa70 0000000000000000 0f02000affff0000
|  0000000000000000 000000000a49eee4 0000000000000000 ffff88007b599180
| Call Trace:
|  [<ffffffff81332a98>] ? udp_v6_push_pending_frames+0x25d/0x2d5
|  [<ffffffff813338f2>] ? udpv6_sendmsg+0x6db/0x8a0
|  [<ffffffff810b75c9>] ? get_page_from_freelist+0x5df/0x69f
|  [<ffffffff8129cc4e>] ? sock_sendmsg+0x54/0x70
|  [<ffffffff810f1048>] ? fatal_signal_pending+0x9/0x23
|  [<ffffffff812a637d>] ? verify_iovec+0x53/0xa0
|  [<ffffffff8129ce9f>] ? ___sys_sendmsg+0x1fe/0x28e
|  [<ffffffff810d0776>] ? handle_mm_fault+0x1ae/0x20b
|  [<ffffffff81064b23>] ? timekeeping_get_ns.constprop.10+0xd/0x31
|  [<ffffffff811b571d>] ? timerqueue_add+0x75/0x8f
|  [<ffffffff8104b6b9>] ? lock_hrtimer_base.isra.14+0x1b/0x3c
|  [<ffffffff8129db2f>] ? __sys_sendmsg+0x39/0x57
|  [<ffffffff813719d2>] ? system_call_fastpath+0x16/0x1b
| Code: 48 8b 44 24 18 48 85 c0 74 0c 48 8d b8 80 00 00 00 e8 e0 e2 ff ff 48 8b 44 24 18 48 89 43 58 48 8b 80 48 01 00 00 48 85 c0 74 14 <48> 8b 80 00 02 00 00 65 48 ff 40 28 8b 53 68 65 48 01 50 30 49
| RIP  [<ffffffff8131ff8c>] ip6_push_pending_frames+0x28a/0x428
|  RSP <ffff88007a31fa40>
| CR2: 000000000e180200
| ---[ end trace 9177219b59c3a20e ]---

I think this is different :) I will see if I can trigger the other issue.

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
  2013-06-16 19:07     ` Sebastian Andrzej Siewior
@ 2013-06-16 20:10       ` Sebastian Andrzej Siewior
  2013-06-16 20:37         ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-16 20:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Herbert Xu, netdev, Hideaki YOSHIFUJI,
	Neal Cardwell

On Sun, Jun 16, 2013 at 09:07:21PM +0200, Sebastian Andrzej Siewior wrote:
> On Sun, Jun 16, 2013 at 02:12:33AM -0700, Eric Dumazet wrote:
> > So far, I am not sure we solved the problem.
> > Could you try latest net-next tree ?
> 
> Yep. So I run pretty soon into
> 
> | BUG: unable to handle kernel paging request at 000000000e180200
> | IP: [<ffffffff8131ff8c>] ip6_push_pending_frames+0x28a/0x428

This is

|        IP6_UPD_PO_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUT, skb->len);

|31ff80:       48 8b 80 48 01 00 00    mov    0x148(%rax),%rax
|31ff87:       48 85 c0                test   %rax,%rax
|31ff8a:       74 14                   je     ffffffff8131ffa0 <ip6_push_pending_frames+0x29e>
|31ff8c:       48 8b 80 00 02 00 00    mov    0x200(%rax),%rax
^^^^^
|31ff93:       65 48 ff 40 28          incq   %gs:0x28(%rax)

Stupid me, it looks familiar.

While writing this email I also captured

| BUG: unable to handle kernel NULL pointer dereference at 0000000000000031
| IP: [<ffffffff813339aa>] udpv6_sendmsg+0x793/0x8a0
| task: ffff88007b7bc0c0 ti: ffff88007a2d4000 task.ti: ffff88007a2d4000
| RIP: 0010:[<ffffffff813339aa>]  [<ffffffff813339aa>] udpv6_sendmsg+0x793/0x8a0
| RSP: 0018:ffff88007a2d5b18  EFLAGS: 00010206
| RAX: 0000000000000005 RBX: ffff88007a1a1200 RCX: ffff88007a1a1560
| RDX: ffff88007a1a1580 RSI: ffff88007ae39f00 RDI: ffff88007ae39f00
| RBP: ffff88007a2d5c40 R08: ffff8800fa101be0 R09: ffff88002e8ec010
| R10: 0000003600000000 R11: 0000000000000001 R12: ffff88007a1a1560
| R13: 0000000000000000 R14: ffff88007ae39f00 R15: ffff88007a1a1560
| Call Trace:
|  [<ffffffff810b75c9>] ? get_page_from_freelist+0x5df/0x69f
|  [<ffffffff8129cc4e>] ? sock_sendmsg+0x54/0x70
|  [<ffffffff8136ceb2>] ? page_fault+0x22/0x30
|  [<ffffffff810f1048>] ? fatal_signal_pending+0x9/0x23
|  [<ffffffff812a637d>] ? verify_iovec+0x53/0xa0
|  [<ffffffff8129ce9f>] ? ___sys_sendmsg+0x1fe/0x28e
|  [<ffffffff810baf58>] ? __lru_cache_add+0x1a/0x39
|  [<ffffffff810cf82f>] ? handle_pte_fault+0x75a/0x79a
|  [<ffffffff810d0776>] ? handle_mm_fault+0x1ae/0x20b
|  [<ffffffff81064b23>] ? timekeeping_get_ns.constprop.10+0xd/0x31
|  [<ffffffff811b571d>] ? timerqueue_add+0x75/0x8f
|  [<ffffffff8104bdae>] ? __hrtimer_start_range_ns+0x263/0x297
|  [<ffffffff8104b6b9>] ? lock_hrtimer_base.isra.14+0x1b/0x3c
|  [<ffffffff8129db2f>] ? __sys_sendmsg+0x39/0x57
|  [<ffffffff813719d2>] ? system_call_fastpath+0x16/0x1b
| Code: df 4c 8b bb 90 02 00 00 e8 ba aa f6 ff 48 8b 54 24 48 48 8b 4c 24 40 49 89 57 48 49 89 4f 50 49 8b 86 a0 00 00 00 48 85  c0 74 05 <8b> 40 2c eb 02 31 c0 41 89 47 74 66 83 83 00 01 00 00 01 eb 08

This is from __ip6_dst_store() the last piece 
| np->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;

|3399e:       49 8b 86 a0 00 00 00    mov    0xa0(%r14),%rax
|339a5:       48 85 c0                test   %rax,%rax
|339a8:       74 05                   je     ffffffff813339af <udpv6_sendmsg+0x798>
|339aa:       8b 40 2c                mov    0x2c(%rax),%eax
^^^^^
|339ad:       eb 02                   jmp    ffffffff813339b1 <udpv6_sendmsg+0x79a>
|339af:       31 c0                   xor    %eax,%eax

rt->rt6i_node seems to be five.

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data"
  2013-06-16 20:10       ` Sebastian Andrzej Siewior
@ 2013-06-16 20:37         ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2013-06-16 20:37 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: David Miller, Herbert Xu, netdev, Hideaki YOSHIFUJI,
	Neal Cardwell

On Sun, 2013-06-16 at 22:10 +0200, Sebastian Andrzej Siewior wrote:
> On Sun, Jun 16, 2013 at 09:07:21PM +0200, Sebastian Andrzej Siewior wrote:
> > On Sun, Jun 16, 2013 at 02:12:33AM -0700, Eric Dumazet wrote:
> > > So far, I am not sure we solved the problem.
> > > Could you try latest net-next tree ?
> > 
> > Yep. So I run pretty soon into
> > 
> > | BUG: unable to handle kernel paging request at 000000000e180200
> > | IP: [<ffffffff8131ff8c>] ip6_push_pending_frames+0x28a/0x428
> 
> This is
> 
> |        IP6_UPD_PO_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUT, skb->len);
> 
> |31ff80:       48 8b 80 48 01 00 00    mov    0x148(%rax),%rax
> |31ff87:       48 85 c0                test   %rax,%rax
> |31ff8a:       74 14                   je     ffffffff8131ffa0 <ip6_push_pending_frames+0x29e>
> |31ff8c:       48 8b 80 00 02 00 00    mov    0x200(%rax),%rax
> ^^^^^
> |31ff93:       65 48 ff 40 28          incq   %gs:0x28(%rax)
> 
> Stupid me, it looks familiar.
> 
> While writing this email I also captured
> 
> | BUG: unable to handle kernel NULL pointer dereference at 0000000000000031
> | IP: [<ffffffff813339aa>] udpv6_sendmsg+0x793/0x8a0
> | task: ffff88007b7bc0c0 ti: ffff88007a2d4000 task.ti: ffff88007a2d4000
> | RIP: 0010:[<ffffffff813339aa>]  [<ffffffff813339aa>] udpv6_sendmsg+0x793/0x8a0
> | RSP: 0018:ffff88007a2d5b18  EFLAGS: 00010206
> | RAX: 0000000000000005 RBX: ffff88007a1a1200 RCX: ffff88007a1a1560
> | RDX: ffff88007a1a1580 RSI: ffff88007ae39f00 RDI: ffff88007ae39f00
> | RBP: ffff88007a2d5c40 R08: ffff8800fa101be0 R09: ffff88002e8ec010
> | R10: 0000003600000000 R11: 0000000000000001 R12: ffff88007a1a1560
> | R13: 0000000000000000 R14: ffff88007ae39f00 R15: ffff88007a1a1560
> | Call Trace:
> |  [<ffffffff810b75c9>] ? get_page_from_freelist+0x5df/0x69f
> |  [<ffffffff8129cc4e>] ? sock_sendmsg+0x54/0x70
> |  [<ffffffff8136ceb2>] ? page_fault+0x22/0x30
> |  [<ffffffff810f1048>] ? fatal_signal_pending+0x9/0x23
> |  [<ffffffff812a637d>] ? verify_iovec+0x53/0xa0
> |  [<ffffffff8129ce9f>] ? ___sys_sendmsg+0x1fe/0x28e
> |  [<ffffffff810baf58>] ? __lru_cache_add+0x1a/0x39
> |  [<ffffffff810cf82f>] ? handle_pte_fault+0x75a/0x79a
> |  [<ffffffff810d0776>] ? handle_mm_fault+0x1ae/0x20b
> |  [<ffffffff81064b23>] ? timekeeping_get_ns.constprop.10+0xd/0x31
> |  [<ffffffff811b571d>] ? timerqueue_add+0x75/0x8f
> |  [<ffffffff8104bdae>] ? __hrtimer_start_range_ns+0x263/0x297
> |  [<ffffffff8104b6b9>] ? lock_hrtimer_base.isra.14+0x1b/0x3c
> |  [<ffffffff8129db2f>] ? __sys_sendmsg+0x39/0x57
> |  [<ffffffff813719d2>] ? system_call_fastpath+0x16/0x1b
> | Code: df 4c 8b bb 90 02 00 00 e8 ba aa f6 ff 48 8b 54 24 48 48 8b 4c 24 40 49 89 57 48 49 89 4f 50 49 8b 86 a0 00 00 00 48 85  c0 74 05 <8b> 40 2c eb 02 31 c0 41 89 47 74 66 83 83 00 01 00 00 01 eb 08
> 
> This is from __ip6_dst_store() the last piece 
> | np->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
> 
> |3399e:       49 8b 86 a0 00 00 00    mov    0xa0(%r14),%rax
> |339a5:       48 85 c0                test   %rax,%rax
> |339a8:       74 05                   je     ffffffff813339af <udpv6_sendmsg+0x798>
> |339aa:       8b 40 2c                mov    0x2c(%rax),%eax
> ^^^^^
> |339ad:       eb 02                   jmp    ffffffff813339b1 <udpv6_sendmsg+0x79a>
> |339af:       31 c0                   xor    %eax,%eax
> 
> rt->rt6i_node seems to be five.

Yes, that's really the same root cause.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-06-16 20:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-16 22:23 [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Eric Dumazet
2013-05-17  0:27 ` [PATCH net-next] ipv6: use ipv6_dup_options() from ip6_append_data() Eric Dumazet
2013-05-17 13:58   ` Herbert Xu
2013-05-17 14:53     ` Eric Dumazet
2013-05-17 23:36       ` Herbert Xu
2013-05-18 19:57       ` David Miller
2013-06-15 18:51 ` [RFC/BUG] ipv6: bug in "ipv6: Copy cork options in ip6_append_data" Sebastian Andrzej Siewior
2013-06-16  9:12   ` Eric Dumazet
2013-06-16 19:07     ` Sebastian Andrzej Siewior
2013-06-16 20:10       ` Sebastian Andrzej Siewior
2013-06-16 20:37         ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).