netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER
@ 2013-07-02 21:50 Sebastian Andrzej Siewior
  2013-07-02 22:07 ` Eric Dumazet
  2013-07-02 23:11 ` Pablo Neira Ayuso
  0 siblings, 2 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-07-02 21:50 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Patrick McHardy, netdev, Dave Jones

Since commit c05cdb1b ("netlink: allow large data transfers from
user-space") the large skbs are allocated via vmalloc(). Trinity
triggered this in response:

| BUG: unable to handle kernel paging request at ffffc900001bf001
| IP: [<ffffffff8135270a>] skb_clone+0x1a/0xa0
| Call Trace:
|  [<ffffffff813cb107>] nl_fib_input+0x37/0x230
|  [<ffffffff8142c9b2>] ? _raw_read_unlock+0x22/0x40
|  [<ffffffff81380b1a>] netlink_unicast+0x13a/0x1f0
|  [<ffffffff81380ef7>] netlink_sendmsg+0x327/0x420

The problem is that the vmalloc() based skb ends exactly at size (where
->end is pointing) and skb_shinfo() starts past ->end where we have our
guard page and hence we BUG().
The question is should we fix this or forbid the skb_clone(). Fixing this
behaviour is tricky because even after we add space for struct
skb_shared_info we release the memory from the destructor so once the
first skbs is gone, the memory in the clone is invalid.
The other case where skb_clone() is used is when we have mutltiple
destinations.
Since I assume the initial target was to extend the size for
NETLINK_NETFILTER this patch limits to this target only and with single
destination.
Is this okay?

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 net/netlink/af_netlink.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 68c1673..9926453 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2129,7 +2129,11 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	if (len > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = netlink_alloc_large_skb(len);
+	if (netlink_is_kernel(sk) && sk->sk_protocol == NETLINK_NETFILTER &&
+			!dst_group)
+		skb = netlink_alloc_large_skb(len);
+	else
+		skb = alloc_skb(len, GFP_KERNEL);
 	if (skb == NULL)
 		goto out;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER
  2013-07-02 21:50 [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER Sebastian Andrzej Siewior
@ 2013-07-02 22:07 ` Eric Dumazet
  2013-07-03  6:59   ` Sebastian Andrzej Siewior
  2013-07-02 23:11 ` Pablo Neira Ayuso
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2013-07-02 22:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pablo Neira Ayuso, Patrick McHardy, netdev, Dave Jones

On Tue, 2013-07-02 at 23:50 +0200, Sebastian Andrzej Siewior wrote:
> Since commit c05cdb1b ("netlink: allow large data transfers from
> user-space") the large skbs are allocated via vmalloc(). Trinity
> triggered this in response:
> 
> | BUG: unable to handle kernel paging request at ffffc900001bf001
> | IP: [<ffffffff8135270a>] skb_clone+0x1a/0xa0
> | Call Trace:
> |  [<ffffffff813cb107>] nl_fib_input+0x37/0x230
> |  [<ffffffff8142c9b2>] ? _raw_read_unlock+0x22/0x40
> |  [<ffffffff81380b1a>] netlink_unicast+0x13a/0x1f0
> |  [<ffffffff81380ef7>] netlink_sendmsg+0x327/0x420
> 
> The problem is that the vmalloc() based skb ends exactly at size (where
> ->end is pointing) and skb_shinfo() starts past ->end where we have our
> guard page and hence we BUG().
> The question is should we fix this or forbid the skb_clone(). Fixing this
> behaviour is tricky because even after we add space for struct
> skb_shared_info we release the memory from the destructor so once the
> first skbs is gone, the memory in the clone is invalid.
> The other case where skb_clone() is used is when we have mutltiple
> destinations.
> Since I assume the initial target was to extend the size for
> NETLINK_NETFILTER this patch limits to this target only and with single
> destination.
> Is this okay?
> 
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  net/netlink/af_netlink.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 68c1673..9926453 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2129,7 +2129,11 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
>  	if (len > sk->sk_sndbuf - 32)
>  		goto out;
>  	err = -ENOBUFS;
> -	skb = netlink_alloc_large_skb(len);
> +	if (netlink_is_kernel(sk) && sk->sk_protocol == NETLINK_NETFILTER &&
> +			!dst_group)
> +		skb = netlink_alloc_large_skb(len);
> +	else
> +		skb = alloc_skb(len, GFP_KERNEL);
>  	if (skb == NULL)
>  		goto out;
>  


I believe you came too late, this was hopefully fixed here after some
discussion last week : 

http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=3a36515f729458c8efa0c124c7262d5843ad5c37

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER
  2013-07-02 21:50 [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER Sebastian Andrzej Siewior
  2013-07-02 22:07 ` Eric Dumazet
@ 2013-07-02 23:11 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-02 23:11 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Patrick McHardy, netdev, Dave Jones

On Tue, Jul 02, 2013 at 11:50:15PM +0200, Sebastian Andrzej Siewior wrote:
> Since commit c05cdb1b ("netlink: allow large data transfers from
> user-space") the large skbs are allocated via vmalloc(). Trinity
> triggered this in response:
> 
> | BUG: unable to handle kernel paging request at ffffc900001bf001
> | IP: [<ffffffff8135270a>] skb_clone+0x1a/0xa0
> | Call Trace:
> |  [<ffffffff813cb107>] nl_fib_input+0x37/0x230
> |  [<ffffffff8142c9b2>] ? _raw_read_unlock+0x22/0x40
> |  [<ffffffff81380b1a>] netlink_unicast+0x13a/0x1f0
> |  [<ffffffff81380ef7>] netlink_sendmsg+0x327/0x420
> 
> The problem is that the vmalloc() based skb ends exactly at size (where
> ->end is pointing) and skb_shinfo() starts past ->end where we have our
> guard page and hence we BUG().
> The question is should we fix this or forbid the skb_clone(). Fixing this
> behaviour is tricky because even after we add space for struct
> skb_shared_info we release the memory from the destructor so once the
> first skbs is gone, the memory in the clone is invalid.
> The other case where skb_clone() is used is when we have mutltiple
> destinations.
> Since I assume the initial target was to extend the size for
> NETLINK_NETFILTER this patch limits to this target only and with single
> destination.
> Is this okay?

Did you notice this patch?

3a36515 netlink: fix splat in skb_clone with large messages

Regards.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER
  2013-07-02 22:07 ` Eric Dumazet
@ 2013-07-03  6:59   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-07-03  6:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Pablo Neira Ayuso, Patrick McHardy, netdev, Dave Jones

On Tue, Jul 02, 2013 at 03:07:14PM -0700, Eric Dumazet wrote:
> I believe you came too late, this was hopefully fixed here after some
> discussion last week : 

Oh not again. Okay so this is gone.

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-03  7:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-02 21:50 [RFC ] netlink: limit large vmalloc() based skbs to NETLINK_NETFILTER Sebastian Andrzej Siewior
2013-07-02 22:07 ` Eric Dumazet
2013-07-03  6:59   ` Sebastian Andrzej Siewior
2013-07-02 23:11 ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).