Netdev List
 help / color / mirror / Atom feed
* Re: [net-next] net/phy: extra delay only for RGMII interfaces for IC+ IP 1001
From: David Miller @ 2011-10-19  3:50 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1318318676-4493-1-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Tue, 11 Oct 2011 09:37:56 +0200

> The extra delay of 2ns to adjust RX clock phase is actually needed
> in RGMII mode. Tested on the HDK7108 (STx7108c2).
> 
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] route:ip_rt_frag_needed always return unzero
From: Eric Dumazet @ 2011-10-19  3:49 UTC (permalink / raw)
  To: Gao feng; +Cc: davem, kuznet, jmorris, netdev
In-Reply-To: <4E9E2929.7070701@cn.fujitsu.com>

Le mercredi 19 octobre 2011 à 09:34 +0800, Gao feng a écrit :
> 2011.10.18 17:23, Eric Dumazet wrote:
> > Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
> >> int function ip_rt_frag_need,if peer is null,
> >> there is no need to do ipprot->err_handler.
> >> I am right?
> >>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> >> ---
> >>  net/ipv4/route.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> >> index 075212e..6cde0fa 100644
> >> --- a/net/ipv4/route.c
> >> +++ b/net/ipv4/route.c
> >> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
> >>  
> >>  		atomic_inc(&__rt_peer_genid);
> >>  	}
> >> -	return est_mtu ? : new_mtu;
> >> +	return est_mtu;
> >>  }
> >>  
> >>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
> > 
> > No idea why you want this, your changelog is a bit cryptic :)
> > 
> > Wont this bypass the raw_icmp_error(skb, protocol, info);
> > call in icmp_unreach() as well ?
> > 
> > 
> 
> thanks Eric!
> 
> I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
> so in function ip_rt_frag_needed,
> if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
> there is no need to call function tcp_v4_err.
> 
> the call stack is
> icmp_unreach
>   |
>   |--->ip_rt_frag_needed(fill inet_peer)
>   |
>   |--->raw_icmp_error()
>   |
>   |--->ipprot->err_handler(tcp_v4_err or something else)
> 	|
> 	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
> 		|
> 		|--->do_pmtu_discovery
> 		(in this function both __sk_dst_check or dst->ops->update_pmtu
> 		need struct inet_peer to update pmtu)
> 
> so,I think when set inet_peer->pmtu_learned failed,
> in func icmp_unreach we should goto out immediately.
> 
> And it's confuse me that why func ping_err and udp_err not update the pmtu?
> What I miss?

You dont answer my question : After your patch, we now dont call
raw_icmp_error() anymore. Why is is valid ?

Not finding/create inet_peer is very unlikely : This occurs only under
high stress and out of memory condition. Is it really happening on your
machines ?

^ permalink raw reply

* Re: [PATCH net-next] net: add skb frag size accessors
From: David Miller @ 2011-10-19  3:49 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1318279267.2567.19.camel@edumazet-laptop>


It seems that enough has changed that this patch no longer applies,
I'm sorry for taking so long to get to it as that is part of the
reason this situation was created.

I'd really appreciate it if you'd respin this patch, thanks1

^ permalink raw reply

* Re: [net-next PATCH] net: allow vlan traffic to be received under bond
From: David Miller @ 2011-10-19  3:47 UTC (permalink / raw)
  To: jesse; +Cc: john.r.fastabend, hans.schillstrom, jpirko, mbizon, netdev, fubar
In-Reply-To: <CAEP_g=9dk_ERdnw4Hw_8RO8Z23E1g2Q9G=AxPkiaVhwbHvo47A@mail.gmail.com>

From: Jesse Gross <jesse@nicira.com>
Date: Thu, 13 Oct 2011 17:22:02 -0700

> Actually, for most of 2.6.x the behavior was somewhat
> non-deterministic since it depended on kernel version and the NIC.  As
> a result, I think we can safely say that this wasn't a particularly
> firm interface that we have to be wedded to.  Based on overwhelming
> feedback, I think the interface in this patch is the preferred one and
> what we should stabilize on.

Agreed, and I've applied this patch to net-next, thanks everyone!

^ permalink raw reply

* Re: [net-next v2] cs89x0: Move the driver into the Cirrus dir
From: David Miller @ 2011-10-19  3:42 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann, nelson, akpm
In-Reply-To: <1318061264-25310-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Sat,  8 Oct 2011 01:07:44 -0700

> The cs89x0 driver was initial placed in the apple/ when it
> should have been placed in the cirrus/.  This resolves the
> issue by moving the dirver and fixing up the respective
> Kconfig(s) and Makefile(s).
> 
> Thanks to Sascha for reporting the issue.
> 
> -v2 Fix a config error that was introduced with v1 by removing
>     the dependency on MACE for NET_VENDOR_APPLE.
> 
> CC: Russell Nelson <nelson@crynwr.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] bluetooth: Properly clone LSM attributes to newly created child connections
From: David Miller @ 2011-10-19  3:36 UTC (permalink / raw)
  To: pmoore; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <20111007194059.12345.13398.stgit@sifl>

From: Paul Moore <pmoore@redhat.com>
Date: Fri, 07 Oct 2011 15:40:59 -0400

> The Bluetooth stack has internal connection handlers for all of the various
> Bluetooth protocols, and unfortunately, they are currently lacking the LSM
> hooks found in the core network stack's connection handlers.  I say
> unfortunately, because this can cause problems for users who have have an
> LSM enabled and are using certain Bluetooth devices.  See one problem
> report below:
> 
>  * http://bugzilla.redhat.com/show_bug.cgi?id=741703
> 
> In order to keep things simple at this point in time, this patch fixes the
> problem by cloning the parent socket's LSM attributes to the newly created
> child socket.  If we decide we need a more elaborate LSM marking mechanism
> for Bluetooth (I somewhat doubt this) we can always revisit this decision
> in the future.
> 
> Reported-by: James M. Cape <jcape@ignore-your.tv>
> Signed-off-by: Paul Moore <pmoore@redhat.com>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH net-next] l2tp: give proper headroom in pppol2tp_xmit()
From: David Miller @ 2011-10-19  3:34 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1318002357.3207.28.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 07 Oct 2011 17:45:57 +0200

> pppol2tp_xmit() calls skb_cow_head(skb, 2) before calling
> l2tp_xmit_skb()
> 
> Then l2tp_xmit_skb() calls again skb_cow_head(skb, large_headroom)
> 
> This patchs changes the first skb_cow_head() call to supply the needed
> headroom to make sure at most one (expensive) pskb_expand_head() is
> done.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> Maybe we should add a counter to help diagnose too many
> pskb_expand_head() calls...

Applied.

I think it's the kind of event that deserves a tracepoint, this way one
could use perf to notice and diagnose such problems.

^ permalink raw reply

* Re: [PATCH] l2tp: fix a potential skb leak in l2tp_xmit_skb()
From: David Miller @ 2011-10-19  3:32 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1318001746.3207.21.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 07 Oct 2011 17:35:46 +0200

> l2tp_xmit_skb() can leak one skb if skb_cow_head() returns an error.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply

* Re: [PATCH] pch_gbe: compilation warning in pch_gbe_setup_rctl() fixed
From: David Miller @ 2011-10-19  3:28 UTC (permalink / raw)
  To: vvs; +Cc: netdev, toshiharu-linux, vvs
In-Reply-To: <1317974359-20548-1-git-send-email-vvs@parallels.com>

From: Vasily Averin <vvs@parallels.com>
Date: Fri, 7 Oct 2011 11:59:19 +0400

> From: Vasily Averin <vvs@sw.ru>
> 
> compilation warning fixed
> drivers/net/pch_gbe/pch_gbe_main.c: In function ‘pch_gbe_setup_rctl’:
> drivers/net/pch_gbe/pch_gbe_main.c:701:21: warning: unused variable ‘netdev’
> 
> Signed-off-by: Vasily Averin <vvs@sw.ru>

This patch is not appropriate for the 'net' tree, and in the
'net-next' tree not only is the driver in a completely different
directory but also this warning is already fixed.

^ permalink raw reply

* Re: [PATCH] bridge: fix hang on removal of bridge via netlink
From: David Miller @ 2011-10-19  3:24 UTC (permalink / raw)
  To: sri; +Cc: shemminger, netdev
In-Reply-To: <1317942169.6433.17.camel@w-sridhar.beaverton.ibm.com>

From: Sridhar Samudrala <sri@us.ibm.com>
Date: Thu, 06 Oct 2011 16:02:49 -0700

> On Thu, 2011-10-06 at 14:19 -0700, Stephen Hemminger wrote:
>> Need to cleanup bridge device timers and ports when being bridge
>> device is being removed via netlink.
>> 
>> This fixes the problem of observed when doing:
>>  ip link add br0 type bridge
>>  ip link set dev eth1 master br0
>>  ip link set br0 up
>>  ip link del br0
>> 
>> which would cause br0 to hang in unregister_netdev because
>> of leftover reference count.
>> 
>> Reported-by: Sridhar Samudrala <sri@us.ibm.com>
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Acked-by: Sridhar Samudrala <sri@us.ibm.com>

Applied to 'net' and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next] macvlan: handle fragmented multicast frames
From: David Miller @ 2011-10-19  3:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: greearb, netdev
In-Reply-To: <1317932911.3457.31.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 06 Oct 2011 22:28:31 +0200

> [PATCH net-next] macvlan: handle fragmented multicast frames
> 
> Fragmented multicast frames are delivered to a single macvlan port,
> because ip defrag logic considers other samples are redundant.
> 
> Implement a defrag step before trying to send the multicast frame.
> 
> Reported-by: Ben Greear <greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net] vlan:make mtu of vlan equal to physical dev
From: Ben Greear @ 2011-10-19  2:51 UTC (permalink / raw)
  To: WeipingPan; +Cc: herbert, open list:NETWORKING [GENERAL]
In-Reply-To: <4E9E30B8.5030904@gmail.com>

On 10/18/2011 07:06 PM, WeipingPan wrote:
> Hi, Herbert,
>
> What do you think of this patch ?
>
> thanks
> Weiping Pan
> On 10/08/2011 06:12 PM, Weiping Pan wrote:
>> Default mtu of vlan device is the same with mtu of physical device,
>> for example 1500, but when change physics mtu to 1600,
>> VLAN device's mtu is still 1500.
>> Certainly, you can change vlan device's mtu to 1600 manually,
>> but I think when you change physics device's mtu, VLAN's mtu should be changed
>> automatically instead of by manually.

I don't like the idea.  It's perfectly valid to have the physical dev with 9000 MTU
and have vlans with 1500 MTU.

>> Steps to Reproduce:
>> 1.vconfig add eth4 3
>> 2.ifconfig eth4 mtu 1600
>> 3.check mtu on eth4.3
>>
>> And what's worse is that if you decrease mtu of pyhsical device,
>> and when you want to increase it, the mtu of vlan device won't be changed.

That *might* be worth fixing, but even so, some NICs might handle that
just fine, so my opinion is that this should not change either.

Thanks,
Ben

>>
>> Steps to Reproduce:
>> 1.vconfig add eth4 3
>> 2.ifconfig eth4 mtu 100
>> 3.ifconfig eth4 mtu 1500
>> 4.the mtu of eth4.3 is still 100
>>
>> This bug is reported by Liang Zheng(lzheng@redhat.com).
>>
>> Signed-off-by: Weiping Pan<panweiping3@gmail.com>
>> ---
>> net/8021q/vlan.c | 3 ---
>> 1 files changed, 0 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
>> index 8970ba1..f6072b4 100644
>> --- a/net/8021q/vlan.c
>> +++ b/net/8021q/vlan.c
>> @@ -417,9 +417,6 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
>> if (!vlandev)
>> continue;
>>
>> - if (vlandev->mtu<= dev->mtu)
>> - continue;
>> -
>> dev_set_mtu(vlandev, dev->mtu);
>> }
>> break;
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] route:ip_rt_frag_needed always return unzero
From: Gao feng @ 2011-10-19  2:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuznet, jmorris, netdev
In-Reply-To: <4E9E2929.7070701@cn.fujitsu.com>

于 2011年10月19日 09:34, Gao feng 写道:
> 2011.10.18 17:23, Eric Dumazet wrote:
>> Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
>>> int function ip_rt_frag_need,if peer is null,
>>> there is no need to do ipprot->err_handler.
>>> I am right?
>>>
>>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>>> ---
>>>  net/ipv4/route.c |    2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>> index 075212e..6cde0fa 100644
>>> --- a/net/ipv4/route.c
>>> +++ b/net/ipv4/route.c
>>> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
>>>  
>>>  		atomic_inc(&__rt_peer_genid);
>>>  	}
>>> -	return est_mtu ? : new_mtu;
>>> +	return est_mtu;
>>>  }
>>>  
>>>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
>>
>> No idea why you want this, your changelog is a bit cryptic :)
>>
>> Wont this bypass the raw_icmp_error(skb, protocol, info);
>> call in icmp_unreach() as well ?
>>
>>
> 
> thanks Eric!
> 
> I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
> so in function ip_rt_frag_needed,
> if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
> there is no need to call function tcp_v4_err.
> 
> the call stack is
> icmp_unreach
>   |
>   |--->ip_rt_frag_needed(fill inet_peer)
>   |
>   |--->raw_icmp_error()
>   |
>   |--->ipprot->err_handler(tcp_v4_err or something else)
> 	|
> 	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
> 		|
> 		|--->do_pmtu_discovery
> 		(in this function both __sk_dst_check or dst->ops->update_pmtu
> 		need struct inet_peer to update pmtu)
> 
> so,I think when set inet_peer->pmtu_learned failed,
> in func icmp_unreach we should goto out immediately.
> 
> And it's confuse me that why func ping_err and udp_err not update the pmtu?
> What I miss?
> --

And move atomic_inc(&__rt_peer_genid) just like func ip_rt_update_pmtu?

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 6cde0fa..3e1aa5c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1568,11 +1568,12 @@ unsigned short ip_rt_frag_needed(struct net *net, const
                        est_mtu = mtu;
                        peer->pmtu_learned = mtu;
                        peer->pmtu_expires = pmtu_expires;
+
+                       atomic_inc(&__rt_peer_genid);
                }

                inet_putpeer(peer);

-               atomic_inc(&__rt_peer_genid);
        }
        return est_mtu;
 }

^ permalink raw reply related

* [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware (v2)
From: Serge E. Hallyn @ 2011-10-19  2:25 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Joe Perches, linux-kernel, ebiederm, akpm, oleg, richard, mikevs,
	segoon, gregkh, dhowells, eparis, netdev
In-Reply-To: <20111018232242.GA22950@hallyn.com>

(Thanks for the suggestions, Joe.)

Currently uids are compared without regard for the user namespace.
Fix that to prevent tasks in a different user namespace from
wrongly matching on SCM_CREDENTIALS.

In the past, either your uids had to match, or you had to have
CAP_SETXID.  In a namespaced world, you must either (both be in the
same user namespace and have your uids match), or you must have
CAP_SETXID targeted at the other user namespace.  The latter can
happen for instance if uid 500 created a new user namespace and
now interacts with uid 0 in it.

Changelog: Oct 18:
	Per Joe Perches: don't mark uidequiv and gidequiv fns inline
	(let the compiler do that if appropriate), and change the flow
	of id comparisons to make it clearer.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Joe Perches <joe@perches.com>
---
 net/core/scm.c |   43 ++++++++++++++++++++++++++++++++++++-------
 1 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/net/core/scm.c b/net/core/scm.c
index 811b53f..2261607 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -43,17 +43,46 @@
  *	setu(g)id.
  */
 
-static __inline__ int scm_check_creds(struct ucred *creds)
+static bool uidequiv(const struct cred *src, struct ucred *tgt,
+			       struct user_namespace *ns)
+{
+	if (src->user_ns != ns)
+		goto check_capable;
+	if (tgt->uid == src->uid ||
+	    tgt->uid == src->euid ||
+	    tgt->uid == src->suid)
+		return true;
+check_capable:
+	if (ns_capable(ns, CAP_SETUID))
+		return true;
+	return false;
+}
+
+static bool gidequiv(const struct cred *src, struct ucred *tgt,
+			       struct user_namespace *ns)
+{
+	if (src->user_ns != ns)
+		goto check_capable;
+	if (tgt->gid == src->gid ||
+	    tgt->gid == src->egid ||
+	    tgt->gid == src->sgid)
+		return true;
+check_capable:
+	if (ns_capable(ns, CAP_SETGID))
+		return true;
+	return false;
+}
+
+static int scm_check_creds(struct ucred *creds, struct socket *sock)
 {
 	const struct cred *cred = current_cred();
+	struct user_namespace *ns = sock_net(sock->sk)->user_ns;
 
-	if ((creds->pid == task_tgid_vnr(current) || capable(CAP_SYS_ADMIN)) &&
-	    ((creds->uid == cred->uid   || creds->uid == cred->euid ||
-	      creds->uid == cred->suid) || capable(CAP_SETUID)) &&
-	    ((creds->gid == cred->gid   || creds->gid == cred->egid ||
-	      creds->gid == cred->sgid) || capable(CAP_SETGID))) {
+	if ((creds->pid == task_tgid_vnr(current) || ns_capable(ns, CAP_SYS_ADMIN)) &&
+	     uidequiv(cred, creds, ns) && gidequiv(cred, creds, ns)) {
 	       return 0;
 	}
+
 	return -EPERM;
 }
 
@@ -169,7 +198,7 @@ int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *p)
 			if (cmsg->cmsg_len != CMSG_LEN(sizeof(struct ucred)))
 				goto error;
 			memcpy(&p->creds, CMSG_DATA(cmsg), sizeof(struct ucred));
-			err = scm_check_creds(&p->creds);
+			err = scm_check_creds(&p->creds, sock);
 			if (err)
 				goto error;
 
-- 
1.7.5.4

^ permalink raw reply related

* Re: [PATCH net] vlan:make mtu of vlan equal to physical dev
From: WeipingPan @ 2011-10-19  2:06 UTC (permalink / raw)
  To: herbert; +Cc: open list:NETWORKING [GENERAL]
In-Reply-To: <bc33a6142ce48b71b9c232a9154ed76f6048f0cb.1318068629.git.panweiping3@gmail.com>

Hi, Herbert,

What do you think of this patch ?

thanks
Weiping Pan
On 10/08/2011 06:12 PM, Weiping Pan wrote:
> Default mtu of vlan device is the same with mtu of physical device,
> for example 1500, but when change physics mtu to 1600,
> VLAN device's mtu is still 1500.
> Certainly, you can change vlan device's mtu to 1600 manually,
> but I think when you change physics device's mtu, VLAN's mtu should be changed
> automatically instead of by manually.
>
> Steps to Reproduce:
> 1.vconfig add eth4 3
> 2.ifconfig eth4 mtu 1600
> 3.check mtu on eth4.3
>
> And what's worse is that if you decrease mtu of pyhsical device,
> and when you want to increase it, the mtu of vlan device won't be changed.
>
> Steps to Reproduce:
> 1.vconfig add eth4 3
> 2.ifconfig eth4 mtu 100
> 3.ifconfig eth4 mtu 1500
> 4.the mtu of eth4.3 is still 100
>
> This bug is reported by Liang Zheng(lzheng@redhat.com).
>
> Signed-off-by: Weiping Pan<panweiping3@gmail.com>
> ---
>   net/8021q/vlan.c |    3 ---
>   1 files changed, 0 insertions(+), 3 deletions(-)
>
> diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
> index 8970ba1..f6072b4 100644
> --- a/net/8021q/vlan.c
> +++ b/net/8021q/vlan.c
> @@ -417,9 +417,6 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
>   			if (!vlandev)
>   				continue;
>
> -			if (vlandev->mtu<= dev->mtu)
> -				continue;
> -
>   			dev_set_mtu(vlandev, dev->mtu);
>   		}
>   		break;

^ permalink raw reply

* Re: [PATCH] route:ip_rt_frag_needed always return unzero
From: Gao feng @ 2011-10-19  1:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuznet, jmorris, netdev
In-Reply-To: <1318929797.2657.21.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

2011.10.18 17:23, Eric Dumazet wrote:
> Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
>> int function ip_rt_frag_need,if peer is null,
>> there is no need to do ipprot->err_handler.
>> I am right?
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/ipv4/route.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>> index 075212e..6cde0fa 100644
>> --- a/net/ipv4/route.c
>> +++ b/net/ipv4/route.c
>> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
>>  
>>  		atomic_inc(&__rt_peer_genid);
>>  	}
>> -	return est_mtu ? : new_mtu;
>> +	return est_mtu;
>>  }
>>  
>>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
> 
> No idea why you want this, your changelog is a bit cryptic :)
> 
> Wont this bypass the raw_icmp_error(skb, protocol, info);
> call in icmp_unreach() as well ?
> 
> 

thanks Eric!

I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
so in function ip_rt_frag_needed,
if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
there is no need to call function tcp_v4_err.

the call stack is
icmp_unreach
  |
  |--->ip_rt_frag_needed(fill inet_peer)
  |
  |--->raw_icmp_error()
  |
  |--->ipprot->err_handler(tcp_v4_err or something else)
	|
	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
		|
		|--->do_pmtu_discovery
		(in this function both __sk_dst_check or dst->ops->update_pmtu
		need struct inet_peer to update pmtu)

so,I think when set inet_peer->pmtu_learned failed,
in func icmp_unreach we should goto out immediately.

And it's confuse me that why func ping_err and udp_err not update the pmtu?
What I miss?

^ permalink raw reply

* Re: [RESEND] [PATCH] ll_temac: Add support for ethtool
From: Ben Hutchings @ 2011-10-18 23:29 UTC (permalink / raw)
  To: Ricardo Ribalda Delgado
  Cc: davem, grant.likely, sfr, u.kleine-koenig, netdev, linux-kernel
In-Reply-To: <1318960501-3544-1-git-send-email-ricardo.ribalda@gmail.com>

On Tue, 2011-10-18 at 19:55 +0200, Ricardo Ribalda Delgado wrote:
> This patch enables the ethtool interface. The implementation is done
> using the libphy helper functions.

All Ethernet drivers have been moved in net-next.  This driver is now
under drivers/net/ethernet/xilinx.

> Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
> Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
> ---
>  drivers/net/ll_temac_main.c |   27 +++++++++++++++++++++++++++
>  1 files changed, 27 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ll_temac_main.c b/drivers/net/ll_temac_main.c
> index 728fe41..91a9804 100644
> --- a/drivers/net/ll_temac_main.c
> +++ b/drivers/net/ll_temac_main.c
> @@ -957,6 +957,32 @@ static const struct attribute_group temac_attr_group = {
>  	.attrs = temac_device_attrs,
>  };
>  
> +/* ethtool support */
> +static int temac_get_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
> +{
> +	struct temac_local *lp = netdev_priv(ndev);
> +	return phy_ethtool_gset(lp->phy_dev, cmd);
> +}
> +
> +static int temac_set_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
> +{
> +	struct temac_local *lp = netdev_priv(ndev);
> +	return phy_ethtool_sset(lp->phy_dev, cmd);
> +}
> +
> +static int temac_nway_reset(struct net_device *ndev)
> +{
> +	struct temac_local *lp = netdev_priv(ndev);
> +	return phy_start_aneg(lp->phy_dev);
> +}
> +
> +static const struct ethtool_ops temac_ethtool_ops = {
> +	.get_settings = temac_get_settings,
> +	.set_settings = temac_set_settings,
> +	.nway_reset = temac_nway_reset,
> +	.get_link = ethtool_op_get_link,
> +};
> +
>  static int __devinit temac_of_probe(struct platform_device *op)
>  {
>  	struct device_node *np;
> @@ -978,6 +1004,7 @@ static int __devinit temac_of_probe(struct platform_device *op)
>  	ndev->flags &= ~IFF_MULTICAST;  /* clear multicast */
>  	ndev->features = NETIF_F_SG | NETIF_F_FRAGLIST;
>  	ndev->netdev_ops = &temac_netdev_ops;
> +	ndev->ethtool_ops= &temac_ethtool_ops;

Missing space before '='.

Ben.

>  #if 0
>  	ndev->features |= NETIF_F_IP_CSUM; /* Can checksum TCP/UDP over IPv4. */
>  	ndev->features |= NETIF_F_HW_CSUM; /* Can checksum all the packets. */

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware
From: Serge E. Hallyn @ 2011-10-18 23:22 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-kernel, ebiederm, akpm, oleg, richard, mikevs, segoon,
	gregkh, dhowells, eparis, Serge E. Hallyn, netdev
In-Reply-To: <1318976049.2273.7.camel@Joe-Laptop>

Quoting Joe Perches (joe@perches.com):
> On Tue, 2011-10-18 at 21:54 +0000, Serge Hallyn wrote:
> > From: "Serge E. Hallyn" <serge.hallyn@canonical.com>
> 
> Hi Serge.
> 
> Just some trivial style notes.
> 
> > Currently uids are compared without regard for the user namespace.
> > Fix that to prevent tasks in a different user namespace from
> > wrongly matching on SCM_CREDENTIALS.
> []
> > diff --git a/net/core/scm.c b/net/core/scm.c
> 
> > -static __inline__ int scm_check_creds(struct ucred *creds)
> > +static __inline__ bool uidequiv(const struct cred *src, struct ucred *tgt,
> > +			       struct user_namespace *ns)
> 
> Perhaps inline is better than __inline__ and do these
> functions really need to be marked inline at all?

Dunno, I was just sticking with the current style.

> > +{
> > +	if (src->user_ns != ns)
> > +		goto check_capable;
> > +	if (src->uid == tgt->uid || src->euid == tgt->uid ||
> > +	    src->suid == tgt->uid)
> 
> Perhaps this is less prone to typo errors and are a bit
> more readable as:
> 
> 	if (tgt->uid == src->uid ||
> 	    tgt->uid == src->euid ||
> 	    tgt->uid == src->suid)

I do like that better.

thanks,
-serge

^ permalink raw reply

* Re: Comment on nf_queue NF_STOLEN patch
From: Jim Sansing @ 2011-10-18 21:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Network Development list
In-Reply-To: <1318973032.19139.5.camel@edumazet-laptop>

Eric Dumazet wrote:
> Le mardi 18 octobre 2011 à 15:08 -0400, Jim Sansing a écrit :
>   
>> I have been working on a kernel module that registers with netfilter,
>> and I noticed that a patch was added to nf_queue that changed the
>> handling of return code NF_FILTER from 'do nothing' to 'free the skb'. 
>> I'm not sure which kernel version this went in, but the date of the
>> patch is Feb, 19, 2010.
>>
>> Everything I have read about netfilter states that it is up to the
>> netfilter hook to free the skb if NF_STOLEN is returned.  The
>> implications of this patch from a hook programming perspective are:
>>
>> 1) If the skb is used after the return from the hook, it must be cloned.
>> 2) The original skb must not be freed.
>>
>> I suggest that a comment be added to include/linux/netfilter.h that says
>> explicitly the skb will be freed if NF_STOLEN is returned.
>>     
>
> But its not true. Just read the code.
>
> If you are working on this stuff I recommend you take a look at
> commits :
>
> c6675233f9015d3c0460c8aab53ed9b99d915c64
> (netfilter: nf_queue: reject NF_STOLEN verdicts from userspace)
>
> fad54440438a7c231a6ae347738423cbabc936d9
> (netfilter: avoid double free in nf_reinject)
>
> 64507fdbc29c3a622180378210ecea8659b14e40
> (netfilter: nf_queue: fix NF_STOLEN skb leak)
>
> 3bc38712e3a6e0596ccb6f8299043a826f983701
> ([NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in
> nf_reinject)
>
>   

I see that fad54440438a7c231a6ae347738423cbabc936d9 (netfilter: avoid
double free in nf_reinject) returns the switch case for NF_STOLEN back
to the original state, but I just downloaded 3.0.4, and the skb is still
freed.  So for some versions of the kernel, the situation exists. 
Hopefully anyone who runs into it will find this thread.

Later . . .   Jim

^ permalink raw reply

* Re: [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware
From: Joe Perches @ 2011-10-18 22:14 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: linux-kernel, ebiederm, akpm, oleg, richard, mikevs, segoon,
	gregkh, dhowells, eparis, Serge E. Hallyn, netdev
In-Reply-To: <1318974898-21431-10-git-send-email-serge@hallyn.com>

On Tue, 2011-10-18 at 21:54 +0000, Serge Hallyn wrote:
> From: "Serge E. Hallyn" <serge.hallyn@canonical.com>

Hi Serge.

Just some trivial style notes.

> Currently uids are compared without regard for the user namespace.
> Fix that to prevent tasks in a different user namespace from
> wrongly matching on SCM_CREDENTIALS.
[]
> diff --git a/net/core/scm.c b/net/core/scm.c

> -static __inline__ int scm_check_creds(struct ucred *creds)
> +static __inline__ bool uidequiv(const struct cred *src, struct ucred *tgt,
> +			       struct user_namespace *ns)

Perhaps inline is better than __inline__ and do these
functions really need to be marked inline at all?

> +{
> +	if (src->user_ns != ns)
> +		goto check_capable;
> +	if (src->uid == tgt->uid || src->euid == tgt->uid ||
> +	    src->suid == tgt->uid)

Perhaps this is less prone to typo errors and are a bit
more readable as:

	if (tgt->uid == src->uid ||
	    tgt->uid == src->euid ||
	    tgt->uid == src->suid)

^ permalink raw reply

* Re: [PATCH] tc: fix parallel build file with lex/yacc
From: Stephen Hemminger @ 2011-10-18 22:02 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: stephen.hemminger, netdev
In-Reply-To: <1318973888-29496-1-git-send-email-vapier@gentoo.org>

On Tue, 18 Oct 2011 17:38:08 -0400
Mike Frysinger <vapier@gentoo.org> wrote:

> Building iproute2 in parallel might hit the race failure:
> 	emp_ematch.l:2:30: fatal error: emp_ematch.yacc.h:
> 		No such file or directory
> 	make[1]: *** [emp_ematch.lex.o] Error 1
> 
> This is because we currently allow the yacc/lex files to generate and
> compile in parallel.  So add a simple dependency to make sure yacc has
> finished before we attempt to compile the lex output.
> 
> Signed-off-by: Mike Frysinger <vapier@gentoo.org>

applied

^ permalink raw reply

* [PATCH 9/9] make net/core/scm.c uid comparisons user namespace aware
From: Serge Hallyn @ 2011-10-18 21:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, akpm, oleg, richard, mikevs, segoon, gregkh, dhowells,
	eparis, Serge E. Hallyn, netdev
In-Reply-To: <1318974898-21431-1-git-send-email-serge@hallyn.com>

From: "Serge E. Hallyn" <serge.hallyn@canonical.com>

Currently uids are compared without regard for the user namespace.
Fix that to prevent tasks in a different user namespace from
wrongly matching on SCM_CREDENTIALS.

In the past, either your uids had to match, or you had to have
CAP_SETXID.  In a namespaced world, you must either (both be in the
same user namespace and have your uids match), or you must have
CAP_SETXID targeted at the other user namespace.  The latter can
happen for instance if uid 500 created a new user namespace and
now interacts with uid 0 in it.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org
---
 net/core/scm.c |   41 ++++++++++++++++++++++++++++++++++-------
 1 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/net/core/scm.c b/net/core/scm.c
index 811b53f..4f376bf 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -43,17 +43,44 @@
  *	setu(g)id.
  */
 
-static __inline__ int scm_check_creds(struct ucred *creds)
+static __inline__ bool uidequiv(const struct cred *src, struct ucred *tgt,
+			       struct user_namespace *ns)
+{
+	if (src->user_ns != ns)
+		goto check_capable;
+	if (src->uid == tgt->uid || src->euid == tgt->uid ||
+	    src->suid == tgt->uid)
+		return true;
+check_capable:
+	if (ns_capable(ns, CAP_SETUID))
+		return true;
+	return false;
+}
+
+static __inline__ bool gidequiv(const struct cred *src, struct ucred *tgt,
+			       struct user_namespace *ns)
+{
+	if (src->user_ns != ns)
+		goto check_capable;
+	if (src->gid == tgt->gid || src->egid == tgt->gid ||
+	    src->sgid == tgt->gid)
+		return true;
+check_capable:
+	if (ns_capable(ns, CAP_SETGID))
+		return true;
+	return false;
+}
+
+static __inline__ int scm_check_creds(struct ucred *creds, struct socket *sock)
 {
 	const struct cred *cred = current_cred();
+	struct user_namespace *ns = sock_net(sock->sk)->user_ns;
 
-	if ((creds->pid == task_tgid_vnr(current) || capable(CAP_SYS_ADMIN)) &&
-	    ((creds->uid == cred->uid   || creds->uid == cred->euid ||
-	      creds->uid == cred->suid) || capable(CAP_SETUID)) &&
-	    ((creds->gid == cred->gid   || creds->gid == cred->egid ||
-	      creds->gid == cred->sgid) || capable(CAP_SETGID))) {
+	if ((creds->pid == task_tgid_vnr(current) || ns_capable(ns, CAP_SYS_ADMIN)) &&
+	     uidequiv(cred, creds, ns) && gidequiv(cred, creds, ns)) {
 	       return 0;
 	}
+
 	return -EPERM;
 }
 
@@ -169,7 +196,7 @@ int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *p)
 			if (cmsg->cmsg_len != CMSG_LEN(sizeof(struct ucred)))
 				goto error;
 			memcpy(&p->creds, CMSG_DATA(cmsg), sizeof(struct ucred));
-			err = scm_check_creds(&p->creds);
+			err = scm_check_creds(&p->creds, sock);
 			if (err)
 				goto error;
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 8/9] protect cap_netlink_recv from user namespaces
From: Serge Hallyn @ 2011-10-18 21:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, akpm, oleg, richard, mikevs, segoon, gregkh, dhowells,
	eparis, Serge E. Hallyn, netdev
In-Reply-To: <1318974898-21431-1-git-send-email-serge@hallyn.com>

From: "Serge E. Hallyn" <serge.hallyn@canonical.com>

cap_netlink_recv() was granting privilege if a capability is in
current_cap(), regardless of the user namespace.  Fix that by
targeting the capability check against the user namespace which
owns the skb.

Caller passes the user ns down because sock_net is static inline defined in
net/sock.h, which we'd rather not #include at the cap_netlink_recv function.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org
---
 drivers/scsi/scsi_netlink.c     |    3 ++-
 include/linux/security.h        |   14 +++++++++-----
 kernel/audit.c                  |    6 ++++--
 net/core/rtnetlink.c            |    3 ++-
 net/decnet/netfilter/dn_rtmsg.c |    3 ++-
 net/ipv4/netfilter/ip_queue.c   |    3 ++-
 net/ipv6/netfilter/ip6_queue.c  |    3 ++-
 net/netfilter/nfnetlink.c       |    2 +-
 net/netlink/genetlink.c         |    2 +-
 net/xfrm/xfrm_user.c            |    2 +-
 security/commoncap.c            |    6 ++----
 security/security.c             |    4 ++--
 security/selinux/hooks.c        |    5 +++--
 13 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/scsi_netlink.c b/drivers/scsi/scsi_netlink.c
index 26a8a45..0aa2e57 100644
--- a/drivers/scsi/scsi_netlink.c
+++ b/drivers/scsi/scsi_netlink.c
@@ -111,7 +111,8 @@ scsi_nl_rcv_msg(struct sk_buff *skb)
 			goto next_msg;
 		}
 
-		if (security_netlink_recv(skb, CAP_SYS_ADMIN)) {
+		if (security_netlink_recv(skb, CAP_SYS_ADMIN,
+					  sock_net(skb->sk)->user_ns)) {
 			err = -EPERM;
 			goto next_msg;
 		}
diff --git a/include/linux/security.h b/include/linux/security.h
index ebd2a53..cfa1f47 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -95,7 +95,8 @@ struct xfrm_user_sec_ctx;
 struct seq_file;
 
 extern int cap_netlink_send(struct sock *sk, struct sk_buff *skb);
-extern int cap_netlink_recv(struct sk_buff *skb, int cap);
+extern int cap_netlink_recv(struct sk_buff *skb, int cap,
+			    struct user_namespace *ns);
 
 void reset_security_ops(void);
 
@@ -797,6 +798,7 @@ static inline void security_free_mnt_opts(struct security_mnt_opts *opts)
  *	@skb.
  *	@skb contains the sk_buff structure for the netlink message.
  *	@cap indicates the capability required
+ *	@ns is the user namespace which owns skb
  *	Return 0 if permission is granted.
  *
  * Security hooks for Unix domain networking.
@@ -1557,7 +1559,8 @@ struct security_operations {
 			  struct sembuf *sops, unsigned nsops, int alter);
 
 	int (*netlink_send) (struct sock *sk, struct sk_buff *skb);
-	int (*netlink_recv) (struct sk_buff *skb, int cap);
+	int (*netlink_recv) (struct sk_buff *skb, int cap,
+			     struct user_namespace *ns);
 
 	void (*d_instantiate) (struct dentry *dentry, struct inode *inode);
 
@@ -1806,7 +1809,7 @@ void security_d_instantiate(struct dentry *dentry, struct inode *inode);
 int security_getprocattr(struct task_struct *p, char *name, char **value);
 int security_setprocattr(struct task_struct *p, char *name, void *value, size_t size);
 int security_netlink_send(struct sock *sk, struct sk_buff *skb);
-int security_netlink_recv(struct sk_buff *skb, int cap);
+int security_netlink_recv(struct sk_buff *skb, int cap, struct user_namespace *ns);
 int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen);
 int security_secctx_to_secid(const char *secdata, u32 seclen, u32 *secid);
 void security_release_secctx(char *secdata, u32 seclen);
@@ -2498,9 +2501,10 @@ static inline int security_netlink_send(struct sock *sk, struct sk_buff *skb)
 	return cap_netlink_send(sk, skb);
 }
 
-static inline int security_netlink_recv(struct sk_buff *skb, int cap)
+static inline int security_netlink_recv(struct sk_buff *skb, int cap,
+					struct user_namespace *ns)
 {
-	return cap_netlink_recv(skb, cap);
+	return cap_netlink_recv(skb, cap, ns);
 }
 
 static inline int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen)
diff --git a/kernel/audit.c b/kernel/audit.c
index 0a1355c..48144c4 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -601,13 +601,15 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
 	case AUDIT_TTY_SET:
 	case AUDIT_TRIM:
 	case AUDIT_MAKE_EQUIV:
-		if (security_netlink_recv(skb, CAP_AUDIT_CONTROL))
+		if (security_netlink_recv(skb, CAP_AUDIT_CONTROL,
+					  sock_net(skb->sk)->user_ns))
 			err = -EPERM;
 		break;
 	case AUDIT_USER:
 	case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
 	case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
-		if (security_netlink_recv(skb, CAP_AUDIT_WRITE))
+		if (security_netlink_recv(skb, CAP_AUDIT_WRITE,
+					  sock_net(skb->sk)->user_ns))
 			err = -EPERM;
 		break;
 	default:  /* bad msg */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 99d9e95..4a444de 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1931,7 +1931,8 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	sz_idx = type>>2;
 	kind = type&3;
 
-	if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN,
+					       net->user_ns))
 		return -EPERM;
 
 	if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
diff --git a/net/decnet/netfilter/dn_rtmsg.c b/net/decnet/netfilter/dn_rtmsg.c
index 69975e0..2d052ab 100644
--- a/net/decnet/netfilter/dn_rtmsg.c
+++ b/net/decnet/netfilter/dn_rtmsg.c
@@ -108,7 +108,8 @@ static inline void dnrmg_receive_user_skb(struct sk_buff *skb)
 	if (nlh->nlmsg_len < sizeof(*nlh) || skb->len < nlh->nlmsg_len)
 		return;
 
-	if (security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (security_netlink_recv(skb, CAP_NET_ADMIN,
+	    sock_net(skb->sk)->user_ns))
 		RCV_SKB_FAIL(-EPERM);
 
 	/* Eventually we might send routing messages too */
diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c
index e59aabd..d20bede 100644
--- a/net/ipv4/netfilter/ip_queue.c
+++ b/net/ipv4/netfilter/ip_queue.c
@@ -430,7 +430,8 @@ __ipq_rcv_skb(struct sk_buff *skb)
 	if (type <= IPQM_BASE)
 		return;
 
-	if (security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (security_netlink_recv(skb, CAP_NET_ADMIN,
+				  sock_net(skb->sk)->user_ns))
 		RCV_SKB_FAIL(-EPERM);
 
 	spin_lock_bh(&queue_lock);
diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c
index e63c397..09db01c 100644
--- a/net/ipv6/netfilter/ip6_queue.c
+++ b/net/ipv6/netfilter/ip6_queue.c
@@ -431,7 +431,8 @@ __ipq_rcv_skb(struct sk_buff *skb)
 	if (type <= IPQM_BASE)
 		return;
 
-	if (security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (security_netlink_recv(skb, CAP_NET_ADMIN,
+				  sock_net(skb->sk)->user_ns))
 		RCV_SKB_FAIL(-EPERM);
 
 	spin_lock_bh(&queue_lock);
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 1905976..bcaff9d 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -130,7 +130,7 @@ static int nfnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	const struct nfnetlink_subsystem *ss;
 	int type, err;
 
-	if (security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (security_netlink_recv(skb, CAP_NET_ADMIN, net->user_ns))
 		return -EPERM;
 
 	/* All the messages must at least contain nfgenmsg */
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 482fa57..00a101c 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -516,7 +516,7 @@ static int genl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		return -EOPNOTSUPP;
 
 	if ((ops->flags & GENL_ADMIN_PERM) &&
-	    security_netlink_recv(skb, CAP_NET_ADMIN))
+	    security_netlink_recv(skb, CAP_NET_ADMIN, net->user_ns))
 		return -EPERM;
 
 	if (nlh->nlmsg_flags & NLM_F_DUMP) {
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 0256b8a..1808e1e 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2290,7 +2290,7 @@ static int xfrm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	link = &xfrm_dispatch[type];
 
 	/* All operations require privileges, even GET */
-	if (security_netlink_recv(skb, CAP_NET_ADMIN))
+	if (security_netlink_recv(skb, CAP_NET_ADMIN, net->user_ns))
 		return -EPERM;
 
 	if ((type == (XFRM_MSG_GETSA - XFRM_MSG_BASE) ||
diff --git a/security/commoncap.c b/security/commoncap.c
index a93b3b7..1e48e6a 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -56,11 +56,9 @@ int cap_netlink_send(struct sock *sk, struct sk_buff *skb)
 	return 0;
 }
 
-int cap_netlink_recv(struct sk_buff *skb, int cap)
+int cap_netlink_recv(struct sk_buff *skb, int cap, struct user_namespace *ns)
 {
-	if (!cap_raised(current_cap(), cap))
-		return -EPERM;
-	return 0;
+	return security_capable(ns, current_cred(), cap);
 }
 EXPORT_SYMBOL(cap_netlink_recv);
 
diff --git a/security/security.c b/security/security.c
index 0e4fccf..0a1453e 100644
--- a/security/security.c
+++ b/security/security.c
@@ -941,9 +941,9 @@ int security_netlink_send(struct sock *sk, struct sk_buff *skb)
 	return security_ops->netlink_send(sk, skb);
 }
 
-int security_netlink_recv(struct sk_buff *skb, int cap)
+int security_netlink_recv(struct sk_buff *skb, int cap, struct user_namespace *ns)
 {
-	return security_ops->netlink_recv(skb, cap);
+	return security_ops->netlink_recv(skb, cap, ns);
 }
 EXPORT_SYMBOL(security_netlink_recv);
 
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 266a229..fe290bb 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4723,13 +4723,14 @@ static int selinux_netlink_send(struct sock *sk, struct sk_buff *skb)
 	return selinux_nlmsg_perm(sk, skb);
 }
 
-static int selinux_netlink_recv(struct sk_buff *skb, int capability)
+static int selinux_netlink_recv(struct sk_buff *skb, int capability,
+				struct user_namespace *ns)
 {
 	int err;
 	struct common_audit_data ad;
 	u32 sid;
 
-	err = cap_netlink_recv(skb, capability);
+	err = cap_netlink_recv(skb, capability, ns);
 	if (err)
 		return err;
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 7/9] user namespace: make each net (net_ns) belong to a user_ns
From: Serge Hallyn @ 2011-10-18 21:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: ebiederm, akpm, oleg, richard, mikevs, segoon, gregkh, dhowells,
	eparis, Serge E. Hallyn, netdev
In-Reply-To: <1318974898-21431-1-git-send-email-serge@hallyn.com>

From: "Serge E. Hallyn" <serge.hallyn@canonical.com>

The user namespace which creates a new network namespace owns that
namespace and all resources created in it.  This way we can target
capability checks for privileged operations against network resources to
the user_ns which created the network namespace in which the resource
lives.  Privilege to the user namespace which owns the network
namespace, or any parent user namespace thereof, provides the same
privilege to the network resource.

Changelog:
   jul 8: nsproxy: don't assign netns->userns if not cloning.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org
---
 include/net/net_namespace.h |    2 ++
 kernel/nsproxy.c            |    2 ++
 net/core/net_namespace.c    |    3 +++
 3 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 3bb6fa0..d91fe5f 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -29,6 +29,7 @@ struct ctl_table_header;
 struct net_generic;
 struct sock;
 struct netns_ipvs;
+struct user_namespace;
 
 
 #define NETDEV_HASHBITS    8
@@ -101,6 +102,7 @@ struct net {
 	struct netns_xfrm	xfrm;
 #endif
 	struct netns_ipvs	*ipvs;
+	struct user_namespace	*user_ns;
 };
 
 
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 9aeab4b..0d5bf8d 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -95,6 +95,8 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
 		err = PTR_ERR(new_nsp->net_ns);
 		goto out_net;
 	}
+	if (flags & CLONE_NEWNET)
+		new_nsp->net_ns->user_ns = get_user_ns(task_cred_xxx(tsk, user_ns));
 
 	return new_nsp;
 
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 5bbdbf0..791c19c 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -10,6 +10,7 @@
 #include <linux/nsproxy.h>
 #include <linux/proc_fs.h>
 #include <linux/file.h>
+#include <linux/user_namespace.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
 
@@ -209,6 +210,7 @@ static void net_free(struct net *net)
 	}
 #endif
 	kfree(net->gen);
+	put_user_ns(net->user_ns);
 	kmem_cache_free(net_cachep, net);
 }
 
@@ -389,6 +391,7 @@ static int __init net_ns_init(void)
 	rcu_assign_pointer(init_net.gen, ng);
 
 	mutex_lock(&net_mutex);
+	init_net.user_ns = &init_user_ns;
 	if (setup_net(&init_net))
 		panic("Could not setup the initial network namespace");
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH] tc: fix parallel build file with lex/yacc
From: Mike Frysinger @ 2011-10-18 21:38 UTC (permalink / raw)
  To: stephen.hemminger, netdev

Building iproute2 in parallel might hit the race failure:
	emp_ematch.l:2:30: fatal error: emp_ematch.yacc.h:
		No such file or directory
	make[1]: *** [emp_ematch.lex.o] Error 1

This is because we currently allow the yacc/lex files to generate and
compile in parallel.  So add a simple dependency to make sure yacc has
finished before we attempt to compile the lex output.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
---
 tc/Makefile |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/tc/Makefile b/tc/Makefile
index 08aa4ce..b2ca165 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -136,6 +136,11 @@ m_xt_old.so: m_xt_old.c
 %.lex.c: %.l
 	$(LEX) $(LEXFLAGS) -o$@ $<
 
+# our lexer includes the header from yacc, so make sure
+# we don't attempt to compile it before the header has
+# been generated as part of the yacc step.
+emp_ematch.lex.o: emp_ematch.yacc.c
+
 ifneq ($(SHARED_LIBS),y)
 
 tc: static-syms.o
-- 
1.7.6.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox