Netdev List
 help / color / mirror / Atom feed
* Re: [PATCHv2 net-next] add DOVE extensions for VXLAN
From: David Miller @ 2012-11-19 23:55 UTC (permalink / raw)
  To: dlstevens; +Cc: shemminger, netdev
In-Reply-To: <201211151347.qAFDkR8G023134@lab1.dls>

From: David L Stevens <dlstevens@us.ibm.com>
Date: Thu, 15 Nov 2012 08:46:27 -0500

> @@ -111,6 +113,10 @@ struct vxlan_dev {
>  	__u8		  tos;		/* TOS override */
>  	__u8		  ttl;
>  	bool		  learn;
> +	bool		  proxy;	/* ARP reduction */
> +	bool		  rsc;		/* route short-circuit */
> +	bool		  l2miss;	/* L2 miss notifications */
> +	bool		  l3miss;	/* L3 miss notifications */
>  
>  	unsigned long	  age_interval;
>  	struct timer_list age_timer;

Perhaps we've accumulated enough boolean states that a "u32 flags" are
in order?

> +		for (i=0; i<ETH_ALEN; ++i)
> +			send_eth |= !!fdb->eth_addr[i];

Please use is_zero_ether_addr().

Otherwise these changes look fine to me, thanks.

^ permalink raw reply

* Re: [PATCH v2 3.7.0-rc4] of/net/mdio-gpio: Fix pdev->id issue when using devicetrees.
From: David Miller @ 2012-11-19 23:58 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, devicetree-discuss
In-Reply-To: <1353062039-31336-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Fri, 16 Nov 2012 10:33:59 +0000

> From: Srinivas Kandagatla <srinivas.kandagatla@st.com>
> 
> When the mdio-gpio driver is probed via device trees, the platform
> device id is set as -1, However the pdev->id is re-used as bus-id for
> while creating mdio gpio bus.
> So
> For device tree case the mdio-gpio bus name appears as "gpio-ffffffff"
> where as
> for non-device tree case the bus name appears as "gpio-<bus-num>"
> 
> Which means the bus_id is fixed in device tree case, so we can't have
> two mdio gpio buses via device trees. Assigning a logical bus number
> via device tree solves the problem and the bus name is much consistent
> with non-device tree bus name.
> 
> Without this patch
> 1. we can't support two mdio-gpio buses via device trees.
> 2. we should always pass gpio-ffffffff as bus name to phy_connect, very
> different to non-device tree bus name.
> 
> So, setting up the bus_id via aliases from device tree is the right
> solution and other drivers do similar thing.
> 
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v4 1/9] net: core: use this_cpu_ptr per-cpu helper
From: David Miller @ 2012-11-20  0:00 UTC (permalink / raw)
  To: shanwei88; +Cc: timo.teras, steffen.klassert, netdev, linux-kernel, cl, htejun
In-Reply-To: <50A5FB69.8000907@gmail.com>

From: Shan Wei <shanwei88@gmail.com>
Date: Fri, 16 Nov 2012 16:38:01 +0800

> Shan Wei said, at 2012/11/13 9:51:
>> From: Shan Wei <davidshan@tencent.com>
>> 
>> flush_tasklet is a struct, not a pointer in percpu var.
>> so use this_cpu_ptr to get the member pointer.
>> 
>> Signed-off-by: Shan Wei <davidshan@tencent.com>
>> Reviewed-by: Christoph Lameter <cl@linux.com>
> 
> David Miller,  would you like to pick it up to your net-next tree?

Applied.

^ permalink raw reply

* Re: [PATCH v4 2/9] net: rds: use this_cpu_* per-cpu helper
From: David Miller @ 2012-11-20  0:00 UTC (permalink / raw)
  To: shanwei88
  Cc: venkat.x.venkatsubra, rds-devel, netdev, linux-kernel, cl, htejun
In-Reply-To: <50A5FB89.4000400@gmail.com>

From: Shan Wei <shanwei88@gmail.com>
Date: Fri, 16 Nov 2012 16:38:33 +0800

> Shan Wei said, at 2012/11/13 9:52:
>> From: Shan Wei <davidshan@tencent.com>
>> 
>> 
>> Signed-off-by: Shan Wei <davidshan@tencent.com>
>> Reviewed-by: Christoph Lameter <cl@linux.com>
> 
> David Miller,  would you like to pick it up to your net-next tree?

Applied.

^ permalink raw reply

* Re: [net-next PATCH v2] net/ethernet: remove useless is_valid_ether_addr from drivers ndo_open
From: David Miller @ 2012-11-20  0:03 UTC (permalink / raw)
  To: manabian
  Cc: nicolas.ferre, shemminger, steve.glendinning, stigge, msink,
	netdev
In-Reply-To: <1353077235-342-1-git-send-email-manabian@gmail.com>

From: Joachim Eastwood <manabian@gmail.com>
Date: Fri, 16 Nov 2012 15:47:15 +0100

> If ndo_validate_addr is set to the generic eth_validate_addr
> function there is no point in calling is_valid_ether_addr
> from driver ndo_open if ndo_open is not used elsewhere in
> the driver.
> 
> With this change is_valid_ether_addr will be called from the
> generic eth_validate_addr function. So there should be no change
> in the actual behavior.
> 
> Signed-off-by: Joachim Eastwood <manabian@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 1/4] ipip: allow to deactivate the creation of fb dev
From: David Miller @ 2012-11-20  0:06 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev
In-Reply-To: <1353082456-21234-2-git-send-email-nicolas.dichtel@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Fri, 16 Nov 2012 17:14:13 +0100

> Now that tunnels can be configured via rtnetlink, this device is not mandatory.
> The default is conservative.
> 
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

I'm not too thrilled about this change, mostly because of my dislike of
module parameters in general.

But in this case there appears to be real bugs in the two sets of changes
where you add this setup_fb thing.

> @@ -1057,7 +1066,8 @@ static void __net_exit ipip_exit_net(struct net *net)
>  
>  	rtnl_lock();
>  	ipip_destroy_tunnels(ipn, &list);
> -	unregister_netdevice_queue(ipn->fb_tunnel_dev, &list);
> +	if (setup_fb)
> +		unregister_netdevice_queue(ipn->fb_tunnel_dev, &list);
>  	unregister_netdevice_many(&list);
>  	rtnl_unlock();
>  }

Users can modify module parameter values via sysfs after the module
is loaded, so you need a more internal and protected state to use
to decide whether you really need to unregister the thing or not.

But to me it's just symptomatic of what a bad idea this is in the
first place.

^ permalink raw reply

* Re: [PATCH] netfilter: Remove the spurious \ in __ip_vs_lblc_init
From: Simon Horman @ 2012-11-20  0:06 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev, Pablo Neira Ayuso
In-Reply-To: <87zk2d7gwp.fsf@xmission.com>

On Mon, Nov 19, 2012 at 07:26:30AM -0800, Eric W. Biederman wrote:
> 
> In (464dc801c76a net: Don't export sysctls to unprivileged users)
> I typoed and introduced a spurious backslash.  Delete it.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Signed-off-by: Simon Horman <horms@verge.net.au>

Pablo, can you take this directly through your tree?

> ---
>  net/netfilter/ipvs/ip_vs_lblc.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
> index d742aa9..fdd89b9 100644
> --- a/net/netfilter/ipvs/ip_vs_lblc.c
> +++ b/net/netfilter/ipvs/ip_vs_lblc.c
> @@ -574,7 +574,7 @@ static int __net_init __ip_vs_lblc_init(struct net *net)
>  		register_net_sysctl(net, "net/ipv4/vs", ipvs->lblc_ctl_table);
>  	if (!ipvs->lblc_ctl_header) {
>  		if (!net_eq(net, &init_net))
> -			kfree(ipvs->lblc_ctl_table);\
> +			kfree(ipvs->lblc_ctl_table);
>  		return -ENOMEM;
>  	}
>  
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH net-next 1/4] ipip: allow to deactivate the creation of fb dev
From: David Miller @ 2012-11-20  0:07 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: shemminger, netdev
In-Reply-To: <50A66DD3.20807@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Fri, 16 Nov 2012 17:46:11 +0100

> By default, the fb device is created, so there is no change if you
> don't set explicitly setup_fb to 0.


Nicolas, this idea is contentous to me too.

Why not put this aside and submit the other parts of your
patch set on their own, since those looked fine to me?

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] gro: Handle inline VLAN tags
From: David Miller @ 2012-11-20  0:10 UTC (permalink / raw)
  To: eric.dumazet; +Cc: bhutchings, netdev, linux-net-drivers, gallatin, herbert
In-Reply-To: <1353114559.10798.66.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 16 Nov 2012 17:09:19 -0800

> On Sat, 2012-11-17 at 00:32 +0000, Ben Hutchings wrote:
>> On Fri, 2012-11-16 at 16:16 -0800, Eric Dumazet wrote:
>> > On Sat, 2012-11-17 at 00:00 +0000, Ben Hutchings wrote:
>> > 
>> > > I'm not sure what you mean by this.  Is your point that the
>> > > copy-on-write is never needed?  It is still possible for pskb_may_pull()
>> > > to fail.
>> > > 
>> > 
>> > A packet sniffer should have a copy of bad frames, even if dropped later
>> > in our stacks.
>> > 
>> > GRO layer is not allowed to drop a frame, even if not 'correct'.
>> 
>> What do you think the accelerated hardware does with frames that have a
>> truncated VLAN tag?
> 
> The hardware should send us the frame, exactly like when RX checksum is
> wrong.

I agree with Eric, and therefore will not apply this patch.

^ permalink raw reply

* Re: [PATCH] net: remove unnecessary wireless includes
From: David Miller @ 2012-11-20  0:10 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, netdev, johannes.berg
In-Reply-To: <1353099561-14339-1-git-send-email-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 16 Nov 2012 21:59:21 +0100

> From: Johannes Berg <johannes.berg@intel.com>
> 
> The wireless and wext includes in net-sysfs.c aren't
> needed, so remove them.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Applied, thanks Johannes.

^ permalink raw reply

* Re: [PATCH] netfilter: Remove the spurious \ in __ip_vs_lblc_init
From: Simon Horman @ 2012-11-20  0:12 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev, Pablo Neira Ayuso
In-Reply-To: <20121120000649.GC13829@verge.net.au>

On Tue, Nov 20, 2012 at 09:06:49AM +0900, Simon Horman wrote:
> On Mon, Nov 19, 2012 at 07:26:30AM -0800, Eric W. Biederman wrote:
> > 
> > In (464dc801c76a net: Don't export sysctls to unprivileged users)
> > I typoed and introduced a spurious backslash.  Delete it.
> > 
> > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Signed-off-by: Simon Horman <horms@verge.net.au>
> 
> Pablo, can you take this directly through your tree?

Scratch that, I see that David has already applied this change.

^ permalink raw reply

* Re: [PATCH] netfilter: Remove the spurious \ in __ip_vs_lblc_init
From: David Miller @ 2012-11-20  0:12 UTC (permalink / raw)
  To: horms; +Cc: ebiederm, netdev, pablo
In-Reply-To: <20121120000649.GC13829@verge.net.au>

From: Simon Horman <horms@verge.net.au>
Date: Tue, 20 Nov 2012 09:06:49 +0900

> On Mon, Nov 19, 2012 at 07:26:30AM -0800, Eric W. Biederman wrote:
>> 
>> In (464dc801c76a net: Don't export sysctls to unprivileged users)
>> I typoed and introduced a spurious backslash.  Delete it.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Signed-off-by: Simon Horman <horms@verge.net.au>
> 
> Pablo, can you take this directly through your tree?

I already applied this to net-next yesterday.

^ permalink raw reply

* Re: [PATCH net-next 0/3] Remove inet_lro remnants
From: David Miller @ 2012-11-20  0:14 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, eric.dumazet
In-Reply-To: <1353105635.2743.49.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 16 Nov 2012 22:40:35 +0000

> Eric's comments on LRO/GRO conversion prompted me to check what was
> still using inet_lro.  Not very many drivers, but there are a few
> unnecessary references outside the code.

Series applied, thanks Ben.

^ permalink raw reply

* Re: [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge
From: Jay Vosburgh @ 2012-11-20  1:02 UTC (permalink / raw)
  To: Zheng Li; +Cc: netdev, andy, linux-kernel, davem, joe.jin
In-Reply-To: <1352972861-17577-1-git-send-email-zheng.x.li@oracle.com>

Zheng Li <zheng.x.li@oracle.com> wrote:

>ARP traffic passing through a bridge and out via the bond (when the bond is a 
>port of the bridge) should not have its source MAC address adjusted by the 
>receive load balance code in rlb_arp_xmit.

	This patch differs from prior versions in that it does more than
what's described here; it also disables the receive load balance logic
for any ARPs (request or reply) that are passing through the bond (not
of local origin).  For ARP replies, that's mostly harmless, as the ARPs
passing through will simply always be sent from one particular slave
(the active slave) instead of being balanced.

	For ARP requests, though, they are already always sent via the
active slave, but there is also some logic in rlb_arp_xmit to limit the
side effects from the broadcast ARP, in particular this part:

		/* The ARP reply packets must be delayed so that
		 * they can cancel out the influence of the ARP request.
		 */
		bond->alb_info.rlb_update_delay_counter = RLB_UPDATE_DELAY;

		/* arp requests are broadcast and are sent on the primary
		 * the arp request will collapse all clients on the subnet to
		 * the primary slave. We must register these clients to be
		 * updated with their assigned mac.
		 */
		rlb_req_update_subnet_clients(bond, arp->ip_src);

	that arranges for clients to be given ARP updates for their
slave assignments (which may change to the active slave, due to the ARP
broadcast being sent via the active slave).

	I think the ARP reply side of this is fine (and is what is
described in teh changelog), but the ARP request behavior change is new
with this version.

	Since prior versions of the patch didn't cause this code to be
skipped, is this change intentional?

	Did you check to see if the above logic is necessary for ARP
requests passing through via a bridge to prevent peers from "stacking"
(in terms of load balance assignment) on the active slave due to bridged
ARP traffic?

	-J

>Signed-off-by: Zheng Li <zheng.x.li@oracle.com>
>Cc: Jay Vosburgh <fubar@us.ibm.com>
>Cc: Andy Gospodarek <andy@greyhouse.net>
>Cc: "David S. Miller" <davem@davemloft.net>
>
>---
> drivers/net/bonding/bond_alb.c |    6 ++++++
> drivers/net/bonding/bonding.h  |   13 +++++++++++++
> 2 files changed, 19 insertions(+), 0 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index e15cc11..75f6f0d 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -694,6 +694,12 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
> 	struct arp_pkt *arp = arp_pkt(skb);
> 	struct slave *tx_slave = NULL;
>
>+	/* Only modify ARP's MAC if it originates locally;
>+	 * don't change ARPs arriving via a bridge.
>+	 */
>+	if (!bond_slave_has_mac(bond, arp->mac_src))
>+		return NULL;
>+
> 	if (arp->op_code == htons(ARPOP_REPLY)) {
> 		/* the arp must be sent on the selected
> 		* rx channel
>diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>index f8af2fc..6dded56 100644
>--- a/drivers/net/bonding/bonding.h
>+++ b/drivers/net/bonding/bonding.h
>@@ -22,6 +22,7 @@
> #include <linux/in6.h>
> #include <linux/netpoll.h>
> #include <linux/inetdevice.h>
>+#include <linux/etherdevice.h>
> #include "bond_3ad.h"
> #include "bond_alb.h"
>
>@@ -450,6 +451,18 @@ static inline void bond_destroy_proc_dir(struct bond_net *bn)
> }
> #endif
>
>+static inline struct slave *bond_slave_has_mac(struct bonding *bond,
>+					       const u8 *mac)
>+{
>+	int i = 0;
>+	struct slave *tmp;
>+
>+	bond_for_each_slave(bond, tmp, i)
>+		if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
>+			return tmp;
>+
>+	return NULL;
>+}
>
> /* exported from bond_main.c */
> extern int bond_net_id;
>-- 
>1.7.6.5

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* [Suggestion] net/tipc: can delete checking: (if_local_len > TIPC_MAX_IF_NAME) || (if_peer_len  > TIPC_MAX_IF_NAME)
From: Chen Gang @ 2012-11-20  4:08 UTC (permalink / raw)
  To: David Miller; +Cc: Shan Wei, Eric Dumazet, netdev

Hello David Miller:

  at net/tipc/link.c:212,213, we can delete the check (if_local_len > TIPC_MAX_IF_NAME) and (if_peer_len  > TIPC_MAX_IF_NAME)
  for the total buffer length is no more than TIPC_MAX_IF_NAME, already (at line 182..186).

  suggest to modify it, although it is minor.

thanks.


 167 static int link_name_validate(const char *name,
 168                                 struct tipc_link_name *name_parts)
 169 {
 170         char name_copy[TIPC_MAX_LINK_NAME];
 171         char *addr_local;
 172         char *if_local;
 173         char *addr_peer;
 174         char *if_peer;
 175         char dummy;
 176         u32 z_local, c_local, n_local;
 177         u32 z_peer, c_peer, n_peer;
 178         u32 if_local_len;
 179         u32 if_peer_len;
 180 
 181         /* copy link name & ensure length is OK */
 182         name_copy[TIPC_MAX_LINK_NAME - 1] = 0;
 183         /* need above in case non-Posix strncpy() doesn't pad with nulls */
 184         strncpy(name_copy, name, TIPC_MAX_LINK_NAME);
 185         if (name_copy[TIPC_MAX_LINK_NAME - 1] != 0)
 186                 return 0;
 187 
 188         /* ensure all component parts of link name are present */
 189         addr_local = name_copy;
 190         if_local = strchr(addr_local, ':');
 191         if (if_local == NULL)
 192                 return 0;
 193         *(if_local++) = 0;
 194         addr_peer = strchr(if_local, '-');
 195         if (addr_peer == NULL)
 196                 return 0;
 197         *(addr_peer++) = 0;
 198         if_local_len = addr_peer - if_local;
 199         if_peer = strchr(addr_peer, ':');
 200         if (if_peer == NULL)
 201                 return 0;
 202         *(if_peer++) = 0;
 203         if_peer_len = strlen(if_peer) + 1;
 204 
 205         /* validate component parts of link name */
 206         if ((sscanf(addr_local, "%u.%u.%u%c",
 207                     &z_local, &c_local, &n_local, &dummy) != 3) ||
 208             (sscanf(addr_peer, "%u.%u.%u%c",
 209                     &z_peer, &c_peer, &n_peer, &dummy) != 3) ||
 210             (z_local > 255) || (c_local > 4095) || (n_local > 4095) ||
 211             (z_peer  > 255) || (c_peer  > 4095) || (n_peer  > 4095) ||
 212             (if_local_len <= 1) || (if_local_len > TIPC_MAX_IF_NAME) ||
 213             (if_peer_len  <= 1) || (if_peer_len  > TIPC_MAX_IF_NAME))
 214                 return 0;
 215 
 216         /* return link name components, if necessary */
 217         if (name_parts) {
 218                 name_parts->addr_local = tipc_addr(z_local, c_local, n_local);
 219                 strcpy(name_parts->if_local, if_local);
 220                 name_parts->addr_peer = tipc_addr(z_peer, c_peer, n_peer);
 221                 strcpy(name_parts->if_peer, if_peer);
 222         }
 223         return 1;
 224 }



-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: [Suggestion] net/tipc: can delete checking: (if_local_len > TIPC_MAX_IF_NAME) || (if_peer_len  > TIPC_MAX_IF_NAME)
From: Xue Ying @ 2012-11-20  5:19 UTC (permalink / raw)
  To: Chen Gang; +Cc: David Miller, Shan Wei, Eric Dumazet, netdev
In-Reply-To: <50AB0249.20802@asianux.com>

Chen Gang wrote:
> Hello David Miller:
>
>   at net/tipc/link.c:212,213, we can delete the check (if_local_len > TIPC_MAX_IF_NAME) and (if_peer_len  > TIPC_MAX_IF_NAME)
>   for the total buffer length is no more than TIPC_MAX_IF_NAME, already (at line 182..186).
>
>   

The length of name_copy array is TIPC_MAX_LINK_NAME(i.e, 60) while
TIPC_MAX_IF_NAME is defined to 16, why does the former is more than the
latter?

Regards,
Ying

>   suggest to modify it, although it is minor.
>
> thanks.
>
>
>  167 static int link_name_validate(const char *name,
>  168                                 struct tipc_link_name *name_parts)
>  169 {
>  170         char name_copy[TIPC_MAX_LINK_NAME];
>  171         char *addr_local;
>  172         char *if_local;
>  173         char *addr_peer;
>  174         char *if_peer;
>  175         char dummy;
>  176         u32 z_local, c_local, n_local;
>  177         u32 z_peer, c_peer, n_peer;
>  178         u32 if_local_len;
>  179         u32 if_peer_len;
>  180 
>  181         /* copy link name & ensure length is OK */
>  182         name_copy[TIPC_MAX_LINK_NAME - 1] = 0;
>  183         /* need above in case non-Posix strncpy() doesn't pad with nulls */
>  184         strncpy(name_copy, name, TIPC_MAX_LINK_NAME);
>  185         if (name_copy[TIPC_MAX_LINK_NAME - 1] != 0)
>  186                 return 0;
>  187 
>  188         /* ensure all component parts of link name are present */
>  189         addr_local = name_copy;
>  190         if_local = strchr(addr_local, ':');
>  191         if (if_local == NULL)
>  192                 return 0;
>  193         *(if_local++) = 0;
>  194         addr_peer = strchr(if_local, '-');
>  195         if (addr_peer == NULL)
>  196                 return 0;
>  197         *(addr_peer++) = 0;
>  198         if_local_len = addr_peer - if_local;
>  199         if_peer = strchr(addr_peer, ':');
>  200         if (if_peer == NULL)
>  201                 return 0;
>  202         *(if_peer++) = 0;
>  203         if_peer_len = strlen(if_peer) + 1;
>  204 
>  205         /* validate component parts of link name */
>  206         if ((sscanf(addr_local, "%u.%u.%u%c",
>  207                     &z_local, &c_local, &n_local, &dummy) != 3) ||
>  208             (sscanf(addr_peer, "%u.%u.%u%c",
>  209                     &z_peer, &c_peer, &n_peer, &dummy) != 3) ||
>  210             (z_local > 255) || (c_local > 4095) || (n_local > 4095) ||
>  211             (z_peer  > 255) || (c_peer  > 4095) || (n_peer  > 4095) ||
>  212             (if_local_len <= 1) || (if_local_len > TIPC_MAX_IF_NAME) ||
>  213             (if_peer_len  <= 1) || (if_peer_len  > TIPC_MAX_IF_NAME))
>  214                 return 0;
>  215 
>  216         /* return link name components, if necessary */
>  217         if (name_parts) {
>  218                 name_parts->addr_local = tipc_addr(z_local, c_local, n_local);
>  219                 strcpy(name_parts->if_local, if_local);
>  220                 name_parts->addr_peer = tipc_addr(z_peer, c_peer, n_peer);
>  221                 strcpy(name_parts->if_peer, if_peer);
>  222         }
>  223         return 1;
>  224 }
>
>
>
>   

^ permalink raw reply

* Re: [Suggestion] net/tipc: can delete checking: (if_local_len > TIPC_MAX_IF_NAME) || (if_peer_len  > TIPC_MAX_IF_NAME)
From: Chen Gang @ 2012-11-20  5:25 UTC (permalink / raw)
  To: Xue Ying; +Cc: David Miller, Shan Wei, Eric Dumazet, netdev
In-Reply-To: <50AB12EE.6050802@gmail.com>

于 2012年11月20日 13:19, Xue Ying 写道:
> Chen Gang wrote:
>> Hello David Miller:
>>
>>   at net/tipc/link.c:212,213, we can delete the check (if_local_len > TIPC_MAX_IF_NAME) and (if_peer_len  > TIPC_MAX_IF_NAME)
>>   for the total buffer length is no more than TIPC_MAX_IF_NAME, already (at line 182..186).
>>
>>   
> 
> The length of name_copy array is TIPC_MAX_LINK_NAME(i.e, 60) while
> TIPC_MAX_IF_NAME is defined to 16, why does the former is more than the
> latter?
> 
> Regards,
> Ying
> 

  It is my fault. thanks.

  Regards,

  gchen


>>   suggest to modify it, although it is minor.
>>
>> thanks.
>>
>>
>>  167 static int link_name_validate(const char *name,
>>  168                                 struct tipc_link_name *name_parts)
>>  169 {
>>  170         char name_copy[TIPC_MAX_LINK_NAME];
>>  171         char *addr_local;
>>  172         char *if_local;
>>  173         char *addr_peer;
>>  174         char *if_peer;
>>  175         char dummy;
>>  176         u32 z_local, c_local, n_local;
>>  177         u32 z_peer, c_peer, n_peer;
>>  178         u32 if_local_len;
>>  179         u32 if_peer_len;
>>  180 
>>  181         /* copy link name & ensure length is OK */
>>  182         name_copy[TIPC_MAX_LINK_NAME - 1] = 0;
>>  183         /* need above in case non-Posix strncpy() doesn't pad with nulls */
>>  184         strncpy(name_copy, name, TIPC_MAX_LINK_NAME);
>>  185         if (name_copy[TIPC_MAX_LINK_NAME - 1] != 0)
>>  186                 return 0;
>>  187 
>>  188         /* ensure all component parts of link name are present */
>>  189         addr_local = name_copy;
>>  190         if_local = strchr(addr_local, ':');
>>  191         if (if_local == NULL)
>>  192                 return 0;
>>  193         *(if_local++) = 0;
>>  194         addr_peer = strchr(if_local, '-');
>>  195         if (addr_peer == NULL)
>>  196                 return 0;
>>  197         *(addr_peer++) = 0;
>>  198         if_local_len = addr_peer - if_local;
>>  199         if_peer = strchr(addr_peer, ':');
>>  200         if (if_peer == NULL)
>>  201                 return 0;
>>  202         *(if_peer++) = 0;
>>  203         if_peer_len = strlen(if_peer) + 1;
>>  204 
>>  205         /* validate component parts of link name */
>>  206         if ((sscanf(addr_local, "%u.%u.%u%c",
>>  207                     &z_local, &c_local, &n_local, &dummy) != 3) ||
>>  208             (sscanf(addr_peer, "%u.%u.%u%c",
>>  209                     &z_peer, &c_peer, &n_peer, &dummy) != 3) ||
>>  210             (z_local > 255) || (c_local > 4095) || (n_local > 4095) ||
>>  211             (z_peer  > 255) || (c_peer  > 4095) || (n_peer  > 4095) ||
>>  212             (if_local_len <= 1) || (if_local_len > TIPC_MAX_IF_NAME) ||
>>  213             (if_peer_len  <= 1) || (if_peer_len  > TIPC_MAX_IF_NAME))
>>  214                 return 0;
>>  215 
>>  216         /* return link name components, if necessary */
>>  217         if (name_parts) {
>>  218                 name_parts->addr_local = tipc_addr(z_local, c_local, n_local);
>>  219                 strcpy(name_parts->if_local, if_local);
>>  220                 name_parts->addr_peer = tipc_addr(z_peer, c_peer, n_peer);
>>  221                 strcpy(name_parts->if_peer, if_peer);
>>  222         }
>>  223         return 1;
>>  224 }
>>
>>
>>
>>   
> 
> 
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: unable to handle paging request, arm, at aio/tcp code, only 3.6
From: Lluís Batlle i Rossell @ 2012-11-20  6:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1353360858.10798.86.camel@edumazet-glaptop>

On Mon, Nov 19, 2012 at 01:34:18PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> > Thanks for the report.
> > 
> > I believe this is a regression of commit
> > 35ad9b9cf7d8a2e6259a0d24022e910adb6f3489
> > (ipv6: Add helper inet6_csk_update_pmtu().)
> > 
> > I'll prepare a patch to fix this.
> 
> Please try the following fix.
> 
> Thanks !

Ok, I'm running it. Let's see if it crashes today.

Thank you!

^ permalink raw reply

* Re: New ASIX USB3 product
From: Christian Riesch @ 2012-11-20  8:15 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: netdev
In-Reply-To: <CALJXSJp+m7FewpW2i6wJoFAN0oGwao4-KjWuUX0F_dcE5vsoPQ@mail.gmail.com>

Jérôme,

On 2012-11-19 20:02, Jérôme Poulin wrote:
> I just bought a gigabit ethernet USB3 adapter, it is a StarTech
> USB31000S adapter and it does not seem to be un ASIX product database
> yet.

The AX88179 device is not yet supported by the mainline asix drivers. 
You could try to use the driver from the ASIX website [1]. Or you could 
compare this driver to the code in drivers/net/usb/a* and see if one of 
these drivers would work for the AX88179 and add an entry for the 
AX88179 in usb_device_id struct in asix_devices.c.

Regards, Christian

[1] 
http://www.asix.com.tw/products.php?op=pItemdetail&PItemID=131;71;112&PLine=71

^ permalink raw reply

* [PATCHSET REPOST v2 cgroup/for-3.8] netcls/prio_cgroup: update hierarchy support
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, lizefan-hv44wF8Li93QT0dZR+AlfA,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA

(Reposting w/ cc to netdev added as requested by David.  Sorry about
the noise.)

Hello, guys.

This patchset replaces the following two patchsets.

 [1] "[PATCHSET cgroup/for-3.8] netprio_cgroup: implement hierarchy support"
 [2] "[PATCHSET cgroup/for-3.8] netcls_cgroup: implement hierarchy support"

It turns out that we'll have to live with attributes which are
inherited only while creating a new cgroup - such attributes are
already in wide use in cpuset, and there doesn't seem to be much point
in trying to implement fully hierarchical behavior on all attributes
especially as it requires an otherwise-unnecessary userland behavior
change on netcls_cgroup.

This patchset drop .broken_hierarchy from netcls_cgroup while
retaining the current behavior, and implements the same behavior on
netprio_cgroup.

 0001-netcls_cgroup-move-config-inheritance-to-css_online-.patch
 0002-netprio_cgroup-simplify-write_priomap.patch
 0003-netprio_cgroup-shorten-variable-names-in-extend_netd.patch
 0004-netprio_cgroup-reimplement-priomap-expansion.patch
 0005-netprio_cgroup-use-cgroup-id-instead-of-cgroup_netpr.patch
 0006-netprio_cgroup-implement-netprio-_set-_prio-helpers.patch
 0007-netprio_cgroup-allow-nesting-and-inherit-config-on-c.patch

0001 drops .broken_hierarchy from netcls_cgroup.

0002-0006 prepare for inheritance implementation.  These patches are
identical from the previous posting[1].

0007 implements inheritance on cgroup creation.

This patchset is on top of cgroup/for-3.8 0a950f65e1 ("cgroup: add
cgroup->id").

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup-net-cls-prio-hierarchy

diffstat follows.

 Documentation/cgroups/net_prio.txt |    2
 include/net/netprio_cgroup.h       |   11 -
 net/core/netprio_cgroup.c          |  254 ++++++++++++++++---------------------
 net/sched/cls_cgroup.c             |   20 +-
 4 files changed, 124 insertions(+), 163 deletions(-)

Thanks.

--
tejun

[1] https://lkml.org/lkml/2012/11/16/514
[2] http://thread.gmane.org/gmane.linux.kernel.cgroups/5094

^ permalink raw reply

* [PATCH 1/7] netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353400211-5182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

It turns out that we'll have to live with attributes which are
inherited at cgroup creation time but not affected by further updates
to the parent afterwards - such attributes are already in wide use
e.g. for cpuset.

So, there's nothing to do for netcls_cgroup for hierarchy support.
Its current behavior - inherit only during creation - is good enough.

Move config inheriting from ->css_alloc() to ->css_online() for
consistency, which doesn't change behavior at all, and remove
.broken_hierarchy marking.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 net/sched/cls_cgroup.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 8cdc18e..31f06b6 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -41,11 +41,15 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
 	if (!cs)
 		return ERR_PTR(-ENOMEM);
+	return &cs->css;
+}
 
+static int cgrp_css_online(struct cgroup *cgrp)
+{
 	if (cgrp->parent)
-		cs->classid = cgrp_cls_state(cgrp->parent)->classid;
-
-	return &cs->css;
+		cgrp_cls_state(cgrp)->classid =
+			cgrp_cls_state(cgrp->parent)->classid;
+	return 0;
 }
 
 static void cgrp_css_free(struct cgroup *cgrp)
@@ -76,19 +80,11 @@ static struct cftype ss_files[] = {
 struct cgroup_subsys net_cls_subsys = {
 	.name		= "net_cls",
 	.css_alloc	= cgrp_css_alloc,
+	.css_online	= cgrp_css_online,
 	.css_free	= cgrp_css_free,
 	.subsys_id	= net_cls_subsys_id,
 	.base_cftypes	= ss_files,
 	.module		= THIS_MODULE,
-
-	/*
-	 * While net_cls cgroup has the rudimentary hierarchy support of
-	 * inheriting the parent's classid on cgroup creation, it doesn't
-	 * properly propagates config changes in ancestors to their
-	 * descendents.  A child should follow the parent's configuration
-	 * but be allowed to override it.  Fix it and remove the following.
-	 */
-	.broken_hierarchy = true,
 };
 
 struct cls_cgroup_head {
-- 
1.7.11.7

^ permalink raw reply related

* [PATCH 2/7] netprio_cgroup: simplify write_priomap()
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353400211-5182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

sscanf() doesn't bite.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
---
 net/core/netprio_cgroup.c | 56 ++++++++++-------------------------------------
 1 file changed, 11 insertions(+), 45 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index f0b6b0d..66d98da 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -176,66 +176,32 @@ static int read_priomap(struct cgroup *cont, struct cftype *cft,
 static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 			 const char *buffer)
 {
-	char *devname = kstrdup(buffer, GFP_KERNEL);
-	int ret = -EINVAL;
 	u32 prioidx = cgrp_netprio_state(cgrp)->prioidx;
-	unsigned long priority;
-	char *priostr;
+	char devname[IFNAMSIZ + 1];
 	struct net_device *dev;
 	struct netprio_map *map;
+	u32 prio;
+	int ret;
 
-	if (!devname)
-		return -ENOMEM;
-
-	/*
-	 * Minimally sized valid priomap string
-	 */
-	if (strlen(devname) < 3)
-		goto out_free_devname;
-
-	priostr = strstr(devname, " ");
-	if (!priostr)
-		goto out_free_devname;
-
-	/*
-	 *Separate the devname from the associated priority
-	 *and advance the priostr pointer to the priority value
-	 */
-	*priostr = '\0';
-	priostr++;
-
-	/*
-	 * If the priostr points to NULL, we're at the end of the passed
-	 * in string, and its not a valid write
-	 */
-	if (*priostr == '\0')
-		goto out_free_devname;
-
-	ret = kstrtoul(priostr, 10, &priority);
-	if (ret < 0)
-		goto out_free_devname;
-
-	ret = -ENODEV;
+	if (sscanf(buffer, "%"__stringify(IFNAMSIZ)"s %u", devname, &prio) != 2)
+		return -EINVAL;
 
 	dev = dev_get_by_name(&init_net, devname);
 	if (!dev)
-		goto out_free_devname;
+		return -ENODEV;
 
 	rtnl_lock();
+
 	ret = write_update_netdev_table(dev);
-	if (ret < 0)
-		goto out_put_dev;
+	if (ret)
+		goto out_unlock;
 
 	map = rtnl_dereference(dev->priomap);
 	if (map)
-		map->priomap[prioidx] = priority;
-
-out_put_dev:
+		map->priomap[prioidx] = prio;
+out_unlock:
 	rtnl_unlock();
 	dev_put(dev);
-
-out_free_devname:
-	kfree(devname);
 	return ret;
 }
 
-- 
1.7.11.7

^ permalink raw reply related

* [PATCH 3/7] netprio_cgroup: shorten variable names in extend_netdev_table()
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353400211-5182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

The function is about to go through a rewrite.  In preparation,
shorten the variable names so that we don't repeat "priomap" so often.

This patch is cosmetic.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
---
 net/core/netprio_cgroup.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 66d98da..92cc54c 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -71,26 +71,25 @@ static int extend_netdev_table(struct net_device *dev, u32 new_len)
 {
 	size_t new_size = sizeof(struct netprio_map) +
 			   ((sizeof(u32) * new_len));
-	struct netprio_map *new_priomap = kzalloc(new_size, GFP_KERNEL);
-	struct netprio_map *old_priomap;
+	struct netprio_map *new = kzalloc(new_size, GFP_KERNEL);
+	struct netprio_map *old;
 
-	old_priomap  = rtnl_dereference(dev->priomap);
+	old = rtnl_dereference(dev->priomap);
 
-	if (!new_priomap) {
+	if (!new) {
 		pr_warn("Unable to alloc new priomap!\n");
 		return -ENOMEM;
 	}
 
-	if (old_priomap)
-		memcpy(new_priomap->priomap, old_priomap->priomap,
-		       old_priomap->priomap_len *
-		       sizeof(old_priomap->priomap[0]));
+	if (old)
+		memcpy(new->priomap, old->priomap,
+		       old->priomap_len * sizeof(old->priomap[0]));
 
-	new_priomap->priomap_len = new_len;
+	new->priomap_len = new_len;
 
-	rcu_assign_pointer(dev->priomap, new_priomap);
-	if (old_priomap)
-		kfree_rcu(old_priomap, rcu);
+	rcu_assign_pointer(dev->priomap, new);
+	if (old)
+		kfree_rcu(old, rcu);
 	return 0;
 }
 
-- 
1.7.11.7

^ permalink raw reply related

* [PATCH 4/7] netprio_cgroup: reimplement priomap expansion
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353400211-5182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

netprio kept track of the highest prioidx allocated and resized
priomaps accordingly when necessary.  This makes it necessary to keep
track of prioidx allocation and may end up resizing on every new
prioidx.

Update extend_netdev_table() such that it takes @target_idx which the
priomap should be able to accomodate.  If the priomap is large enough,
nothing happens; otherwise, the size is doubled until @target_idx can
be accomodated.

This makes max_prioidx and write_update_netdev_table() unnecessary.
write_priomap() now calls extend_netdev_table() directly.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
---
 net/core/netprio_cgroup.c | 56 ++++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 92cc54c..569d83d 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -27,11 +27,11 @@
 
 #include <linux/fdtable.h>
 
+#define PRIOMAP_MIN_SZ		128
 #define PRIOIDX_SZ 128
 
 static unsigned long prioidx_map[PRIOIDX_SZ];
 static DEFINE_SPINLOCK(prioidx_map_lock);
-static atomic_t max_prioidx = ATOMIC_INIT(0);
 
 static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp)
 {
@@ -51,8 +51,6 @@ static int get_prioidx(u32 *prio)
 		return -ENOSPC;
 	}
 	set_bit(prioidx, prioidx_map);
-	if (atomic_read(&max_prioidx) < prioidx)
-		atomic_set(&max_prioidx, prioidx);
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
 	*prio = prioidx;
 	return 0;
@@ -67,15 +65,40 @@ static void put_prioidx(u32 idx)
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
 }
 
-static int extend_netdev_table(struct net_device *dev, u32 new_len)
+/*
+ * Extend @dev->priomap so that it's large enough to accomodate
+ * @target_idx.  @dev->priomap.priomap_len > @target_idx after successful
+ * return.  Must be called under rtnl lock.
+ */
+static int extend_netdev_table(struct net_device *dev, u32 target_idx)
 {
-	size_t new_size = sizeof(struct netprio_map) +
-			   ((sizeof(u32) * new_len));
-	struct netprio_map *new = kzalloc(new_size, GFP_KERNEL);
-	struct netprio_map *old;
+	struct netprio_map *old, *new;
+	size_t new_sz, new_len;
 
+	/* is the existing priomap large enough? */
 	old = rtnl_dereference(dev->priomap);
+	if (old && old->priomap_len > target_idx)
+		return 0;
+
+	/*
+	 * Determine the new size.  Let's keep it power-of-two.  We start
+	 * from PRIOMAP_MIN_SZ and double it until it's large enough to
+	 * accommodate @target_idx.
+	 */
+	new_sz = PRIOMAP_MIN_SZ;
+	while (true) {
+		new_len = (new_sz - offsetof(struct netprio_map, priomap)) /
+			sizeof(new->priomap[0]);
+		if (new_len > target_idx)
+			break;
+		new_sz *= 2;
+		/* overflowed? */
+		if (WARN_ON(new_sz < PRIOMAP_MIN_SZ))
+			return -ENOSPC;
+	}
 
+	/* allocate & copy */
+	new = kzalloc(new_sz, GFP_KERNEL);
 	if (!new) {
 		pr_warn("Unable to alloc new priomap!\n");
 		return -ENOMEM;
@@ -87,26 +110,13 @@ static int extend_netdev_table(struct net_device *dev, u32 new_len)
 
 	new->priomap_len = new_len;
 
+	/* install the new priomap */
 	rcu_assign_pointer(dev->priomap, new);
 	if (old)
 		kfree_rcu(old, rcu);
 	return 0;
 }
 
-static int write_update_netdev_table(struct net_device *dev)
-{
-	int ret = 0;
-	u32 max_len;
-	struct netprio_map *map;
-
-	max_len = atomic_read(&max_prioidx) + 1;
-	map = rtnl_dereference(dev->priomap);
-	if (!map || map->priomap_len < max_len)
-		ret = extend_netdev_table(dev, max_len);
-
-	return ret;
-}
-
 static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
@@ -191,7 +201,7 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 
 	rtnl_lock();
 
-	ret = write_update_netdev_table(dev);
+	ret = extend_netdev_table(dev, prioidx);
 	if (ret)
 		goto out_unlock;
 
-- 
1.7.11.7

^ permalink raw reply related

* [PATCH 5/7] netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx
From: Tejun Heo @ 2012-11-20  8:30 UTC (permalink / raw)
  To: daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	tgraf-G/eBtMaohhA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353400211-5182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

With priomap expansion no longer depending on knowing max id
allocated, netprio_cgroup can use cgroup->id insted of cs->prioidx.
Drop prioidx alloc/free logic and convert all uses to cgroup->id.

* In cgrp_css_alloc(), parent->id test is moved above @cs allocation
  to simplify error path.

* In cgrp_css_free(), @cs assignment is made initialization.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
---
 include/net/netprio_cgroup.h | 11 +++-----
 net/core/netprio_cgroup.c    | 65 ++++++++------------------------------------
 2 files changed, 15 insertions(+), 61 deletions(-)

diff --git a/include/net/netprio_cgroup.h b/include/net/netprio_cgroup.h
index 2760f4f..1d04b6f 100644
--- a/include/net/netprio_cgroup.h
+++ b/include/net/netprio_cgroup.h
@@ -27,7 +27,6 @@ struct netprio_map {
 
 struct cgroup_netprio_state {
 	struct cgroup_subsys_state css;
-	u32 prioidx;
 };
 
 extern void sock_update_netprioidx(struct sock *sk, struct task_struct *task);
@@ -36,13 +35,12 @@ extern void sock_update_netprioidx(struct sock *sk, struct task_struct *task);
 
 static inline u32 task_netprioidx(struct task_struct *p)
 {
-	struct cgroup_netprio_state *state;
+	struct cgroup_subsys_state *css;
 	u32 idx;
 
 	rcu_read_lock();
-	state = container_of(task_subsys_state(p, net_prio_subsys_id),
-			     struct cgroup_netprio_state, css);
-	idx = state->prioidx;
+	css = task_subsys_state(p, net_prio_subsys_id);
+	idx = css->cgroup->id;
 	rcu_read_unlock();
 	return idx;
 }
@@ -57,8 +55,7 @@ static inline u32 task_netprioidx(struct task_struct *p)
 	rcu_read_lock();
 	css = task_subsys_state(p, net_prio_subsys_id);
 	if (css)
-		idx = container_of(css,
-				   struct cgroup_netprio_state, css)->prioidx;
+		idx = css->cgroup->id;
 	rcu_read_unlock();
 	return idx;
 }
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 569d83d..9409cdf 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -28,10 +28,6 @@
 #include <linux/fdtable.h>
 
 #define PRIOMAP_MIN_SZ		128
-#define PRIOIDX_SZ 128
-
-static unsigned long prioidx_map[PRIOIDX_SZ];
-static DEFINE_SPINLOCK(prioidx_map_lock);
 
 static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp)
 {
@@ -39,32 +35,6 @@ static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgr
 			    struct cgroup_netprio_state, css);
 }
 
-static int get_prioidx(u32 *prio)
-{
-	unsigned long flags;
-	u32 prioidx;
-
-	spin_lock_irqsave(&prioidx_map_lock, flags);
-	prioidx = find_first_zero_bit(prioidx_map, sizeof(unsigned long) * PRIOIDX_SZ);
-	if (prioidx == sizeof(unsigned long) * PRIOIDX_SZ) {
-		spin_unlock_irqrestore(&prioidx_map_lock, flags);
-		return -ENOSPC;
-	}
-	set_bit(prioidx, prioidx_map);
-	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-	*prio = prioidx;
-	return 0;
-}
-
-static void put_prioidx(u32 idx)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&prioidx_map_lock, flags);
-	clear_bit(idx, prioidx_map);
-	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-}
-
 /*
  * Extend @dev->priomap so that it's large enough to accomodate
  * @target_idx.  @dev->priomap.priomap_len > @target_idx after successful
@@ -120,62 +90,50 @@ static int extend_netdev_table(struct net_device *dev, u32 target_idx)
 static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
-	int ret = -EINVAL;
+
+	if (cgrp->parent && cgrp->parent->id)
+		return ERR_PTR(-EINVAL);
 
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
 	if (!cs)
 		return ERR_PTR(-ENOMEM);
 
-	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx)
-		goto out;
-
-	ret = get_prioidx(&cs->prioidx);
-	if (ret < 0) {
-		pr_warn("No space in priority index array\n");
-		goto out;
-	}
-
 	return &cs->css;
-out:
-	kfree(cs);
-	return ERR_PTR(ret);
 }
 
 static void cgrp_css_free(struct cgroup *cgrp)
 {
-	struct cgroup_netprio_state *cs;
+	struct cgroup_netprio_state *cs = cgrp_netprio_state(cgrp);
 	struct net_device *dev;
 	struct netprio_map *map;
 
-	cs = cgrp_netprio_state(cgrp);
 	rtnl_lock();
 	for_each_netdev(&init_net, dev) {
 		map = rtnl_dereference(dev->priomap);
-		if (map && cs->prioidx < map->priomap_len)
-			map->priomap[cs->prioidx] = 0;
+		if (map && cgrp->id < map->priomap_len)
+			map->priomap[cgrp->id] = 0;
 	}
 	rtnl_unlock();
-	put_prioidx(cs->prioidx);
 	kfree(cs);
 }
 
 static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
 {
-	return (u64)cgrp_netprio_state(cgrp)->prioidx;
+	return cgrp->id;
 }
 
 static int read_priomap(struct cgroup *cont, struct cftype *cft,
 			struct cgroup_map_cb *cb)
 {
 	struct net_device *dev;
-	u32 prioidx = cgrp_netprio_state(cont)->prioidx;
+	u32 id = cont->id;
 	u32 priority;
 	struct netprio_map *map;
 
 	rcu_read_lock();
 	for_each_netdev_rcu(&init_net, dev) {
 		map = rcu_dereference(dev->priomap);
-		priority = (map && prioidx < map->priomap_len) ? map->priomap[prioidx] : 0;
+		priority = (map && id < map->priomap_len) ? map->priomap[id] : 0;
 		cb->fill(cb, dev->name, priority);
 	}
 	rcu_read_unlock();
@@ -185,7 +143,6 @@ static int read_priomap(struct cgroup *cont, struct cftype *cft,
 static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 			 const char *buffer)
 {
-	u32 prioidx = cgrp_netprio_state(cgrp)->prioidx;
 	char devname[IFNAMSIZ + 1];
 	struct net_device *dev;
 	struct netprio_map *map;
@@ -201,13 +158,13 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 
 	rtnl_lock();
 
-	ret = extend_netdev_table(dev, prioidx);
+	ret = extend_netdev_table(dev, cgrp->id);
 	if (ret)
 		goto out_unlock;
 
 	map = rtnl_dereference(dev->priomap);
 	if (map)
-		map->priomap[prioidx] = prio;
+		map->priomap[cgrp->id] = prio;
 out_unlock:
 	rtnl_unlock();
 	dev_put(dev);
-- 
1.7.11.7

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox