Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/2] net/neighbour: queue work on power efficient wq
From: Viresh Kumar @ 2014-01-22  6:53 UTC (permalink / raw)
  To: davem; +Cc: linaro-kernel, patches, netdev, linux-kernel, Viresh Kumar
In-Reply-To: <c24546776d8ec32afd1c084dd1d2a72fc96c6519.1390373223.git.viresh.kumar@linaro.org>

Workqueue used in neighbour layer have no real dependency of scheduling these on
the cpu which scheduled them.

On a idle system, it is observed that an idle cpu wakes up many times just to
service this work. It would be better if we can schedule it on a cpu which the
scheduler believes to be the most appropriate one.

This patch replaces normal workqueues with power efficient versions. This
doesn't change existing behavior of code unless CONFIG_WQ_POWER_EFFICIENT is
enabled.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 net/core/neighbour.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index f8012fe..b9e9e0d 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -828,7 +828,7 @@ out:
 	 * ARP entry timeouts range from 1/2 BASE_REACHABLE_TIME to 3/2
 	 * BASE_REACHABLE_TIME.
 	 */
-	schedule_delayed_work(&tbl->gc_work,
+	queue_delayed_work(system_power_efficient_wq, &tbl->gc_work,
 			      NEIGH_VAR(&tbl->parms, BASE_REACHABLE_TIME) >> 1);
 	write_unlock_bh(&tbl->lock);
 }
@@ -1565,7 +1565,8 @@ static void neigh_table_init_no_netlink(struct neigh_table *tbl)
 
 	rwlock_init(&tbl->lock);
 	INIT_DEFERRABLE_WORK(&tbl->gc_work, neigh_periodic_work);
-	schedule_delayed_work(&tbl->gc_work, tbl->parms.reachable_time);
+	queue_delayed_work(system_power_efficient_wq, &tbl->gc_work,
+			tbl->parms.reachable_time);
 	setup_timer(&tbl->proxy_timer, neigh_proxy_process, (unsigned long)tbl);
 	skb_queue_head_init_class(&tbl->proxy_queue,
 			&neigh_table_proxy_queue_class);
-- 
1.7.12.rc2.18.g61b472e

^ permalink raw reply related

* Re: [PATCH] net: Fix some fallout from the etner_addr_copy() changes.
From: David Miller @ 2014-01-22  6:54 UTC (permalink / raw)
  To: billfink; +Cc: netdev
In-Reply-To: <20140121232331.130f34fc.billfink@mindspring.com>

From: Bill Fink <billfink@mindspring.com>
Date: Tue, 21 Jan 2014 23:23:31 -0500

> The commit message indicates problems with appletalk/aarp.c,
> atm/lec.c, and caif/caif_usb.c, but the diffstat and patch only
> address the first two and not caif/caif_usb.c.  Is that intended
> or am I missing something.

Thanks for catching that, I just pushed the missing change.

^ permalink raw reply

* Re: [PATCH RFC 00/73] tree-wide: clean up some no longer required #include <linux/init.h>
From: Stephen Rothwell @ 2014-01-22  7:00 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: linux-kernel, linux-arch, linux-alpha, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linuxppc-dev, linux-s390,
	sparclinux, x86, netdev, kvm, rusty, gregkh, akpm, torvalds
In-Reply-To: <1390339396-3479-1-git-send-email-paul.gortmaker@windriver.com>

[-- Attachment #1: Type: text/plain, Size: 2351 bytes --]

Hi Paul,

On Tue, 21 Jan 2014 16:22:03 -0500 Paul Gortmaker <paul.gortmaker@windriver.com> wrote:
>
> Where: This work exists as a queue of patches that I apply to
> linux-next; since the changes are fixing some things that currently
> can only be found there.  The patch series can be found at:
> 
>    http://git.kernel.org/cgit/linux/kernel/git/paulg/init.git
>    git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git
> 
> I've avoided annoying Stephen with another queue of patches for
> linux-next while the development content was in flux, but now that
> the merge window has opened, and new additions are fewer, perhaps he
> wouldn't mind tacking it on the end...  Stephen?

OK, I have added this to the end of linux-next today - we will see how we
go.  It is called "init".

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgment of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
     * submitted under GPL v2 (or later) and include the Contributor's
	Signed-off-by,
     * posted to the relevant mailing list,
     * reviewed by you (or another maintainer of your subsystem tree),
     * successfully unit tested, and 
     * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
sfr@canb.auug.org.au

Legal Stuff:
By participating in linux-next, your subsystem tree contributions are
public and will be included in the linux-next trees.  You may be sent
e-mail messages indicating errors or other issues when the
patches/commits from your subsystem tree are merged and tested in
linux-next.  These messages may also be cross-posted to the linux-next
mailing list, the linux-kernel mailing list, etc.  The linux-next tree
project and IBM (my employer) make no warranties regarding the linux-next
project, the testing procedures, the results, the e-mails, etc.  If you
don't agree to these ground rules, let me know and I'll remove your tree
from participation in linux-next.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH net-next v4] ipv6: enable anycast addresses as source addresses for datagrams
From: Francois-Xavier Le Bail @ 2014-01-22  6:42 UTC (permalink / raw)
  To: netdev; +Cc: Hannes Frederic Sowa, David Stevens, David Miller

This change allows to consider an anycast address valid as source address
when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item.
So, when sending a datagram with ancillary data, the unicast and anycast
addresses are handled in the same way.

- Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local
  on given interface or is global.
- Uses it in ip6_datagram_send_ctl().

Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
---
v4: better style
v3: Consideration of Hannes's review (thanks!):
    - Uses only ipv6_chk_acast_addr (rcu_read_lock needed).

Typical usage :
A server uses IPV6_RECVPKTINFO socket option to get ancillary data with
recvmsg() and can use sendmsg() to reply with anycast adress as source address
in the same way it does for unicast.

 include/net/addrconf.h |    5 +++--
 net/ipv6/anycast.c     |   11 +++++++++++
 net/ipv6/datagram.c    |    4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 66c4a44..50e39a8 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -205,8 +205,9 @@ void ipv6_sock_ac_close(struct sock *sk);
 int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr);
 int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr);
 bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
-				const struct in6_addr *addr);
-
+			 const struct in6_addr *addr);
+bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
+			     const struct in6_addr *addr);
 
 /* Device notifier */
 int register_inet6addr_notifier(struct notifier_block *nb);
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 5a80f15..2101832 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -383,6 +383,17 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 	return found;
 }
 
+/*	check if this anycast address is link-local on given interface or
+ *	is global
+ */
+bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
+			     const struct in6_addr *addr)
+{
+	return ipv6_chk_acast_addr(net,
+				   (ipv6_addr_type(addr) & IPV6_ADDR_LINKLOCAL ?
+				    dev : NULL),
+				   addr);
+}
 
 #ifdef CONFIG_PROC_FS
 struct ac6_iter_state {
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2f5e2f1..c3bf2d2 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -699,7 +699,9 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				int strict = __ipv6_addr_src_scope(addr_type) <= IPV6_ADDR_SCOPE_LINKLOCAL;
 				if (!(inet_sk(sk)->freebind || inet_sk(sk)->transparent) &&
 				    !ipv6_chk_addr(net, &src_info->ipi6_addr,
-						   strict ? dev : NULL, 0))
+						   strict ? dev : NULL, 0) &&
+				    !ipv6_chk_acast_addr_src(net, dev,
+							     &src_info->ipi6_addr))
 					err = -EINVAL;
 				else
 					fl6->saddr = src_info->ipi6_addr;

^ permalink raw reply related

* Re: IPV6 routing problem
From: Hannes Frederic Sowa @ 2014-01-22  7:07 UTC (permalink / raw)
  To: Sharat Masetty; +Cc: netdev
In-Reply-To: <CAJzFV37rPo6trvCwWyErchen_MvTzqabSU+BE3JZjNXT5-5H7Q@mail.gmail.com>

Hi!

On Tue, Jan 21, 2014 at 06:41:58PM -0700, Sharat Masetty wrote:
>  I have an IPV6 routing problem that has only surfaced on a 3.10
> kernel version. This problem is not seen on 3.4 kernel. I will keep
> the problem statement as brief as possible.

Could you do me a favor and test this on a recent 3.13 kernel? Thanks!
Please also state the specific kernel version (I guess you use one).

> I have two interfaces say eth0 and eth1. I have a host route to a
> destination over eth1, but this route has a global gateway
> address(via) of the router on the link. When this global gateway is
> present in the route, the kernel does not seem to pick up the route,
> but falls back to the default route which is eth0. When this gateway
> is removed from the route or if the gateway is changed to a link local
> address of the router, instead of the global address, then routing
> seems to work as expected and kernel picks up interface eth1.
> 
> == does not work ==
> <dest-addr> via <global scope -gateway addr>dev eth1 metric 1024
> <global scope -gateway addr> dev eth1  metric 1024

Do you have CONFIG_IPV6_ROUTER_PREF activated?

> 
> I checked the git commits diff between 3.4 and 3.10 for ipv6/route.c
> and ipv6/ip6_fib.c, but looks like they are close to an year apart
> with lots of changes. Hence I wanted to reach out to you to see if any
> recent changes in IPV6 routing is causing this difference in behavior.

It should be a problem in ipv6/route.c, ip6_fib.c does not really care about
interfaces that much.

Thanks,

  Hannes

^ permalink raw reply

* Re: [PATCH net] tcp: metrics: Handle v6/v4-mapped sockets in tcp-metrics
From: David Miller @ 2014-01-22  7:09 UTC (permalink / raw)
  To: christoph.paasch; +Cc: netdev
In-Reply-To: <1390288632-28729-1-git-send-email-christoph.paasch@uclouvain.be>

From: Christoph Paasch <christoph.paasch@uclouvain.be>
Date: Tue, 21 Jan 2014 08:17:12 +0100

> A socket may be v6/v4-mapped. In that case sk->sk_family is AF_INET6,
> but the IP being used is actually an IPv4-address.
> Current's tcp-metrics will thus represent it as an IPv6-address:
> 
> root@server:~# ip tcp_metrics
> ::ffff:10.1.1.2 age 22.920sec rtt 18750us rttvar 15000us cwnd 10
> 10.1.1.2 age 47.970sec rtt 16250us rttvar 10000us cwnd 10
> 
> This patch modifies the tcp-metrics so that they are able to handle the
> v6/v4-mapped sockets correctly.
> 
> Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.)
> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>

Please respin this against net-next, thanks.

^ permalink raw reply

* Re: IPV6 routing problem
From: Hannes Frederic Sowa @ 2014-01-22  7:11 UTC (permalink / raw)
  To: Sharat Masetty, netdev
In-Reply-To: <20140122070759.GB28225@order.stressinduktion.org>

On Wed, Jan 22, 2014 at 08:07:59AM +0100, Hannes Frederic Sowa wrote:
> Hi!
> 
> On Tue, Jan 21, 2014 at 06:41:58PM -0700, Sharat Masetty wrote:
> >  I have an IPV6 routing problem that has only surfaced on a 3.10
> > kernel version. This problem is not seen on 3.4 kernel. I will keep
> > the problem statement as brief as possible.
> 
> Could you do me a favor and test this on a recent 3.13 kernel? Thanks!
> Please also state the specific kernel version (I guess you use one).

+stable.

A quick test worked for me. Maybe you could try with CONFIG_IPV6_ROUTER_PREF
enabled and one time with it disabled.

We should make this knob a runtime setting soon...

Thanks,

  Hannes

^ permalink raw reply

* Re: [PATCH net] bnx2x: Fix VF flr flow
From: David Miller @ 2014-01-22  7:13 UTC (permalink / raw)
  To: yuvalmin; +Cc: netdev, ariele
In-Reply-To: <1390293080-21276-1-git-send-email-yuvalmin@broadcom.com>

From: Yuval Mintz <yuvalmin@broadcom.com>
Date: Tue, 21 Jan 2014 10:31:20 +0200

> From: Ariel Elior <ariele@broadcom.com>
> 
> When a VF originating from a given PF is flr-ed, that PF gets an interrupt
> from the chip management and takes a part in the flr process.
> 
> This patch fixes several corner cases in which the driver performs its part
> of the flr flow out-of-order, causing the FW to assert due to badly timed
> messages received from the driver.
> 
> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
> Signed-off-by: Ariel Elior <ariele@broadcom.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] be2net: Fix be_vlan_add/rem_vid() routines
From: David Miller @ 2014-01-22  7:13 UTC (permalink / raw)
  To: somnath.kotur; +Cc: netdev, kalesh.purayil
In-Reply-To: <e4c76822-2f55-4a40-a0fa-c4cacb2a480f@CMEXHTCAS2.ad.emulex.com>

From: Somnath Kotur <somnath.kotur@emulex.com>
Date: Tue, 21 Jan 2014 15:50:55 +0530

> The current logic to put interface into VLAN Promiscous mode is not correct.
> We should increment "adapter->vlans_added" before calling be_vid_config().
> Also removed some unwanted log messages.
> 
> Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net/mlx4_core: Remove unnecessary validation for port number
From: David Miller @ 2014-01-22  7:14 UTC (permalink / raw)
  To: amirv; +Cc: netdev, ogerlitz, monis, matanb
In-Reply-To: <1390292377-11806-1-git-send-email-amirv@mellanox.com>

From: Amir Vadai <amirv@mellanox.com>
Date: Tue, 21 Jan 2014 10:19:37 +0200

> From: Moni Shoua <monis@mellanox.co.il>
> 
> This is a fix to a regression introduced by commit:
> "982290a net/mlx4_core: Check port number for validity
> before accessing data"
> 
> IPoIB could not attach to multicast group and we get this in dmesg:
> [144214.145008] ib0: failed to attach to multicast group, ret = -22
> [144214.145016] ib0: couldn't attach QP to multicast group ff12:401b:ffff:0000:0000:0000:ffff:ffff
> [144214.145019] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
> 
> The cause to the problem is because port is extracted from gid[5].
> Which is only valid for Ethernet.
> Removed this validation in mlx4_qp_attach_common(), which is accessed
> from both Ethernet and IB flows.
> Error flow for bad port value in Ethernet is already exists in that
> function.
> 
> Signed-off-by: Moni Shoua <monis@mellanox.co.il>
> Signed-off-by: Matan Barak <matanb@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write
From: David Miller @ 2014-01-22  7:15 UTC (permalink / raw)
  To: chris.ruehl; +Cc: netdev, linux-kernel
In-Reply-To: <1390360813-29185-1-git-send-email-chris.ruehl@gtsys.com.hk>


Please do not mix coding style and functional changes.

Please resubmit this entire series once you have addressed
all feedback.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next v5 0/0] reciprocal_divide update
From: David Miller @ 2014-01-22  7:20 UTC (permalink / raw)
  To: hannes; +Cc: netdev
In-Reply-To: <1390354181-5080-1-git-send-email-hannes@stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 22 Jan 2014 02:29:38 +0100

> This patch is on top of aee636c4809fa5 ("bpf: do not use reciprocal
> divide") from Eric that sits in net tree. It will not create a merge
> conflict, but it depends on this one, so we suggest, if possible, to
> merge net into net-next.
> 
> We are proposing this change with only small modifications from the
> v2 version, namely updating the name of trim to reciprocal_scale
> (as commented on by Ben Hutchings and Eric Dumazet, thanks!).
> 
> We thought about introducing the reciprocal_divide algorithm in
> parallel to the one already used by the kernel but faced organizational
> issues, leading us to the conclusion that it is best to just replace
> the old one: We could not come up with names for the different
> implementations and also with a way to describe the differences to
> guide developers which one to choose in which situation. This is
> because we cannot specify the correct semantics for the version
> which is currently used by the kernel. Altough it seems to not be
> causing problems in the kernel, we cannot surely say so in the
> case of flex_array for the future. Current usage seems ok, but
> future users could run into problems.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 03/25] bonding: convert packets_per_slave to use the new option API
From: Hannes Frederic Sowa @ 2014-01-22  7:25 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev
In-Reply-To: <1390316114-17815-4-git-send-email-nikolay@redhat.com>

Hi Nikolay!

On Tue, Jan 21, 2014 at 03:54:52PM +0100, Nikolay Aleksandrov wrote:
> This patch adds the necessary changes so packets_per_slave would use the
> new bonding option API.

Just want to warn you that because of the reciproal_divide merge in net-next
there will be some conflicts with this patch in net-next.

I actually looked to rebase Daniel and my series on your patchset, but now
David pulled ours first.

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write
From: Chris Ruehl @ 2014-01-22  7:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20140121.231537.1961204373265189658.davem@davemloft.net>

On Wednesday, January 22, 2014 03:15 PM, David Miller wrote:
> Please do not mix coding style and functional changes.
>
> Please resubmit this entire series once you have addressed
> all feedback.
>
> Thank you.
Thanks for the advice. I will do.

Chris

^ permalink raw reply

* [patch net-next 0/5] fix bonding slave info API (sysfs and netlink)
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend

The main part of this patchset is the introduction of a generic way how
to get/set info of slaves unsing rtnl_link_ops.

Jiri Pirko (5):
  bonding: change name of sysfs dir for bonding slaves
  rtnetlink: put "BOND" into nl attribute names which are related to
    bonding
  rtnetlink: provide api for getting and setting slave info
  bonding: convert netlink to use slave data info api
  rtnetlink: remove ndo_get_slave

 drivers/net/bonding/bond_main.c        |   1 -
 drivers/net/bonding/bond_netlink.c     |  33 ++++--
 drivers/net/bonding/bond_sysfs_slave.c |   2 +-
 drivers/net/bonding/bonding.h          |   1 -
 include/linux/netdevice.h              |   5 -
 include/net/rtnetlink.h                |  14 +++
 include/uapi/linux/if_link.h           |  21 ++--
 net/core/rtnetlink.c                   | 209 ++++++++++++++++++++++-----------
 8 files changed, 190 insertions(+), 96 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [patch net-next 1/5] bonding: change name of sysfs dir for bonding slaves
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend
In-Reply-To: <1390377957-31466-1-git-send-email-jiri@resnulli.us>

Allow user to identify easily what the attributes are related to. Change
the name of the group to "bonding_slave" to be similar to master which
is named "bonding".

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/bonding/bond_sysfs_slave.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_sysfs_slave.c b/drivers/net/bonding/bond_sysfs_slave.c
index 7cb97de..84b0e38 100644
--- a/drivers/net/bonding/bond_sysfs_slave.c
+++ b/drivers/net/bonding/bond_sysfs_slave.c
@@ -118,7 +118,7 @@ int bond_sysfs_slave_add(struct slave *slave)
 	int err;
 
 	err = kobject_init_and_add(&slave->kobj, &slave_ktype,
-				   &(slave->dev->dev.kobj), "slave");
+				   &(slave->dev->dev.kobj), "bonding_slave");
 	if (err)
 		return err;
 
-- 
1.8.3.1

^ permalink raw reply related

* [patch net-next 2/5] rtnetlink: put "BOND" into nl attribute names which are related to bonding
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend
In-Reply-To: <1390377957-31466-1-git-send-email-jiri@resnulli.us>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/bonding/bond_netlink.c | 12 ++++++------
 include/uapi/linux/if_link.h       | 19 ++++++++++---------
 net/core/rtnetlink.c               | 16 ++++++++--------
 3 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index 21c6488..dd786a3 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -27,27 +27,27 @@ int bond_get_slave(struct net_device *slave_dev, struct sk_buff *skb)
 	struct slave *slave = bond_slave_get_rtnl(slave_dev);
 	const struct aggregator *agg;
 
-	if (nla_put_u8(skb, IFLA_SLAVE_STATE, bond_slave_state(slave)))
+	if (nla_put_u8(skb, IFLA_BOND_SLAVE_STATE, bond_slave_state(slave)))
 		goto nla_put_failure;
 
-	if (nla_put_u8(skb, IFLA_SLAVE_MII_STATUS, slave->link))
+	if (nla_put_u8(skb, IFLA_BOND_SLAVE_MII_STATUS, slave->link))
 		goto nla_put_failure;
 
-	if (nla_put_u32(skb, IFLA_SLAVE_LINK_FAILURE_COUNT,
+	if (nla_put_u32(skb, IFLA_BOND_SLAVE_LINK_FAILURE_COUNT,
 			slave->link_failure_count))
 		goto nla_put_failure;
 
-	if (nla_put(skb, IFLA_SLAVE_PERM_HWADDR,
+	if (nla_put(skb, IFLA_BOND_SLAVE_PERM_HWADDR,
 		    slave_dev->addr_len, slave->perm_hwaddr))
 		goto nla_put_failure;
 
-	if (nla_put_u16(skb, IFLA_SLAVE_QUEUE_ID, slave->queue_id))
+	if (nla_put_u16(skb, IFLA_BOND_SLAVE_QUEUE_ID, slave->queue_id))
 		goto nla_put_failure;
 
 	if (slave->bond->params.mode == BOND_MODE_8023AD) {
 		agg = SLAVE_AD_INFO(slave).port.aggregator;
 		if (agg)
-			if (nla_put_u16(skb, IFLA_SLAVE_AD_AGGREGATOR_ID,
+			if (nla_put_u16(skb, IFLA_BOND_SLAVE_AD_AGGREGATOR_ID,
 					agg->aggregator_identifier))
 				goto nla_put_failure;
 	}
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index ba2f3bf..1f30b85 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -144,7 +144,7 @@ enum {
 	IFLA_NUM_RX_QUEUES,
 	IFLA_CARRIER,
 	IFLA_PHYS_PORT_ID,
-	IFLA_SLAVE,
+	IFLA_BOND_SLAVE,
 	__IFLA_MAX
 };
 
@@ -370,16 +370,17 @@ enum {
 #define IFLA_BOND_AD_INFO_MAX	(__IFLA_BOND_AD_INFO_MAX - 1)
 
 enum {
-	IFLA_SLAVE_STATE,
-	IFLA_SLAVE_MII_STATUS,
-	IFLA_SLAVE_LINK_FAILURE_COUNT,
-	IFLA_SLAVE_PERM_HWADDR,
-	IFLA_SLAVE_QUEUE_ID,
-	IFLA_SLAVE_AD_AGGREGATOR_ID,
-	__IFLA_SLAVE_MAX,
+	IFLA_BOND_SLAVE_UNSPEC,
+	IFLA_BOND_SLAVE_STATE,
+	IFLA_BOND_SLAVE_MII_STATUS,
+	IFLA_BOND_SLAVE_LINK_FAILURE_COUNT,
+	IFLA_BOND_SLAVE_PERM_HWADDR,
+	IFLA_BOND_SLAVE_QUEUE_ID,
+	IFLA_BOND_SLAVE_AD_AGGREGATOR_ID,
+	__IFLA_BOND_SLAVE_MAX,
 };
 
-#define IFLA_SLAVE_MAX	(__IFLA_SLAVE_MAX - 1)
+#define IFLA_BOND_SLAVE_MAX	(__IFLA_BOND_SLAVE_MAX - 1)
 
 /* SR-IOV virtual function management section */
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 4f85de7..cace149 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -725,13 +725,13 @@ static size_t rtnl_bond_slave_size(const struct net_device *dev)
 {
 	struct net_device *bond;
 	size_t slave_size =
-		nla_total_size(sizeof(struct nlattr)) +	/* IFLA_SLAVE */
-		nla_total_size(1) +	/* IFLA_SLAVE_STATE */
-		nla_total_size(1) +	/* IFLA_SLAVE_MII_STATUS */
-		nla_total_size(4) +	/* IFLA_SLAVE_LINK_FAILURE_COUNT */
-		nla_total_size(MAX_ADDR_LEN) +	/* IFLA_SLAVE_PERM_HWADDR */
-		nla_total_size(2) +	/* IFLA_SLAVE_QUEUE_ID */
-		nla_total_size(2) +	/* IFLA_SLAVE_AD_AGGREGATOR_ID */
+		nla_total_size(sizeof(struct nlattr)) +	/* IFLA_BOND_SLAVE */
+		nla_total_size(1) +	/* IFLA_BOND_SLAVE_STATE */
+		nla_total_size(1) +	/* IFLA_BOND_SLAVE_MII_STATUS */
+		nla_total_size(4) +	/* IFLA_BOND_SLAVE_LINK_FAILURE_COUNT */
+		nla_total_size(MAX_ADDR_LEN) +	/* IFLA_BOND_SLAVE_PERM_HWADDR */
+		nla_total_size(2) +	/* IFLA_BOND_SLAVE_QUEUE_ID */
+		nla_total_size(2) +	/* IFLA_BOND_SLAVE_AD_AGGREGATOR_ID */
 		0;
 
 	if (netif_is_bond_slave((struct net_device *)dev)) {
@@ -883,7 +883,7 @@ static size_t rtnl_bond_slave_fill(struct sk_buff *skb, struct net_device *dev)
 	if (!bond || !bond->netdev_ops->ndo_get_slave)
 		return 0;
 
-	nest = nla_nest_start(skb, IFLA_SLAVE);
+	nest = nla_nest_start(skb, IFLA_BOND_SLAVE);
 	if (!nest)
 		return -EMSGSIZE;
 
-- 
1.8.3.1

^ permalink raw reply related

* [patch net-next 3/5] rtnetlink: provide api for getting and setting slave info
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend
In-Reply-To: <1390377957-31466-1-git-send-email-jiri@resnulli.us>

Recent patch
bonding: add netlink attributes to slave link dev (1d3ee88ae0d6)

Introduced yet another device specific way to access slave information
over rtnetlink. There is one already there for bridge.

This patch introduces generic way to do this, for getting and setting
info as well by extending link_ops. Later on, this new interface will
be used for bridge ports as well.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 include/net/rtnetlink.h      |  14 ++++
 include/uapi/linux/if_link.h |   2 +
 net/core/rtnetlink.c         | 158 +++++++++++++++++++++++++++++++++++++------
 3 files changed, 154 insertions(+), 20 deletions(-)

diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 8fb4207..661e45d 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -79,6 +79,20 @@ struct rtnl_link_ops {
 					       const struct net_device *dev);
 	unsigned int		(*get_num_tx_queues)(void);
 	unsigned int		(*get_num_rx_queues)(void);
+
+	int			slave_maxtype;
+	const struct nla_policy	*slave_policy;
+	int			(*slave_validate)(struct nlattr *tb[],
+						  struct nlattr *data[]);
+	int			(*slave_changelink)(struct net_device *dev,
+						    struct net_device *slave_dev,
+						    struct nlattr *tb[],
+						    struct nlattr *data[]);
+	size_t			(*get_slave_size)(const struct net_device *dev,
+						  const struct net_device *slave_dev);
+	int			(*fill_slave_info)(struct sk_buff *skb,
+						   const struct net_device *dev,
+						   const struct net_device *slave_dev);
 };
 
 int __rtnl_link_register(struct rtnl_link_ops *ops);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 1f30b85..b8fb352 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -241,6 +241,8 @@ enum {
 	IFLA_INFO_KIND,
 	IFLA_INFO_DATA,
 	IFLA_INFO_XSTATS,
+	IFLA_INFO_SLAVE_KIND,
+	IFLA_INFO_SLAVE_DATA,
 	__IFLA_INFO_MAX,
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index cace149..a56bccf 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -365,6 +365,22 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops)
 }
 EXPORT_SYMBOL_GPL(rtnl_link_unregister);
 
+static size_t rtnl_link_get_slave_info_data_size(const struct net_device *dev)
+{
+	struct net_device *master_dev;
+	const struct rtnl_link_ops *ops;
+
+	master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
+	if (!master_dev)
+		return 0;
+	ops = master_dev->rtnl_link_ops;
+	if (!ops->get_slave_size)
+		return 0;
+	/* IFLA_INFO_SLAVE_DATA + nested data */
+	return nla_total_size(sizeof(struct nlattr)) +
+	       ops->get_slave_size(master_dev, dev);
+}
+
 static size_t rtnl_link_get_size(const struct net_device *dev)
 {
 	const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
@@ -385,6 +401,8 @@ static size_t rtnl_link_get_size(const struct net_device *dev)
 		/* IFLA_INFO_XSTATS */
 		size += nla_total_size(ops->get_xstats_size(dev));
 
+	size += rtnl_link_get_slave_info_data_size(dev);
+
 	return size;
 }
 
@@ -459,40 +477,101 @@ static size_t rtnl_link_get_af_size(const struct net_device *dev)
 	return size;
 }
 
-static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
+static bool rtnl_have_link_slave_info(const struct net_device *dev)
 {
-	const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
-	struct nlattr *linkinfo, *data;
-	int err = -EMSGSIZE;
+	struct net_device *master_dev;
 
-	linkinfo = nla_nest_start(skb, IFLA_LINKINFO);
-	if (linkinfo == NULL)
-		goto out;
+	master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
+	if (master_dev && master_dev->rtnl_link_ops &&
+	    master_dev->rtnl_link_ops->fill_slave_info)
+		return true;
+	return false;
+}
+
+static int rtnl_link_slave_info_fill(struct sk_buff *skb,
+				     const struct net_device *dev)
+{
+	struct net_device *master_dev;
+	const struct rtnl_link_ops *ops;
+	struct nlattr *slave_data;
+	int err;
 
+	master_dev = netdev_master_upper_dev_get((struct net_device *) dev);
+	if (!master_dev)
+		return 0;
+	ops = master_dev->rtnl_link_ops;
+	if (!ops)
+		return 0;
+	if (nla_put_string(skb, IFLA_INFO_SLAVE_KIND, ops->kind) < 0)
+		return -EMSGSIZE;
+	if (ops->fill_slave_info) {
+		slave_data = nla_nest_start(skb, IFLA_INFO_SLAVE_DATA);
+		if (!slave_data)
+			return -EMSGSIZE;
+		err = ops->fill_slave_info(skb, master_dev, dev);
+		if (err < 0)
+			goto err_cancel_slave_data;
+		nla_nest_end(skb, slave_data);
+	}
+	return 0;
+
+err_cancel_slave_data:
+	nla_nest_cancel(skb, slave_data);
+	return err;
+}
+
+static int rtnl_link_info_fill(struct sk_buff *skb,
+			       const struct net_device *dev)
+{
+	const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
+	struct nlattr *data;
+	int err;
+
+	if (!ops)
+		return 0;
 	if (nla_put_string(skb, IFLA_INFO_KIND, ops->kind) < 0)
-		goto err_cancel_link;
+		return -EMSGSIZE;
 	if (ops->fill_xstats) {
 		err = ops->fill_xstats(skb, dev);
 		if (err < 0)
-			goto err_cancel_link;
+			return err;
 	}
 	if (ops->fill_info) {
 		data = nla_nest_start(skb, IFLA_INFO_DATA);
-		if (data == NULL) {
-			err = -EMSGSIZE;
-			goto err_cancel_link;
-		}
+		if (data == NULL)
+			return -EMSGSIZE;
 		err = ops->fill_info(skb, dev);
 		if (err < 0)
 			goto err_cancel_data;
 		nla_nest_end(skb, data);
 	}
-
-	nla_nest_end(skb, linkinfo);
 	return 0;
 
 err_cancel_data:
 	nla_nest_cancel(skb, data);
+	return err;
+}
+
+static int rtnl_link_fill(struct sk_buff *skb, const struct net_device *dev)
+{
+	struct nlattr *linkinfo;
+	int err = -EMSGSIZE;
+
+	linkinfo = nla_nest_start(skb, IFLA_LINKINFO);
+	if (linkinfo == NULL)
+		goto out;
+
+	err = rtnl_link_info_fill(skb, dev);
+	if (err < 0)
+		goto err_cancel_link;
+
+	err = rtnl_link_slave_info_fill(skb, dev);
+	if (err < 0)
+		goto err_cancel_link;
+
+	nla_nest_end(skb, linkinfo);
+	return 0;
+
 err_cancel_link:
 	nla_nest_cancel(skb, linkinfo);
 out:
@@ -1052,10 +1131,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 	if (rtnl_port_fill(skb, dev))
 		goto nla_put_failure;
 
-	if (rtnl_bond_slave_fill(skb, dev))
-		goto nla_put_failure;
-
-	if (dev->rtnl_link_ops) {
+	if (dev->rtnl_link_ops || rtnl_have_link_slave_info(dev)) {
 		if (rtnl_link_fill(skb, dev) < 0)
 			goto nla_put_failure;
 	}
@@ -1178,6 +1254,8 @@ EXPORT_SYMBOL(ifla_policy);
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
 	[IFLA_INFO_KIND]	= { .type = NLA_STRING },
 	[IFLA_INFO_DATA]	= { .type = NLA_NESTED },
+	[IFLA_INFO_SLAVE_KIND]	= { .type = NLA_STRING },
+	[IFLA_INFO_SLAVE_DATA]	= { .type = NLA_NESTED },
 };
 
 static const struct nla_policy ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = {
@@ -1765,7 +1843,9 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
 	struct net *net = sock_net(skb->sk);
 	const struct rtnl_link_ops *ops;
+	const struct rtnl_link_ops *m_ops = NULL;
 	struct net_device *dev;
+	struct net_device *master_dev = NULL;
 	struct ifinfomsg *ifm;
 	char kind[MODULE_NAME_LEN];
 	char ifname[IFNAMSIZ];
@@ -1795,6 +1875,12 @@ replay:
 			dev = NULL;
 	}
 
+	if (dev) {
+		master_dev = netdev_master_upper_dev_get(dev);
+		if (master_dev)
+			m_ops = master_dev->rtnl_link_ops;
+	}
+
 	err = validate_linkmsg(dev, tb);
 	if (err < 0)
 		return err;
@@ -1816,7 +1902,10 @@ replay:
 	}
 
 	if (1) {
-		struct nlattr *attr[ops ? ops->maxtype + 1 : 0], **data = NULL;
+		struct nlattr *attr[ops ? ops->maxtype + 1 : 0];
+		struct nlattr *slave_attr[m_ops ? m_ops->slave_maxtype + 1 : 0];
+		struct nlattr **data = NULL;
+		struct nlattr **slave_data = NULL;
 		struct net *dest_net;
 
 		if (ops) {
@@ -1835,6 +1924,24 @@ replay:
 			}
 		}
 
+		if (m_ops) {
+			if (m_ops->slave_maxtype &&
+			    linkinfo[IFLA_INFO_SLAVE_DATA]) {
+				err = nla_parse_nested(slave_attr,
+						       m_ops->slave_maxtype,
+						       linkinfo[IFLA_INFO_SLAVE_DATA],
+						       m_ops->slave_policy);
+				if (err < 0)
+					return err;
+				slave_data = slave_attr;
+			}
+			if (m_ops->slave_validate) {
+				err = m_ops->slave_validate(tb, slave_data);
+				if (err < 0)
+					return err;
+			}
+		}
+
 		if (dev) {
 			int modified = 0;
 
@@ -1854,6 +1961,17 @@ replay:
 				modified = 1;
 			}
 
+			if (linkinfo[IFLA_INFO_SLAVE_DATA]) {
+				if (!m_ops || !m_ops->slave_changelink)
+					return -EOPNOTSUPP;
+
+				err = m_ops->slave_changelink(master_dev, dev,
+							      tb, slave_data);
+				if (err < 0)
+					return err;
+				modified = 1;
+			}
+
 			return do_setlink(dev, ifm, tb, ifname, modified);
 		}
 
-- 
1.8.3.1

^ permalink raw reply related

* [patch net-next 4/5] bonding: convert netlink to use slave data info api
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend
In-Reply-To: <1390377957-31466-1-git-send-email-jiri@resnulli.us>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/bonding/bond_main.c    |  1 -
 drivers/net/bonding/bond_netlink.c | 21 ++++++++++++++--
 drivers/net/bonding/bonding.h      |  1 -
 net/core/rtnetlink.c               | 51 --------------------------------------
 4 files changed, 19 insertions(+), 55 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3220b48..df85cec 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3883,7 +3883,6 @@ static const struct net_device_ops bond_netdev_ops = {
 #endif
 	.ndo_add_slave		= bond_enslave,
 	.ndo_del_slave		= bond_release,
-	.ndo_get_slave		= bond_get_slave,
 	.ndo_fix_features	= bond_fix_features,
 };
 
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index dd786a3..524d7ce 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -22,10 +22,23 @@
 #include <linux/reciprocal_div.h>
 #include "bonding.h"
 
-int bond_get_slave(struct net_device *slave_dev, struct sk_buff *skb)
+static size_t bond_get_slave_size(const struct net_device *bond_dev,
+				  const struct net_device *slave_dev)
+{
+	return nla_total_size(sizeof(u8)) +	/* IFLA_BOND_SLAVE_STATE */
+		nla_total_size(sizeof(u8)) +	/* IFLA_BOND_SLAVE_MII_STATUS */
+		nla_total_size(sizeof(u32)) +	/* IFLA_BOND_SLAVE_LINK_FAILURE_COUNT */
+		nla_total_size(MAX_ADDR_LEN) +	/* IFLA_BOND_SLAVE_PERM_HWADDR */
+		nla_total_size(sizeof(u16)) +	/* IFLA_BOND_SLAVE_QUEUE_ID */
+		nla_total_size(sizeof(u16)) +	/* IFLA_BOND_SLAVE_AD_AGGREGATOR_ID */
+		0;
+}
+
+static int bond_fill_slave_info(struct sk_buff *skb,
+				const struct net_device *bond_dev,
+				const struct net_device *slave_dev)
 {
 	struct slave *slave = bond_slave_get_rtnl(slave_dev);
-	const struct aggregator *agg;
 
 	if (nla_put_u8(skb, IFLA_BOND_SLAVE_STATE, bond_slave_state(slave)))
 		goto nla_put_failure;
@@ -45,6 +58,8 @@ int bond_get_slave(struct net_device *slave_dev, struct sk_buff *skb)
 		goto nla_put_failure;
 
 	if (slave->bond->params.mode == BOND_MODE_8023AD) {
+		const struct aggregator *agg;
+
 		agg = SLAVE_AD_INFO(slave).port.aggregator;
 		if (agg)
 			if (nla_put_u16(skb, IFLA_BOND_SLAVE_AD_AGGREGATOR_ID,
@@ -518,6 +533,8 @@ struct rtnl_link_ops bond_link_ops __read_mostly = {
 	.get_num_tx_queues	= bond_get_num_tx_queues,
 	.get_num_rx_queues	= bond_get_num_tx_queues, /* Use the same number
 							     as for TX queues */
+	.get_slave_size		= bond_get_slave_size,
+	.fill_slave_info	= bond_fill_slave_info,
 };
 
 int __init bond_netlink_init(void)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 8a935f8..5033e83 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -432,7 +432,6 @@ int bond_sysfs_slave_add(struct slave *slave);
 void bond_sysfs_slave_del(struct slave *slave);
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev);
 int bond_release(struct net_device *bond_dev, struct net_device *slave_dev);
-int bond_get_slave(struct net_device *slave_dev, struct sk_buff *skb);
 int bond_xmit_hash(struct bonding *bond, struct sk_buff *skb, int count);
 int bond_parse_parm(const char *mode_arg, const struct bond_parm_tbl *tbl);
 int bond_parm_tbl_lookup(int mode, const struct bond_parm_tbl *tbl);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a56bccf..db6a239 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -800,28 +800,6 @@ static size_t rtnl_port_size(const struct net_device *dev)
 		return port_self_size;
 }
 
-static size_t rtnl_bond_slave_size(const struct net_device *dev)
-{
-	struct net_device *bond;
-	size_t slave_size =
-		nla_total_size(sizeof(struct nlattr)) +	/* IFLA_BOND_SLAVE */
-		nla_total_size(1) +	/* IFLA_BOND_SLAVE_STATE */
-		nla_total_size(1) +	/* IFLA_BOND_SLAVE_MII_STATUS */
-		nla_total_size(4) +	/* IFLA_BOND_SLAVE_LINK_FAILURE_COUNT */
-		nla_total_size(MAX_ADDR_LEN) +	/* IFLA_BOND_SLAVE_PERM_HWADDR */
-		nla_total_size(2) +	/* IFLA_BOND_SLAVE_QUEUE_ID */
-		nla_total_size(2) +	/* IFLA_BOND_SLAVE_AD_AGGREGATOR_ID */
-		0;
-
-	if (netif_is_bond_slave((struct net_device *)dev)) {
-		bond = netdev_master_upper_dev_get((struct net_device *)dev);
-		if (bond && bond->netdev_ops->ndo_get_slave)
-			return slave_size;
-	}
-
-	return 0;
-}
-
 static noinline size_t if_nlmsg_size(const struct net_device *dev,
 				     u32 ext_filter_mask)
 {
@@ -851,7 +829,6 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + rtnl_port_size(dev) /* IFLA_VF_PORTS + IFLA_PORT_SELF */
 	       + rtnl_link_get_size(dev) /* IFLA_LINKINFO */
 	       + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */
-	       + rtnl_bond_slave_size(dev) /* IFLA_SLAVE */
 	       + nla_total_size(MAX_PHYS_PORT_ID_LEN); /* IFLA_PHYS_PORT_ID */
 }
 
@@ -949,34 +926,6 @@ static int rtnl_phys_port_id_fill(struct sk_buff *skb, struct net_device *dev)
 	return 0;
 }
 
-static size_t rtnl_bond_slave_fill(struct sk_buff *skb, struct net_device *dev)
-{
-	struct net_device *bond;
-	struct nlattr *nest;
-	int err;
-
-	if (!netif_is_bond_slave(dev))
-		return 0;
-
-	bond = netdev_master_upper_dev_get(dev);
-	if (!bond || !bond->netdev_ops->ndo_get_slave)
-		return 0;
-
-	nest = nla_nest_start(skb, IFLA_BOND_SLAVE);
-	if (!nest)
-		return -EMSGSIZE;
-
-	err = bond->netdev_ops->ndo_get_slave(dev, skb);
-	if (err) {
-		nla_nest_cancel(skb, nest);
-		return (err == -EMSGSIZE) ? err : 0;
-	}
-
-	nla_nest_end(skb, nest);
-
-	return 0;
-}
-
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 			    int type, u32 pid, u32 seq, u32 change,
 			    unsigned int flags, u32 ext_filter_mask)
-- 
1.8.3.1

^ permalink raw reply related

* [patch net-next 5/5] rtnetlink: remove ndo_get_slave
From: Jiri Pirko @ 2014-01-22  8:05 UTC (permalink / raw)
  To: netdev
  Cc: davem, fubar, vfalico, andy, sfeldma, stephen, vyasevic,
	nicolas.dichtel, john.r.fastabend
In-Reply-To: <1390377957-31466-1-git-send-email-jiri@resnulli.us>

No longer used API bond-specific can be removed now. This is now handled
in a generic way in rtnl_link_ops.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 include/linux/netdevice.h | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 83ce2ae..e985231 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -921,9 +921,6 @@ struct netdev_phys_port_id {
  * int (*ndo_del_slave)(struct net_device *dev, struct net_device *slave_dev);
  *	Called to release previously enslaved netdev.
  *
- * int (*ndo_get_slave)(struct net_device *slave_dev, struct sk_buff *skb);
- *	Called to fill netlink skb with slave info.
- *
  *      Feature/offload setting functions.
  * netdev_features_t (*ndo_fix_features)(struct net_device *dev,
  *		netdev_features_t features);
@@ -1096,8 +1093,6 @@ struct net_device_ops {
 						 struct net_device *slave_dev);
 	int			(*ndo_del_slave)(struct net_device *dev,
 						 struct net_device *slave_dev);
-	int			(*ndo_get_slave)(struct net_device *slave_dev,
-						 struct sk_buff *skb);
 	netdev_features_t	(*ndo_fix_features)(struct net_device *dev,
 						    netdev_features_t features);
 	int			(*ndo_set_features)(struct net_device *dev,
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net-next v2 2/2] bonding: add netlink attributes to slave link dev
From: Jiri Pirko @ 2014-01-22  8:22 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, Netdev,
	Roopa Prabhu, Shrijeet Mukherjee, Ding Tianhong
In-Reply-To: <C5AA1F6D-4FC7-42F6-A59F-A0114E1B2E8B@cumulusnetworks.com>

Tue, Jan 21, 2014 at 11:42:58PM CET, sfeldma@cumulusnetworks.com wrote:
>
>On Jan 21, 2014, at 2:00 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Tue, Jan 21, 2014 at 10:36:58PM CET, sfeldma@cumulusnetworks.com wrote:
>>> 
>>> On Jan 21, 2014, at 5:34 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> 
>>>>> +	if (rtnl_bond_slave_fill(skb, dev))
>>>>> +		goto nla_put_failure;
>>>>> +
>>>> 
>>>> I must say I do not like this at all. This should be done in a generic
>>>> way. By a callback registered by bonding and possibly other master-slave
>>>> device types.
>>> 
>>> The bond was registered with the ndo_get_slave op.  ndo_get_slave could be used for other master-slave device types.  I’ll agree that rtnl_bond_slave_fill() could have been written more generically.  Is that the objection?
>> 
>> I think is should be done rather in rtnl_link_ops. It's the natural point
>> for this ops. I have patchset prepared. Will send it very soon.
>
>Ok, cool.
>
>Also, right now I have IFLA_SLAVE as a nest for IFLA_SLAVE_xxx attrs.  Do you think we should have a two-layer nest so we can capture other master-slave devices rather than just bond slaves?  I.e.:
>
>	IFLA_SLAVE
>		IFLA_BOND_SLAVE
>			IFLA_BOND_SLAVE_xxx
>			IFLA_BOND_SLAVE_yyy
>			IFLA_BOND_SLAVE_zzz
>		IFLA_FOO_SLAVE			// FOO is some other non-bond master
>			IFLA_FOO_SLAVE_xxx
>			IFLA_FOO_SLAVE_yyy
>			IFLA_FOO_SLAVE_zzz
>
>(Of course, slave wouldn’t be bond and foo slave at same time).

I would rather do this in LINKINFO nest the same way IFLA_BOND_* are
done. Please see following patch:

http://patchwork.ozlabs.org/patch/313156/

>
>-scott

^ permalink raw reply

* [PATCH v2 1/2] can: Decrease default size of CAN_RAW socket send queue
From: Michal Sojka @ 2014-01-22  8:27 UTC (permalink / raw)
  To: linux-can; +Cc: netdev, Marc Kleine-Budde, Michal Sojka

This fixes the infamous ENOBUFS problem, which appears when an
application sends CAN frames faster than they leave the interface.

Packets for sending can be queued at queueing discipline. Qdisc queue
has two limits: maximum length and per-socket byte limit (SO_SNDBUF).
Only the later limit can cause the sender to block. If maximum queue
length limit is reached before the per-socket limit, the application
receives ENOBUFS and there is no way how it can wait for the queue to
become free again. Since the length of the qdisc queue was set by
default to 10 packets, this is exactly what was happening.

This patch decreases the default per-socket limit to approximately 3
CAN frames and increases the length of the qdisc queue to 100 frames.
This setting allows for at least 33 CAN_RAW sockets to send
simultaneously to the same CAN interface without getting ENOBUFS
errors.

The exact maximum number of CAN frames that fit into the per-socket
limit is: 1+floor(sk_sndbuf/skb->truesize)

The calculation of the default sk_sndbuf value in the patch is only
approximate, because skb->truesize can be slightly greater than
SKB_TRUESIZE(). For example, for CAN frames on my 32 bit PowerPC
system, SKB_TRUESIZE() = 408, but skb->truesize = 448. Therefore, on
my system the per-socket limit allows 1+floor(3*408/448) =
1+floor(2.73) = 3 CAN frames to be queued.

Without this patch, the default per-socket limit is
/proc/sys/net/core/wmem_default, which is 163840 on my system. This
limit allows for queuing of 1+163840/448 = 366 CAN frames, which is
clearly more than the queue length (10 frames).

Since the per-socket limit is expressed in bytes, the number of queued
CANFD frames, which are bigger than CAN frames, may be lower. This is
not a big problem, because at least one frame could be always queued.

Changes since v1:
- Improved the commit message, added some number from my system.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
---
 drivers/net/can/dev.c | 2 +-
 net/can/raw.c         | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 1870c47..a0bce83 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -492,7 +492,7 @@ static void can_setup(struct net_device *dev)
 	dev->mtu = CAN_MTU;
 	dev->hard_header_len = 0;
 	dev->addr_len = 0;
-	dev->tx_queue_len = 10;
+	dev->tx_queue_len = 100;
 
 	/* New-style flags. */
 	dev->flags = IFF_NOARP;
diff --git a/net/can/raw.c b/net/can/raw.c
index fdda5f6..4ad0bb2 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -291,6 +291,10 @@ static int raw_init(struct sock *sk)
 {
 	struct raw_sock *ro = raw_sk(sk);
 
+	/* allow at most 3 frames to wait for transmission in socket queue */
+	sk->sk_sndbuf = 3 * SKB_TRUESIZE(sizeof(struct can_frame) +
+					 sizeof(struct can_skb_priv));
+
 	ro->bound            = 0;
 	ro->ifindex          = 0;
 
-- 
1.8.5.2


^ permalink raw reply related

* [PATCH v2 2/2] net: Make minimum SO_SNDBUF size dependent on the protocol family
From: Michal Sojka @ 2014-01-22  8:27 UTC (permalink / raw)
  To: linux-can; +Cc: netdev, Marc Kleine-Budde, Michal Sojka
In-Reply-To: <1390379257-9040-1-git-send-email-sojkam1@fel.cvut.cz>

For CAN bus it is desired to have the per-socket send queue limit much
smaller than for Ethernet-based protocols (see the previous patch).
This patch makes the lower limit of accepted setsockopt(SO_SNDBUF)
values smaller for PF_CAN sockets.

The value of SOCK_MIN_SNDBUF is kept unchanged, because it is only
used in two functions (sk_stream_moderate_sndbuf, tcp_out_of_memory)
that seem to be called for TCP/IP related protocols only and the
change from a constant to sk->sk_prot->min_sndbuf could have negative
performance impact.

Changes since v1:
- SOCK_MIN_SNDBUF changed back to a constant.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
---
 include/net/sock.h | 1 +
 net/can/raw.c      | 1 +
 net/core/sock.c    | 6 ++++--
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 808cbc2..9ca3c0c 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -969,6 +969,7 @@ struct proto {
 	int			*sysctl_rmem;
 	int			max_header;
 	bool			no_autobind;
+	int			min_sndbuf;
 
 	struct kmem_cache	*slab;
 	unsigned int		obj_size;
diff --git a/net/can/raw.c b/net/can/raw.c
index 4ad0bb2..b58f53f 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -818,6 +818,7 @@ static struct proto raw_proto __read_mostly = {
 	.owner      = THIS_MODULE,
 	.obj_size   = sizeof(struct raw_sock),
 	.init       = raw_init,
+	.min_sndbuf = SKB_TRUESIZE(sizeof(struct can_frame) + sizeof(struct can_skb_priv)),
 };
 
 static const struct can_proto raw_can_proto = {
diff --git a/net/core/sock.c b/net/core/sock.c
index 0b39e7a..fddb6be 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -625,7 +625,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		    char __user *optval, unsigned int optlen)
 {
 	struct sock *sk = sock->sk;
-	int val;
+	int val, min;
 	int valbool;
 	struct linger ling;
 	int ret = 0;
@@ -681,7 +681,9 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		val = min_t(u32, val, sysctl_wmem_max);
 set_sndbuf:
 		sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
-		sk->sk_sndbuf = max_t(u32, val * 2, SOCK_MIN_SNDBUF);
+		min = sk->sk_prot->min_sndbuf ?
+			sk->sk_prot->min_sndbuf : SOCK_MIN_SNDBUF;
+		sk->sk_sndbuf = max_t(u32, val * 2, min);
 		/* Wake up sending tasks if we upped the value. */
 		sk->sk_write_space(sk);
 		break;
-- 
1.8.5.2


^ permalink raw reply related

* Re: [PATCH net-next v3] tuntap: Fix for a race in accessing numqueues
From: Jason Wang @ 2014-01-22  8:53 UTC (permalink / raw)
  To: Dominic Curran, netdev; +Cc: Maxim Krasnyansky
In-Reply-To: <1390359803-27989-1-git-send-email-dominic.curran@citrix.com>

On 01/22/2014 11:03 AM, Dominic Curran wrote:
> A patch for fixing a race between queue selection and changing queues
> was introduced in commit 92bb73ea2("tuntap: fix a possible race between
> queue selection and changing queues").
>
> The fix was to prevent the driver from re-reading the tun->numqueues
> more than once within tun_select_queue() using ACCESS_ONCE().
>
> We have been experiancing 'Divide-by-zero' errors in tun_net_xmit() 
> since we moved from 3.6 to 3.10, and believe that they come from a 
> simular source where the value of tun->numqueues changes to zero 
> between the first and a subsequent read of tun->numqueues.
>
> The fix is a simular use of ACCESS_ONCE(), as well as a multiply
> instead of a divide in the if statement.
>
> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
> ---
> V3: Rebase against net-next. Include all numqueues in function.
> V2: Use multiply instead of divide. Suggested by Eric Dumazet.
>     Fixed email address for maxk. Rebase against net tree.
> ---
>  drivers/net/tun.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> Index: net-next/drivers/net/tun.c
> ===================================================================
> --- net-next.orig/drivers/net/tun.c	2014-01-22 02:50:01.000000000 +0000
> +++ net-next/drivers/net/tun.c	2014-01-22 02:59:42.000000000 +0000
> @@ -738,15 +738,17 @@ static netdev_tx_t tun_net_xmit(struct s
>  	struct tun_struct *tun = netdev_priv(dev);
>  	int txq = skb->queue_mapping;
>  	struct tun_file *tfile;
> +	u32 numqueues = 0;
>  
>  	rcu_read_lock();
>  	tfile = rcu_dereference(tun->tfiles[txq]);
> +	numqueues = ACCESS_ONCE(tun->numqueues);
>  
>  	/* Drop packet if interface is not attached */
> -	if (txq >= tun->numqueues)
> +	if (txq >= numqueues)
>  		goto drop;
>  
> -	if (tun->numqueues == 1) {
> +	if (numqueues == 1) {
>  		/* Select queue was not called for the skbuff, so we extract the
>  		 * RPS hash and save it into the flow_table here.
>  		 */
> @@ -779,8 +781,8 @@ static netdev_tx_t tun_net_xmit(struct s
>  	/* Limit the number of packets queued by dividing txq length with the
>  	 * number of queues.
>  	 */
> -	if (skb_queue_len(&tfile->socket.sk->sk_receive_queue)
> -			  >= dev->tx_queue_len / tun->numqueues)
> +	if (skb_queue_len(&tfile->socket.sk->sk_receive_queue) * numqueues
> +			  >= dev->tx_queue_len)
>  		goto drop;
>  
>  	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))

Acked-by: Jason Wang <jasowang@redhat.com>

^ permalink raw reply

* [PATCH] bonding: Don't allow bond devices to change network namespaces.
From: Chen Weilong @ 2014-01-22  9:16 UTC (permalink / raw)
  To: fubar, vfalico, andy, davem; +Cc: netdev

From: Weilong Chen <chenweilong@huawei.com>

Like bridge, bonding as netdevice doesn't cross netns boundaries.

Bonding ports and bonding itself live in same netns.

Signed-off-by: Weilong Chen <chenweilong@huawei.com>
---
 drivers/net/bonding/bond_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f00dd45..897d153 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3916,6 +3916,9 @@ void bond_setup(struct net_device *bond_dev)
 	 * capable
 	 */
 
+	/* Don't allow bond devices to change network namespaces. */
+	bond_dev->features |= NETIF_F_NETNS_LOCAL;
+
 	bond_dev->hw_features = BOND_VLAN_FEATURES |
 				NETIF_F_HW_VLAN_CTAG_TX |
 				NETIF_F_HW_VLAN_CTAG_RX |
-- 
1.7.12

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox