Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 1/5] bonding: The fail_over_mac should be set only in ACTIVE_BACKUP mode
From: Jay Vosburgh @ 2014-01-22  0:25 UTC (permalink / raw)
  To: Ding Tianhong; +Cc: Veaceslav Falico, David S. Miller, Netdev, Andy Gospodarek
In-Reply-To: <52DE4161.3050801@huawei.com>

Ding Tianhong <dingtianhong@huawei.com> wrote:

>According the bonding.txt, the option fail_over_mac only affect for
>AB mode, but in currect code, the parameter could be set to active
>or follow in every mode, this will cause bonding could not set all
>slaves of an RR or XOR mode to the same MAC address at enslavement
>time, so reset fail_over_mac to 0 if the mode is not ACTIVE_BACKUP.

	The correct way to fix this in general is to permit setting an
option at any time, but only have it take effect in active-backup mode.
This minimizes ordering requirements when setting options.

	I would instead modify the bond enslave and removal processing
to check the mode in addition to fail_over_mac when setting a slave's
MAC during enslavement.  The change active slave processing already only
calls the fail_over_mac function when in active-backup mode.  This
should also be a simpler change set.

>Fix the wrong variables for pr_err().
>
>Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>---
> drivers/net/bonding/bond_main.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 3220b48..ecff04e 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -4307,12 +4307,14 @@ static int bond_check_params(struct bond_params *params)
> 						      fail_over_mac_tbl);
> 		if (fail_over_mac_value == -1) {
> 			pr_err("Error: invalid fail_over_mac \"%s\"\n",
>-			       arp_validate == NULL ? "NULL" : arp_validate);
>+			       fail_over_mac == NULL ? "NULL" : fail_over_mac);

	This part is ok.

> 			return -EINVAL;
> 		}
>
>-		if (bond_mode != BOND_MODE_ACTIVEBACKUP)
>-			pr_warning("Warning: fail_over_mac only affects active-backup mode.\n");
>+		if (bond_mode != BOND_MODE_ACTIVEBACKUP) {
>+			pr_warning("Warning: fail_over_mac only affects active-backup mode, set it to 0.\n");
>+			fail_over_mac_value = BOND_FOM_NONE;
>+		}

	This part is not.

	I would additionally NAK patches 2, 3, and 4 (noting that 4
inhibits the change to fail_over_mac, but not the warning message that
we're changing it).

	Patch 5 is ok, although it really has nothing to do with
fail_over_mac.

	-J

> 	} else {
> 		fail_over_mac_value = BOND_FOM_NONE;
> 	}
>-- 
>1.8.0

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [PATCH net-next v2] ipv6: enable anycast addresses as source addresses for datagrams
From: Hannes Frederic Sowa @ 2014-01-22  0:16 UTC (permalink / raw)
  To: Francois-Xavier Le Bail; +Cc: netdev, David Stevens, David Miller
In-Reply-To: <1390320070-12735-1-git-send-email-fx.lebail@yahoo.com>

On Tue, Jan 21, 2014 at 05:01:10PM +0100, Francois-Xavier Le Bail wrote:
> diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
> index 5a80f15..d3a6e2d 100644
> --- a/net/ipv6/anycast.c
> +++ b/net/ipv6/anycast.c
> @@ -383,6 +383,17 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
>  	return found;
>  }
>  
> +/*	check if this anycast address is link-local on given interface or
> + *	is global
> + */
> +bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
> +			     const struct in6_addr *addr)
> +{
> +	if (ipv6_addr_type(addr) & IPV6_ADDR_LINKLOCAL)
> +		return ipv6_chk_acast_dev(dev, addr);
> +	else
> +		return ipv6_chk_acast_addr(net, NULL, addr);
> +}

You need to do the check with ipv6_chk_acast_addr in both cases, as only
ipv6_chk_acast_addr does a rcu_read_lock and it is needed for the dereference
of inet6_dev (__in6_dev_get is not safe in ipv6_chk_acast_dev without
rcu_read_lock).

Otherwise I am fine with this patch, thanks!

^ permalink raw reply

* Re: [PATCH] net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer
From: Vlad Yasevich @ 2014-01-22  0:08 UTC (permalink / raw)
  To: David Miller, matija.glavinic-pecotic.ext
  Cc: linux-sctp, alexander.sverdlin, netdev
In-Reply-To: <20140121.143620.1777441472756091252.davem@davemloft.net>

On 01/21/2014 05:36 PM, David Miller wrote:
> From: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
> Date: Fri, 17 Jan 2014 08:01:24 +0100
> 
>> Implementation of (a)rwnd calculation might lead to severe performance issues
>> and associations completely stalling. These problems are described and solution
>> is proposed which improves lksctp's robustness in congestion state.
> 
> It would be fantastic if an SCTP expert would review this patch, it's
> been rotting in patchwork for 4 days.
> 

Apologies.  It's been on my to do list.

-vlad

^ permalink raw reply

* Re: r8169 won't transmit with 3.12
From: Francois Romieu @ 2014-01-21 23:36 UTC (permalink / raw)
  To: Craig Small; +Cc: Realtek linux nic maintainers, netdev
In-Reply-To: <20140121000615.GA30495@enc.com.au>

Craig Small <csmall@enc.com.au> :
[...]
> It's a new setup so it might of never worked. It's not likely to be a
> hardware problem as its three different devices.

You may check that the onboard nic device does not even work when the
extra PCI-e card aren't plugged.

> I've sent what I think you might need for starters, but if there
> is extra stuff you'd like to see, let me know.

A complete dmesg including the XID lines that the r8169 driver prints
would be welcome.

> The problem shows up the same, the TX dropped counter increments.
> I'm not sure why 42 packets made it out (or even if they really did)
> Receive works fine, I can even start up wireshark and see packets
> pass by.
> 
> eth0      Link encap:Ethernet  HWaddr 00:e0:4c:80:66:57  
>           inet addr:192.168.1.222  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::2e0:4cff:fe80:6657/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:8252 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:42 errors:0 dropped:8029 overruns:0 carrier:0

There are very few places where the r8169 driver modifies tx_dropped.

Please increase the driver verbosity with the 'msglvl' option of ethtool.

[...]
> ethtool -S shows a similar story, not sure if the rx_missed counter
> is another problem:

It seems safe to ignore as long as it does not change after startup.

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH] [trivial] ixgbe: Fix format string in ixgbe_fcoe.c
From: Brown, Aaron F @ 2014-01-21 23:04 UTC (permalink / raw)
  To: standby24x7@gmail.com
  Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, trivial@kernel.org
In-Reply-To: <1389716082-23395-1-git-send-email-standby24x7@gmail.com>

On Wed, 2014-01-15 at 01:14 +0900, Masanari Iida wrote:
> cppcheck detected following warning in ixgbe_fcoe.c
> (warning) %d in format string (no. 1) requires 'int' but the
> argument type is 'unsigned int'.
> 
> Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>

> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> index f58db45..0872617 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> @@ -585,7 +585,7 @@ static int ixgbe_fcoe_dma_pool_alloc(struct ixgbe_fcoe *fcoe,
>  	struct dma_pool *pool;
>  	char pool_name[32];
>  
> -	snprintf(pool_name, 32, "ixgbe_fcoe_ddp_%d", cpu);
> +	snprintf(pool_name, 32, "ixgbe_fcoe_ddp_%u", cpu);
>  
>  	pool = dma_pool_create(pool_name, dev, IXGBE_FCPTR_MAX,
>  			       IXGBE_FCPTR_ALIGN, PAGE_SIZE);



^ permalink raw reply

* Re: [PATCH] net: Correctly sync addresses from multiple sources to single device
From: David Miller @ 2014-01-21 22:53 UTC (permalink / raw)
  To: vyasevic; +Cc: netdev, andrey.dmitrov
In-Reply-To: <1389988332-22472-1-git-send-email-vyasevic@redhat.com>

From: Vlad Yasevich <vyasevic@redhat.com>
Date: Fri, 17 Jan 2014 14:52:12 -0500

> When we have multiple devices attempting to sync the same address
> to a single destination, each device should be permitted to sync
> it once.  To accomplish this, pass the sync count of the source
> address to __hw_addr_add_ex().  'sync_cnt' tracks how many time
> a given address has been successfully synced.  If the address
> is found in the destination list, but the 'sync_cnt' of the source
> is 0, then this address has not been synced from this interface
> and we need to allow the sync operation to succeed thus incrementing
> reference counts.
> 
> Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
> CC: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>

I applied this to net-next since 3.13 just got released, and it doesn't
compile.

net/core/dev_addr_lists.c: In function ‘__hw_addr_sync_one’:
net/core/dev_addr_lists.c:144:26: error: ‘struct netdev_hw_addr’ has no member named ‘sync_count’

^ permalink raw reply

* Re: [PATCH net-next 0/2] Bug Fixes for SFC driver
From: David Miller @ 2014-01-21 22:47 UTC (permalink / raw)
  To: sshah; +Cc: netdev, linux-net-drivers
In-Reply-To: <52D98860.4020402@solarflare.com>

From: Shradha Shah <sshah@solarflare.com>
Date: Fri, 17 Jan 2014 19:45:36 +0000

> I am taking over the upstream patch submission work for the 
> sfc driver from Ben Hutchings.
> 
> These patches are bug fixes to the sfc driver involving 
> replacement of the PORT RESET MC command and fixing transposed 
> ptp_{under,over}size_sync_window_statistics
> 
> The PORT_RESET bug fix is needed for all versions supporting EF10
> i.e all versions including and after 3.12.

Series applied, thanks.

^ permalink raw reply

* Re: [Patch net-next 2/6] net_sched: act: export tcf_hash_search() instead of tcf_hash_lookup()
From: David Miller @ 2014-01-21 22:44 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jhs
In-Reply-To: <1389987427-14085-3-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Fri, 17 Jan 2014 11:37:03 -0800

> So that we will not expose struct tcf_common to modules.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.

^ permalink raw reply

* Re: [Patch net-next 1/6] net_sched: act: fetch hinfo from a->ops->hinfo
From: David Miller @ 2014-01-21 22:43 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jhs
In-Reply-To: <1389987427-14085-2-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Fri, 17 Jan 2014 11:37:02 -0800

> Every action ops has a pointer to hash info, so we don't need to
> hard-code it in each module.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] bonding: add netlink attributes to slave link dev
From: Scott Feldman @ 2014-01-21 22:42 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, Netdev,
	Roopa Prabhu, Shrijeet Mukherjee, Ding Tianhong
In-Reply-To: <20140121220016.GF3015@minipsycho.orion>


On Jan 21, 2014, at 2:00 PM, Jiri Pirko <jiri@resnulli.us> wrote:

> Tue, Jan 21, 2014 at 10:36:58PM CET, sfeldma@cumulusnetworks.com wrote:
>> 
>> On Jan 21, 2014, at 5:34 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> 
>>>> +	if (rtnl_bond_slave_fill(skb, dev))
>>>> +		goto nla_put_failure;
>>>> +
>>> 
>>> I must say I do not like this at all. This should be done in a generic
>>> way. By a callback registered by bonding and possibly other master-slave
>>> device types.
>> 
>> The bond was registered with the ndo_get_slave op.  ndo_get_slave could be used for other master-slave device types.  I’ll agree that rtnl_bond_slave_fill() could have been written more generically.  Is that the objection?
> 
> I think is should be done rather in rtnl_link_ops. It's the natural point
> for this ops. I have patchset prepared. Will send it very soon.

Ok, cool.

Also, right now I have IFLA_SLAVE as a nest for IFLA_SLAVE_xxx attrs.  Do you think we should have a two-layer nest so we can capture other master-slave devices rather than just bond slaves?  I.e.:

	IFLA_SLAVE
		IFLA_BOND_SLAVE
			IFLA_BOND_SLAVE_xxx
			IFLA_BOND_SLAVE_yyy
			IFLA_BOND_SLAVE_zzz
		IFLA_FOO_SLAVE			// FOO is some other non-bond master
			IFLA_FOO_SLAVE_xxx
			IFLA_FOO_SLAVE_yyy
			IFLA_FOO_SLAVE_zzz

(Of course, slave wouldn’t be bond and foo slave at same time).

-scott

^ permalink raw reply

* Re: [PATCH v3] sch_htb: let skb->priority refer to non-leaf class
From: David Miller @ 2014-01-21 22:36 UTC (permalink / raw)
  To: harry.mason; +Cc: sergei.shtylyov, hadi, eric.dumazet, netdev
In-Reply-To: <1389964952.4698.20.camel@azathoth.dev.smoothwall.net>

From: Harry Mason <harry.mason@smoothwall.net>
Date: Fri, 17 Jan 2014 13:22:32 +0000

> If the class in skb->priority is not a leaf, apply filters from the
> selected class, not the qdisc. This lets netfilter or user space
> partially classify the packet.
> 
> Signed-off-by: Harry Mason <harry.mason@smoothwall.net>

Can I please get some reviews of this patch?

Thanks.

^ permalink raw reply

* Re: [PATCH] net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer
From: David Miller @ 2014-01-21 22:36 UTC (permalink / raw)
  To: matija.glavinic-pecotic.ext; +Cc: linux-sctp, alexander.sverdlin, netdev
In-Reply-To: <52D8D544.5050501@nsn.com>

From: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Date: Fri, 17 Jan 2014 08:01:24 +0100

> Implementation of (a)rwnd calculation might lead to severe performance issues
> and associations completely stalling. These problems are described and solution
> is proposed which improves lksctp's robustness in congestion state.

It would be fantastic if an SCTP expert would review this patch, it's
been rotting in patchwork for 4 days.

Thanks.

^ permalink raw reply

* Re: [PATCH 53/73] drivers/isdn: delete non-required instances of include <linux/init.h>
From: David Miller @ 2014-01-21 22:35 UTC (permalink / raw)
  To: paul.gortmaker; +Cc: linux-kernel, linux-arch, isdn, mac, netdev
In-Reply-To: <1390339396-3479-54-git-send-email-paul.gortmaker@windriver.com>

From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Tue, 21 Jan 2014 16:22:56 -0500

> None of these files are actually using any __init type directives
> and hence don't need to include <linux/init.h>.  Most are just a
> left over from __devinit and __cpuinit removal, or simply due to
> code getting copied from one driver to the next.
> 
> Cc: Karsten Keil <isdn@linux-pingi.de>
> Cc: Armin Schindler <mac@melware.de> (maintainer:ISDN SUBSYSTEM
> Cc: netdev@vger.kernel.org
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Joseph Gasparakis @ 2014-01-21 22:35 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Joseph Gasparakis, Pravin Shelar, Or Gerlitz, netdev
In-Reply-To: <CAJZOPZKdSmuEQv6dcuLb8E8CqBuTvpywQXZypFTp+j3Xnw2E0w@mail.gmail.com>



On Tue, 21 Jan 2014, Or Gerlitz wrote:

> On Tue, Jan 21, 2014, Joseph Gasparakis <joseph.gasparakis@intel.com> wrote:
> > On Tue, 21 Jan 2014, Or Gerlitz wrote:
> 
> >> To be a bit more precise/concrete here, do we agree that the both paths must >>    skb->encapsulation = 0;
> >> which is done now only by the non-ovs path
> 
> > Originally skb->encapsulation had (and still has) the meaning of "does
> > this skb have outer *and* (valid) inner headers? If so, it is an
> > encapsulated packet".
> > So based on this skb->encapsulation should be set as soon an (inner)
> > packet gets encapsulated and unset when decapsulation takes place, and
> > ideally this should happen in the ovs path too.
> 
> I would say critically not ideally... e.g when the packet is
> destinat-ed to a VM and goes through tap netdevice plugged to OVS --
> without this de-assignment weird things happen after the point in time
> where the ovs TX path calls dev_queue_xmit() in
> net/openvswitch/vport-netdev.c
> 
> > Together with skb->encapsulation the inner headers should be correctly set.
> 
> That's interesting point, what might break if this isn't done?

Well, the inner headers, same as the generic (outer) headers are pointers 
or offsets that are going to be added to the packet data pointer. If for 
any reason we try to access these pointers when they are invalid anything 
can go wrong... The way to indicate if they are valid or not should be 
skb->encapsulation.

> 

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] bonding: add netlink attributes to slave link dev
From: Jiri Pirko @ 2014-01-21 22:00 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, Netdev,
	Roopa Prabhu, Shrijeet Mukherjee, Ding Tianhong
In-Reply-To: <0A99CC69-DBFF-46DF-9300-D2C6DF10A965@cumulusnetworks.com>

Tue, Jan 21, 2014 at 10:36:58PM CET, sfeldma@cumulusnetworks.com wrote:
>
>On Jan 21, 2014, at 5:34 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>
>>> +	if (rtnl_bond_slave_fill(skb, dev))
>>> +		goto nla_put_failure;
>>> +
>> 
>> I must say I do not like this at all. This should be done in a generic
>> way. By a callback registered by bonding and possibly other master-slave
>> device types.
>
>The bond was registered with the ndo_get_slave op.  ndo_get_slave could be used for other master-slave device types.  I’ll agree that rtnl_bond_slave_fill() could have been written more generically.  Is that the objection?

I think is should be done rather in rtnl_link_ops. It's the natural point
for this ops. I have patchset prepared. Will send it very soon.

>
>> I have something in mind, will try to prepare patch soon.
>
>

^ permalink raw reply

* [PATCH] checkpatch: Prefer ether_addr_copy to memcpy(foo, bar, ETH_ALEN)
From: Joe Perches @ 2014-01-21 21:51 UTC (permalink / raw)
  To: Andrew Morton, Andy Whitcroft; +Cc: David Miller, netdev, LKML

ether_addr_copy was added for kernel version 3.14.
It's slightly smaller/faster for some arches.
Encourage its use.

Signed-off-by: Joe Perches <joe@perches.com>
---
 scripts/checkpatch.pl | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index be53ba7..0ea2a1e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -4096,6 +4096,16 @@ sub process {
 			}
 		}
 
+# Check for memcpy(foo, bar, ETH_ALEN) that could be ether_addr_copy(foo, bar)
+		if ($^V && $^V ge 5.10.0 &&
+		    $line =~ /^\+(?:.*?)\bmemcpy\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\,\s*ETH_ALEN\s*\)/s) {
+			if (WARN("PREFER_ETHER_ADDR_COPY",
+				 "Prefer ether_addr_copy() over memcpy() if the Ethernet addresses are __aligned(2)\n" . $herecurr) &&
+			    $fix) {
+				$fixed[$linenr - 1] =~ s/\bmemcpy\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\,\s*ETH_ALEN\s*\)/ether_addr_copy($2, $7)/;
+			}
+		}
+
 # typecasts on min/max could be min_t/max_t
 		if ($^V && $^V ge 5.10.0 &&
 		    defined $stat &&

^ permalink raw reply related

* Re: [PATCH net-next 00/25] bonding: introduce new option API
From: Scott Feldman @ 2014-01-21 21:51 UTC (permalink / raw)
  To: Dan Williams
  Cc: Nikolay Aleksandrov, Netdev, Andy Gospodarek, Jay Vosburgh,
	Veaceslav Falico, David S. Miller
In-Reply-To: <1390319983.26902.4.camel@dcbw.local>


On Jan 21, 2014, at 7:59 AM, Dan Williams <dcbw@redhat.com> wrote:

> On Tue, 2014-01-21 at 15:54 +0100, Nikolay Aleksandrov wrote:
>> Hi,
>> This patchset's goal is to introduce a new option API which should be used
>> to properly describe the bonding options with their mode dependcies and
>> requirements. With this patchset applied we get centralized option
>> manipulation, automatic RTNL acquire per option setting, automatic option
>> range checking, mode dependcy checking and other various flags which are
>> described in detail in patch 01's commit message and comments.
>> Also the parameter passing is changed to use a specialized structure which
>> is initialized to a value depending on the needs.
> 
> Currently userspace has to encode a lot of the same logic the kernel has
> for option validation, for example when creating a user interface for
> this stuff, you have to know that miimon and arp are incompatible, and
> that certain options are only relevant with certain bond modes, and it's
> a mess.  And this also sometimes changes when new kernel options or
> capabilities are added.

Note that if using the new netlink support for bonding via an updated ip link command, incompatible options will be caught and the command will fail.  For example, if you try to create a new bond and set both miimon and arp, the bond create will fail (and err msg logged).  Once the bond is created, again trying to set incompatible options with one command will fail in the same way.  If options are set one at a time on existing bond, then last set option takes precedence and may override earlier settings.

None of this is true if using sysfs.  So user-space tools should move to ip link (or netlink directly) rather than using sysfs for bond config.

> 
> So, is there any good way to describe the valid value ranges or
> capabilities for userspace?  One idea is to send a package of bond
> options to the kernel and see if they validate before actually applying
> them, though this only works when actually configuring the interface.
> An additional idea would be to somehow describe the available options
> (eg, value ranges for numeric values, list-of-strings for bond modes,
> etc) that the kernel supports before any bonds are even created (thus
> probably through netlink, not sysfs).
> 
> Thoughts?

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Or Gerlitz @ 2014-01-21 21:43 UTC (permalink / raw)
  To: Joseph Gasparakis; +Cc: Pravin Shelar, Or Gerlitz, netdev
In-Reply-To: <alpine.LFD.2.03.1401211336550.16080@intel.com>

On Tue, Jan 21, 2014, Joseph Gasparakis <joseph.gasparakis@intel.com> wrote:
> On Tue, 21 Jan 2014, Or Gerlitz wrote:

>> To be a bit more precise/concrete here, do we agree that the both paths must >>    skb->encapsulation = 0;
>> which is done now only by the non-ovs path

> Originally skb->encapsulation had (and still has) the meaning of "does
> this skb have outer *and* (valid) inner headers? If so, it is an
> encapsulated packet".
> So based on this skb->encapsulation should be set as soon an (inner)
> packet gets encapsulated and unset when decapsulation takes place, and
> ideally this should happen in the ovs path too.

I would say critically not ideally... e.g when the packet is
destinat-ed to a VM and goes through tap netdevice plugged to OVS --
without this de-assignment weird things happen after the point in time
where the ovs TX path calls dev_queue_xmit() in
net/openvswitch/vport-netdev.c

> Together with skb->encapsulation the inner headers should be correctly set.

That's interesting point, what might break if this isn't done?

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] bonding: add netlink attributes to slave link dev
From: Scott Feldman @ 2014-01-21 21:36 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek, Netdev,
	Roopa Prabhu, Shrijeet Mukherjee, Ding Tianhong
In-Reply-To: <20140121133426.GC3015@minipsycho.orion>


On Jan 21, 2014, at 5:34 AM, Jiri Pirko <jiri@resnulli.us> wrote:

>> +	if (rtnl_bond_slave_fill(skb, dev))
>> +		goto nla_put_failure;
>> +
> 
> I must say I do not like this at all. This should be done in a generic
> way. By a callback registered by bonding and possibly other master-slave
> device types.

The bond was registered with the ndo_get_slave op.  ndo_get_slave could be used for other master-slave device types.  I’ll agree that rtnl_bond_slave_fill() could have been written more generically.  Is that the objection?

> I have something in mind, will try to prepare patch soon.

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Joseph Gasparakis @ 2014-01-21 21:47 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Pravin Shelar, Or Gerlitz, Joseph Gasparakis, netdev
In-Reply-To: <CAJZOPZJ6xj=fOAY9rkXZZs99ONkHwcW8RMXb+H2FjJzV3eoAPg@mail.gmail.com>



On Tue, 21 Jan 2014, Or Gerlitz wrote:

> On Tue, Jan 21, 2014 at 7:30 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> > On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> 
> >> While testing the gro udp patches over a setup with openvswitch I noted that
> >> the RX checksum offload support introduced by Joseph's commit 0afb166
> >> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
> >> when you use a setup made of
> >> NIC --> IP stack --> vxlan device --> bridge --> tap
> >> but not when its
> >> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
> >> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
> >> remains true also after the decap is done. Basically, this is the original hunk
> [...]
> >>> +       skb->encapsulation = 0;
> [...]
> 
> >> Moving this to shared code (while removing the check for
> >> vxlan->dev->features) made things to work on my setup, but this misses one
> >> of the original conditions, ideas?
> 
> > I kept csum check in vxlan-device recv path for same reason. As of now
> > there is no efficient way to get ovs-dev features.
> > May be we can cache device features in struct datapath from datapath-netdev.
> 
> To be a bit more precise/concrete here, do we agree that the both paths must do
> 
>    skb->encapsulation = 0;
> 
> which is done now only by the non-ovs path

Originally skb->encapsulation had (and still has) the meaning of "does 
this skb have outer *and* (valid) inner headers? If so, it is an 
encapsulated packet". 
So based on this skb->encapsulation should be set as soon an (inner) 
packet gets encapsulated and unset when decapsulation takes place, and 
ideally this should happen in the ovs path too. Together with 
skb->encapsulation the inner headers should be correctly set.
 
> 
> Or.
> 

^ permalink raw reply

* [PATCH 04/73] netfilter: don't use module_init/exit in core IPV4 code
From: Paul Gortmaker @ 2014-01-21 21:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Paul Gortmaker, Pablo Neira Ayuso, Patrick McHardy,
	Jozsef Kadlecsik, David S. Miller, netfilter-devel, netdev
In-Reply-To: <1390339396-3479-1-git-send-email-paul.gortmaker@windriver.com>

The file net/ipv4/netfilter.o is created based on whether
CONFIG_NETFILTER is set.  However that is defined as a bool, and
hence this file with the core netfilter hooks will never be
modular.  So using module_init as an alias for __initcall can be
somewhat misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future.  If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.  Also add an inclusion of init.h, as
that was previously implicit here in the netfilter.c file.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups.  As __initcall gets
mapped onto device_initcall, our use of subsys_initcall (which
seems to make sense for netfilter code) will thus change this
registration from level 6-device to level 4-subsys (i.e. slightly
earlier).  However no observable impact of that small difference
has been observed during testing, or is expected. (i.e. the
location of the netfilter messages in dmesg remains unchanged
with respect to all the other surrounding messages.)

As for the module_exit, rather than replace it with __exitcall,
we simply remove it, since it appears only UML does anything
with those, and even for UML, there is no relevant cleanup
to be done here.

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netfilter-devel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 net/ipv4/netfilter.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
index c3e0ade..31abf96 100644
--- a/net/ipv4/netfilter.c
+++ b/net/ipv4/netfilter.c
@@ -197,11 +197,4 @@ static int __init ipv4_netfilter_init(void)
 {
 	return nf_register_afinfo(&nf_ip_afinfo);
 }
-
-static void __exit ipv4_netfilter_fini(void)
-{
-	nf_unregister_afinfo(&nf_ip_afinfo);
-}
-
-module_init(ipv4_netfilter_init);
-module_exit(ipv4_netfilter_fini);
+device_initcall(ipv4_netfilter_init);
-- 
1.8.4.1

^ permalink raw reply related

* [PATCH 53/73] drivers/isdn: delete non-required instances of include <linux/init.h>
From: Paul Gortmaker @ 2014-01-21 21:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Paul Gortmaker, Karsten Keil, Armin Schindler, netdev
In-Reply-To: <1390339396-3479-1-git-send-email-paul.gortmaker@windriver.com>

None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Armin Schindler <mac@melware.de> (maintainer:ISDN SUBSYSTEM
Cc: netdev@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 drivers/isdn/hardware/avm/avm_cs.c     | 1 -
 drivers/isdn/hardware/eicon/platform.h | 1 -
 drivers/isdn/hisax/amd7930_fn.c        | 1 -
 drivers/isdn/hisax/asuscom.c           | 1 -
 drivers/isdn/hisax/avm_a1.c            | 1 -
 drivers/isdn/hisax/avm_a1p.c           | 1 -
 drivers/isdn/hisax/avma1_cs.c          | 1 -
 drivers/isdn/hisax/bkm_a4t.c           | 1 -
 drivers/isdn/hisax/bkm_a8.c            | 1 -
 drivers/isdn/hisax/diva.c              | 1 -
 drivers/isdn/hisax/elsa.c              | 1 -
 drivers/isdn/hisax/elsa_cs.c           | 1 -
 drivers/isdn/hisax/enternow_pci.c      | 1 -
 drivers/isdn/hisax/fsm.c               | 1 -
 drivers/isdn/hisax/gazel.c             | 1 -
 drivers/isdn/hisax/hfc_2bds0.c         | 1 -
 drivers/isdn/hisax/hfc_2bs0.c          | 1 -
 drivers/isdn/hisax/hfc_pci.c           | 1 -
 drivers/isdn/hisax/hfc_sx.c            | 1 -
 drivers/isdn/hisax/hfcscard.c          | 1 -
 drivers/isdn/hisax/hscx.c              | 1 -
 drivers/isdn/hisax/icc.c               | 1 -
 drivers/isdn/hisax/ipacx.c             | 1 -
 drivers/isdn/hisax/isac.c              | 1 -
 drivers/isdn/hisax/isar.c              | 1 -
 drivers/isdn/hisax/isurf.c             | 1 -
 drivers/isdn/hisax/ix1_micro.c         | 1 -
 drivers/isdn/hisax/jade.c              | 1 -
 drivers/isdn/hisax/mic.c               | 1 -
 drivers/isdn/hisax/netjet.c            | 1 -
 drivers/isdn/hisax/niccy.c             | 1 -
 drivers/isdn/hisax/nj_s.c              | 1 -
 drivers/isdn/hisax/nj_u.c              | 1 -
 drivers/isdn/hisax/s0box.c             | 1 -
 drivers/isdn/hisax/saphir.c            | 1 -
 drivers/isdn/hisax/sedlbauer.c         | 1 -
 drivers/isdn/hisax/sedlbauer_cs.c      | 1 -
 drivers/isdn/hisax/sportster.c         | 1 -
 drivers/isdn/hisax/st5481_b.c          | 1 -
 drivers/isdn/hisax/st5481_usb.c        | 1 -
 drivers/isdn/hisax/teleint.c           | 1 -
 drivers/isdn/hisax/teles0.c            | 1 -
 drivers/isdn/hisax/teles3.c            | 1 -
 drivers/isdn/hisax/teles_cs.c          | 1 -
 drivers/isdn/hisax/telespci.c          | 1 -
 drivers/isdn/hisax/w6692.c             | 1 -
 drivers/isdn/i4l/isdnhdlc.c            | 1 -
 47 files changed, 47 deletions(-)

diff --git a/drivers/isdn/hardware/avm/avm_cs.c b/drivers/isdn/hardware/avm/avm_cs.c
index 62b8030..4332bd8 100644
--- a/drivers/isdn/hardware/avm/avm_cs.c
+++ b/drivers/isdn/hardware/avm/avm_cs.c
@@ -11,7 +11,6 @@
 
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/init.h>
 #include <linux/ptrace.h>
 #include <linux/string.h>
 #include <linux/tty.h>
diff --git a/drivers/isdn/hardware/eicon/platform.h b/drivers/isdn/hardware/eicon/platform.h
index b2edb75..b034ac1 100644
--- a/drivers/isdn/hardware/eicon/platform.h
+++ b/drivers/isdn/hardware/eicon/platform.h
@@ -19,7 +19,6 @@
 #endif
 
 #include <linux/module.h>
-#include <linux/init.h>
 #include <linux/kernel.h>
 #include <linux/sched.h>
 #include <linux/skbuff.h>
diff --git a/drivers/isdn/hisax/amd7930_fn.c b/drivers/isdn/hisax/amd7930_fn.c
index 36817e0..4d437f2 100644
--- a/drivers/isdn/hisax/amd7930_fn.c
+++ b/drivers/isdn/hisax/amd7930_fn.c
@@ -58,7 +58,6 @@
 #include "isac.h"
 #include "amd7930_fn.h"
 #include <linux/interrupt.h>
-#include <linux/init.h>
 #include <linux/gfp.h>
 
 static void Amd7930_new_ph(struct IsdnCardState *cs);
diff --git a/drivers/isdn/hisax/asuscom.c b/drivers/isdn/hisax/asuscom.c
index 62f9c43..cdbbec7 100644
--- a/drivers/isdn/hisax/asuscom.c
+++ b/drivers/isdn/hisax/asuscom.c
@@ -12,7 +12,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/isapnp.h>
 #include "hisax.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/avm_a1.c b/drivers/isdn/hisax/avm_a1.c
index 7dd7408..d3dbf02 100644
--- a/drivers/isdn/hisax/avm_a1.c
+++ b/drivers/isdn/hisax/avm_a1.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/avm_a1p.c b/drivers/isdn/hisax/avm_a1p.c
index bc52d54..50f1b1a 100644
--- a/drivers/isdn/hisax/avm_a1p.c
+++ b/drivers/isdn/hisax/avm_a1p.c
@@ -13,7 +13,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/avma1_cs.c b/drivers/isdn/hisax/avma1_cs.c
index baad94e..cf305b5 100644
--- a/drivers/isdn/hisax/avma1_cs.c
+++ b/drivers/isdn/hisax/avma1_cs.c
@@ -13,7 +13,6 @@
 
 
 #include <linux/kernel.h>
-#include <linux/init.h>
 #include <linux/ptrace.h>
 #include <linux/slab.h>
 #include <linux/string.h>
diff --git a/drivers/isdn/hisax/bkm_a4t.c b/drivers/isdn/hisax/bkm_a4t.c
index c360164..8ac171b 100644
--- a/drivers/isdn/hisax/bkm_a4t.c
+++ b/drivers/isdn/hisax/bkm_a4t.c
@@ -11,7 +11,6 @@
  */
 
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/bkm_a8.c b/drivers/isdn/hisax/bkm_a8.c
index dd663ea..20ad1f7 100644
--- a/drivers/isdn/hisax/bkm_a8.c
+++ b/drivers/isdn/hisax/bkm_a8.c
@@ -11,7 +11,6 @@
  */
 
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "ipac.h"
diff --git a/drivers/isdn/hisax/diva.c b/drivers/isdn/hisax/diva.c
index 4fc90de..c8d2682 100644
--- a/drivers/isdn/hisax/diva.c
+++ b/drivers/isdn/hisax/diva.c
@@ -15,7 +15,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/elsa.c b/drivers/isdn/hisax/elsa.c
index 2be1c8a..bdf952b 100644
--- a/drivers/isdn/hisax/elsa.c
+++ b/drivers/isdn/hisax/elsa.c
@@ -18,7 +18,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/slab.h>
 #include "hisax.h"
 #include "arcofi.h"
diff --git a/drivers/isdn/hisax/elsa_cs.c b/drivers/isdn/hisax/elsa_cs.c
index 40f6fad..ca35784 100644
--- a/drivers/isdn/hisax/elsa_cs.c
+++ b/drivers/isdn/hisax/elsa_cs.c
@@ -37,7 +37,6 @@
 
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/init.h>
 #include <linux/ptrace.h>
 #include <linux/slab.h>
 #include <linux/string.h>
diff --git a/drivers/isdn/hisax/enternow_pci.c b/drivers/isdn/hisax/enternow_pci.c
index e8d431a..2fd3886 100644
--- a/drivers/isdn/hisax/enternow_pci.c
+++ b/drivers/isdn/hisax/enternow_pci.c
@@ -67,7 +67,6 @@
 #include <linux/interrupt.h>
 #include <linux/ppp_defs.h>
 #include <linux/pci.h>
-#include <linux/init.h>
 #include "netjet.h"
 
 
diff --git a/drivers/isdn/hisax/fsm.c b/drivers/isdn/hisax/fsm.c
index c7a9471..61db6fa 100644
--- a/drivers/isdn/hisax/fsm.c
+++ b/drivers/isdn/hisax/fsm.c
@@ -16,7 +16,6 @@
 
 #include <linux/module.h>
 #include <linux/slab.h>
-#include <linux/init.h>
 #include "hisax.h"
 
 #define FSM_TIMER_DEBUG 0
diff --git a/drivers/isdn/hisax/gazel.c b/drivers/isdn/hisax/gazel.c
index 35c6df6..5157ac7 100644
--- a/drivers/isdn/hisax/gazel.c
+++ b/drivers/isdn/hisax/gazel.c
@@ -11,7 +11,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/hfc_2bds0.c b/drivers/isdn/hisax/hfc_2bds0.c
index a756e5c..17fcbbd 100644
--- a/drivers/isdn/hisax/hfc_2bds0.c
+++ b/drivers/isdn/hisax/hfc_2bds0.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include "hisax.h"
diff --git a/drivers/isdn/hisax/hfc_2bs0.c b/drivers/isdn/hisax/hfc_2bs0.c
index 838531b..2e2c319 100644
--- a/drivers/isdn/hisax/hfc_2bs0.c
+++ b/drivers/isdn/hisax/hfc_2bs0.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "hfc_2bs0.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/hfc_pci.c b/drivers/isdn/hisax/hfc_pci.c
index 4a48255..5d8a1c7 100644
--- a/drivers/isdn/hisax/hfc_pci.c
+++ b/drivers/isdn/hisax/hfc_pci.c
@@ -15,7 +15,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "hfc_pci.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/hfc_sx.c b/drivers/isdn/hisax/hfc_sx.c
index fa1fefd..99c7f09 100644
--- a/drivers/isdn/hisax/hfc_sx.c
+++ b/drivers/isdn/hisax/hfc_sx.c
@@ -11,7 +11,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "hfc_sx.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/hfcscard.c b/drivers/isdn/hisax/hfcscard.c
index 394da64..1b69f0c 100644
--- a/drivers/isdn/hisax/hfcscard.c
+++ b/drivers/isdn/hisax/hfcscard.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/isapnp.h>
 #include "hisax.h"
 #include "hfc_2bds0.h"
diff --git a/drivers/isdn/hisax/hscx.c b/drivers/isdn/hisax/hscx.c
index 3e305fe..8c237dc 100644
--- a/drivers/isdn/hisax/hscx.c
+++ b/drivers/isdn/hisax/hscx.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "hscx.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/icc.c b/drivers/isdn/hisax/icc.c
index 51dae91..575700f 100644
--- a/drivers/isdn/hisax/icc.c
+++ b/drivers/isdn/hisax/icc.c
@@ -14,7 +14,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "icc.h"
 // #include "arcofi.h"
diff --git a/drivers/isdn/hisax/ipacx.c b/drivers/isdn/hisax/ipacx.c
index 5faa5de..d522dfb 100644
--- a/drivers/isdn/hisax/ipacx.c
+++ b/drivers/isdn/hisax/ipacx.c
@@ -11,7 +11,6 @@
  */
 #include <linux/kernel.h>
 #include <linux/slab.h>
-#include <linux/init.h>
 #include "hisax_if.h"
 #include "hisax.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/isac.c b/drivers/isdn/hisax/isac.c
index 7fdf78f..e671728 100644
--- a/drivers/isdn/hisax/isac.c
+++ b/drivers/isdn/hisax/isac.c
@@ -19,7 +19,6 @@
 #include "isdnl1.h"
 #include <linux/interrupt.h>
 #include <linux/slab.h>
-#include <linux/init.h>
 
 #define DBUSY_TIMER_VALUE 80
 #define ARCOFI_USE 1
diff --git a/drivers/isdn/hisax/isar.c b/drivers/isdn/hisax/isar.c
index f4956c7..993c6f3 100644
--- a/drivers/isdn/hisax/isar.c
+++ b/drivers/isdn/hisax/isar.c
@@ -8,7 +8,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isar.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/isurf.c b/drivers/isdn/hisax/isurf.c
index 1399ddd..0d5849f 100644
--- a/drivers/isdn/hisax/isurf.c
+++ b/drivers/isdn/hisax/isurf.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "isar.h"
diff --git a/drivers/isdn/hisax/ix1_micro.c b/drivers/isdn/hisax/ix1_micro.c
index 7ae39f5..7d3b35c 100644
--- a/drivers/isdn/hisax/ix1_micro.c
+++ b/drivers/isdn/hisax/ix1_micro.c
@@ -17,7 +17,6 @@
  * Germany
  */
 
-#include <linux/init.h>
 #include <linux/isapnp.h>
 #include "hisax.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/jade.c b/drivers/isdn/hisax/jade.c
index e2ae787..f2c24f7 100644
--- a/drivers/isdn/hisax/jade.c
+++ b/drivers/isdn/hisax/jade.c
@@ -11,7 +11,6 @@
  */
 
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "hscx.h"
 #include "jade.h"
diff --git a/drivers/isdn/hisax/mic.c b/drivers/isdn/hisax/mic.c
index 9339867..6cf1367 100644
--- a/drivers/isdn/hisax/mic.c
+++ b/drivers/isdn/hisax/mic.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/netjet.c b/drivers/isdn/hisax/netjet.c
index 233e432..93b10a5 100644
--- a/drivers/isdn/hisax/netjet.c
+++ b/drivers/isdn/hisax/netjet.c
@@ -14,7 +14,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/niccy.c b/drivers/isdn/hisax/niccy.c
index e4c33cf..480cc66 100644
--- a/drivers/isdn/hisax/niccy.c
+++ b/drivers/isdn/hisax/niccy.c
@@ -13,7 +13,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/nj_s.c b/drivers/isdn/hisax/nj_s.c
index 32b4bbd..3c43f52 100644
--- a/drivers/isdn/hisax/nj_s.c
+++ b/drivers/isdn/hisax/nj_s.c
@@ -5,7 +5,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/nj_u.c b/drivers/isdn/hisax/nj_u.c
index 4e8adbe..9f6eaab 100644
--- a/drivers/isdn/hisax/nj_u.c
+++ b/drivers/isdn/hisax/nj_u.c
@@ -5,7 +5,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "icc.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/hisax/s0box.c b/drivers/isdn/hisax/s0box.c
index 4e7d0aa..3749b09 100644
--- a/drivers/isdn/hisax/s0box.c
+++ b/drivers/isdn/hisax/s0box.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/saphir.c b/drivers/isdn/hisax/saphir.c
index 6b2d0ec..dc36783 100644
--- a/drivers/isdn/hisax/saphir.c
+++ b/drivers/isdn/hisax/saphir.c
@@ -12,7 +12,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/sedlbauer.c b/drivers/isdn/hisax/sedlbauer.c
index f16a47b..1b9459f 100644
--- a/drivers/isdn/hisax/sedlbauer.c
+++ b/drivers/isdn/hisax/sedlbauer.c
@@ -38,7 +38,6 @@
  * For example: hisaxctrl <DriverID> 9 ISAR.BIN
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "ipac.h"
diff --git a/drivers/isdn/hisax/sedlbauer_cs.c b/drivers/isdn/hisax/sedlbauer_cs.c
index 92ef62d..f738f26 100644
--- a/drivers/isdn/hisax/sedlbauer_cs.c
+++ b/drivers/isdn/hisax/sedlbauer_cs.c
@@ -37,7 +37,6 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
-#include <linux/init.h>
 #include <linux/ptrace.h>
 #include <linux/slab.h>
 #include <linux/string.h>
diff --git a/drivers/isdn/hisax/sportster.c b/drivers/isdn/hisax/sportster.c
index 18cee63..943a186 100644
--- a/drivers/isdn/hisax/sportster.c
+++ b/drivers/isdn/hisax/sportster.c
@@ -12,7 +12,6 @@
  *
  *
  */
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/st5481_b.c b/drivers/isdn/hisax/st5481_b.c
index 4098491..ca141d6 100644
--- a/drivers/isdn/hisax/st5481_b.c
+++ b/drivers/isdn/hisax/st5481_b.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/gfp.h>
 #include <linux/usb.h>
 #include <linux/netdevice.h>
diff --git a/drivers/isdn/hisax/st5481_usb.c b/drivers/isdn/hisax/st5481_usb.c
index ead0a4f..ee78014 100644
--- a/drivers/isdn/hisax/st5481_usb.c
+++ b/drivers/isdn/hisax/st5481_usb.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include <linux/usb.h>
 #include <linux/slab.h>
 #include "st5481.h"
diff --git a/drivers/isdn/hisax/teleint.c b/drivers/isdn/hisax/teleint.c
index bf64754..d947abc 100644
--- a/drivers/isdn/hisax/teleint.c
+++ b/drivers/isdn/hisax/teleint.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hfc_2bs0.h"
diff --git a/drivers/isdn/hisax/teles0.c b/drivers/isdn/hisax/teles0.c
index ce9eabd..bd439a9 100644
--- a/drivers/isdn/hisax/teles0.c
+++ b/drivers/isdn/hisax/teles0.c
@@ -15,7 +15,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isdnl1.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/teles3.c b/drivers/isdn/hisax/teles3.c
index 38fb2c1..c608d80 100644
--- a/drivers/isdn/hisax/teles3.c
+++ b/drivers/isdn/hisax/teles3.c
@@ -13,7 +13,6 @@
  *              Beat Doebeli
  *
  */
-#include <linux/init.h>
 #include <linux/isapnp.h>
 #include "hisax.h"
 #include "isac.h"
diff --git a/drivers/isdn/hisax/teles_cs.c b/drivers/isdn/hisax/teles_cs.c
index b8dd149..c0e532e 100644
--- a/drivers/isdn/hisax/teles_cs.c
+++ b/drivers/isdn/hisax/teles_cs.c
@@ -18,7 +18,6 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
-#include <linux/init.h>
 #include <linux/ptrace.h>
 #include <linux/slab.h>
 #include <linux/string.h>
diff --git a/drivers/isdn/hisax/telespci.c b/drivers/isdn/hisax/telespci.c
index 33eeb46..f9b2169 100644
--- a/drivers/isdn/hisax/telespci.c
+++ b/drivers/isdn/hisax/telespci.c
@@ -12,7 +12,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "isac.h"
 #include "hscx.h"
diff --git a/drivers/isdn/hisax/w6692.c b/drivers/isdn/hisax/w6692.c
index a858955..7e0c60c 100644
--- a/drivers/isdn/hisax/w6692.c
+++ b/drivers/isdn/hisax/w6692.c
@@ -10,7 +10,6 @@
  *
  */
 
-#include <linux/init.h>
 #include "hisax.h"
 #include "w6692.h"
 #include "isdnl1.h"
diff --git a/drivers/isdn/i4l/isdnhdlc.c b/drivers/isdn/i4l/isdnhdlc.c
index 027d1c5..5daace1 100644
--- a/drivers/isdn/i4l/isdnhdlc.c
+++ b/drivers/isdn/i4l/isdnhdlc.c
@@ -23,7 +23,6 @@
  */
 
 #include <linux/module.h>
-#include <linux/init.h>
 #include <linux/crc-ccitt.h>
 #include <linux/isdn/hdlc.h>
 #include <linux/bitrev.h>
-- 
1.8.4.1

^ permalink raw reply related

* [PATCH RFC 00/73] tree-wide: clean up some no longer required #include <linux/init.h>
From: Paul Gortmaker @ 2014-01-21 21:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Paul Gortmaker, linux-alpha, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linuxppc-dev, linux-s390,
	sparclinux, x86, netdev, kvm, sfr, rusty, gregkh, akpm, torvalds

TL;DR - We removed cpuinit and devinit, which left ~2000 instances of
include <linux/init.h> that were no longer needed.  To fully enable
this removal/cleanup, we relocate module_init() from init.h into
module.h.  Multi arch/multi config build testing on linux-next has
been used to find and fix any implicit header dependencies prior to
deploying the actual init.h --> module.h move, to preserve bisection.

Additional details beyond TL;DR:

module_init/module_exit and friends moved to module.h
=====================================================
Aside from enabling this init.h cleanup to extend into modular files,
it actually does make sense.  For all modules will use some form of
our initfunc processing/categorization, but not all initfunc users
will be necessarily using modular functionality.  So we move these
module related macros to module.h and ensure module.h sources init.h


module_init in non modular code:
================================
This series uncovered that we are enabling people to use module_init
in non-modular code.  While that works fine, there are at least three
reasons why it probably should not be encouraged:

 1) it makes a casual reader of the code assume the code is modular
    even though it is obj-y (builtin) or controlled by a bool Kconfig.

 2) it makes it too easy to add dead code in a function that is handed
    to module_exit() -- [more on that below]

 3) it breaks our ability to use priority sorted initcalls properly
    [more on that below.]

After this change, a new coder who tries to make use of module_init in
non modular code would find themselves also needing to include the
module.h header.  At which point the odds are greater that they would
ask themselves "Am I doing this right?  I shouldn't need this."

Note that existing non-modular code that already includes module.h and
uses module_init doesn't get fixed here, since they already build w/o
errors triggered by this change; we'll have to hunt them down later.


module_init and initcall ordering:
==================================
We have a group of about ten priority sorted initcalls, that are
called in init/main.c after most of the hard coded/direct calls
have been processed.  These serve the purpose of avoiding everyone
poking at init/main.c to hook in their init sequence.  The bins are:

        pure_initcall               0
        core_initcall               1
        postcore_initcall           2
        arch_initcall               3
        subsys_initcall             4
        fs_initcall                 5
        device_initcall             6
        late_initcall               7

These are meant to eventually replace users of the non specific
priority "__initcall" which currently maps onto device_initcall.
This is of interest, because in non-modular code, cpp does this:

    module_init -->  __initcall --> device_initcall

So all module_init() land in the device_initcall bucket, rather late
in the sequence.  That makes sense, since if it was a module, the init
could be real late (days, weeks after boot).  But now imagine you add
support for some non-modular bus/arch/infrastructure (say for e.g. PCI)
and you use module_init for it.  That means anybody else who wants
to use your subsystem can't do so if they use an initcall of 0 --> 5
priority.  For a real world example of this, see patch #1 in this series:

	https://lkml.org/lkml/2014/1/14/809

We don't want to force code that is clearly arch or subsys or fs
specific to have to use the device_initcall just because something
else has been mistakenly put (or left) in that bucket.  So a couple of
changes do actually change the initcall level where it is inevitably
appropriate to do so.  Those are called out explicitly in their
respective commit logs.


module_exit and dead code
=========================
Built in code will never have an opportunity to call functions that
are registered with module_exit(), so any cases of that uncovered in
this series delete that dead code.  Note that any built-in code that
was already including module.h and using module_exit won't have shown
up as breakage on the build coverage of this series, so we'll have to
find those independently later.  It looks like there may be quite a
few that are invisibly created via module_platform_driver -- a macro
that creates module_init and module_exit automatically.  We may want
to consider relocating module_platform_driver into module.h later...


cpuinit
=======
To finalize the removal of cpuinit, which was done several releases
ago, we remove the remaining stub functions from init.h in this
series.  We've seen one or two "users" try to creep back in, so this
will close the door on that chapter and prevent creep.


When, what and where?
=====================
When: Ideally, barring any objections or massive oversights on my
part, this will go in at or around rc1, i.e. in about 2wks.  In the
meantime I will continue daily re-test on linux-next across ~10 different
arch, using allyesconfig, allmodconfig and arch specific defconfigs
for things like mips/arm/powerpc; as I have been doing for a while.

Where: This work exists as a queue of patches that I apply to
linux-next; since the changes are fixing some things that currently
can only be found there.  The patch series can be found at:

   http://git.kernel.org/cgit/linux/kernel/git/paulg/init.git
   git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git

The patches are not in strict chronological order, since when I've
found a header change causes a build regression that is due to an
implicit dependency/inclusion, I place the dependency fix before the
header change that caused it, so that bisection is preserved.

I've avoided annoying Stephen with another queue of patches for
linux-next while the development content was in flux, but now that
the merge window has opened, and new additions are fewer, perhaps he
wouldn't mind tacking it on the end...  Stephen?

In order to reduce the size of the overall queue here, I have already
put some dependency-free changes through maintainer trees after
re-testing them on whatever their development baseline was.  That made
sense for the larger ones (drivers/[net,usb,input] some arch trees...)
and for the kernel/ mm/ and fs/ ones where the changes were less
trivial and an earlier review was desired. But that independent treatment
doesn't scale for handling all the commits -- hence ~1400 of the
full ~2k of init.h removals remain here in this series.

What: The audit for removal of extra init.h lines has covered
drivers/, all of the main architectures (and some of the more fringe
ones), and core dirs like mm/ fs/ and kernel/ too.  The removals from
include/ itself are probably the most valuable, in terms of reducing
the amount of stuff we needlessly feed CPP.  There is probably more
fringe ones to be found, but this covers the majority of them.
Additional ones can be fed in later (through the trivial tree perhaps)
as desired.

Build coverage (from memory) has included, but is not limited to:

  allyesconfig, allmodconfig:
	x86, x86_64, ia64, s390, arm, mips, sparc, powerpc
  arch specifc arch/<name>/config/*config files:
	arm, mips, powerpc
  defconfig:
	(all of the above), c6x, parisc, uml, tile, c6x, blackfin, ...

and it will continue to take place for the next ~2wks, until I can
reliably apply the queue to master and submit a pull request.

Thanks for reading this far, and thanks to those who have merged init.h
cleanup commits already!  Additional comments, reviews and acks welcomed.

Paul.
---

Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Cc: netdev@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: sfr@canb.auug.org.au
Cc: rusty@rustcorp.com.au
Cc: gregkh@linuxfoundation.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org

Paul Gortmaker (73):
  init: delete the __cpuinit related stubs
  mm: replace module_init usages with subsys_initcall in nommu.c
  fs/notify: don't use module_init for non-modular inotify_user code
  netfilter: don't use module_init/exit in core IPV4 code
  x86: don't use module_init in non-modular intel_mid_vrtc.c
  x86: don't use module_init for non-modular core bootflag code
  x86: replace __init_or_module with __init in non-modular vsmp_64.c
  drivers/tty/hvc: don't use module_init in non-modular hyp. console code
  staging: don't use module_init in non-modular ion_dummy_driver.c
  powerpc: use device_initcall for registering rtc devices
  powerpc: book3s KVM can be modular so it should use module.h
  powerpc: kvm e500/44x is not modular, so don't use module_init
  powerpc: use subsys_initcall for Freescale Local Bus
  powerpc: don't use module_init for non-modular core hugetlb code
  powerpc: don't use module_init in non-modular 83xx suspend code
  arm: include module.h in drivers/bus/omap_l3_smx.c
  arm: fix implicit module.h use in mach-at91 gpio.h
  arm: fix implicit #include <linux/init.h> in entry asm.
  arm: mach-s3c64xx mach-crag6410-module.c is not modular
  arm: use subsys_initcall in non-modular pl320 IPC code
  arm: don't use module_init in non-modular mach-vexpress/spc.c code
  alpha: don't use module_init for non-modular core code
  sparc: don't use module_init in non-modular pci.c code
  m68k: don't use module_init in non-modular mvme16x/rtc.c code
  ia64: don't use module_init for non-modular core kernel/mca.c code
  ia64: don't use module_init in non-modular sim/simscsi.c code
  drivers/clk: don't use module_init in clk-nomadik.c which is non-modular
  cpuidle: don't use modular platform register in non-modular ARM drivers
  drivers/platform: don't use modular register in non-modular pdev_bus.c
  drivers/i2c: busses/i2c-acorn.c is tristate and should use module.h
  module: relocate module_init from init.h to module.h
  logo: emit "#include <linux/init.h> in autogenerated C file
  arm: delete non-required instances of include <linux/init.h>
  mips: delete non-required instances of include <linux/init.h>
  sparc: delete non-required instances of include <linux/init.h>
  s390: delete non-required instances of include <linux/init.h>
  alpha: delete non-required instances of <linux/init.h>
  blackfin: delete non-required instances of <linux/init.h>
  powerpc: delete another unrequired instance of <linux/init.h>
  watchdog: delete non-required instances of include <linux/init.h>
  video: delete non-required instances of include <linux/init.h>
  rtc: delete non-required instances of include <linux/init.h>
  scsi: delete non-required instances of include <linux/init.h>
  spi: delete non-required instances of include <linux/init.h>
  acpi: delete non-required instances of include <linux/init.h>
  drivers/power: delete non-required instances of include <linux/init.h>
  drivers/media: delete non-required instances of include <linux/init.h>
  drivers/ata: delete non-required instances of include <linux/init.h>
  drivers/mtd: delete non-required instances of include <linux/init.h>
  drivers/hwmon: delete non-required instances of include <linux/init.h>
  drivers/i2c: delete non-required instances of include <linux/init.h>
  drivers/pinctrl: delete non-required instances of include <linux/init.h>
  drivers/isdn: delete non-required instances of include <linux/init.h>
  drivers/leds: delete non-required instances of include <linux/init.h>
  drivers/pcmcia: delete non-required instances of include <linux/init.h>
  drivers/char: delete non-required instances of include <linux/init.h>
  drivers/infiniband: delete non-required instances of include <linux/init.h>
  drivers/mfd: delete non-required instances of include <linux/init.h>
  drivers/gpio: delete non-required instances of include <linux/init.h>
  drivers/bluetooth: delete non-required instances of include <linux/init.h>
  drivers/mmc: delete non-required instances of include <linux/init.h>
  drivers/crypto: delete non-required instances of include <linux/init.h>
  drivers/platform: delete non-required instances of include <linux/init.h>
  drivers/misc: delete non-required instances of include <linux/init.h>
  drivers/edac: delete non-required instances of include <linux/init.h>
  drivers/macintosh: delete non-required instances of include <linux/init.h>
  drivers/base: delete non-required instances of include <linux/init.h>
  drivers/cpufreq: delete non-required instances of <linux/init.h>
  drivers/pci: delete non-required instances of <linux/init.h>
  drivers/dma: delete non-required instances of <linux/init.h>
  drivers/gpu: delete non-required instances of <linux/init.h>
  drivers: delete remaining non-required instances of <linux/init.h>
  include: remove needless instances of <linux/init.h>

 arch/alpha/kernel/err_ev6.c                        |  1 -
 arch/alpha/kernel/irq.c                            |  1 -
 arch/alpha/kernel/srmcons.c                        |  3 +-
 arch/alpha/kernel/traps.c                          |  1 -
 arch/alpha/oprofile/op_model_ev4.c                 |  1 -
 arch/alpha/oprofile/op_model_ev5.c                 |  1 -
 arch/alpha/oprofile/op_model_ev6.c                 |  1 -
 arch/alpha/oprofile/op_model_ev67.c                |  1 -
 arch/arm/common/dmabounce.c                        |  1 -
 arch/arm/firmware/trusted_foundations.c            |  1 -
 arch/arm/include/asm/arch_timer.h                  |  1 -
 arch/arm/kernel/entry-armv.S                       |  2 +
 arch/arm/kernel/entry-header.S                     |  1 -
 arch/arm/kernel/hyp-stub.S                         |  1 -
 arch/arm/kernel/suspend.c                          |  1 -
 arch/arm/kernel/unwind.c                           |  1 -
 arch/arm/mach-at91/include/mach/gpio.h             |  1 +
 arch/arm/mach-cns3xxx/pm.c                         |  1 -
 arch/arm/mach-exynos/headsmp.S                     |  1 -
 arch/arm/mach-footbridge/personal.c                |  1 -
 arch/arm/mach-imx/headsmp.S                        |  1 -
 arch/arm/mach-imx/iomux-v3.c                       |  1 -

 [.... snip ~1300 lines ...]

 drivers/watchdog/stmp3xxx_rtc_wdt.c                |  1 -
 drivers/watchdog/wdt_pci.c                         |  1 -
 drivers/xen/xen-stub.c                             |  1 -
 fs/notify/inotify/inotify_user.c                   |  4 +-
 include/drm/drmP.h                                 |  2 +-
 include/linux/fb.h                                 |  1 -
 include/linux/ide.h                                |  1 -
 include/linux/init.h                               | 77 ----------------------
 include/linux/kdb.h                                |  1 -
 include/linux/linux_logo.h                         |  3 -
 include/linux/lsm_audit.h                          |  1 -
 include/linux/module.h                             | 72 ++++++++++++++++++++
 include/linux/moduleparam.h                        |  1 -
 include/linux/netfilter.h                          |  1 -
 include/linux/nls.h                                |  2 +-
 include/linux/percpu_ida.h                         |  1 -
 include/linux/profile.h                            |  1 -
 include/linux/pstore_ram.h                         |  1 -
 include/linux/usb/gadget.h                         |  1 -
 include/linux/zorro.h                              |  1 -
 include/xen/xenbus.h                               |  1 -
 mm/nommu.c                                         |  4 +-
 net/ipv4/netfilter.c                               |  9 +--
 scripts/pnmtologo.c                                |  1 +
 scripts/tags.sh                                    |  2 +-
 1254 files changed, 131 insertions(+), 1431 deletions(-)
 mode change 100755 => 100644 scripts/tags.sh

-- 
1.8.4.1

^ permalink raw reply

* Re: Fwd: [RFC PATCH net-next 0/3] virtio_net: add aRFS support
From: Stefan Hajnoczi @ 2014-01-20 16:49 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zhi Yong Wu, Linux Netdev List, Tom Herbert, Eric Dumazet,
	David S. Miller, Zhi Yong Wu, Michael S. Tsirkin, Rusty Russell
In-Reply-To: <52D89DA0.1020804@redhat.com>

On Fri, Jan 17, 2014 at 11:04:00AM +0800, Jason Wang wrote:
> On 01/16/2014 04:52 PM, Stefan Hajnoczi wrote:
> > On Thu, Jan 16, 2014 at 04:34:10PM +0800, Zhi Yong Wu wrote:
> >> CC: stefanha, MST, Rusty Russel
> >>
> >> ---------- Forwarded message ----------
> >> From: Jason Wang <jasowang@redhat.com>
> >> Date: Thu, Jan 16, 2014 at 12:23 PM
> >> Subject: Re: [RFC PATCH net-next 0/3] virtio_net: add aRFS support
> >> To: Zhi Yong Wu <zwu.kernel@gmail.com>
> >> Cc: netdev@vger.kernel.org, therbert@google.com, edumazet@google.com,
> >> davem@davemloft.net, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> >>
> >>
> >> On 01/15/2014 10:20 PM, Zhi Yong Wu wrote:
> >>> From: Zhi Yong Wu<wuzhy@linux.vnet.ibm.com>
> >>>
> >>> HI, folks
> >>>
> >>> The patchset is trying to integrate aRFS support to virtio_net. In this case,
> >>> aRFS will be used to select the RX queue. To make sure that it's going ahead
> >>> in the correct direction, although it is still one RFC and isn't tested, it's
> >>> post out ASAP. Any comment are appreciated, thanks.
> >>>
> >>> If anyone is interested in playing with it, you can get this patchset from my
> >>> dev git on github:
> >>>    git://github.com/wuzhy/kernel.git virtnet_rfs
> >>>
> >>> Zhi Yong Wu (3):
> >>>    virtio_pci: Introduce one new config api vp_get_vq_irq()
> >>>    virtio_net: Introduce one dummy function virtnet_filter_rfs()
> >>>    virtio-net: Add accelerated RFS support
> >>>
> >>>   drivers/net/virtio_net.c      |   67 ++++++++++++++++++++++++++++++++++++++++-
> >>>   drivers/virtio/virtio_pci.c   |   11 +++++++
> >>>   include/linux/virtio_config.h |   12 +++++++
> >>>   3 files changed, 89 insertions(+), 1 deletions(-)
> >>>
> >> Please run get_maintainter.pl before sending the patch. You'd better
> >> at least cc virtio maintainer/list for this.
> >>
> >> The core aRFS method is a noop in this RFC which make this series no
> >> much sense to discuss. You should at least mention the big picture
> >> here in the cover letter. I suggest you should post a RFC which can
> >> run and has expected result or you can just raise a thread for the
> >> design discussion.
> >>
> >> And this method has been discussed before, you can search "[net-next
> >> RFC PATCH 5/5] virtio-net: flow director support" in netdev archive
> >> for a very old prototype implemented by me. It can work and looks like
> >> most of this RFC have already done there.
> >>
> >> A basic question is whether or not we need this, not all the mq cards
> >> use aRFS (see ixgbe ATR). And whether or not it can bring extra
> >> overheads? For virtio, we want to reduce the vmexits as much as
> >> possible but this aRFS seems introduce a lot of more of this. Making a
> >> complex interfaces just for an virtual device may not be good, simple
> >> method may works for most of the cases.
> >>
> >> We really should consider to offload this to real nic. VMDq and L2
> >> forwarding offload may help in this case.
> > Zhi Yong and I had an IRC chat.  I wanted to post my questions on the
> > list - it's still the same concern I had in the old email thread that
> > Jason mentioned.
> >
> > In order for virtio-net aRFS to make sense there needs to be an overall
> > plan for pushing flow mapping information down to the physical NIC.
> > That's the only way to actually achieve the benefit of steering:
> > processing the packet on the CPU where the application is running.
> 
> There are several places that the packet needs to be processing:
> 1) If you mean send the interrupt of virtio-net to the vcpu that the
> application is running, current virito-net has already had the ability
> with the help of flow caches of tun and the vcpu private queue
> configuration of XPS and irq affinity hint which is automatically done
> through the driver itself.
> 2) If you mean send the interrupt or doing the softirq of the host card
> to the cpu where vhost thread is running, recent RFS support of tun can
> do this.
> 3) And about the affinity of vhost thread and vcpu thread, this seems
> beyond the scope of aRFS.

I guess you're right, the vhost thread affinity is a factor if we're
trying to take the interrupt on the host CPU that will eventually
execute the vcpu.  It's beyond the scope of this effort but something
that would be nice to eventually solve.

Stefan

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Or Gerlitz @ 2014-01-21 20:55 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Or Gerlitz, Joseph Gasparakis, netdev
In-Reply-To: <CALnjE+rUTejbU_d39G7WSn=k0aY2xELCE3UxjDr_Zdq3iwHbNA@mail.gmail.com>

On Tue, Jan 21, 2014 at 7:30 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:

>> While testing the gro udp patches over a setup with openvswitch I noted that
>> the RX checksum offload support introduced by Joseph's commit 0afb166
>> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
>> when you use a setup made of
>> NIC --> IP stack --> vxlan device --> bridge --> tap
>> but not when its
>> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
>> remains true also after the decap is done. Basically, this is the original hunk
[...]
>>> +       skb->encapsulation = 0;
[...]

>> Moving this to shared code (while removing the check for
>> vxlan->dev->features) made things to work on my setup, but this misses one
>> of the original conditions, ideas?

> I kept csum check in vxlan-device recv path for same reason. As of now
> there is no efficient way to get ovs-dev features.
> May be we can cache device features in struct datapath from datapath-netdev.

To be a bit more precise/concrete here, do we agree that the both paths must do

   skb->encapsulation = 0;

which is done now only by the non-ovs path

Or.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox