Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] netem: update backlog after drop
From: David Miller @ 2013-10-11 21:31 UTC (permalink / raw)
  To: stephen; +Cc: eric.dumazet, netdev
In-Reply-To: <20131006151533.52988624@nehalam.linuxnetplumber.net>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Sun, 6 Oct 2013 15:15:33 -0700

> When packet is dropped from rb-tree netem the backlog statistic should
> also be updated.
> 
> Reported-by: Сергеев Сергей <adron@yapic.net>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Applied.

^ permalink raw reply

* Re: [PATCH net] netem: free skb's in tree on reset
From: David Miller @ 2013-10-11 21:31 UTC (permalink / raw)
  To: stephen; +Cc: eric.dumazet, netdev
In-Reply-To: <20131006151649.38038c0e@nehalam.linuxnetplumber.net>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Sun, 6 Oct 2013 15:16:49 -0700

> Netem can leak memory because packets get stored in red-black
> tree and it is not cleared on reset.
> 
> Reported by: Сергеев Сергей <adron@yapic.net>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] tcp: tcp_transmit_skb() optimizations
From: David Miller @ 2013-10-11 21:48 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, fitz, ycheng, ncardwell
In-Reply-To: <1381419780.4971.84.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 10 Oct 2013 08:43:00 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> 1) We need to take a timestamp only for skb that should be cloned.
> 
> Other skbs are not in write queue and no rtt estimation is done on them.
> 
> 2) the unlikely() hint is wrong for receivers (they send pure ACK)
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

This patch is correct, so I've applied it, but it points out a bug.

The __tcp_retransmit_skb() code that does a __pskb_copy() to handle
NET_IP_ALIGN violations and skb_headroom() overflows is buggy because
it needs to store a congestion control timestamp in the original 'skb'
since that's what we'll look at in the retransmit queue.

^ permalink raw reply

* Re: [PATCH] ipv6: Initialize ip6_tnl.hlen in gre tunnel even if no route is found
From: David Miller @ 2013-10-11 21:55 UTC (permalink / raw)
  To: hannes; +Cc: ou.ghorbel, kuznet, jmorris, yoshfuji, kaber, netdev,
	linux-kernel
In-Reply-To: <20131011150216.GA18601@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 11 Oct 2013 17:02:17 +0200

> On Thu, Oct 10, 2013 at 06:50:27PM +0100, Oussama Ghorbel wrote:
>> The ip6_tnl.hlen (gre and ipv6 headers length) is independent from the
>> outgoing interface, so it would be better to initialize it even when no
>> route is found, otherwise its value will be zero.
>> While I'm not sure if this could happen in real life, but doing that
>> will avoid to call the skb_push function with a zero in ip6gre_header
>> function.
>> 
>> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com>
> 
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Applied.

^ permalink raw reply

* Re: [patch] farsync: fix info leak in ioctl
From: David Miller @ 2013-10-11 21:55 UTC (permalink / raw)
  To: dan.carpenter; +Cc: kevin.curtis, speiro, security, netdev
In-Reply-To: <20131011095003.GD6247@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Fri, 11 Oct 2013 12:50:03 +0300

> From: Salva Peiró <speiro@ai2.upv.es>
> 
> The fst_get_iface() code fails to initialize the two padding bytes of
> struct sync_serial_settings after the ->loopback member. Add an explicit
> memset(0) before filling the structure to avoid the info leak.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH 14/14] net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data
From: David Miller @ 2013-10-11 21:56 UTC (permalink / raw)
  To: matthew.leach
  Cc: linux-arm-kernel, ankit.jindal, steve.mcintyre, tushar.jagad,
	will.deacon, catalin.marinas, netdev, nico
In-Reply-To: <1381499540-28794-15-git-send-email-matthew.leach@arm.com>

From: Matthew Leach <matthew.leach@arm.com>
Date: Fri, 11 Oct 2013 14:52:20 +0100

> From: Will Deacon <will.deacon@arm.com>
> 
> SMC_outw invokes an endian-aware I/O accessor, which may change the data
> endianness before writing to the device. This is not suitable for data
> transfers where the memory buffer is simply a string of bytes that does
> not require any byte-swapping.
> 
> This patches fixes the smc91x SMC_PUSH_DATA macro so that it uses the
> string I/O accessor for outputting the leading or trailing halfwords on
> halfword-aligned buffers.
> 
> Cc: <netdev@vger.kernel.org>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister
From: Jesse Gross @ 2013-10-11 22:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, Jiri Pirko,
	David S. Miller
In-Reply-To: <CAMEtUuwAJr0uMv4b_SfYG378p06os3bvT3kyvVGpKguAZbNcSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Oct 11, 2013 at 1:03 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote:
> On Fri, Oct 11, 2013 at 11:11 AM, Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote:
>> On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote:
>>> On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote:
>>>> However, the check dev->reg_state in netdev_destroy() looks racy to
>>>> me, as it could already be in NETREG_UNREGISTERED even if we already
>>>> processed this device.
>>>
>>> you mean that netdev_destroy() will see reg_state == netreg_unregistered,
>>> while dp_device_event() didn't see reg_state == netreg_unregistering yet?
>>> or dp_device_event() saw it, proceeded to do unlink and
>>> netdev_destroy() ran in parallel?
>>> well, that's why reg_state == netreg_unregistering check in netdev_destroy()
>>> is done with rtnl_lock() held.
>>> reg_state cannot go into netreg_unregistered state skipping
>>> netreg_unregistering and notifier.
>>> therefore I don't think it's racy.
>>>
>>> In ovs_dp_notify_wq() you're checking for both unregistering and
>>> unregistered and that makes
>>> sense, since workq can run after unregistering notifier called and
>>> netdev_run_todo()
>>> already changed the state to unregistered.
>>> But here it's not the case.
>>
>> ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls
>> netdev_destroy() so it seems like it actually is the same case to me.
>
> yes. makes sense.
> how about:
> -       if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING)
> +       if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH)

Yes, this seems safer. Is the check for NETREG_UNREGISTERING in
dp_device_event() still needed given that we are checking the event?

> ovs_netdev_destroy_dev() name instead ovs_netdev_unlink_dev() name?

How about detach_dev?

^ permalink raw reply

* 3.10 stable request: iwlwifi: add new 7260 and 3160 series device IDs
From: Bjørn Mork @ 2013-10-11 22:26 UTC (permalink / raw)
  To: netdev; +Cc: Oren Givon, Johannes Berg

Hello,

could you please add this commit to your 3.10 stable queue?

commit 93fc64114b994f9ef6901697f9b0de00762680e9
Author: Oren Givon <oren.givon@intel.com>
Date:   Tue Apr 23 18:19:11 2013 +0300

    iwlwifi: add new 7260 and 3160 series device IDs
    
    Add new device IDs and configurations to support
    all the devices.
    
    Signed-off-by: Oren Givon <oren.givon@intel.com>
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>



I just installed Debian testing on a Sony Vaio Pro and that was a bit
more hassle than necessary due to this patch missing in their 3.10.11
based kernel...

The patch applies cleanly on top of v3.10.15, and only add new device
IDs, so it should be a clear stable candidate.



Thanks,
Bjørn

^ permalink raw reply

* Re: [PATCH net-next] tcp: tcp_transmit_skb() optimizations
From: Eric Dumazet @ 2013-10-11 22:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, fitz, ycheng, ncardwell
In-Reply-To: <20131011.174858.1461090000704613432.davem@davemloft.net>

On Fri, 2013-10-11 at 17:48 -0400, David Miller wrote:

> This patch is correct, so I've applied it, but it points out a bug.
> 
> The __tcp_retransmit_skb() code that does a __pskb_copy() to handle
> NET_IP_ALIGN violations and skb_headroom() overflows is buggy because
> it needs to store a congestion control timestamp in the original 'skb'
> since that's what we'll look at in the retransmit queue.

Yes, I saw that, indeed.

I added it as low priority bug for the moment, as the default congestion
module do not really care, and this case is really unlikely ;)

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister
From: Alexei Starovoitov @ 2013-10-11 22:48 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Pravin Shelar, David S. Miller, Jiri Pirko, dev@openvswitch.org,
	netdev
In-Reply-To: <CAEP_g=8JaTstBhpSSJnbFiLZ4V+jKwpomJy4ZSCe2trhYJBzTw@mail.gmail.com>

On Fri, Oct 11, 2013 at 3:02 PM, Jesse Gross <jesse@nicira.com> wrote:
> On Fri, Oct 11, 2013 at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
>> On Fri, Oct 11, 2013 at 11:11 AM, Jesse Gross <jesse@nicira.com> wrote:
>>> On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
>>>> On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse@nicira.com> wrote:
>>>>> However, the check dev->reg_state in netdev_destroy() looks racy to
>>>>> me, as it could already be in NETREG_UNREGISTERED even if we already
>>>>> processed this device.
>>>>
>>>> you mean that netdev_destroy() will see reg_state == netreg_unregistered,
>>>> while dp_device_event() didn't see reg_state == netreg_unregistering yet?
>>>> or dp_device_event() saw it, proceeded to do unlink and
>>>> netdev_destroy() ran in parallel?
>>>> well, that's why reg_state == netreg_unregistering check in netdev_destroy()
>>>> is done with rtnl_lock() held.
>>>> reg_state cannot go into netreg_unregistered state skipping
>>>> netreg_unregistering and notifier.
>>>> therefore I don't think it's racy.
>>>>
>>>> In ovs_dp_notify_wq() you're checking for both unregistering and
>>>> unregistered and that makes
>>>> sense, since workq can run after unregistering notifier called and
>>>> netdev_run_todo()
>>>> already changed the state to unregistered.
>>>> But here it's not the case.
>>>
>>> ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls
>>> netdev_destroy() so it seems like it actually is the same case to me.
>>
>> yes. makes sense.
>> how about:
>> -       if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING)
>> +       if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH)
>
> Yes, this seems safer. Is the check for NETREG_UNREGISTERING in
> dp_device_event() still needed given that we are checking the event?

at least some check is needed, since NETDEV_UNREGISTER event can
be received again as rebroadcast with reg_state=netreg_unregistered
if wq got delayed.
Probably better to combine checks event == unreg and state == unregistering
under one 'if' to avoid unnecessary workq wakeup.
Or may be better to do it as
if (event == NETDEV_UNREGISTER && dev->priv_flags & IFF_OVS_DATAPATH) {
  ovs_netdev_detach_dev();
  queue_work();
}

since we're at it... what should be the behavior for namespace moves?
If dev attached to ovs and being moved into a different net namespace, I think
ovs should detach and forget the dev...

Today ovs ignores this notification and we may have ovs-dp in one net
and attached dev
in a different net.
So if you do:
   ovs-dpctl add-if test tap1
   ip link set tap1 netns 3512
and then try to remove tap1 inside the namespace:
   ip tuntap del dev tap1 mode tap
it will just hang:
[  852.572476] unregister_netdevice: waiting for tap1 to become free.
Usage count = 3
[  862.578769] unregister_netdevice: waiting for tap1 to become free.
Usage count = 3

>> ovs_netdev_destroy_dev() name instead ovs_netdev_unlink_dev() name?
>
> How about detach_dev?

that's better name indeed. Will respin V2.

Thanks
Alexei

^ permalink raw reply

* [PATCH v3 tip/core/rcu 0/14] Sparse-related updates for 3.13
From: Paul E. McKenney @ 2013-10-11 23:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, fweisbec, dhowells, edumazet, gaofeng, mingo, bridge,
	jmorris, dipankar, darren, josh, rostedt, niv, mathieu.desnoyers,
	kuznet, tglx, johannes, laijs, yoshfuji, netdev,
	linux-decnet-user, kaber, stephen, sbw, tgraf, akpm, fengguang.wu,
	davem

Hello!

This series features updates to allow sparse to do a better job of
statically analyzing RCU usage:

1.	Add a comment indicating that despite appearances,
	rcu_assign_pointer() really only evaluates its arguments once,
	as a cpp macro should.

2-13.	Apply ACCESS_ONCE() to avoid a number of rcu_assign_pointer()
	calls that would otherwise suffer sparse false positives given
	patch #13 below.

14.	Apply ACCESS_ONCE() to rcu_assign_pointer()'s target to prevent
	comiler mischief.  Also require that the source pointer be from
	the kernel address space.  Sometimes it can be from the RCU address
	space, which necessitates the remaining patches in this series.
	Which, it must be admitted, apply to a very small fraction of
	the rcu_assign_pointer() invocations in the kernel.  This commit
	courtesy of Josh Triplett.

Changes from v2:

o	Switch from rcu_assign_pointer() to ACCESS_ONCE() given that
	the pointers are all --rcu and already visible to readers,
	as suggested by Eric Dumazet and Josh Triplett.

o	Place the commit adding the rcu_assign_pointer()'s ACCESS_ONCE()
	at the end to allow better bisectability, as suggested by Josh
	Triplett.

o	Add a comment to rcu_assign_pointer() noting that it only evaluates
	its arguments once, as suggested by Josh Triplett.

Changes from v1:

o	Fix grammar nit in commit logs.

							Thanx, Paul


 b/drivers/net/bonding/bond_alb.c  |    3 ++-
 b/drivers/net/bonding/bond_main.c |    8 +++++---
 b/include/linux/rcupdate.h        |   20 +++++++++++++++++++-
 b/kernel/notifier.c               |    3 ++-
 b/net/bridge/br_mdb.c             |    2 +-
 b/net/bridge/br_multicast.c       |    4 ++--
 b/net/decnet/dn_route.c           |    8 +++++---
 b/net/ipv4/ip_sockglue.c          |    3 ++-
 b/net/ipv6/ip6_gre.c              |    3 ++-
 b/net/ipv6/ip6_tunnel.c           |    3 ++-
 b/net/ipv6/sit.c                  |    3 ++-
 b/net/mac80211/sta_info.c         |    7 ++++---
 b/net/wireless/scan.c             |   32 ++++++++++++++++++--------------
 13 files changed, 66 insertions(+), 33 deletions(-)

^ permalink raw reply

* [PATCH v3 tip/core/rcu 03/14] bridge: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Stephen Hemminger, tglx, laijs, edumazet, peterz, fweisbec,
	bridge, josh, rostedt, David S. Miller, dhowells, sbw, niv,
	netdev, mathieu.desnoyers, dipankar, darren, akpm,
	Paul E. McKenney, mingo
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the uses in
br_multicast_del_pg() and br_multicast_new_port_group() are legitimate:
They are assigning a pointer to an element from an RCU-protected list,
and all elements of this list are already visible to caller.

This commit therefore silences these false positives by laundering
the pointers using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
---
 net/bridge/br_multicast.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index d1c578630678..bcc4bbc16498 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -267,7 +267,7 @@ static void br_multicast_del_pg(struct net_bridge *br,
 		if (p != pg)
 			continue;
 
-		rcu_assign_pointer(*pp, p->next);
+		ACCESS_ONCE(*pp) = p->next; /* OK: Both --rcu and visible. */
 		hlist_del_init(&p->mglist);
 		del_timer(&p->timer);
 		call_rcu_bh(&p->rcu, br_multicast_free_pg);
@@ -646,7 +646,7 @@ struct net_bridge_port_group *br_multicast_new_port_group(
 	p->addr = *group;
 	p->port = port;
 	p->state = state;
-	rcu_assign_pointer(p->next, next);
+	ACCESS_ONCE(p->next) = next; /* OK: Both --rcu and visible. */
 	hlist_add_head(&p->mglist, &port->mglist);
 	setup_timer(&p->timer, br_multicast_port_group_expired,
 		    (unsigned long)p);
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 04/14] wireless: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Stephen Hemminger, tglx, laijs, edumazet, peterz, fweisbec,
	bridge, josh, rostedt, David S. Miller, dhowells, sbw, niv,
	netdev, mathieu.desnoyers, dipankar, darren, akpm,
	Paul E. McKenney, mingo
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the uses in
cfg80211_combine_bsses() and cfg80211_bss_update() are legitimate:
They are assigning a pointer to an element from an RCU-protected list,
and all elements of this list are already visible to caller.

This commit therefore silences these false positives by laundering
the pointers using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
---
 net/wireless/scan.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index eeb71480f1af..ac3a47abf195 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -670,8 +670,8 @@ static bool cfg80211_combine_bsses(struct cfg80211_registered_device *dev,
 		list_add(&bss->hidden_list, &new->hidden_list);
 		bss->pub.hidden_beacon_bss = &new->pub;
 		new->refcount += bss->refcount;
-		rcu_assign_pointer(bss->pub.beacon_ies,
-				   new->pub.beacon_ies);
+		/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+		ACCESS_ONCE(bss->pub.beacon_ies) = new->pub.beacon_ies;
 	}
 
 	return true;
@@ -705,11 +705,12 @@ cfg80211_bss_update(struct cfg80211_registered_device *dev,
 
 			old = rcu_access_pointer(found->pub.proberesp_ies);
 
-			rcu_assign_pointer(found->pub.proberesp_ies,
-					   tmp->pub.proberesp_ies);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(found->pub.proberesp_ies) =
+					   tmp->pub.proberesp_ies;
 			/* Override possible earlier Beacon frame IEs */
-			rcu_assign_pointer(found->pub.ies,
-					   tmp->pub.proberesp_ies);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(found->pub.ies) = tmp->pub.proberesp_ies;
 			if (old)
 				kfree_rcu((struct cfg80211_bss_ies *)old,
 					  rcu_head);
@@ -739,13 +740,14 @@ cfg80211_bss_update(struct cfg80211_registered_device *dev,
 
 			old = rcu_access_pointer(found->pub.beacon_ies);
 
-			rcu_assign_pointer(found->pub.beacon_ies,
-					   tmp->pub.beacon_ies);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(found->pub.beacon_ies) = tmp->pub.beacon_ies;
 
 			/* Override IEs if they were from a beacon before */
 			if (old == rcu_access_pointer(found->pub.ies))
-				rcu_assign_pointer(found->pub.ies,
-						   tmp->pub.beacon_ies);
+				/* Both --rcu & visible, ACCESS_ONCE() is OK. */
+				ACCESS_ONCE(found->pub.ies) =
+						   tmp->pub.beacon_ies;
 
 			/* Assign beacon IEs to all sub entries */
 			list_for_each_entry(bss, &found->hidden_list,
@@ -755,8 +757,9 @@ cfg80211_bss_update(struct cfg80211_registered_device *dev,
 				ies = rcu_access_pointer(bss->pub.beacon_ies);
 				WARN_ON(ies != old);
 
-				rcu_assign_pointer(bss->pub.beacon_ies,
-						   tmp->pub.beacon_ies);
+				/* Both --rcu & visible, ACCESS_ONCE() is OK. */
+				ACCESS_ONCE(bss->pub.beacon_ies) =
+						   tmp->pub.beacon_ies;
 			}
 
 			if (old)
@@ -803,8 +806,9 @@ cfg80211_bss_update(struct cfg80211_registered_device *dev,
 				list_add(&new->hidden_list,
 					 &hidden->hidden_list);
 				hidden->refcount++;
-				rcu_assign_pointer(new->pub.beacon_ies,
-						   hidden->pub.beacon_ies);
+				/* Both --rcu & visible, ACCESS_ONCE() is OK. */
+				ACCESS_ONCE(new->pub.beacon_ies) =
+						   hidden->pub.beacon_ies;
 			}
 		} else {
 			/*
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 05/14] decnet: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Thomas Graf, Gao feng,
	Stephen Hemminger, linux-decnet-user, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
dn_insert_route() is legitimate: It is assigning a pointer to an element
from an RCU-protected list, and all elements of this list are already
visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: linux-decnet-user@lists.sourceforge.net
Cc: netdev@vger.kernel.org
---
 net/decnet/dn_route.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index fe32388ea24f..a6ef8b025035 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -344,8 +344,9 @@ static int dn_insert_route(struct dn_route *rt, unsigned int hash, struct dn_rou
 		if (compare_keys(&rth->fld, &rt->fld)) {
 			/* Put it first */
 			*rthp = rth->dst.dn_next;
-			rcu_assign_pointer(rth->dst.dn_next,
-					   dn_rt_hash_table[hash].chain);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(rth->dst.dn_next) =
+					   dn_rt_hash_table[hash].chain;
 			rcu_assign_pointer(dn_rt_hash_table[hash].chain, rth);
 
 			dst_use(&rth->dst, now);
@@ -358,7 +359,8 @@ static int dn_insert_route(struct dn_route *rt, unsigned int hash, struct dn_rou
 		rthp = &rth->dst.dn_next;
 	}
 
-	rcu_assign_pointer(rt->dst.dn_next, dn_rt_hash_table[hash].chain);
+	/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+	ACCESS_ONCE(rt->dst.dn_next) = dn_rt_hash_table[hash].chain;
 	rcu_assign_pointer(dn_rt_hash_table[hash].chain, rt);
 
 	dst_use(&rt->dst, now);
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 06/14] ipv4/ip_socketglue: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
ip_ra_control() is legitimate: It is assigning a pointer to an element
from an RCU-protected list, and all elements of this list are already
visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 net/ipv4/ip_sockglue.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index d9c4f113d709..a0e7f176e9c8 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -269,7 +269,8 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 			}
 			/* dont let ip_call_ra_chain() use sk again */
 			ra->sk = NULL;
-			rcu_assign_pointer(*rap, ra->next);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(*rap) = ra->next;
 			spin_unlock_bh(&ip_ra_lock);
 
 			if (ra->destructor)
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 07/14] ipv6/ip6_tunnel: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
ip6_tnl_unlink() is legitimate: It is assigning a pointer to an element
from an RCU-protected list, and all elements of this list are already
visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 net/ipv6/ip6_tunnel.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 61355f7f4da5..2bea7a4e49ed 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -245,7 +245,8 @@ ip6_tnl_unlink(struct ip6_tnl_net *ip6n, struct ip6_tnl *t)
 	     (iter = rtnl_dereference(*tp)) != NULL;
 	     tp = &iter->next) {
 		if (t == iter) {
-			rcu_assign_pointer(*tp, t->next);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(*tp) = t->next;
 			break;
 		}
 	}
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 09/14] ipv6/sit: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
ipip6_tunnel_unlink() is legitimate: It is assigning a pointer to an
element from an RCU-protected list, and all elements of this list are
already visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 net/ipv6/sit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 7ee5cb96db34..9b976a4b463d 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -157,7 +157,8 @@ static void ipip6_tunnel_unlink(struct sit_net *sitn, struct ip_tunnel *t)
 	     (iter = rtnl_dereference(*tp)) != NULL;
 	     tp = &iter->next) {
 		if (t == iter) {
-			rcu_assign_pointer(*tp, t->next);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(*tp) = t->next;
 			break;
 		}
 	}
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 10/14] mac80211: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, John W. Linville, Johannes Berg,
	David S. Miller, linux-wireless, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the uses in
sta_info_hash_del() are legitimate: They are assigning a pointer to an
element from an RCU-protected list, and all elements of this list are
already visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 net/mac80211/sta_info.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index aeb967a0aeed..494f03c0831f 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -74,8 +74,8 @@ static int sta_info_hash_del(struct ieee80211_local *local,
 	if (!s)
 		return -ENOENT;
 	if (s == sta) {
-		rcu_assign_pointer(local->sta_hash[STA_HASH(sta->sta.addr)],
-				   s->hnext);
+		/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+		ACCESS_ONCE(local->sta_hash[STA_HASH(sta->sta.addr)]) = s->hnext;
 		return 0;
 	}
 
@@ -84,7 +84,8 @@ static int sta_info_hash_del(struct ieee80211_local *local,
 		s = rcu_dereference_protected(s->hnext,
 					lockdep_is_held(&local->sta_mtx));
 	if (rcu_access_pointer(s->hnext)) {
-		rcu_assign_pointer(s->hnext, sta->hnext);
+		/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+		ACCESS_ONCE(s->hnext) = sta->hnext;
 		return 0;
 	}
 
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 11/14] bridge/br_mdb: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Stephen Hemminger, tglx, laijs, edumazet, peterz, fweisbec,
	bridge, josh, rostedt, David S. Miller, dhowells, sbw, niv,
	netdev, mathieu.desnoyers, dipankar, darren, akpm,
	Paul E. McKenney, mingo
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
__br_mdb_del() is legitimate: They are assigning a pointer to an element
from an RCU-protected list, and all elements of this list are already
visible to caller.

This commit therefore silences these false positives by laundering
the pointers using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
---
 net/bridge/br_mdb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 85a09bb5ca51..de7197ba8695 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -447,7 +447,7 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry)
 		if (p->port->state == BR_STATE_DISABLED)
 			goto unlock;
 
-		rcu_assign_pointer(*pp, p->next);
+		ACCESS_ONCE(*pp) = p->next; /* OK: Both --rcu and visible. */
 		hlist_del_init(&p->mglist);
 		del_timer(&p->timer);
 		call_rcu_bh(&p->rcu, br_multicast_free_pg);
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 12/14] bonding/bond_main: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Stephen Hemminger, tglx, laijs, edumazet, peterz, fweisbec,
	bridge, josh, rostedt, David S. Miller, dhowells, sbw, niv,
	netdev, mathieu.desnoyers, dipankar, darren, akpm,
	Paul E. McKenney, mingo
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the uses in
bond_change_active_slave(), bond_enslave(), and __bond_release_one()
are legitimate: They are assigning a pointer to an element from an
RCU-protected list (or a NULL pointer), and all elements of this list
are already visible to caller.

This commit therefore silences these false positives either by laundering
the pointers using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett, or by using RCU_INIT_POINTER() for NULL pointer assignments.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
---
 drivers/net/bonding/bond_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 72df399c4ab3..e4270ae1c0a8 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -890,7 +890,8 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 		if (new_active)
 			bond_set_slave_active_flags(new_active);
 	} else {
-		rcu_assign_pointer(bond->curr_active_slave, new_active);
+		/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+		ACCESS_ONCE(bond->curr_active_slave) = new_active;
 	}
 
 	if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) {
@@ -1601,7 +1602,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 		 * so we can change it without calling change_active_interface()
 		 */
 		if (!bond->curr_active_slave && new_slave->link == BOND_LINK_UP)
-			rcu_assign_pointer(bond->curr_active_slave, new_slave);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(bond->curr_active_slave) = new_slave;
 
 		break;
 	} /* switch(bond_mode) */
@@ -1801,7 +1803,7 @@ static int __bond_release_one(struct net_device *bond_dev,
 	}
 
 	if (all) {
-		rcu_assign_pointer(bond->curr_active_slave, NULL);
+		RCU_INIT_POINTER(bond->curr_active_slave, NULL);
 	} else if (oldcurrent == slave) {
 		/*
 		 * Note that we hold RTNL over this sequence, so there
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 08/14] ipv6/ip6_gre: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
ip6gre_tunnel_unlink() is legitimate: It is assigning a pointer to an
element from an RCU-protected list, and all elements of this list are
already visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 net/ipv6/ip6_gre.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 6b26e9feafb9..7bc9e1b3283e 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -276,7 +276,8 @@ static void ip6gre_tunnel_unlink(struct ip6gre_net *ign, struct ip6_tnl *t)
 	     (iter = rtnl_dereference(*tp)) != NULL;
 	     tp = &iter->next) {
 		if (t == iter) {
-			rcu_assign_pointer(*tp, t->next);
+			/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+			ACCESS_ONCE(*tp) = t->next;
 			break;
 		}
 	}
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v3 tip/core/rcu 13/14] bonding/bond_alb.c: Apply ACCESS_ONCE() to avoid sparse false positive
From: Paul E. McKenney @ 2013-10-11 23:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev
In-Reply-To: <1381533451-29018-1-git-send-email-paulmck@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The sparse checking for rcu_assign_pointer() was recently upgraded
to reject non-__kernel address spaces.  This also rejects __rcu,
which is almost always the right thing to do.  However, the use in
bond_alb_handle_active_change() is legitimate: It is assigning a pointer
to an element from an RCU-protected list, and all elements of this list
are already visible to caller.

This commit therefore silences this false positive by laundering the
pointer using ACCESS_ONCE() as suggested by Eric Dumazet and Josh
Triplett.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 drivers/net/bonding/bond_alb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 91f179d5135c..67d3b2893aa3 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1667,7 +1667,8 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
 	}
 
 	swap_slave = bond->curr_active_slave;
-	rcu_assign_pointer(bond->curr_active_slave, new_slave);
+	/* Both --rcu and visible, so ACCESS_ONCE() is OK. */
+	ACCESS_ONCE(bond->curr_active_slave) = new_slave;
 
 	if (!new_slave || list_empty(&bond->slave_list))
 		return;
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH v2 net-next] openvswitch: fix vport-netdev unregister
From: Alexei Starovoitov @ 2013-10-12  1:12 UTC (permalink / raw)
  To: David S. Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	Jiri Pirko

The combination of two commits:
commit 8e4e1713e4
("openvswitch: Simplify datapath locking.")
commit 2537b4dd0a
("openvswitch:: link upper device for port devices")

introduced a bug where upper_dev wasn't unlinked upon
netdev_unregister notification

The following steps:

  modprobe openvswitch
  ovs-dpctl add-dp test
  ip tuntap add dev tap1 mode tap
  ovs-dpctl add-if test tap1
  ip tuntap del dev tap1 mode tap

are causing multiple warnings:

[   62.747557] gre: GRE over IPv4 demultiplexor driver
[   62.749579] openvswitch: Open vSwitch switching datapath
[   62.755087] device test entered promiscuous mode
[   62.765911] device tap1 entered promiscuous mode
[   62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready
[   62.769017] ------------[ cut here ]------------
[   62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240()
[   62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60
[   62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769053]  0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006
[   62.769055]  0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58
[   62.769057]  ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8
[   62.769059] Call Trace:
[   62.769062]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769065]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769067]  [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20
[   62.769069]  [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240
[   62.769071]  [<ffffffff8162a101>] rollback_registered+0x31/0x40
[   62.769073]  [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90
[   62.769075]  [<ffffffff8154f900>] __tun_detach+0x140/0x340
[   62.769077]  [<ffffffff8154fb36>] tun_chr_close+0x36/0x60
[   62.769080]  [<ffffffff811bddaf>] __fput+0xff/0x260
[   62.769082]  [<ffffffff811bdf5e>] ____fput+0xe/0x10
[   62.769084]  [<ffffffff8107b515>] task_work_run+0xb5/0xe0
[   62.769087]  [<ffffffff810029b9>] do_notify_resume+0x59/0x80
[   62.769089]  [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   62.769091]  [<ffffffff81770f5a>] int_signal+0x12/0x17
[   62.769093] ---[ end trace 838756c62e156ffb ]---
[   62.769481] ------------[ cut here ]------------
[   62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[   62.769486] sysfs: can not remove 'master', no directory
[   62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G        W    3.12.0-rc3+ #60
[   62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch]
[   62.769519]  0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[   62.769521]  ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28
[   62.769523]  0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500
[   62.769525] Call Trace:
[   62.769528]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769529]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769531]  [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[   62.769533]  [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[   62.769535]  [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[   62.769538]  [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150
[   62.769540]  [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[   62.769542]  [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[   62.769544]  [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[   62.769548]  [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[   62.769550]  [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[   62.769552]  [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[   62.769555]  [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[   62.769557]  [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[   62.769559]  [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[   62.769562]  [<ffffffff8107659b>] worker_thread+0x11b/0x370
[   62.769564]  [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[   62.769566]  [<ffffffff8107f44a>] kthread+0xea/0xf0
[   62.769568]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769570]  [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[   62.769572]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769573] ---[ end trace 838756c62e156ffc ]---
[   62.769574] ------------[ cut here ]------------
[   62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[   62.769577] sysfs: can not remove 'upper_test', no directory
[   62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G        W    3.12.0-rc3+ #60
[   62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch]
[   62.769607]  0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[   62.769609]  ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58
[   62.769611]  0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500
[   62.769613] Call Trace:
[   62.769615]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769617]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769619]  [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[   62.769621]  [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[   62.769622]  [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[   62.769624]  [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150
[   62.769627]  [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[   62.769629]  [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[   62.769631]  [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[   62.769633]  [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[   62.769636]  [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[   62.769638]  [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[   62.769640]  [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[   62.769642]  [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[   62.769644]  [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[   62.769646]  [<ffffffff8107659b>] worker_thread+0x11b/0x370
[   62.769648]  [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[   62.769650]  [<ffffffff8107f44a>] kthread+0xea/0xf0
[   62.769652]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769654]  [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[   62.769656]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769657] ---[ end trace 838756c62e156ffd ]---
[   62.769724] device tap1 left promiscuous mode

This patch also affects moving devices between net namespaces.

OVS used to ignore netns move notifications which caused problems.
Like:
  ovs-dpctl add-if test tap1
  ip link set tap1 netns 3512
and then removing tap1 inside the namespace will cause hang on missing dev_put.

With this patch OVS will detach dev upon receiving netns move event.

Signed-off-by: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
---
 net/openvswitch/dp_notify.c    |   14 ++++++--------
 net/openvswitch/vport-netdev.c |   16 +++++++++++++---
 net/openvswitch/vport-netdev.h |    1 +
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c
index c323567..49cea88 100644
--- a/net/openvswitch/dp_notify.c
+++ b/net/openvswitch/dp_notify.c
@@ -59,15 +59,9 @@ void ovs_dp_notify_wq(struct work_struct *work)
 			struct hlist_node *n;
 
 			hlist_for_each_entry_safe(vport, n, &dp->ports[i], dp_hash_node) {
-				struct netdev_vport *netdev_vport;
-
 				if (vport->ops->type != OVS_VPORT_TYPE_NETDEV)
 					continue;
-
-				netdev_vport = netdev_vport_priv(vport);
-				if (netdev_vport->dev->reg_state == NETREG_UNREGISTERED ||
-				    netdev_vport->dev->reg_state == NETREG_UNREGISTERING)
-					dp_detach_port_notify(vport);
+				dp_detach_port_notify(vport);
 			}
 		}
 	}
@@ -87,7 +81,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event,
 	if (!vport)
 		return NOTIFY_DONE;
 
-	if (event == NETDEV_UNREGISTER) {
+	if (event == NETDEV_UNREGISTER && dev->priv_flags & IFF_OVS_DATAPATH) {
+		/* upper_dev_unlink and decrement promisc immediately */
+		ovs_netdev_detach_dev(vport);
+
+		/* schedule vport destroy, dev_put and genl notification */
 		ovs_net = net_generic(dev_net(dev), ovs_net_id);
 		queue_work(system_wq, &ovs_net->dp_notify_work);
 	}
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 09d93c1..d21f77d 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -150,15 +150,25 @@ static void free_port_rcu(struct rcu_head *rcu)
 	ovs_vport_free(vport_from_priv(netdev_vport));
 }
 
-static void netdev_destroy(struct vport *vport)
+void ovs_netdev_detach_dev(struct vport *vport)
 {
 	struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
 
-	rtnl_lock();
+	ASSERT_RTNL();
 	netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH;
 	netdev_rx_handler_unregister(netdev_vport->dev);
-	netdev_upper_dev_unlink(netdev_vport->dev, get_dpdev(vport->dp));
+	netdev_upper_dev_unlink(netdev_vport->dev,
+				netdev_master_upper_dev_get(netdev_vport->dev));
 	dev_set_promiscuity(netdev_vport->dev, -1);
+}
+
+static void netdev_destroy(struct vport *vport)
+{
+	struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
+
+	rtnl_lock();
+	if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH)
+		ovs_netdev_detach_dev(vport);
 	rtnl_unlock();
 
 	call_rcu(&netdev_vport->rcu, free_port_rcu);
diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
index dd298b5..21e3770 100644
--- a/net/openvswitch/vport-netdev.h
+++ b/net/openvswitch/vport-netdev.h
@@ -39,5 +39,6 @@ netdev_vport_priv(const struct vport *vport)
 }
 
 const char *ovs_netdev_get_name(const struct vport *);
+void ovs_netdev_unlink_dev(struct vport *);
 
 #endif /* vport_netdev.h */
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH net-next] tcp: tcp_transmit_skb() optimizations
From: Eric Dumazet @ 2013-10-12  1:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, fitz, ycheng, ncardwell
In-Reply-To: <1381531436.4971.114.camel@edumazet-glaptop.roam.corp.google.com>

On Fri, 2013-10-11 at 15:43 -0700, Eric Dumazet wrote:
> On Fri, 2013-10-11 at 17:48 -0400, David Miller wrote:
> 
> > This patch is correct, so I've applied it, but it points out a bug.
> > 
> > The __tcp_retransmit_skb() code that does a __pskb_copy() to handle
> > NET_IP_ALIGN violations and skb_headroom() overflows is buggy because
> > it needs to store a congestion control timestamp in the original 'skb'
> > since that's what we'll look at in the retransmit queue.
> 
> Yes, I saw that, indeed.
> 
> I added it as low priority bug for the moment, as the default congestion
> module do not really care, and this case is really unlikely ;)
> 

Ah, I remember now that the conclusion was :
the timestamp is not taken into account for retransmits.

( FLAG_RETRANS_DATA_ACKED )

^ permalink raw reply

* [PATCH v2] Stmmac: fix a bug when clk_csr is euqal to 0x0
From: Wan ZongShun @ 2013-10-12  2:04 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro

According to spec, if csr clock freq is 60-100Mhz, we have to set CR[5:2] = 0000
but when I set the 'plat_dat.clk_csr = 0',acctually, this value is not used
since the driver code judge 'if (!priv->plat->clk_csr)' then go to dynamic tune
the MDC clock. So this patch is to add other judge condition.

Signed-off-by: Wan Zongshun <mcuos.com@gmail.com>
---
 Documentation/networking/stmmac.txt               | 3 +++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
 include/linux/stmmac.h                            | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/stmmac.txt
b/Documentation/networking/stmmac.txt
index 457b8bb..3ed0ea9 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -116,6 +116,7 @@ struct plat_stmmacenet_data {
     struct stmmac_mdio_bus_data *mdio_bus_data;
     struct stmmac_dma_cfg *dma_cfg;
     int clk_csr;
+    unsigned int dynamic_mdc_clk_en;
     int has_gmac;
     int enh_desc;
     int tx_coe;
@@ -148,6 +149,8 @@ Where:
        GMAC also enables the 4xPBL by default.
    o fixed_burst/mixed_burst/burst_len
  o clk_csr: fixed CSR Clock range selection.
+ o dynamic_mdc_clk_en: If it is set to >=1 MDC clk will be selected
dynamically,
+               or else you must set a fixed CSR Clock range to clk_src.
  o has_gmac: uses the GMAC core.
  o enh_desc: if sets the MAC will use the enhanced descriptor structure.
  o tx_coe: core is able to perform the tx csum in HW.
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 8d4ccd3..93fc6bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2741,7 +2741,7 @@ struct stmmac_priv *stmmac_dvr_probe(struct
device *device,
      * set the MDC clock dynamically according to the csr actual
      * clock input.
      */
-    if (!priv->plat->clk_csr)
+    if (!!priv->plat->dynamic_mdc_clk_en)
         stmmac_clk_csr_set(priv);
     else
         priv->clk_csr = priv->plat->clk_csr;
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index bb5deb0..1b9f0b5 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -101,6 +101,7 @@ struct plat_stmmacenet_data {
     struct stmmac_mdio_bus_data *mdio_bus_data;
     struct stmmac_dma_cfg *dma_cfg;
     int clk_csr;
+    unsigned int dynamic_mdc_clk_en;
     int has_gmac;
     int enh_desc;
     int tx_coe;
-- 
1.8.1.2


-- 
Wan ZongShun.
www.mcuos.com

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox