netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: core: Correctly iterate over lower adjacency list
@ 2016-10-19 13:57 idosch
  2016-10-19 14:33 ` David Ahern
  2016-10-19 14:38 ` David Miller
  0 siblings, 2 replies; 5+ messages in thread
From: idosch @ 2016-10-19 13:57 UTC (permalink / raw)
  To: davem; +Cc: netdev, jiri, eladr, yotamg, nogahf, ogerlitz, dsa, Ido Schimmel

From: Ido Schimmel <idosch@mellanox.com>

Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:

 NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
 CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.0-rc7 #1
 Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
 task: ffff88017edfea40 task.stack: ffff88017ee10000
 RIP: 0010:[<ffffffff815dcc73>]  [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
 Call Trace:
  <IRQ>
  [<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
  [<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
  [<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
  [<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
  [<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
  [<ffffffff815f2eb6>] neigh_update+0x306/0x740
  [<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
  [<ffffffff8165ea3f>] arp_process+0x66f/0x700
  [<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
  [<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
  [<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
  [<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
  [<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
  [<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
  [<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
  [<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
  [<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
  [<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
  [<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
  [<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
  [<ffffffff81374815>] ? find_next_bit+0x15/0x20
  [<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
  [<ffffffff810c9968>] ? load_balance+0x178/0x9b0
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
  [<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
  [<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
  [<ffffffff81091986>] tasklet_action+0xf6/0x110
  [<ffffffff81704556>] __do_softirq+0xf6/0x280
  [<ffffffff8109213f>] irq_exit+0xdf/0xf0
  [<ffffffff817042b4>] do_IRQ+0x54/0xd0
  [<ffffffff8170214c>] common_interrupt+0x8c/0x8c

The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.

Fix this by advancing the iterator and avoid the infinite loop.

Fixes: 7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
---
Note that DavidA's series removes this code in net-next, but it wasn't
picked up for net, so I'm sending this patch.
Please consider queueing this for 4.8.y
---
 include/linux/netdevice.h |  2 +-
 net/core/dev.c            | 10 +++++++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 136ae6bb..465e128 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3877,7 +3877,7 @@ struct net_device *netdev_all_lower_get_next_rcu(struct net_device *dev,
 	     ldev = netdev_all_lower_get_next(dev, &(iter)))
 
 #define netdev_for_each_all_lower_dev_rcu(dev, ldev, iter) \
-	for (iter = (dev)->all_adj_list.lower.next, \
+	for (iter = &(dev)->all_adj_list.lower, \
 	     ldev = netdev_all_lower_get_next_rcu(dev, &(iter)); \
 	     ldev; \
 	     ldev = netdev_all_lower_get_next_rcu(dev, &(iter)))
diff --git a/net/core/dev.c b/net/core/dev.c
index f1fe26f..b09ac57 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5511,10 +5511,14 @@ struct net_device *netdev_all_lower_get_next_rcu(struct net_device *dev,
 {
 	struct netdev_adjacent *lower;
 
-	lower = list_first_or_null_rcu(&dev->all_adj_list.lower,
-				       struct netdev_adjacent, list);
+	lower = list_entry_rcu((*iter)->next, struct netdev_adjacent, list);
+
+	if (&lower->list == &dev->all_adj_list.lower)
+		return NULL;
+
+	*iter = &lower->list;
 
-	return lower ? lower->dev : NULL;
+	return lower->dev;
 }
 EXPORT_SYMBOL(netdev_all_lower_get_next_rcu);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: core: Correctly iterate over lower adjacency list
  2016-10-19 13:57 [PATCH net] net: core: Correctly iterate over lower adjacency list idosch
@ 2016-10-19 14:33 ` David Ahern
  2016-10-19 14:38 ` David Miller
  1 sibling, 0 replies; 5+ messages in thread
From: David Ahern @ 2016-10-19 14:33 UTC (permalink / raw)
  To: idosch, davem; +Cc: netdev, jiri, eladr, yotamg, nogahf, ogerlitz, Ido Schimmel

On 10/19/16 7:57 AM, idosch@idosch.org wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 136ae6bb..465e128 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3877,7 +3877,7 @@ struct net_device *netdev_all_lower_get_next_rcu(struct net_device *dev,
>  	     ldev = netdev_all_lower_get_next(dev, &(iter)))
>  
>  #define netdev_for_each_all_lower_dev_rcu(dev, ldev, iter) \
> -	for (iter = (dev)->all_adj_list.lower.next, \
> +	for (iter = &(dev)->all_adj_list.lower, \
>  	     ldev = netdev_all_lower_get_next_rcu(dev, &(iter)); \
>  	     ldev; \
>  	     ldev = netdev_all_lower_get_next_rcu(dev, &(iter)))
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f1fe26f..b09ac57 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5511,10 +5511,14 @@ struct net_device *netdev_all_lower_get_next_rcu(struct net_device *dev,
>  {
>  	struct netdev_adjacent *lower;
>  
> -	lower = list_first_or_null_rcu(&dev->all_adj_list.lower,
> -				       struct netdev_adjacent, list);
> +	lower = list_entry_rcu((*iter)->next, struct netdev_adjacent, list);
> +
> +	if (&lower->list == &dev->all_adj_list.lower)
> +		return NULL;
> +
> +	*iter = &lower->list;
>  
> -	return lower ? lower->dev : NULL;
> +	return lower->dev;
>  }
>  EXPORT_SYMBOL(netdev_all_lower_get_next_rcu);


When I converted this function in my series I wondered how the current code worked at all. I guess I didn't. This is inline with what I did and matches the form used for the all_upper variant, so for 4.9 and 4.8.x

Acked-by: David Ahern <dsa@cumulusnetworks.com>

I would like to see my series applied to 4.9 at some point.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: core: Correctly iterate over lower adjacency list
  2016-10-19 13:57 [PATCH net] net: core: Correctly iterate over lower adjacency list idosch
  2016-10-19 14:33 ` David Ahern
@ 2016-10-19 14:38 ` David Miller
  2016-10-19 15:36   ` Ido Schimmel
  1 sibling, 1 reply; 5+ messages in thread
From: David Miller @ 2016-10-19 14:38 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, eladr, yotamg, nogahf, ogerlitz, dsa, idosch

From: idosch@idosch.org
Date: Wed, 19 Oct 2016 16:57:08 +0300

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Tamir reported the following trace when processing ARP requests received
> via a vlan device on top of a VLAN-aware bridge:
 ...
> The problem is that netdev_all_lower_get_next_rcu() never advances the
> iterator, thereby causing the loop over the lower adjacency list to run
> forever.
> 
> Fix this by advancing the iterator and avoid the infinite loop.
> 
> Fixes: 7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reported-by: Tamir Winetroub <tamirw@mellanox.com>
> Reviewed-by: Jiri Pirko <jiri@mellanox.com>

Applied.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: core: Correctly iterate over lower adjacency list
  2016-10-19 14:38 ` David Miller
@ 2016-10-19 15:36   ` Ido Schimmel
  2016-10-19 16:28     ` David Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2016-10-19 15:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jiri, eladr, yotamg, nogahf, ogerlitz, dsa, idosch

Hi Dave,

On Wed, Oct 19, 2016 at 10:38:29AM -0400, David Miller wrote:
> From: idosch@idosch.org
> Date: Wed, 19 Oct 2016 16:57:08 +0300
> 
> > From: Ido Schimmel <idosch@mellanox.com>
> > 
> > Tamir reported the following trace when processing ARP requests received
> > via a vlan device on top of a VLAN-aware bridge:
>  ...
> > The problem is that netdev_all_lower_get_next_rcu() never advances the
> > iterator, thereby causing the loop over the lower adjacency list to run
> > forever.
> > 
> > Fix this by advancing the iterator and avoid the infinite loop.
> > 
> > Fixes: 7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
> > Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> > Reported-by: Tamir Winetroub <tamirw@mellanox.com>
> > Reviewed-by: Jiri Pirko <jiri@mellanox.com>
> 
> Applied.

Below the patch I noted "Please consider queueing this for 4.8.y", but
you didn't reply and I don't see it here:
https://patchwork.ozlabs.org/bundle/davem/stable/?state=*

So I'm not sure if this was rejected or you just missed my note.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: core: Correctly iterate over lower adjacency list
  2016-10-19 15:36   ` Ido Schimmel
@ 2016-10-19 16:28     ` David Miller
  0 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2016-10-19 16:28 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, eladr, yotamg, nogahf, ogerlitz, dsa, idosch

From: Ido Schimmel <idosch@idosch.org>
Date: Wed, 19 Oct 2016 18:36:27 +0300

> Below the patch I noted "Please consider queueing this for 4.8.y", but
> you didn't reply and I don't see it here:
> https://patchwork.ozlabs.org/bundle/davem/stable/?state=*
> 
> So I'm not sure if this was rejected or you just missed my note.

Sorry, I thought I added it, should be fixes now.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-19 16:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-19 13:57 [PATCH net] net: core: Correctly iterate over lower adjacency list idosch
2016-10-19 14:33 ` David Ahern
2016-10-19 14:38 ` David Miller
2016-10-19 15:36   ` Ido Schimmel
2016-10-19 16:28     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).