[PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes
@ 2014-07-02 15:09 David Vrabel
  2014-07-02 15:09 ` [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect() David Vrabel
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: David Vrabel @ 2014-07-02 15:09 UTC (permalink / raw)
  To: netdev; +Cc: David Vrabel, xen-devel, Konrad Rzeszutek Wilk, Boris Ostrovsky

Two fixes to the per-queue locking bugs in xen-netfront that were
introduced in 3.16-rc1 with the multi-queue support.

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect()
  2014-07-02 15:09 [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Vrabel
@ 2014-07-02 15:09 ` David Vrabel
  2014-07-02 17:36   ` Boris Ostrovsky
  2014-07-02 15:09 ` [PATCH 2/2] xen-netfront: call netif_carrier_off() only once when disconnecting David Vrabel
  2014-07-08 18:21 ` [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Miller
  2 siblings, 1 reply; 6+ messages in thread
From: David Vrabel @ 2014-07-02 15:09 UTC (permalink / raw)
  To: netdev; +Cc: David Vrabel, xen-devel, Konrad Rzeszutek Wilk, Boris Ostrovsky

The nesting of the per-queue rx_lock and tx_lock in xennet_connect()
is confusing to both humans and lockdep.  The locking is safe because
this is the only place where the locks are nested in this way but
lockdep still warns.

Instead of adding the missing lockdep annotations, refactor the
locking to avoid the confusing nesting.  This is still safe, because
the xenbus connection state changes are all serialized by the xenwatch
thread.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
---
 drivers/net/xen-netfront.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 2ccb4a0..6a37d62 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -2046,13 +2046,15 @@ static int xennet_connect(struct net_device *dev)
 	/* By now, the queue structures have been set up */
 	for (j = 0; j < num_queues; ++j) {
 		queue = &np->queues[j];
-		spin_lock_bh(&queue->rx_lock);
-		spin_lock_irq(&queue->tx_lock);
 
 		/* Step 1: Discard all pending TX packet fragments. */
+		spin_lock_irq(&queue->tx_lock);
 		xennet_release_tx_bufs(queue);
+		spin_unlock_irq(&queue->tx_lock);
 
 		/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
+		spin_lock_bh(&queue->rx_lock);
+
 		for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
 			skb_frag_t *frag;
 			const struct page *page;
@@ -2076,6 +2078,8 @@ static int xennet_connect(struct net_device *dev)
 		}
 
 		queue->rx.req_prod_pvt = requeue_idx;
+
+		spin_unlock_bh(&queue->rx_lock);
 	}
 
 	/*
@@ -2087,13 +2091,17 @@ static int xennet_connect(struct net_device *dev)
 	netif_carrier_on(np->netdev);
 	for (j = 0; j < num_queues; ++j) {
 		queue = &np->queues[j];
+
 		notify_remote_via_irq(queue->tx_irq);
 		if (queue->tx_irq != queue->rx_irq)
 			notify_remote_via_irq(queue->rx_irq);
-		xennet_tx_buf_gc(queue);
-		xennet_alloc_rx_buffers(queue);
 
+		spin_lock_irq(&queue->tx_lock);
+		xennet_tx_buf_gc(queue);
 		spin_unlock_irq(&queue->tx_lock);
+
+		spin_lock_bh(&queue->rx_lock);
+		xennet_alloc_rx_buffers(queue);
 		spin_unlock_bh(&queue->rx_lock);
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] xen-netfront: call netif_carrier_off() only once when disconnecting
  2014-07-02 15:09 [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Vrabel
  2014-07-02 15:09 ` [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect() David Vrabel
@ 2014-07-02 15:09 ` David Vrabel
  2014-07-08 18:21 ` [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: David Vrabel @ 2014-07-02 15:09 UTC (permalink / raw)
  To: netdev; +Cc: David Vrabel, xen-devel, Konrad Rzeszutek Wilk, Boris Ostrovsky

In xennet_disconnect_backend(), netif_carrier_off() was called once
per queue when it needs to only be called once.

The queue locking around the netif_carrier_off() call looked very
odd. I think they were supposed to synchronize any NAPI instances with
the expectation that no further NAPI instances would be scheduled
because of the carrier being off (see the check in
xennet_rx_interrupt()).  But I can't easily tell if this works
correctly.

Instead, add a napi_synchronize() call after disabling the interrupts.
This is obviously correct as with no Rx interrupts, no further NAPI
instances will be scheduled.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netfront.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 6a37d62..055222b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1439,16 +1439,11 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 	unsigned int i = 0;
 	unsigned int num_queues = info->netdev->real_num_tx_queues;
 
+	netif_carrier_off(info->netdev);
+
 	for (i = 0; i < num_queues; ++i) {
 		struct netfront_queue *queue = &info->queues[i];
 
-		/* Stop old i/f to prevent errors whilst we rebuild the state. */
-		spin_lock_bh(&queue->rx_lock);
-		spin_lock_irq(&queue->tx_lock);
-		netif_carrier_off(queue->info->netdev);
-		spin_unlock_irq(&queue->tx_lock);
-		spin_unlock_bh(&queue->rx_lock);
-
 		if (queue->tx_irq && (queue->tx_irq == queue->rx_irq))
 			unbind_from_irqhandler(queue->tx_irq, queue);
 		if (queue->tx_irq && (queue->tx_irq != queue->rx_irq)) {
@@ -1458,6 +1453,8 @@ static void xennet_disconnect_backend(struct netfront_info *info)
 		queue->tx_evtchn = queue->rx_evtchn = 0;
 		queue->tx_irq = queue->rx_irq = 0;
 
+		napi_synchronize(&queue->napi);
+
 		/* End access and free the pages */
 		xennet_end_access(queue->tx_ring_ref, queue->tx.sring);
 		xennet_end_access(queue->rx_ring_ref, queue->rx.sring);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect()
  2014-07-02 15:09 ` [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect() David Vrabel
@ 2014-07-02 17:36   ` Boris Ostrovsky
  2014-07-03 16:19     ` David Vrabel
  0 siblings, 1 reply; 6+ messages in thread
From: Boris Ostrovsky @ 2014-07-02 17:36 UTC (permalink / raw)
  To: David Vrabel
  Cc: netdev, xen-devel, Konrad Rzeszutek Wilk, Tülin İzer

On 07/02/2014 11:09 AM, David Vrabel wrote:
> The nesting of the per-queue rx_lock and tx_lock in xennet_connect()
> is confusing to both humans and lockdep.  The locking is safe because
> this is the only place where the locks are nested in this way but
> lockdep still warns.
>
> Instead of adding the missing lockdep annotations, refactor the
> locking to avoid the confusing nesting.  This is still safe, because
> the xenbus connection state changes are all serialized by the xenwatch
> thread.

Tulin (GSoC student) is working on adding some concurrency to the 
xenwatch thread (possibly eliminating it and replacing with a 
workqueue). Which means that we will need to add some sort of 
synchronization in watch callbacks, whenever they are required.

-boris

>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
> ---
>   drivers/net/xen-netfront.c |   16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 2ccb4a0..6a37d62 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -2046,13 +2046,15 @@ static int xennet_connect(struct net_device *dev)
>   	/* By now, the queue structures have been set up */
>   	for (j = 0; j < num_queues; ++j) {
>   		queue = &np->queues[j];
> -		spin_lock_bh(&queue->rx_lock);
> -		spin_lock_irq(&queue->tx_lock);
>   
>   		/* Step 1: Discard all pending TX packet fragments. */
> +		spin_lock_irq(&queue->tx_lock);
>   		xennet_release_tx_bufs(queue);
> +		spin_unlock_irq(&queue->tx_lock);
>   
>   		/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
> +		spin_lock_bh(&queue->rx_lock);
> +
>   		for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
>   			skb_frag_t *frag;
>   			const struct page *page;
> @@ -2076,6 +2078,8 @@ static int xennet_connect(struct net_device *dev)
>   		}
>   
>   		queue->rx.req_prod_pvt = requeue_idx;
> +
> +		spin_unlock_bh(&queue->rx_lock);
>   	}
>   
>   	/*
> @@ -2087,13 +2091,17 @@ static int xennet_connect(struct net_device *dev)
>   	netif_carrier_on(np->netdev);
>   	for (j = 0; j < num_queues; ++j) {
>   		queue = &np->queues[j];
> +
>   		notify_remote_via_irq(queue->tx_irq);
>   		if (queue->tx_irq != queue->rx_irq)
>   			notify_remote_via_irq(queue->rx_irq);
> -		xennet_tx_buf_gc(queue);
> -		xennet_alloc_rx_buffers(queue);
>   
> +		spin_lock_irq(&queue->tx_lock);
> +		xennet_tx_buf_gc(queue);
>   		spin_unlock_irq(&queue->tx_lock);
> +
> +		spin_lock_bh(&queue->rx_lock);
> +		xennet_alloc_rx_buffers(queue);
>   		spin_unlock_bh(&queue->rx_lock);
>   	}
>   

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect()
  2014-07-02 17:36   ` Boris Ostrovsky
@ 2014-07-03 16:19     ` David Vrabel
  0 siblings, 0 replies; 6+ messages in thread
From: David Vrabel @ 2014-07-03 16:19 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: netdev, xen-devel, Konrad Rzeszutek Wilk, Tülin İzer

On 02/07/14 18:36, Boris Ostrovsky wrote:
> On 07/02/2014 11:09 AM, David Vrabel wrote:
>> The nesting of the per-queue rx_lock and tx_lock in xennet_connect()
>> is confusing to both humans and lockdep.  The locking is safe because
>> this is the only place where the locks are nested in this way but
>> lockdep still warns.
>>
>> Instead of adding the missing lockdep annotations, refactor the
>> locking to avoid the confusing nesting.  This is still safe, because
>> the xenbus connection state changes are all serialized by the xenwatch
>> thread.
> 
> Tulin (GSoC student) is working on adding some concurrency to the
> xenwatch thread (possibly eliminating it and replacing with a
> workqueue). Which means that we will need to add some sort of
> synchronization in watch callbacks, whenever they are required.

The back/frontend changed callbacks should be serialized with the device
mutex in the same way that the device core does for probe and remove etc.

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes
  2014-07-02 15:09 [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Vrabel
  2014-07-02 15:09 ` [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect() David Vrabel
  2014-07-02 15:09 ` [PATCH 2/2] xen-netfront: call netif_carrier_off() only once when disconnecting David Vrabel
@ 2014-07-08 18:21 ` David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2014-07-08 18:21 UTC (permalink / raw)
  To: david.vrabel; +Cc: netdev, xen-devel, konrad.wilk, boris.ostrovsky

From: David Vrabel <david.vrabel@citrix.com>
Date: Wed, 2 Jul 2014 16:09:13 +0100

> Two fixes to the per-queue locking bugs in xen-netfront that were
> introduced in 3.16-rc1 with the multi-queue support.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-07-08 18:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-02 15:09 [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Vrabel
2014-07-02 15:09 ` [PATCH 1/2] xen-netfront: don't nest queue locks in xennet_connect() David Vrabel
2014-07-02 17:36   ` Boris Ostrovsky
2014-07-03 16:19     ` David Vrabel
2014-07-02 15:09 ` [PATCH 2/2] xen-netfront: call netif_carrier_off() only once when disconnecting David Vrabel
2014-07-08 18:21 ` [PATCHv1 net 0/2] xen-netfront: multi-queue related locking fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).