netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* xen-netback: fix oopes during shutdown and error handling
@ 2013-02-14 13:18 David Vrabel
  2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
  To: xen-devel
  Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
	Wei Liu, netdev

These two netback patches fix oopes that may occur during shutdown or
if a specific fatal error occurs.

They are stable candidants.

David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests()
  2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
@ 2013-02-14 13:18 ` David Vrabel
  2013-02-14 16:39   ` Ian Campbell
  2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
  To: xen-devel
  Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
	Wei Liu, netdev, Wei Liu, Jan Beulich

From: David Vrabel <david.vrabel@citrix.com>

netbk_count_requests() could detect an error, call
netbk_fatal_tx_error() but return 0.  The vif may then be used
afterwards (e.g., in a call to netbk_tx_error().

Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may
be freed immediately after the call to netbk_fatal_tx_error() (e.g.,
if the vif is also removed).

Netback thread              Xenwatch thread
-------------------------------------------
netbk_fatal_tx_err()        netback_remove()
                              xenvif_disconnect()
                                ...
                                free_netdev()
netbk_tx_err() Oops!

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/netback.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 2b9520c..cd49ba9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -911,13 +911,13 @@ static int netbk_count_requests(struct xenvif *vif,
 		if (frags >= work_to_do) {
 			netdev_err(vif->dev, "Need more frags\n");
 			netbk_fatal_tx_err(vif);
-			return -frags;
+			return -ENODATA;
 		}
 
 		if (unlikely(frags >= MAX_SKB_FRAGS)) {
 			netdev_err(vif->dev, "Too many frags\n");
 			netbk_fatal_tx_err(vif);
-			return -frags;
+			return -E2BIG;
 		}
 
 		memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + frags),
@@ -925,7 +925,7 @@ static int netbk_count_requests(struct xenvif *vif,
 		if (txp->size > first->size) {
 			netdev_err(vif->dev, "Frag is bigger than frame.\n");
 			netbk_fatal_tx_err(vif);
-			return -frags;
+			return -EIO;
 		}
 
 		first->size -= txp->size;
@@ -935,7 +935,7 @@ static int netbk_count_requests(struct xenvif *vif,
 			netdev_err(vif->dev, "txp->offset: %x, size: %u\n",
 				 txp->offset, txp->size);
 			netbk_fatal_tx_err(vif);
-			return -frags;
+			return -EINVAL;
 		}
 	} while ((txp++)->flags & XEN_NETTXF_more_data);
 	return frags;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
  2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
@ 2013-02-14 13:18 ` David Vrabel
  2013-02-14 13:53   ` [Xen-devel] " Wei Liu
  2013-02-14 16:39   ` Ian Campbell
  2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
  2013-02-14 21:57 ` Christopher S. Aker
  3 siblings, 2 replies; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
  To: xen-devel
  Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
	Wei Liu, netdev

From: David Vrabel <david.vrabel@citrix.com>

If the credit timer is left armed after calling
xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
the vif which will then oops as vif->netbk == NULL.

This may happen both in the fatal error path and during normal
disconnection from the front end.

The sequencing during shutdown is critical to ensure that: a)
vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
is not freed.

1. Mark as unschedulable (netif_carrier_off()).
2. Synchronously cancel the timer.
3. Remove the vif from the schedule list.
4. Remove it from it netback thread group.
5. Wait for vif->refcnt to become 0.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/interface.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index b8c5193..221f426 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
 static void xenvif_down(struct xenvif *vif)
 {
 	disable_irq(vif->irq);
+	del_timer_sync(&vif->credit_timeout);
 	xen_netbk_deschedule_xenvif(vif);
 	xen_netbk_remove_xenvif(vif);
 }
@@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
-	del_timer_sync(&vif->credit_timeout);
-
 	if (vif->irq)
 		unbind_from_irqhandler(vif->irq, vif);
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
@ 2013-02-14 13:53   ` Wei Liu
  2013-02-14 13:56     ` David Vrabel
  2013-02-14 14:15     ` Jan Beulich
  2013-02-14 16:39   ` Ian Campbell
  1 sibling, 2 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 13:53 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, xen-devel@lists.xen.org, Ian Campbell,
	Konrad Rzeszutek Wilk, netdev@vger.kernel.org, Jan Beulich,
	Wei Liu (3P)

On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
> 
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
> 
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
> 
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

You would need to reinitialize the timer in xenvif_up, given that user
might `ifconfig vifX.X down; ifconfig vifX.X up`.

Another less desired but simpler fix would be leave the timer alone but
check for vif->netbk != NULL in the timer callback.


Wei.

> ---
>  drivers/net/xen-netback/interface.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
>  static void xenvif_down(struct xenvif *vif)
>  {
>  	disable_irq(vif->irq);
> +	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
>  }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	del_timer_sync(&vif->credit_timeout);
> -
>  	if (vif->irq)
>  		unbind_from_irqhandler(vif->irq, vif);
>  

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:53   ` [Xen-devel] " Wei Liu
@ 2013-02-14 13:56     ` David Vrabel
  2013-02-14 14:10       ` Wei Liu
  2013-02-14 14:15     ` Jan Beulich
  1 sibling, 1 reply; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:56 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel@lists.xen.org, Ian Campbell, Konrad Rzeszutek Wilk,
	netdev@vger.kernel.org, Jan Beulich, Wei Liu (3P)

On 14/02/13 13:53, Wei Liu wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>>
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>>
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>>
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>>
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.

No.  Deleted timers do not need to be reinitialized.  The timer will be
armed as usual with mod_timer() when credit is next exhausted.

David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:56     ` David Vrabel
@ 2013-02-14 14:10       ` Wei Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 14:10 UTC (permalink / raw)
  To: David Vrabel
  Cc: wei.liu2, xen-devel@lists.xen.org, Ian Campbell,
	Konrad Rzeszutek Wilk, netdev@vger.kernel.org, Jan Beulich,
	Wei Liu (3P)

On Thu, 2013-02-14 at 13:56 +0000, David Vrabel wrote:
> On 14/02/13 13:53, Wei Liu wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >>
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >>
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >>
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >>
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> > 
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
> 
> No.  Deleted timers do not need to be reinitialized.  The timer will be

What I really meant was to "rearm"...

> armed as usual with mod_timer() when credit is next exhausted.
> 

Ah, ok.


Wei.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:53   ` [Xen-devel] " Wei Liu
  2013-02-14 13:56     ` David Vrabel
@ 2013-02-14 14:15     ` Jan Beulich
  2013-02-14 14:21       ` Wei Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2013-02-14 14:15 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: David Vrabel, Wei Liu (3P), xen-devel@lists.xen.org,
	KonradRzeszutek Wilk, netdev@vger.kernel.org

>>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>> 
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>> 
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>> 
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>> 
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>> 
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.

Which gets us to another aspect of the original fix that I don't
think was considered: Is there anything preventing the interface
to be brought back up after fatal_tx_err() shut it down?

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 14:15     ` Jan Beulich
@ 2013-02-14 14:21       ` Wei Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 14:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, David Vrabel, xen-devel@lists.xen.org,
	KonradRzeszutek Wilk, netdev@vger.kernel.org

On Thu, 2013-02-14 at 14:15 +0000, Jan Beulich wrote:
> >>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >> 
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >> 
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >> 
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >> 
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >> 
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> > 
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
> 
> Which gets us to another aspect of the original fix that I don't
> think was considered: Is there anything preventing the interface
> to be brought back up after fatal_tx_err() shut it down?
> 

I don't think so. Code could / should not prevent host admin from doing
anything he wants - even it is re-enabling a malicious vif. ;-)


Wei.

> Jan
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests()
  2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
@ 2013-02-14 16:39   ` Ian Campbell
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2013-02-14 16:39 UTC (permalink / raw)
  To: David Vrabel, Christopher S. Aker
  Cc: xen-devel@lists.xen.org, Konrad Rzeszutek Wilk, Jan Beulich,
	Wei Liu (3P), netdev@vger.kernel.org, Wei Liu

On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> netbk_count_requests() could detect an error, call
> netbk_fatal_tx_error() but return 0.  The vif may then be used
> afterwards (e.g., in a call to netbk_tx_error().
> 
> Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may
> be freed immediately after the call to netbk_fatal_tx_error() (e.g.,
> if the vif is also removed).
> 
> Netback thread              Xenwatch thread
> -------------------------------------------
> netbk_fatal_tx_err()        netback_remove()
>                               xenvif_disconnect()
>                                 ...
>                                 free_netdev()
> netbk_tx_err() Oops!
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Jan Beulich <JBeulich@suse.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Reported-by: Christopher S. Aker <caker@theshore.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

Although I would like to add a Tested-by: Christopher too if possible.

David (M) this should go to stable please.

> ---
>  drivers/net/xen-netback/netback.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 2b9520c..cd49ba9 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -911,13 +911,13 @@ static int netbk_count_requests(struct xenvif *vif,
>  		if (frags >= work_to_do) {
>  			netdev_err(vif->dev, "Need more frags\n");
>  			netbk_fatal_tx_err(vif);
> -			return -frags;
> +			return -ENODATA;
>  		}
>  
>  		if (unlikely(frags >= MAX_SKB_FRAGS)) {
>  			netdev_err(vif->dev, "Too many frags\n");
>  			netbk_fatal_tx_err(vif);
> -			return -frags;
> +			return -E2BIG;
>  		}
>  
>  		memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + frags),
> @@ -925,7 +925,7 @@ static int netbk_count_requests(struct xenvif *vif,
>  		if (txp->size > first->size) {
>  			netdev_err(vif->dev, "Frag is bigger than frame.\n");
>  			netbk_fatal_tx_err(vif);
> -			return -frags;
> +			return -EIO;
>  		}
>  
>  		first->size -= txp->size;
> @@ -935,7 +935,7 @@ static int netbk_count_requests(struct xenvif *vif,
>  			netdev_err(vif->dev, "txp->offset: %x, size: %u\n",
>  				 txp->offset, txp->size);
>  			netbk_fatal_tx_err(vif);
> -			return -frags;
> +			return -EINVAL;
>  		}
>  	} while ((txp++)->flags & XEN_NETTXF_more_data);
>  	return frags;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
  2013-02-14 13:53   ` [Xen-devel] " Wei Liu
@ 2013-02-14 16:39   ` Ian Campbell
  2013-02-14 16:48     ` [Xen-devel] " Wei Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2013-02-14 16:39 UTC (permalink / raw)
  To: David Vrabel, Christopher S. Aker
  Cc: xen-devel@lists.xen.org, Konrad Rzeszutek Wilk, Jan Beulich,
	Wei Liu (3P), netdev@vger.kernel.org

On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
> 
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
> 
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
> 
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Was this one also Reported-by Christopher S. Aker or was it just
discovered in the process of investigating?

Another stable candidate please Dave.

> ---
>  drivers/net/xen-netback/interface.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
>  static void xenvif_down(struct xenvif *vif)
>  {
>  	disable_irq(vif->irq);
> +	del_timer_sync(&vif->credit_timeout);
>  	xen_netbk_deschedule_xenvif(vif);
>  	xen_netbk_remove_xenvif(vif);
>  }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
>  	atomic_dec(&vif->refcnt);
>  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>  
> -	del_timer_sync(&vif->credit_timeout);
> -
>  	if (vif->irq)
>  		unbind_from_irqhandler(vif->irq, vif);
>  

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
  2013-02-14 16:39   ` Ian Campbell
@ 2013-02-14 16:48     ` Wei Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 16:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, David Vrabel, Christopher S. Aker,
	netdev@vger.kernel.org, Konrad Rzeszutek Wilk, Wei Liu (3P),
	Jan Beulich, xen-devel@lists.xen.org

On Thu, 2013-02-14 at 16:39 +0000, Ian Campbell wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> > From: David Vrabel <david.vrabel@citrix.com>
> > 
> > If the credit timer is left armed after calling
> > xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> > the vif which will then oops as vif->netbk == NULL.
> > 
> > This may happen both in the fatal error path and during normal
> > disconnection from the front end.
> > 
> > The sequencing during shutdown is critical to ensure that: a)
> > vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> > is not freed.
> > 
> > 1. Mark as unschedulable (netif_carrier_off()).
> > 2. Synchronously cancel the timer.
> > 3. Remove the vif from the schedule list.
> > 4. Remove it from it netback thread group.
> > 5. Wait for vif->refcnt to become 0.
> > 
> > Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Was this one also Reported-by Christopher S. Aker or was it just
> discovered in the process of investigating?
> 

His bug report did prod me to look into this, so I think it is worth
adding

Reported-by: Christopher S. Aker <caker@theshore.net>


Wei.

> Another stable candidate please Dave.
> 
> > ---
> >  drivers/net/xen-netback/interface.c |    3 +--
> >  1 files changed, 1 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> > index b8c5193..221f426 100644
> > --- a/drivers/net/xen-netback/interface.c
> > +++ b/drivers/net/xen-netback/interface.c
> > @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
> >  static void xenvif_down(struct xenvif *vif)
> >  {
> >  	disable_irq(vif->irq);
> > +	del_timer_sync(&vif->credit_timeout);
> >  	xen_netbk_deschedule_xenvif(vif);
> >  	xen_netbk_remove_xenvif(vif);
> >  }
> > @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
> >  	atomic_dec(&vif->refcnt);
> >  	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
> >  
> > -	del_timer_sync(&vif->credit_timeout);
> > -
> >  	if (vif->irq)
> >  		unbind_from_irqhandler(vif->irq, vif);
> >  
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xen-netback: fix oopes during shutdown and error handling
  2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
  2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
  2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
@ 2013-02-14 18:21 ` David Miller
  2013-02-14 21:57 ` Christopher S. Aker
  3 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-14 18:21 UTC (permalink / raw)
  To: david.vrabel
  Cc: xen-devel, konrad.wilk, ian.campbell, jbeulich, wei.liu, netdev

From: David Vrabel <david.vrabel@citrix.com>
Date: Thu, 14 Feb 2013 13:18:56 +0000

> These two netback patches fix oopes that may occur during shutdown or
> if a specific fatal error occurs.
> 
> They are stable candidants.

Both applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xen-netback: fix oopes during shutdown and error handling
  2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
                   ` (2 preceding siblings ...)
  2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
@ 2013-02-14 21:57 ` Christopher S. Aker
  2013-02-15  8:45   ` [Xen-devel] " Ian Campbell
  3 siblings, 1 reply; 14+ messages in thread
From: Christopher S. Aker @ 2013-02-14 21:57 UTC (permalink / raw)
  To: David Vrabel
  Cc: Ian Campbell, Konrad Rzeszutek Wilk, netdev, xen-devel,
	Jan Beulich, Wei Liu

On 2/14/13 8:18 AM, David Vrabel wrote:
> These two netback patches fix oopes that may occur during shutdown or
> if a specific fatal error occurs.

On 2/14/13 11:39 AM, Ian Campbell wrote:
> Although I would like to add a Tested-by: Christopher too if
> possible.

Tested-by: Christopher S. Aker <caker@theshore.net> (and team)

We are no longer able to trigger the last OOPs we provided by downing 
the vif.  Further testing of the other issues will take time, but we're 
deploying a few patched hosts now.

Thanks!
-Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Xen-devel] xen-netback: fix oopes during shutdown and error handling
  2013-02-14 21:57 ` Christopher S. Aker
@ 2013-02-15  8:45   ` Ian Campbell
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2013-02-15  8:45 UTC (permalink / raw)
  To: Christopher S. Aker
  Cc: David Vrabel, xen-devel@lists.xen.org, Konrad Rzeszutek Wilk,
	netdev@vger.kernel.org, Jan Beulich, Wei Liu (3P)

On Thu, 2013-02-14 at 21:57 +0000, Christopher S. Aker wrote:
> On 2/14/13 8:18 AM, David Vrabel wrote:
> > These two netback patches fix oopes that may occur during shutdown or
> > if a specific fatal error occurs.
> 
> On 2/14/13 11:39 AM, Ian Campbell wrote:
> > Although I would like to add a Tested-by: Christopher too if
> > possible.
> 
> Tested-by: Christopher S. Aker <caker@theshore.net> (and team)

Thank you!

> We are no longer able to trigger the last OOPs we provided by downing 
> the vif.  Further testing of the other issues will take time, but we're 
> deploying a few patched hosts now.

Great, please let me know how you get on.

Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-02-15  8:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
2013-02-14 16:39   ` Ian Campbell
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
2013-02-14 13:53   ` [Xen-devel] " Wei Liu
2013-02-14 13:56     ` David Vrabel
2013-02-14 14:10       ` Wei Liu
2013-02-14 14:15     ` Jan Beulich
2013-02-14 14:21       ` Wei Liu
2013-02-14 16:39   ` Ian Campbell
2013-02-14 16:48     ` [Xen-devel] " Wei Liu
2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
2013-02-14 21:57 ` Christopher S. Aker
2013-02-15  8:45   ` [Xen-devel] " Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).