* xen-netback: fix oopes during shutdown and error handling
@ 2013-02-14 13:18 David Vrabel
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
To: xen-devel
Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
Wei Liu, netdev
These two netback patches fix oopes that may occur during shutdown or
if a specific fatal error occurs.
They are stable candidants.
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests()
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
@ 2013-02-14 13:18 ` David Vrabel
2013-02-14 16:39 ` Ian Campbell
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
To: xen-devel
Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
Wei Liu, netdev, Wei Liu, Jan Beulich
From: David Vrabel <david.vrabel@citrix.com>
netbk_count_requests() could detect an error, call
netbk_fatal_tx_error() but return 0. The vif may then be used
afterwards (e.g., in a call to netbk_tx_error().
Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may
be freed immediately after the call to netbk_fatal_tx_error() (e.g.,
if the vif is also removed).
Netback thread Xenwatch thread
-------------------------------------------
netbk_fatal_tx_err() netback_remove()
xenvif_disconnect()
...
free_netdev()
netbk_tx_err() Oops!
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
drivers/net/xen-netback/netback.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 2b9520c..cd49ba9 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -911,13 +911,13 @@ static int netbk_count_requests(struct xenvif *vif,
if (frags >= work_to_do) {
netdev_err(vif->dev, "Need more frags\n");
netbk_fatal_tx_err(vif);
- return -frags;
+ return -ENODATA;
}
if (unlikely(frags >= MAX_SKB_FRAGS)) {
netdev_err(vif->dev, "Too many frags\n");
netbk_fatal_tx_err(vif);
- return -frags;
+ return -E2BIG;
}
memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + frags),
@@ -925,7 +925,7 @@ static int netbk_count_requests(struct xenvif *vif,
if (txp->size > first->size) {
netdev_err(vif->dev, "Frag is bigger than frame.\n");
netbk_fatal_tx_err(vif);
- return -frags;
+ return -EIO;
}
first->size -= txp->size;
@@ -935,7 +935,7 @@ static int netbk_count_requests(struct xenvif *vif,
netdev_err(vif->dev, "txp->offset: %x, size: %u\n",
txp->offset, txp->size);
netbk_fatal_tx_err(vif);
- return -frags;
+ return -EINVAL;
}
} while ((txp++)->flags & XEN_NETTXF_more_data);
return frags;
--
1.7.2.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
@ 2013-02-14 13:18 ` David Vrabel
2013-02-14 13:53 ` [Xen-devel] " Wei Liu
2013-02-14 16:39 ` Ian Campbell
2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
2013-02-14 21:57 ` Christopher S. Aker
3 siblings, 2 replies; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:18 UTC (permalink / raw)
To: xen-devel
Cc: David Vrabel, Konrad Rzeszutek Wilk, Ian Campbell, Jan Beulich,
Wei Liu, netdev
From: David Vrabel <david.vrabel@citrix.com>
If the credit timer is left armed after calling
xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
the vif which will then oops as vif->netbk == NULL.
This may happen both in the fatal error path and during normal
disconnection from the front end.
The sequencing during shutdown is critical to ensure that: a)
vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
is not freed.
1. Mark as unschedulable (netif_carrier_off()).
2. Synchronously cancel the timer.
3. Remove the vif from the schedule list.
4. Remove it from it netback thread group.
5. Wait for vif->refcnt to become 0.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
drivers/net/xen-netback/interface.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index b8c5193..221f426 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
static void xenvif_down(struct xenvif *vif)
{
disable_irq(vif->irq);
+ del_timer_sync(&vif->credit_timeout);
xen_netbk_deschedule_xenvif(vif);
xen_netbk_remove_xenvif(vif);
}
@@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
atomic_dec(&vif->refcnt);
wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
- del_timer_sync(&vif->credit_timeout);
-
if (vif->irq)
unbind_from_irqhandler(vif->irq, vif);
--
1.7.2.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
@ 2013-02-14 13:53 ` Wei Liu
2013-02-14 13:56 ` David Vrabel
2013-02-14 14:15 ` Jan Beulich
2013-02-14 16:39 ` Ian Campbell
1 sibling, 2 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 13:53 UTC (permalink / raw)
To: David Vrabel
Cc: wei.liu2, xen-devel@lists.xen.org, Ian Campbell,
Konrad Rzeszutek Wilk, netdev@vger.kernel.org, Jan Beulich,
Wei Liu (3P)
On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
>
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
>
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
>
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
You would need to reinitialize the timer in xenvif_up, given that user
might `ifconfig vifX.X down; ifconfig vifX.X up`.
Another less desired but simpler fix would be leave the timer alone but
check for vif->netbk != NULL in the timer callback.
Wei.
> ---
> drivers/net/xen-netback/interface.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
> static void xenvif_down(struct xenvif *vif)
> {
> disable_irq(vif->irq);
> + del_timer_sync(&vif->credit_timeout);
> xen_netbk_deschedule_xenvif(vif);
> xen_netbk_remove_xenvif(vif);
> }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
> atomic_dec(&vif->refcnt);
> wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>
> - del_timer_sync(&vif->credit_timeout);
> -
> if (vif->irq)
> unbind_from_irqhandler(vif->irq, vif);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:53 ` [Xen-devel] " Wei Liu
@ 2013-02-14 13:56 ` David Vrabel
2013-02-14 14:10 ` Wei Liu
2013-02-14 14:15 ` Jan Beulich
1 sibling, 1 reply; 14+ messages in thread
From: David Vrabel @ 2013-02-14 13:56 UTC (permalink / raw)
To: Wei Liu
Cc: xen-devel@lists.xen.org, Ian Campbell, Konrad Rzeszutek Wilk,
netdev@vger.kernel.org, Jan Beulich, Wei Liu (3P)
On 14/02/13 13:53, Wei Liu wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>>
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>>
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>>
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>>
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.
No. Deleted timers do not need to be reinitialized. The timer will be
armed as usual with mod_timer() when credit is next exhausted.
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:56 ` David Vrabel
@ 2013-02-14 14:10 ` Wei Liu
0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 14:10 UTC (permalink / raw)
To: David Vrabel
Cc: wei.liu2, xen-devel@lists.xen.org, Ian Campbell,
Konrad Rzeszutek Wilk, netdev@vger.kernel.org, Jan Beulich,
Wei Liu (3P)
On Thu, 2013-02-14 at 13:56 +0000, David Vrabel wrote:
> On 14/02/13 13:53, Wei Liu wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >>
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >>
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >>
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >>
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> >
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
>
> No. Deleted timers do not need to be reinitialized. The timer will be
What I really meant was to "rearm"...
> armed as usual with mod_timer() when credit is next exhausted.
>
Ah, ok.
Wei.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:53 ` [Xen-devel] " Wei Liu
2013-02-14 13:56 ` David Vrabel
@ 2013-02-14 14:15 ` Jan Beulich
2013-02-14 14:21 ` Wei Liu
1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2013-02-14 14:15 UTC (permalink / raw)
To: Ian Campbell, Wei Liu
Cc: David Vrabel, Wei Liu (3P), xen-devel@lists.xen.org,
KonradRzeszutek Wilk, netdev@vger.kernel.org
>>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If the credit timer is left armed after calling
>> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
>> the vif which will then oops as vif->netbk == NULL.
>>
>> This may happen both in the fatal error path and during normal
>> disconnection from the front end.
>>
>> The sequencing during shutdown is critical to ensure that: a)
>> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
>> is not freed.
>>
>> 1. Mark as unschedulable (netif_carrier_off()).
>> 2. Synchronously cancel the timer.
>> 3. Remove the vif from the schedule list.
>> 4. Remove it from it netback thread group.
>> 5. Wait for vif->refcnt to become 0.
>>
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>
> You would need to reinitialize the timer in xenvif_up, given that user
> might `ifconfig vifX.X down; ifconfig vifX.X up`.
Which gets us to another aspect of the original fix that I don't
think was considered: Is there anything preventing the interface
to be brought back up after fatal_tx_err() shut it down?
Jan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 14:15 ` Jan Beulich
@ 2013-02-14 14:21 ` Wei Liu
0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 14:21 UTC (permalink / raw)
To: Jan Beulich
Cc: wei.liu2, Ian Campbell, David Vrabel, xen-devel@lists.xen.org,
KonradRzeszutek Wilk, netdev@vger.kernel.org
On Thu, 2013-02-14 at 14:15 +0000, Jan Beulich wrote:
> >>> On 14.02.13 at 14:53, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> If the credit timer is left armed after calling
> >> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> >> the vif which will then oops as vif->netbk == NULL.
> >>
> >> This may happen both in the fatal error path and during normal
> >> disconnection from the front end.
> >>
> >> The sequencing during shutdown is critical to ensure that: a)
> >> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> >> is not freed.
> >>
> >> 1. Mark as unschedulable (netif_carrier_off()).
> >> 2. Synchronously cancel the timer.
> >> 3. Remove the vif from the schedule list.
> >> 4. Remove it from it netback thread group.
> >> 5. Wait for vif->refcnt to become 0.
> >>
> >> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> >
> > You would need to reinitialize the timer in xenvif_up, given that user
> > might `ifconfig vifX.X down; ifconfig vifX.X up`.
>
> Which gets us to another aspect of the original fix that I don't
> think was considered: Is there anything preventing the interface
> to be brought back up after fatal_tx_err() shut it down?
>
I don't think so. Code could / should not prevent host admin from doing
anything he wants - even it is re-enabling a malicious vif. ;-)
Wei.
> Jan
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests()
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
@ 2013-02-14 16:39 ` Ian Campbell
0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2013-02-14 16:39 UTC (permalink / raw)
To: David Vrabel, Christopher S. Aker
Cc: xen-devel@lists.xen.org, Konrad Rzeszutek Wilk, Jan Beulich,
Wei Liu (3P), netdev@vger.kernel.org, Wei Liu
On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> netbk_count_requests() could detect an error, call
> netbk_fatal_tx_error() but return 0. The vif may then be used
> afterwards (e.g., in a call to netbk_tx_error().
>
> Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may
> be freed immediately after the call to netbk_fatal_tx_error() (e.g.,
> if the vif is also removed).
>
> Netback thread Xenwatch thread
> -------------------------------------------
> netbk_fatal_tx_err() netback_remove()
> xenvif_disconnect()
> ...
> free_netdev()
> netbk_tx_err() Oops!
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Jan Beulich <JBeulich@suse.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Christopher S. Aker <caker@theshore.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Although I would like to add a Tested-by: Christopher too if possible.
David (M) this should go to stable please.
> ---
> drivers/net/xen-netback/netback.c | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 2b9520c..cd49ba9 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -911,13 +911,13 @@ static int netbk_count_requests(struct xenvif *vif,
> if (frags >= work_to_do) {
> netdev_err(vif->dev, "Need more frags\n");
> netbk_fatal_tx_err(vif);
> - return -frags;
> + return -ENODATA;
> }
>
> if (unlikely(frags >= MAX_SKB_FRAGS)) {
> netdev_err(vif->dev, "Too many frags\n");
> netbk_fatal_tx_err(vif);
> - return -frags;
> + return -E2BIG;
> }
>
> memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + frags),
> @@ -925,7 +925,7 @@ static int netbk_count_requests(struct xenvif *vif,
> if (txp->size > first->size) {
> netdev_err(vif->dev, "Frag is bigger than frame.\n");
> netbk_fatal_tx_err(vif);
> - return -frags;
> + return -EIO;
> }
>
> first->size -= txp->size;
> @@ -935,7 +935,7 @@ static int netbk_count_requests(struct xenvif *vif,
> netdev_err(vif->dev, "txp->offset: %x, size: %u\n",
> txp->offset, txp->size);
> netbk_fatal_tx_err(vif);
> - return -frags;
> + return -EINVAL;
> }
> } while ((txp++)->flags & XEN_NETTXF_more_data);
> return frags;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
2013-02-14 13:53 ` [Xen-devel] " Wei Liu
@ 2013-02-14 16:39 ` Ian Campbell
2013-02-14 16:48 ` [Xen-devel] " Wei Liu
1 sibling, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2013-02-14 16:39 UTC (permalink / raw)
To: David Vrabel, Christopher S. Aker
Cc: xen-devel@lists.xen.org, Konrad Rzeszutek Wilk, Jan Beulich,
Wei Liu (3P), netdev@vger.kernel.org
On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> If the credit timer is left armed after calling
> xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> the vif which will then oops as vif->netbk == NULL.
>
> This may happen both in the fatal error path and during normal
> disconnection from the front end.
>
> The sequencing during shutdown is critical to ensure that: a)
> vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> is not freed.
>
> 1. Mark as unschedulable (netif_carrier_off()).
> 2. Synchronously cancel the timer.
> 3. Remove the vif from the schedule list.
> 4. Remove it from it netback thread group.
> 5. Wait for vif->refcnt to become 0.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Was this one also Reported-by Christopher S. Aker or was it just
discovered in the process of investigating?
Another stable candidate please Dave.
> ---
> drivers/net/xen-netback/interface.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b8c5193..221f426 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
> static void xenvif_down(struct xenvif *vif)
> {
> disable_irq(vif->irq);
> + del_timer_sync(&vif->credit_timeout);
> xen_netbk_deschedule_xenvif(vif);
> xen_netbk_remove_xenvif(vif);
> }
> @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
> atomic_dec(&vif->refcnt);
> wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
>
> - del_timer_sync(&vif->credit_timeout);
> -
> if (vif->irq)
> unbind_from_irqhandler(vif->irq, vif);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down
2013-02-14 16:39 ` Ian Campbell
@ 2013-02-14 16:48 ` Wei Liu
0 siblings, 0 replies; 14+ messages in thread
From: Wei Liu @ 2013-02-14 16:48 UTC (permalink / raw)
To: Ian Campbell
Cc: wei.liu2, David Vrabel, Christopher S. Aker,
netdev@vger.kernel.org, Konrad Rzeszutek Wilk, Wei Liu (3P),
Jan Beulich, xen-devel@lists.xen.org
On Thu, 2013-02-14 at 16:39 +0000, Ian Campbell wrote:
> On Thu, 2013-02-14 at 13:18 +0000, David Vrabel wrote:
> > From: David Vrabel <david.vrabel@citrix.com>
> >
> > If the credit timer is left armed after calling
> > xen_netbk_remove_xenvif(), then it may fire and attempt to schedule
> > the vif which will then oops as vif->netbk == NULL.
> >
> > This may happen both in the fatal error path and during normal
> > disconnection from the front end.
> >
> > The sequencing during shutdown is critical to ensure that: a)
> > vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif
> > is not freed.
> >
> > 1. Mark as unschedulable (netif_carrier_off()).
> > 2. Synchronously cancel the timer.
> > 3. Remove the vif from the schedule list.
> > 4. Remove it from it netback thread group.
> > 5. Wait for vif->refcnt to become 0.
> >
> > Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> Was this one also Reported-by Christopher S. Aker or was it just
> discovered in the process of investigating?
>
His bug report did prod me to look into this, so I think it is worth
adding
Reported-by: Christopher S. Aker <caker@theshore.net>
Wei.
> Another stable candidate please Dave.
>
> > ---
> > drivers/net/xen-netback/interface.c | 3 +--
> > 1 files changed, 1 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> > index b8c5193..221f426 100644
> > --- a/drivers/net/xen-netback/interface.c
> > +++ b/drivers/net/xen-netback/interface.c
> > @@ -132,6 +132,7 @@ static void xenvif_up(struct xenvif *vif)
> > static void xenvif_down(struct xenvif *vif)
> > {
> > disable_irq(vif->irq);
> > + del_timer_sync(&vif->credit_timeout);
> > xen_netbk_deschedule_xenvif(vif);
> > xen_netbk_remove_xenvif(vif);
> > }
> > @@ -363,8 +364,6 @@ void xenvif_disconnect(struct xenvif *vif)
> > atomic_dec(&vif->refcnt);
> > wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
> >
> > - del_timer_sync(&vif->credit_timeout);
> > -
> > if (vif->irq)
> > unbind_from_irqhandler(vif->irq, vif);
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xen-netback: fix oopes during shutdown and error handling
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
@ 2013-02-14 18:21 ` David Miller
2013-02-14 21:57 ` Christopher S. Aker
3 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2013-02-14 18:21 UTC (permalink / raw)
To: david.vrabel
Cc: xen-devel, konrad.wilk, ian.campbell, jbeulich, wei.liu, netdev
From: David Vrabel <david.vrabel@citrix.com>
Date: Thu, 14 Feb 2013 13:18:56 +0000
> These two netback patches fix oopes that may occur during shutdown or
> if a specific fatal error occurs.
>
> They are stable candidants.
Both applied and queued up for -stable, thanks.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xen-netback: fix oopes during shutdown and error handling
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
` (2 preceding siblings ...)
2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
@ 2013-02-14 21:57 ` Christopher S. Aker
2013-02-15 8:45 ` [Xen-devel] " Ian Campbell
3 siblings, 1 reply; 14+ messages in thread
From: Christopher S. Aker @ 2013-02-14 21:57 UTC (permalink / raw)
To: David Vrabel
Cc: Ian Campbell, Konrad Rzeszutek Wilk, netdev, xen-devel,
Jan Beulich, Wei Liu
On 2/14/13 8:18 AM, David Vrabel wrote:
> These two netback patches fix oopes that may occur during shutdown or
> if a specific fatal error occurs.
On 2/14/13 11:39 AM, Ian Campbell wrote:
> Although I would like to add a Tested-by: Christopher too if
> possible.
Tested-by: Christopher S. Aker <caker@theshore.net> (and team)
We are no longer able to trigger the last OOPs we provided by downing
the vif. Further testing of the other issues will take time, but we're
deploying a few patched hosts now.
Thanks!
-Chris
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] xen-netback: fix oopes during shutdown and error handling
2013-02-14 21:57 ` Christopher S. Aker
@ 2013-02-15 8:45 ` Ian Campbell
0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2013-02-15 8:45 UTC (permalink / raw)
To: Christopher S. Aker
Cc: David Vrabel, xen-devel@lists.xen.org, Konrad Rzeszutek Wilk,
netdev@vger.kernel.org, Jan Beulich, Wei Liu (3P)
On Thu, 2013-02-14 at 21:57 +0000, Christopher S. Aker wrote:
> On 2/14/13 8:18 AM, David Vrabel wrote:
> > These two netback patches fix oopes that may occur during shutdown or
> > if a specific fatal error occurs.
>
> On 2/14/13 11:39 AM, Ian Campbell wrote:
> > Although I would like to add a Tested-by: Christopher too if
> > possible.
>
> Tested-by: Christopher S. Aker <caker@theshore.net> (and team)
Thank you!
> We are no longer able to trigger the last OOPs we provided by downing
> the vif. Further testing of the other issues will take time, but we're
> deploying a few patched hosts now.
Great, please let me know how you get on.
Ian.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-02-15 8:45 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-14 13:18 xen-netback: fix oopes during shutdown and error handling David Vrabel
2013-02-14 13:18 ` [PATCH 1/2] xen-netback: correctly return errors from netbk_count_requests() David Vrabel
2013-02-14 16:39 ` Ian Campbell
2013-02-14 13:18 ` [PATCH 2/2] xen-netback: cancel the credit timer when taking the vif down David Vrabel
2013-02-14 13:53 ` [Xen-devel] " Wei Liu
2013-02-14 13:56 ` David Vrabel
2013-02-14 14:10 ` Wei Liu
2013-02-14 14:15 ` Jan Beulich
2013-02-14 14:21 ` Wei Liu
2013-02-14 16:39 ` Ian Campbell
2013-02-14 16:48 ` [Xen-devel] " Wei Liu
2013-02-14 18:21 ` xen-netback: fix oopes during shutdown and error handling David Miller
2013-02-14 21:57 ` Christopher S. Aker
2013-02-15 8:45 ` [Xen-devel] " Ian Campbell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).