* [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight
2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
2017-05-18 15:28 ` [PATCH v2 2/3] xen/blkback: don't free be structure too early Juergen Gross
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
To: linux-kernel, xen-devel
Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross, stable
Today disconnecting xen-blkback is broken in case there are still
I/Os in flight: xen_blkif_disconnect() will bail out early without
releasing all resources in the hope it will be called again when
the last request has terminated. This, however, won't happen as
xen_blkif_free() won't be called on termination of the last running
request: xen_blkif_put() won't decrement the blkif refcnt to 0 as
xen_blkif_disconnect() didn't finish before thus some xen_blkif_put()
calls in xen_blkif_disconnect() didn't happen.
To solve this deadlock xen_blkif_disconnect() and
xen_blkif_alloc_rings() shouldn't use xen_blkif_put() and
xen_blkif_get() but use some other way to do their accounting of
resources.
This at once fixes another error in xen_blkif_disconnect(): when it
returned early with -EBUSY for another ring than 0 it would call
xen_blkif_put() again for already handled rings on a subsequent call.
This will lead to inconsistencies in the refcnt handling.
Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
drivers/block/xen-blkback/common.h | 1 +
drivers/block/xen-blkback/xenbus.c | 7 +++++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index dea61f6ab8cb..638597b17a38 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -281,6 +281,7 @@ struct xen_blkif_ring {
wait_queue_head_t wq;
atomic_t inflight;
+ bool active;
/* One thread per blkif ring. */
struct task_struct *xenblkd;
unsigned int waiting_reqs;
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 1f3dfaa54d87..998915174bb8 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -159,7 +159,7 @@ static int xen_blkif_alloc_rings(struct xen_blkif *blkif)
init_waitqueue_head(&ring->shutdown_wq);
ring->blkif = blkif;
ring->st_print = jiffies;
- xen_blkif_get(blkif);
+ ring->active = true;
}
return 0;
@@ -249,6 +249,9 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
struct xen_blkif_ring *ring = &blkif->rings[r];
unsigned int i = 0;
+ if (!ring->active)
+ continue;
+
if (ring->xenblkd) {
kthread_stop(ring->xenblkd);
wake_up(&ring->shutdown_wq);
@@ -296,7 +299,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
BUG_ON(ring->free_pages_num != 0);
BUG_ON(ring->persistent_gnt_c != 0);
WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));
- xen_blkif_put(blkif);
+ ring->active = false;
}
blkif->nr_ring_pages = 0;
/*
--
2.12.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 2/3] xen/blkback: don't free be structure too early
2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
2017-05-18 15:28 ` [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread Juergen Gross
2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
To: linux-kernel, xen-devel; +Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross
The be structure must nor be freed when freeing the blkif structure
isn't done. Otherwise a use-after-free of be when unmapping the ring
used for communicating with the frontend will occur in case of a
late call of xenblk_disconnect() (e.g. due to an I/O still active
when trying to disconnect).
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
drivers/block/xen-blkback/xenbus.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 998915174bb8..4cdf0490983e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -315,9 +315,10 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
static void xen_blkif_free(struct xen_blkif *blkif)
{
-
- xen_blkif_disconnect(blkif);
+ WARN_ON(xen_blkif_disconnect(blkif));
xen_vbd_free(&blkif->vbd);
+ kfree(blkif->be->mode);
+ kfree(blkif->be);
/* Make sure everything is drained before shutting down */
kmem_cache_free(xen_blkif_cachep, blkif);
@@ -514,8 +515,6 @@ static int xen_blkbk_remove(struct xenbus_device *dev)
xen_blkif_put(be->blkif);
}
- kfree(be->mode);
- kfree(be);
return 0;
}
--
2.12.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
2017-05-18 15:28 ` [PATCH v2 1/3] xen/blkback: fix disconnect while I/Os in flight Juergen Gross
2017-05-18 15:28 ` [PATCH v2 2/3] xen/blkback: don't free be structure too early Juergen Gross
@ 2017-05-18 15:28 ` Juergen Gross
2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
3 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2017-05-18 15:28 UTC (permalink / raw)
To: linux-kernel, xen-devel; +Cc: konrad.wilk, roger.pau, netwiz, Juergen Gross
There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread
of xen-blkback. Thread stopping is synchronous and using the blkif
reference counting in the kthread will avoid to ever let the reference
count drop to zero at the end of an I/O running concurrent to
disconnecting and multiple rings.
Setting ring->xenblkd to NULL after stopping the kthread isn't needed
as the kthread does this already.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
---
drivers/block/xen-blkback/blkback.c | 3 ---
drivers/block/xen-blkback/xenbus.c | 1 -
2 files changed, 4 deletions(-)
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 726c32e35db9..6b14c509f3c7 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -609,8 +609,6 @@ int xen_blkif_schedule(void *arg)
unsigned long timeout;
int ret;
- xen_blkif_get(blkif);
-
set_freezable();
while (!kthread_should_stop()) {
if (try_to_freeze())
@@ -665,7 +663,6 @@ int xen_blkif_schedule(void *arg)
print_stats(ring);
ring->xenblkd = NULL;
- xen_blkif_put(blkif);
return 0;
}
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 4cdf0490983e..792da683e70d 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -255,7 +255,6 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
if (ring->xenblkd) {
kthread_stop(ring->xenblkd);
wake_up(&ring->shutdown_wq);
- ring->xenblkd = NULL;
}
/* The above kthread_stop() guarantees that at this point we
--
2.12.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2 0/3] xen/blkback: several fixes of resource management
2017-05-18 15:28 [PATCH v2 0/3] xen/blkback: several fixes of resource management Juergen Gross
` (2 preceding siblings ...)
2017-05-18 15:28 ` [PATCH v2 3/3] xen/blkback: don't use xen_blkif_get() in xen-blkback kthread Juergen Gross
@ 2017-06-07 12:36 ` Steven Haigh
2017-06-07 13:52 ` [Xen-devel] " Konrad Rzeszutek Wilk
3 siblings, 1 reply; 7+ messages in thread
From: Steven Haigh @ 2017-06-07 12:36 UTC (permalink / raw)
To: xen-devel; +Cc: Juergen Gross, linux-kernel, xen-devel, roger.pau
[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]
On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> leaked several resources, including references of the guest's memory
> pages.
>
> This patch series addresses those leaks by correcting usage of
> reference counts and the sequence when to free which resource.
>
> The series applies on top of commit 2d4456c73a487abe ("block:
> xen-blkback: add null check to avoid null pointer dereference") in
> Jens Axboe's tree kernel/git/axboe/linux-block.git
>
> V2: changed flag to type bool in patch 1 (Dietmar Hahn)
>
> Juergen Gross (3):
> xen/blkback: fix disconnect while I/Os in flight
> xen/blkback: don't free be structure too early
> xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
>
> drivers/block/xen-blkback/blkback.c | 3 ---
> drivers/block/xen-blkback/common.h | 1 +
> drivers/block/xen-blkback/xenbus.c | 15 ++++++++-------
> 3 files changed, 9 insertions(+), 10 deletions(-)
Just wanted to give this a bit of a prod.
Is there any plans in having this hit the kernel.org kernels?
My testing was purely on kernel 4.9 branch - but it doesn't look like this has
shown up there yet?
--
Steven Haigh
📧 netwiz@crc.id.au 💻 http://www.crc.id.au
📞 +61 (3) 9001 6090 📱 0412 935 897
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [Xen-devel] [PATCH v2 0/3] xen/blkback: several fixes of resource management
2017-06-07 12:36 ` [PATCH v2 0/3] xen/blkback: several fixes of resource management Steven Haigh
@ 2017-06-07 13:52 ` Konrad Rzeszutek Wilk
2017-06-07 14:08 ` Steven Haigh
0 siblings, 1 reply; 7+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-06-07 13:52 UTC (permalink / raw)
To: Steven Haigh; +Cc: xen-devel, Juergen Gross, xen-devel, linux-kernel, roger.pau
On Wed, Jun 07, 2017 at 10:36:58PM +1000, Steven Haigh wrote:
> On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> > Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> > leaked several resources, including references of the guest's memory
> > pages.
> >
> > This patch series addresses those leaks by correcting usage of
> > reference counts and the sequence when to free which resource.
> >
> > The series applies on top of commit 2d4456c73a487abe ("block:
> > xen-blkback: add null check to avoid null pointer dereference") in
> > Jens Axboe's tree kernel/git/axboe/linux-block.git
> >
> > V2: changed flag to type bool in patch 1 (Dietmar Hahn)
> >
> > Juergen Gross (3):
> > xen/blkback: fix disconnect while I/Os in flight
> > xen/blkback: don't free be structure too early
> > xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
> >
> > drivers/block/xen-blkback/blkback.c | 3 ---
> > drivers/block/xen-blkback/common.h | 1 +
> > drivers/block/xen-blkback/xenbus.c | 15 ++++++++-------
> > 3 files changed, 9 insertions(+), 10 deletions(-)
>
> Just wanted to give this a bit of a prod.
<ugh> Ouch!
>
> Is there any plans in having this hit the kernel.org kernels?
Yes.
>
> My testing was purely on kernel 4.9 branch - but it doesn't look like this has
> shown up there yet?
Correct. I am thinking to send these to Jens around June 20th or so.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [Xen-devel] [PATCH v2 0/3] xen/blkback: several fixes of resource management
2017-06-07 13:52 ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2017-06-07 14:08 ` Steven Haigh
0 siblings, 0 replies; 7+ messages in thread
From: Steven Haigh @ 2017-06-07 14:08 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk, Juergen Gross; +Cc: xen-devel, linux-kernel, roger.pau
[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]
On Wednesday, 7 June 2017 11:52:34 PM AEST Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 07, 2017 at 10:36:58PM +1000, Steven Haigh wrote:
> > On Friday, 19 May 2017 1:28:46 AM AEST Juergen Gross wrote:
> > > Destroying a Xen guest domain while it was doing I/Os via xen-blkback
> > > leaked several resources, including references of the guest's memory
> > > pages.
> > >
> > > This patch series addresses those leaks by correcting usage of
> > > reference counts and the sequence when to free which resource.
> > >
> > > The series applies on top of commit 2d4456c73a487abe ("block:
> > > xen-blkback: add null check to avoid null pointer dereference") in
> > > Jens Axboe's tree kernel/git/axboe/linux-block.git
> > >
> > > V2: changed flag to type bool in patch 1 (Dietmar Hahn)
> > >
> > > Juergen Gross (3):
> > > xen/blkback: fix disconnect while I/Os in flight
> > > xen/blkback: don't free be structure too early
> > > xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
> > >
> > > drivers/block/xen-blkback/blkback.c | 3 ---
> > > drivers/block/xen-blkback/common.h | 1 +
> > > drivers/block/xen-blkback/xenbus.c | 15 ++++++++-------
> > > 3 files changed, 9 insertions(+), 10 deletions(-)
> >
> > Just wanted to give this a bit of a prod.
>
> <ugh> Ouch!
>
> > Is there any plans in having this hit the kernel.org kernels?
>
> Yes.
>
> > My testing was purely on kernel 4.9 branch - but it doesn't look like this
> > has shown up there yet?
>
> Correct. I am thinking to send these to Jens around June 20th or so.
Ok, all understood. Thanks for the clarifications.
At the moment, I'm just including them in my kernel builds - then expecting
them to fail at one point in the future when the patch fails due to
upstreaming. I'll just keep doing this.
--
Steven Haigh
📧 netwiz@crc.id.au 💻 http://www.crc.id.au
📞 +61 (3) 9001 6090 📱 0412 935 897
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread