* [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
@ 2016-04-01 13:57 Fam Zheng
2016-04-01 14:14 ` Laurent Vivier
2016-04-04 11:57 ` Stefan Hajnoczi
0 siblings, 2 replies; 8+ messages in thread
From: Fam Zheng @ 2016-04-01 13:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, lvivier, qemu-block, Stefan Hajnoczi, pbonzini
Using the nested aio_poll() in coroutine is a bad idea. This patch
replaces the aio_poll loop in bdrv_drain with a BH, if called in
coroutine.
For example, the bdrv_drain() in mirror.c can hang when a guest issued
request is pending on it in qemu_co_mutex_lock().
Mirror coroutine in this case has just finished a request, and the block
job is about to complete. It calls bdrv_drain() which waits for the
other coroutine to complete. The other coroutine is a scsi-disk request.
The deadlock happens when the latter is in turn pending on the former to
yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
(assuming a qcow2 image):
mirror coroutine scsi-disk coroutine
-------------------------------------------------------------
do last write
qcow2:qemu_co_mutex_lock()
...
scsi disk read
tracked request begin
qcow2:qemu_co_mutex_lock.enter
qcow2:qemu_co_mutex_unlock()
bdrv_drain
while (has tracked request)
aio_poll()
In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
because the mirror coroutine is blocked in the aio_poll(blocking=true).
With this patch, the added qemu_coroutine_yield() allows the scsi-disk
coroutine to make progress as expected:
mirror coroutine scsi-disk coroutine
-------------------------------------------------------------
do last write
qcow2:qemu_co_mutex_lock()
...
scsi disk read
tracked request begin
qcow2:qemu_co_mutex_lock.enter
qcow2:qemu_co_mutex_unlock()
bdrv_drain.enter
> schedule BH
> qemu_coroutine_yield()
> qcow2:qemu_co_mutex_lock.return
> ...
tracked request end
...
(resumed from BH callback)
bdrv_drain.return
...
Reported-by: Laurent Vivier <lvivier@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
v2: Call qemu_bh_delete() in BH callback. [Paolo]
Change the loop to an assertion. [Paolo]
Elaborate a bit about the fix in commit log. [Paolo]
---
block/io.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/block/io.c b/block/io.c
index c4869b9..ddcfb4c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -253,6 +253,43 @@ static void bdrv_drain_recurse(BlockDriverState *bs)
}
}
+typedef struct {
+ Coroutine *co;
+ BlockDriverState *bs;
+ QEMUBH *bh;
+ bool done;
+} BdrvCoDrainData;
+
+static void bdrv_co_drain_bh_cb(void *opaque)
+{
+ BdrvCoDrainData *data = opaque;
+ Coroutine *co = data->co;
+
+ qemu_bh_delete(data->bh);
+ bdrv_drain(data->bs);
+ data->done = true;
+ qemu_coroutine_enter(co, NULL);
+}
+
+static void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
+{
+ BdrvCoDrainData data;
+
+ assert(qemu_in_coroutine());
+ data = (BdrvCoDrainData) {
+ .co = qemu_coroutine_self(),
+ .bs = bs,
+ .done = false,
+ .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, &data),
+ };
+ qemu_bh_schedule(data.bh);
+
+ qemu_coroutine_yield();
+ /* If we are resumed from some other event (such as an aio completion or a
+ * timer callback), it is a bug in the caller that should be fixed. */
+ assert(data.done);
+}
+
/*
* Wait for pending requests to complete on a single BlockDriverState subtree,
* and suspend block driver's internal I/O until next request arrives.
@@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
bool busy = true;
bdrv_drain_recurse(bs);
+ if (qemu_in_coroutine()) {
+ bdrv_co_drain(bs);
+ return;
+ }
while (busy) {
/* Keep iterating */
bdrv_flush_io_queue(bs);
--
2.7.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-01 13:57 [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine Fam Zheng
@ 2016-04-01 14:14 ` Laurent Vivier
2016-04-04 11:57 ` Stefan Hajnoczi
1 sibling, 0 replies; 8+ messages in thread
From: Laurent Vivier @ 2016-04-01 14:14 UTC (permalink / raw)
To: Fam Zheng, qemu-devel; +Cc: Kevin Wolf, pbonzini, qemu-block, Stefan Hajnoczi
On 01/04/2016 15:57, Fam Zheng wrote:
> Using the nested aio_poll() in coroutine is a bad idea. This patch
> replaces the aio_poll loop in bdrv_drain with a BH, if called in
> coroutine.
>
> For example, the bdrv_drain() in mirror.c can hang when a guest issued
> request is pending on it in qemu_co_mutex_lock().
>
> Mirror coroutine in this case has just finished a request, and the block
> job is about to complete. It calls bdrv_drain() which waits for the
> other coroutine to complete. The other coroutine is a scsi-disk request.
> The deadlock happens when the latter is in turn pending on the former to
> yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
> (assuming a qcow2 image):
>
> mirror coroutine scsi-disk coroutine
> -------------------------------------------------------------
> do last write
>
> qcow2:qemu_co_mutex_lock()
> ...
> scsi disk read
>
> tracked request begin
>
> qcow2:qemu_co_mutex_lock.enter
>
> qcow2:qemu_co_mutex_unlock()
>
> bdrv_drain
> while (has tracked request)
> aio_poll()
>
> In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
> because the mirror coroutine is blocked in the aio_poll(blocking=true).
>
> With this patch, the added qemu_coroutine_yield() allows the scsi-disk
> coroutine to make progress as expected:
>
> mirror coroutine scsi-disk coroutine
> -------------------------------------------------------------
> do last write
>
> qcow2:qemu_co_mutex_lock()
> ...
> scsi disk read
>
> tracked request begin
>
> qcow2:qemu_co_mutex_lock.enter
>
> qcow2:qemu_co_mutex_unlock()
>
> bdrv_drain.enter
>> schedule BH
>> qemu_coroutine_yield()
>> qcow2:qemu_co_mutex_lock.return
>> ...
> tracked request end
> ...
> (resumed from BH callback)
> bdrv_drain.return
> ...
>
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-01 13:57 [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine Fam Zheng
2016-04-01 14:14 ` Laurent Vivier
@ 2016-04-04 11:57 ` Stefan Hajnoczi
2016-04-04 14:47 ` Paolo Bonzini
2016-04-05 1:27 ` Fam Zheng
1 sibling, 2 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2016-04-04 11:57 UTC (permalink / raw)
To: Fam Zheng; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block, pbonzini
[-- Attachment #1: Type: text/plain, Size: 5064 bytes --]
On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
> Using the nested aio_poll() in coroutine is a bad idea. This patch
> replaces the aio_poll loop in bdrv_drain with a BH, if called in
> coroutine.
>
> For example, the bdrv_drain() in mirror.c can hang when a guest issued
> request is pending on it in qemu_co_mutex_lock().
>
> Mirror coroutine in this case has just finished a request, and the block
> job is about to complete. It calls bdrv_drain() which waits for the
> other coroutine to complete. The other coroutine is a scsi-disk request.
> The deadlock happens when the latter is in turn pending on the former to
> yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
> (assuming a qcow2 image):
>
> mirror coroutine scsi-disk coroutine
> -------------------------------------------------------------
> do last write
>
> qcow2:qemu_co_mutex_lock()
> ...
> scsi disk read
>
> tracked request begin
>
> qcow2:qemu_co_mutex_lock.enter
>
> qcow2:qemu_co_mutex_unlock()
>
> bdrv_drain
> while (has tracked request)
> aio_poll()
>
> In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
> because the mirror coroutine is blocked in the aio_poll(blocking=true).
>
> With this patch, the added qemu_coroutine_yield() allows the scsi-disk
> coroutine to make progress as expected:
>
> mirror coroutine scsi-disk coroutine
> -------------------------------------------------------------
> do last write
>
> qcow2:qemu_co_mutex_lock()
> ...
> scsi disk read
>
> tracked request begin
>
> qcow2:qemu_co_mutex_lock.enter
>
> qcow2:qemu_co_mutex_unlock()
>
> bdrv_drain.enter
> > schedule BH
> > qemu_coroutine_yield()
> > qcow2:qemu_co_mutex_lock.return
> > ...
> tracked request end
> ...
> (resumed from BH callback)
> bdrv_drain.return
> ...
>
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
>
> ---
>
> v2: Call qemu_bh_delete() in BH callback. [Paolo]
> Change the loop to an assertion. [Paolo]
> Elaborate a bit about the fix in commit log. [Paolo]
> ---
> block/io.c | 41 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 41 insertions(+)
>
> diff --git a/block/io.c b/block/io.c
> index c4869b9..ddcfb4c 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -253,6 +253,43 @@ static void bdrv_drain_recurse(BlockDriverState *bs)
> }
> }
>
> +typedef struct {
> + Coroutine *co;
> + BlockDriverState *bs;
> + QEMUBH *bh;
> + bool done;
> +} BdrvCoDrainData;
> +
> +static void bdrv_co_drain_bh_cb(void *opaque)
> +{
> + BdrvCoDrainData *data = opaque;
> + Coroutine *co = data->co;
> +
> + qemu_bh_delete(data->bh);
> + bdrv_drain(data->bs);
> + data->done = true;
> + qemu_coroutine_enter(co, NULL);
> +}
> +
> +static void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
> +{
Please document why a BH is needed:
/* Calling bdrv_drain() from a BH ensures the
* current coroutine yields and other coroutines run if they were
* queued from qemu_co_queue_run_restart().
*/
> + BdrvCoDrainData data;
> +
> + assert(qemu_in_coroutine());
> + data = (BdrvCoDrainData) {
> + .co = qemu_coroutine_self(),
> + .bs = bs,
> + .done = false,
> + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, &data),
> + };
> + qemu_bh_schedule(data.bh);
> +
> + qemu_coroutine_yield();
> + /* If we are resumed from some other event (such as an aio completion or a
> + * timer callback), it is a bug in the caller that should be fixed. */
> + assert(data.done);
> +}
> +
> /*
> * Wait for pending requests to complete on a single BlockDriverState subtree,
> * and suspend block driver's internal I/O until next request arrives.
> @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
> bool busy = true;
>
> bdrv_drain_recurse(bs);
> + if (qemu_in_coroutine()) {
> + bdrv_co_drain(bs);
> + return;
> + }
> while (busy) {
> /* Keep iterating */
> bdrv_flush_io_queue(bs);
> --
> 2.7.4
block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
should assert(!qemu_in_coroutine()).
The reason why existing bdrv_read() and friends detect coroutine context
at runtime is because it is meant for legacy code that runs in both
coroutine and non-coroutine contexts.
Modern coroutine code coroutine code the coroutine interfaces explicitly
instead.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-04 11:57 ` Stefan Hajnoczi
@ 2016-04-04 14:47 ` Paolo Bonzini
2016-04-05 1:27 ` Fam Zheng
1 sibling, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2016-04-04 14:47 UTC (permalink / raw)
To: Stefan Hajnoczi, Fam Zheng; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block
On 04/04/2016 13:57, Stefan Hajnoczi wrote:
> On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
>> Using the nested aio_poll() in coroutine is a bad idea. This patch
>> replaces the aio_poll loop in bdrv_drain with a BH, if called in
>> coroutine.
>>
>> For example, the bdrv_drain() in mirror.c can hang when a guest issued
>> request is pending on it in qemu_co_mutex_lock().
>>
>> Mirror coroutine in this case has just finished a request, and the block
>> job is about to complete. It calls bdrv_drain() which waits for the
>> other coroutine to complete. The other coroutine is a scsi-disk request.
>> The deadlock happens when the latter is in turn pending on the former to
>> yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
>> (assuming a qcow2 image):
>>
>> mirror coroutine scsi-disk coroutine
>> -------------------------------------------------------------
>> do last write
>>
>> qcow2:qemu_co_mutex_lock()
>> ...
>> scsi disk read
>>
>> tracked request begin
>>
>> qcow2:qemu_co_mutex_lock.enter
>>
>> qcow2:qemu_co_mutex_unlock()
>>
>> bdrv_drain
>> while (has tracked request)
>> aio_poll()
>>
>> In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
>> because the mirror coroutine is blocked in the aio_poll(blocking=true).
>>
>> With this patch, the added qemu_coroutine_yield() allows the scsi-disk
>> coroutine to make progress as expected:
>>
>> mirror coroutine scsi-disk coroutine
>> -------------------------------------------------------------
>> do last write
>>
>> qcow2:qemu_co_mutex_lock()
>> ...
>> scsi disk read
>>
>> tracked request begin
>>
>> qcow2:qemu_co_mutex_lock.enter
>>
>> qcow2:qemu_co_mutex_unlock()
>>
>> bdrv_drain.enter
>>> schedule BH
>>> qemu_coroutine_yield()
>>> qcow2:qemu_co_mutex_lock.return
>>> ...
>> tracked request end
>> ...
>> (resumed from BH callback)
>> bdrv_drain.return
>> ...
>>
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>
>> ---
>>
>> v2: Call qemu_bh_delete() in BH callback. [Paolo]
>> Change the loop to an assertion. [Paolo]
>> Elaborate a bit about the fix in commit log. [Paolo]
>> ---
>> block/io.c | 41 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 41 insertions(+)
>>
>> diff --git a/block/io.c b/block/io.c
>> index c4869b9..ddcfb4c 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -253,6 +253,43 @@ static void bdrv_drain_recurse(BlockDriverState *bs)
>> }
>> }
>>
>> +typedef struct {
>> + Coroutine *co;
>> + BlockDriverState *bs;
>> + QEMUBH *bh;
>> + bool done;
>> +} BdrvCoDrainData;
>> +
>> +static void bdrv_co_drain_bh_cb(void *opaque)
>> +{
>> + BdrvCoDrainData *data = opaque;
>> + Coroutine *co = data->co;
>> +
>> + qemu_bh_delete(data->bh);
>> + bdrv_drain(data->bs);
>> + data->done = true;
>> + qemu_coroutine_enter(co, NULL);
>> +}
>> +
>> +static void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
>> +{
>
> Please document why a BH is needed:
>
> /* Calling bdrv_drain() from a BH ensures the
> * current coroutine yields and other coroutines run if they were
> * queued from qemu_co_queue_run_restart().
> */
>
>> + BdrvCoDrainData data;
>> +
>> + assert(qemu_in_coroutine());
>> + data = (BdrvCoDrainData) {
>> + .co = qemu_coroutine_self(),
>> + .bs = bs,
>> + .done = false,
>> + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, &data),
>> + };
>> + qemu_bh_schedule(data.bh);
>> +
>> + qemu_coroutine_yield();
>> + /* If we are resumed from some other event (such as an aio completion or a
>> + * timer callback), it is a bug in the caller that should be fixed. */
>> + assert(data.done);
>> +}
>> +
>> /*
>> * Wait for pending requests to complete on a single BlockDriverState subtree,
>> * and suspend block driver's internal I/O until next request arrives.
>> @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
>> bool busy = true;
>>
>> bdrv_drain_recurse(bs);
>> + if (qemu_in_coroutine()) {
>> + bdrv_co_drain(bs);
>> + return;
>> + }
>> while (busy) {
>> /* Keep iterating */
>> bdrv_flush_io_queue(bs);
>> --
>> 2.7.4
>
> block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> should assert(!qemu_in_coroutine()).
>
> The reason why existing bdrv_read() and friends detect coroutine context
> at runtime is because it is meant for legacy code that runs in both
> coroutine and non-coroutine contexts.
>
> Modern coroutine code coroutine code the coroutine interfaces explicitly
> instead.
For what it's worth, I suspect Fam's patch removes the need for
http://article.gmane.org/gmane.comp.emulators.qemu/401375. That's a
nice bonus. :)
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-04 11:57 ` Stefan Hajnoczi
2016-04-04 14:47 ` Paolo Bonzini
@ 2016-04-05 1:27 ` Fam Zheng
2016-04-05 9:39 ` Stefan Hajnoczi
1 sibling, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2016-04-05 1:27 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block, pbonzini
On Mon, 04/04 12:57, Stefan Hajnoczi wrote:
> On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
> > Using the nested aio_poll() in coroutine is a bad idea. This patch
> > replaces the aio_poll loop in bdrv_drain with a BH, if called in
> > coroutine.
> >
> > For example, the bdrv_drain() in mirror.c can hang when a guest issued
> > request is pending on it in qemu_co_mutex_lock().
> >
> > Mirror coroutine in this case has just finished a request, and the block
> > job is about to complete. It calls bdrv_drain() which waits for the
> > other coroutine to complete. The other coroutine is a scsi-disk request.
> > The deadlock happens when the latter is in turn pending on the former to
> > yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
> > (assuming a qcow2 image):
> >
> > mirror coroutine scsi-disk coroutine
> > -------------------------------------------------------------
> > do last write
> >
> > qcow2:qemu_co_mutex_lock()
> > ...
> > scsi disk read
> >
> > tracked request begin
> >
> > qcow2:qemu_co_mutex_lock.enter
> >
> > qcow2:qemu_co_mutex_unlock()
> >
> > bdrv_drain
> > while (has tracked request)
> > aio_poll()
> >
> > In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
> > because the mirror coroutine is blocked in the aio_poll(blocking=true).
> >
> > With this patch, the added qemu_coroutine_yield() allows the scsi-disk
> > coroutine to make progress as expected:
> >
> > mirror coroutine scsi-disk coroutine
> > -------------------------------------------------------------
> > do last write
> >
> > qcow2:qemu_co_mutex_lock()
> > ...
> > scsi disk read
> >
> > tracked request begin
> >
> > qcow2:qemu_co_mutex_lock.enter
> >
> > qcow2:qemu_co_mutex_unlock()
> >
> > bdrv_drain.enter
> > > schedule BH
> > > qemu_coroutine_yield()
> > > qcow2:qemu_co_mutex_lock.return
> > > ...
> > tracked request end
> > ...
> > (resumed from BH callback)
> > bdrv_drain.return
> > ...
> >
> > Reported-by: Laurent Vivier <lvivier@redhat.com>
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> >
> > ---
> >
> > v2: Call qemu_bh_delete() in BH callback. [Paolo]
> > Change the loop to an assertion. [Paolo]
> > Elaborate a bit about the fix in commit log. [Paolo]
> > ---
> > block/io.c | 41 +++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 41 insertions(+)
> >
> > diff --git a/block/io.c b/block/io.c
> > index c4869b9..ddcfb4c 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -253,6 +253,43 @@ static void bdrv_drain_recurse(BlockDriverState *bs)
> > }
> > }
> >
> > +typedef struct {
> > + Coroutine *co;
> > + BlockDriverState *bs;
> > + QEMUBH *bh;
> > + bool done;
> > +} BdrvCoDrainData;
> > +
> > +static void bdrv_co_drain_bh_cb(void *opaque)
> > +{
> > + BdrvCoDrainData *data = opaque;
> > + Coroutine *co = data->co;
> > +
> > + qemu_bh_delete(data->bh);
> > + bdrv_drain(data->bs);
> > + data->done = true;
> > + qemu_coroutine_enter(co, NULL);
> > +}
> > +
> > +static void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
> > +{
>
> Please document why a BH is needed:
>
> /* Calling bdrv_drain() from a BH ensures the
> * current coroutine yields and other coroutines run if they were
> * queued from qemu_co_queue_run_restart().
> */
OK.
>
> > + BdrvCoDrainData data;
> > +
> > + assert(qemu_in_coroutine());
> > + data = (BdrvCoDrainData) {
> > + .co = qemu_coroutine_self(),
> > + .bs = bs,
> > + .done = false,
> > + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, &data),
> > + };
> > + qemu_bh_schedule(data.bh);
> > +
> > + qemu_coroutine_yield();
> > + /* If we are resumed from some other event (such as an aio completion or a
> > + * timer callback), it is a bug in the caller that should be fixed. */
> > + assert(data.done);
> > +}
> > +
> > /*
> > * Wait for pending requests to complete on a single BlockDriverState subtree,
> > * and suspend block driver's internal I/O until next request arrives.
> > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
> > bool busy = true;
> >
> > bdrv_drain_recurse(bs);
> > + if (qemu_in_coroutine()) {
> > + bdrv_co_drain(bs);
> > + return;
> > + }
> > while (busy) {
> > /* Keep iterating */
> > bdrv_flush_io_queue(bs);
> > --
> > 2.7.4
>
> block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> should assert(!qemu_in_coroutine()).
>
> The reason why existing bdrv_read() and friends detect coroutine context
> at runtime is because it is meant for legacy code that runs in both
> coroutine and non-coroutine contexts.
blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations),
and this doesn't just work with the assertion. Should I clean up this "legacy"
code first, i.e. move bdrv_unref calls to BHs in the callers and
assert(!qemu_in_coroutine()) there too? I didn't think this because it
complicates the code somehow.
>
> Modern coroutine code coroutine code the coroutine interfaces explicitly
> instead.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-05 1:27 ` Fam Zheng
@ 2016-04-05 9:39 ` Stefan Hajnoczi
2016-04-05 11:15 ` Fam Zheng
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2016-04-05 9:39 UTC (permalink / raw)
To: Fam Zheng; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block, pbonzini
[-- Attachment #1: Type: text/plain, Size: 3054 bytes --]
On Tue, Apr 05, 2016 at 09:27:39AM +0800, Fam Zheng wrote:
> On Mon, 04/04 12:57, Stefan Hajnoczi wrote:
> > On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
> > > + BdrvCoDrainData data;
> > > +
> > > + assert(qemu_in_coroutine());
> > > + data = (BdrvCoDrainData) {
> > > + .co = qemu_coroutine_self(),
> > > + .bs = bs,
> > > + .done = false,
> > > + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, &data),
> > > + };
> > > + qemu_bh_schedule(data.bh);
> > > +
> > > + qemu_coroutine_yield();
> > > + /* If we are resumed from some other event (such as an aio completion or a
> > > + * timer callback), it is a bug in the caller that should be fixed. */
> > > + assert(data.done);
> > > +}
> > > +
> > > /*
> > > * Wait for pending requests to complete on a single BlockDriverState subtree,
> > > * and suspend block driver's internal I/O until next request arrives.
> > > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
> > > bool busy = true;
> > >
> > > bdrv_drain_recurse(bs);
> > > + if (qemu_in_coroutine()) {
> > > + bdrv_co_drain(bs);
> > > + return;
> > > + }
> > > while (busy) {
> > > /* Keep iterating */
> > > bdrv_flush_io_queue(bs);
> > > --
> > > 2.7.4
> >
> > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> > should assert(!qemu_in_coroutine()).
> >
> > The reason why existing bdrv_read() and friends detect coroutine context
> > at runtime is because it is meant for legacy code that runs in both
> > coroutine and non-coroutine contexts.
>
> blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations),
> and this doesn't just work with the assertion. Should I clean up this "legacy"
> code first, i.e. move bdrv_unref calls to BHs in the callers and
> assert(!qemu_in_coroutine()) there too? I didn't think this because it
> complicates the code somehow.
This is a messy problem.
In general I don't like introducing yields into non-coroutine_fn
functions because it can lead to bugs when the caller didn't expect a
yield point.
For example, I myself wouldn't have expected bdrv_unref() to be a yield
point. So maybe coroutine code I've written would be vulnerable to
interference (I won't call it a race condition) from another coroutine
across the bdrv_unref() call. This could mean that another coroutine
now sees intermediate state that would never be visible without the new
yield point.
I think attempting to invoke qemu_co_queue_run_restart() directly
instead of scheduling a BH and yielding does not improve the situation.
It's also a layering violation since qemu_co_queue_run_restart() is just
meant for the core coroutine code and isn't a public interface.
Anyway, let's consider bdrv_drain() legacy code that can call if
(qemu_in_coroutine()) but please make bdrv_co_drain() public so
block/mirror.c can at least call it directly.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-05 9:39 ` Stefan Hajnoczi
@ 2016-04-05 11:15 ` Fam Zheng
2016-04-05 12:39 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2016-04-05 11:15 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block, pbonzini
On Tue, 04/05 10:39, Stefan Hajnoczi wrote:
> > > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> > > should assert(!qemu_in_coroutine()).
> > >
> > > The reason why existing bdrv_read() and friends detect coroutine context
> > > at runtime is because it is meant for legacy code that runs in both
> > > coroutine and non-coroutine contexts.
> >
> > blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations),
> > and this doesn't just work with the assertion. Should I clean up this "legacy"
> > code first, i.e. move bdrv_unref calls to BHs in the callers and
> > assert(!qemu_in_coroutine()) there too? I didn't think this because it
> > complicates the code somehow.
>
> This is a messy problem.
>
> In general I don't like introducing yields into non-coroutine_fn
> functions because it can lead to bugs when the caller didn't expect a
> yield point.
>
> For example, I myself wouldn't have expected bdrv_unref() to be a yield
> point. So maybe coroutine code I've written would be vulnerable to
> interference (I won't call it a race condition) from another coroutine
> across the bdrv_unref() call. This could mean that another coroutine
> now sees intermediate state that would never be visible without the new
> yield point.
>
> I think attempting to invoke qemu_co_queue_run_restart() directly
> instead of scheduling a BH and yielding does not improve the situation.
> It's also a layering violation since qemu_co_queue_run_restart() is just
> meant for the core coroutine code and isn't a public interface.
>
> Anyway, let's consider bdrv_drain() legacy code that can call if
> (qemu_in_coroutine()) but please make bdrv_co_drain() public so
> block/mirror.c can at least call it directly.
OK, will do.
Fam
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
2016-04-05 11:15 ` Fam Zheng
@ 2016-04-05 12:39 ` Paolo Bonzini
0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2016-04-05 12:39 UTC (permalink / raw)
To: Fam Zheng, Stefan Hajnoczi; +Cc: Kevin Wolf, lvivier, qemu-devel, qemu-block
On 05/04/2016 13:15, Fam Zheng wrote:
> > Anyway, let's consider bdrv_drain() legacy code that can call if
> > (qemu_in_coroutine()) but please make bdrv_co_drain() public so
> > block/mirror.c can at least call it directly.
> OK, will do.
Please create a bdrv_co_drained_begin() function too, even if it's not
used yet.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-04-05 12:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-01 13:57 [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine Fam Zheng
2016-04-01 14:14 ` Laurent Vivier
2016-04-04 11:57 ` Stefan Hajnoczi
2016-04-04 14:47 ` Paolo Bonzini
2016-04-05 1:27 ` Fam Zheng
2016-04-05 9:39 ` Stefan Hajnoczi
2016-04-05 11:15 ` Fam Zheng
2016-04-05 12:39 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).