qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance
@ 2015-08-13  8:41 Kevin Wolf
  2015-08-13  9:26 ` Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Kevin Wolf @ 2015-08-13  8:41 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, famz, jcody, qemu-stable, qemu-devel, stefanha, pbonzini

This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
write on target if sectors not allocated"), which was reported to cause
aborts with the message "Co-routine re-entered recursively".

The cause for this bug is the following code in mirror_iteration_done():

    if (s->common.busy) {
        qemu_coroutine_enter(s->common.co, NULL);
    }

This has always been ugly because - unlike most places that reenter - it
doesn't have a specific yield that it pairs with, but is more
uncontrolled.  What we really mean here is "reenter the coroutine if
it's in one of the four explicit yields in mirror.c".

This used to be equivalent with s->common.busy because neither
mirror_run() nor mirror_iteration() call any function that could yield.
However since commit dcfb3beb this doesn't hold true any more:
bdrv_get_block_status_above() can yield.

So what happens is that bdrv_get_block_status_above() wants to take a
lock that is already held, so it adds itself to the queue of waiting
coroutines and yields. Instead of being woken up by the unlock function,
however, it gets woken up by mirror_iteration_done(), which is obviously
wrong.

In most cases the code actually happens to cope fairly well with such
cases, but in this specific case, the unlock must already have scheduled
the coroutine for wakeup when mirror_iteration_done() reentered it. And
then the coroutine happened to process the scheduled restarts and tried
to reenter itself recursively.

This patch fixes the problem by pairing the reenter in
mirror_iteration_done() with specific yields instead of abusing
s->common.busy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/mirror.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index fc4d8f5..b2fb4b9 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -60,6 +60,7 @@ typedef struct MirrorBlockJob {
     int sectors_in_flight;
     int ret;
     bool unmap;
+    bool waiting_for_io;
 } MirrorBlockJob;
 
 typedef struct MirrorOp {
@@ -114,11 +115,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
     qemu_iovec_destroy(&op->qiov);
     g_slice_free(MirrorOp, op);
 
-    /* Enter coroutine when it is not sleeping.  The coroutine sleeps to
-     * rate-limit itself.  The coroutine will eventually resume since there is
-     * a sleep timeout so don't wake it early.
-     */
-    if (s->common.busy) {
+    if (s->waiting_for_io) {
         qemu_coroutine_enter(s->common.co, NULL);
     }
 }
@@ -203,7 +200,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
     while (test_bit(next_chunk, s->in_flight_bitmap)) {
         trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
+        s->waiting_for_io = true;
         qemu_coroutine_yield();
+        s->waiting_for_io = false;
     }
 
     do {
@@ -239,7 +238,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
          */
         while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
             trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
+            s->waiting_for_io = true;
             qemu_coroutine_yield();
+            s->waiting_for_io = false;
         }
         if (s->buf_free_count < nb_chunks + added_chunks) {
             trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
@@ -333,7 +334,9 @@ static void mirror_free_init(MirrorBlockJob *s)
 static void mirror_drain(MirrorBlockJob *s)
 {
     while (s->in_flight > 0) {
+        s->waiting_for_io = true;
         qemu_coroutine_yield();
+        s->waiting_for_io = false;
     }
 }
 
@@ -506,7 +509,9 @@ static void coroutine_fn mirror_run(void *opaque)
             if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
                 (cnt == 0 && s->in_flight > 0)) {
                 trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
+                s->waiting_for_io = true;
                 qemu_coroutine_yield();
+                s->waiting_for_io = false;
                 continue;
             } else if (cnt != 0) {
                 delay_ns = mirror_iteration(s);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance
  2015-08-13  8:41 [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance Kevin Wolf
@ 2015-08-13  9:26 ` Paolo Bonzini
  2015-08-14 10:27 ` Stefan Hajnoczi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2015-08-13  9:26 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: jcody, famz, qemu-stable, stefanha, qemu-devel



On 13/08/2015 10:41, Kevin Wolf wrote:
> This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
> write on target if sectors not allocated"), which was reported to cause
> aborts with the message "Co-routine re-entered recursively".
> 
> The cause for this bug is the following code in mirror_iteration_done():
> 
>     if (s->common.busy) {
>         qemu_coroutine_enter(s->common.co, NULL);
>     }
> 
> This has always been ugly because - unlike most places that reenter - it
> doesn't have a specific yield that it pairs with, but is more
> uncontrolled.  What we really mean here is "reenter the coroutine if
> it's in one of the four explicit yields in mirror.c".
> 
> This used to be equivalent with s->common.busy because neither
> mirror_run() nor mirror_iteration() call any function that could yield.
> However since commit dcfb3beb this doesn't hold true any more:
> bdrv_get_block_status_above() can yield.
> 
> So what happens is that bdrv_get_block_status_above() wants to take a
> lock that is already held, so it adds itself to the queue of waiting
> coroutines and yields. Instead of being woken up by the unlock function,
> however, it gets woken up by mirror_iteration_done(), which is obviously
> wrong.
> 
> In most cases the code actually happens to cope fairly well with such
> cases, but in this specific case, the unlock must already have scheduled
> the coroutine for wakeup when mirror_iteration_done() reentered it. And
> then the coroutine happened to process the scheduled restarts and tried
> to reenter itself recursively.
> 
> This patch fixes the problem by pairing the reenter in
> mirror_iteration_done() with specific yields instead of abusing
> s->common.busy.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

> ---
>  block/mirror.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index fc4d8f5..b2fb4b9 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -60,6 +60,7 @@ typedef struct MirrorBlockJob {
>      int sectors_in_flight;
>      int ret;
>      bool unmap;
> +    bool waiting_for_io;
>  } MirrorBlockJob;
>  
>  typedef struct MirrorOp {
> @@ -114,11 +115,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
>      qemu_iovec_destroy(&op->qiov);
>      g_slice_free(MirrorOp, op);
>  
> -    /* Enter coroutine when it is not sleeping.  The coroutine sleeps to
> -     * rate-limit itself.  The coroutine will eventually resume since there is
> -     * a sleep timeout so don't wake it early.
> -     */
> -    if (s->common.busy) {
> +    if (s->waiting_for_io) {
>          qemu_coroutine_enter(s->common.co, NULL);
>      }
>  }
> @@ -203,7 +200,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>      /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
>      while (test_bit(next_chunk, s->in_flight_bitmap)) {
>          trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
> +        s->waiting_for_io = true;
>          qemu_coroutine_yield();
> +        s->waiting_for_io = false;
>      }
>  
>      do {
> @@ -239,7 +238,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>           */
>          while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
>              trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
> +            s->waiting_for_io = true;
>              qemu_coroutine_yield();
> +            s->waiting_for_io = false;
>          }
>          if (s->buf_free_count < nb_chunks + added_chunks) {
>              trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
> @@ -333,7 +334,9 @@ static void mirror_free_init(MirrorBlockJob *s)
>  static void mirror_drain(MirrorBlockJob *s)
>  {
>      while (s->in_flight > 0) {
> +        s->waiting_for_io = true;
>          qemu_coroutine_yield();
> +        s->waiting_for_io = false;
>      }
>  }
>  
> @@ -506,7 +509,9 @@ static void coroutine_fn mirror_run(void *opaque)
>              if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
>                  (cnt == 0 && s->in_flight > 0)) {
>                  trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
> +                s->waiting_for_io = true;
>                  qemu_coroutine_yield();
> +                s->waiting_for_io = false;
>                  continue;
>              } else if (cnt != 0) {
>                  delay_ns = mirror_iteration(s);
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance
  2015-08-13  8:41 [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance Kevin Wolf
  2015-08-13  9:26 ` Paolo Bonzini
@ 2015-08-14 10:27 ` Stefan Hajnoczi
  2015-08-14 13:50 ` Jeff Cody
  2015-08-14 13:53 ` Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2015-08-14 10:27 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: famz, qemu-block, jcody, qemu-stable, qemu-devel, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

On Thu, Aug 13, 2015 at 10:41:50AM +0200, Kevin Wolf wrote:
> This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
> write on target if sectors not allocated"), which was reported to cause
> aborts with the message "Co-routine re-entered recursively".
> 
> The cause for this bug is the following code in mirror_iteration_done():
> 
>     if (s->common.busy) {
>         qemu_coroutine_enter(s->common.co, NULL);
>     }
> 
> This has always been ugly because - unlike most places that reenter - it
> doesn't have a specific yield that it pairs with, but is more
> uncontrolled.  What we really mean here is "reenter the coroutine if
> it's in one of the four explicit yields in mirror.c".
> 
> This used to be equivalent with s->common.busy because neither
> mirror_run() nor mirror_iteration() call any function that could yield.
> However since commit dcfb3beb this doesn't hold true any more:
> bdrv_get_block_status_above() can yield.
> 
> So what happens is that bdrv_get_block_status_above() wants to take a
> lock that is already held, so it adds itself to the queue of waiting
> coroutines and yields. Instead of being woken up by the unlock function,
> however, it gets woken up by mirror_iteration_done(), which is obviously
> wrong.
> 
> In most cases the code actually happens to cope fairly well with such
> cases, but in this specific case, the unlock must already have scheduled
> the coroutine for wakeup when mirror_iteration_done() reentered it. And
> then the coroutine happened to process the scheduled restarts and tried
> to reenter itself recursively.
> 
> This patch fixes the problem by pairing the reenter in
> mirror_iteration_done() with specific yields instead of abusing
> s->common.busy.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/mirror.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance
  2015-08-13  8:41 [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance Kevin Wolf
  2015-08-13  9:26 ` Paolo Bonzini
  2015-08-14 10:27 ` Stefan Hajnoczi
@ 2015-08-14 13:50 ` Jeff Cody
  2015-08-14 13:53 ` Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2015-08-14 13:50 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: famz, qemu-block, qemu-devel, qemu-stable, stefanha, pbonzini

On Thu, Aug 13, 2015 at 10:41:50AM +0200, Kevin Wolf wrote:
> This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
> write on target if sectors not allocated"), which was reported to cause
> aborts with the message "Co-routine re-entered recursively".
> 
> The cause for this bug is the following code in mirror_iteration_done():
> 
>     if (s->common.busy) {
>         qemu_coroutine_enter(s->common.co, NULL);
>     }
> 
> This has always been ugly because - unlike most places that reenter - it
> doesn't have a specific yield that it pairs with, but is more
> uncontrolled.  What we really mean here is "reenter the coroutine if
> it's in one of the four explicit yields in mirror.c".
> 
> This used to be equivalent with s->common.busy because neither
> mirror_run() nor mirror_iteration() call any function that could yield.
> However since commit dcfb3beb this doesn't hold true any more:
> bdrv_get_block_status_above() can yield.
> 
> So what happens is that bdrv_get_block_status_above() wants to take a
> lock that is already held, so it adds itself to the queue of waiting
> coroutines and yields. Instead of being woken up by the unlock function,
> however, it gets woken up by mirror_iteration_done(), which is obviously
> wrong.
> 
> In most cases the code actually happens to cope fairly well with such
> cases, but in this specific case, the unlock must already have scheduled
> the coroutine for wakeup when mirror_iteration_done() reentered it. And
> then the coroutine happened to process the scheduled restarts and tried
> to reenter itself recursively.
> 
> This patch fixes the problem by pairing the reenter in
> mirror_iteration_done() with specific yields instead of abusing
> s->common.busy.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/mirror.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index fc4d8f5..b2fb4b9 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -60,6 +60,7 @@ typedef struct MirrorBlockJob {
>      int sectors_in_flight;
>      int ret;
>      bool unmap;
> +    bool waiting_for_io;
>  } MirrorBlockJob;
>  
>  typedef struct MirrorOp {
> @@ -114,11 +115,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
>      qemu_iovec_destroy(&op->qiov);
>      g_slice_free(MirrorOp, op);
>  
> -    /* Enter coroutine when it is not sleeping.  The coroutine sleeps to
> -     * rate-limit itself.  The coroutine will eventually resume since there is
> -     * a sleep timeout so don't wake it early.
> -     */
> -    if (s->common.busy) {
> +    if (s->waiting_for_io) {
>          qemu_coroutine_enter(s->common.co, NULL);
>      }
>  }
> @@ -203,7 +200,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>      /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
>      while (test_bit(next_chunk, s->in_flight_bitmap)) {
>          trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
> +        s->waiting_for_io = true;
>          qemu_coroutine_yield();
> +        s->waiting_for_io = false;
>      }
>  
>      do {
> @@ -239,7 +238,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>           */
>          while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
>              trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
> +            s->waiting_for_io = true;
>              qemu_coroutine_yield();
> +            s->waiting_for_io = false;
>          }
>          if (s->buf_free_count < nb_chunks + added_chunks) {
>              trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
> @@ -333,7 +334,9 @@ static void mirror_free_init(MirrorBlockJob *s)
>  static void mirror_drain(MirrorBlockJob *s)
>  {
>      while (s->in_flight > 0) {
> +        s->waiting_for_io = true;
>          qemu_coroutine_yield();
> +        s->waiting_for_io = false;
>      }
>  }
>  
> @@ -506,7 +509,9 @@ static void coroutine_fn mirror_run(void *opaque)
>              if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
>                  (cnt == 0 && s->in_flight > 0)) {
>                  trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
> +                s->waiting_for_io = true;
>                  qemu_coroutine_yield();
> +                s->waiting_for_io = false;
>                  continue;
>              } else if (cnt != 0) {
>                  delay_ns = mirror_iteration(s);
> -- 
> 1.8.3.1
>

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance
  2015-08-13  8:41 [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance Kevin Wolf
                   ` (2 preceding siblings ...)
  2015-08-14 13:50 ` Jeff Cody
@ 2015-08-14 13:53 ` Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2015-08-14 13:53 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: famz, qemu-block, qemu-devel, qemu-stable, stefanha, pbonzini

On Thu, Aug 13, 2015 at 10:41:50AM +0200, Kevin Wolf wrote:
> This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
> write on target if sectors not allocated"), which was reported to cause
> aborts with the message "Co-routine re-entered recursively".
> 
> The cause for this bug is the following code in mirror_iteration_done():
> 
>     if (s->common.busy) {
>         qemu_coroutine_enter(s->common.co, NULL);
>     }
> 
> This has always been ugly because - unlike most places that reenter - it
> doesn't have a specific yield that it pairs with, but is more
> uncontrolled.  What we really mean here is "reenter the coroutine if
> it's in one of the four explicit yields in mirror.c".
> 
> This used to be equivalent with s->common.busy because neither
> mirror_run() nor mirror_iteration() call any function that could yield.
> However since commit dcfb3beb this doesn't hold true any more:
> bdrv_get_block_status_above() can yield.
> 
> So what happens is that bdrv_get_block_status_above() wants to take a
> lock that is already held, so it adds itself to the queue of waiting
> coroutines and yields. Instead of being woken up by the unlock function,
> however, it gets woken up by mirror_iteration_done(), which is obviously
> wrong.
> 
> In most cases the code actually happens to cope fairly well with such
> cases, but in this specific case, the unlock must already have scheduled
> the coroutine for wakeup when mirror_iteration_done() reentered it. And
> then the coroutine happened to process the scheduled restarts and tried
> to reenter itself recursively.
> 
> This patch fixes the problem by pairing the reenter in
> mirror_iteration_done() with specific yields instead of abusing
> s->common.busy.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Thanks, applied to my block branch:

https://github.com/codyprime/qemu-kvm-jtc/tree/block

-Jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-08-14 13:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-13  8:41 [Qemu-devel] [PATCH] mirror: Fix coroutine reentrance Kevin Wolf
2015-08-13  9:26 ` Paolo Bonzini
2015-08-14 10:27 ` Stefan Hajnoczi
2015-08-14 13:50 ` Jeff Cody
2015-08-14 13:53 ` Jeff Cody

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).