qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish
       [not found] <53F80DC4.7030102@gmail.com>
@ 2014-08-27 15:53 ` Michael Roth
  2014-08-27 16:28   ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Roth @ 2014-08-27 15:53 UTC (permalink / raw)
  To: Liu Yu, qemu-stable; +Cc: qemu-devel

Quoting Liu Yu (2014-08-22 22:43:00)
> From: Liu Yu <allanyuliu@tencent.com>
> 
> The patch against branch stable-2.0

Hi Liu,

2.0.2 was the last planned released for stable-2.0. Are you still seeing this
issue with 2.1.0 or master? 2.1.1 is release is planned for September 9th so we
can work on getting this included there if it is.

Also, please Cc: qemu-devel@nongnu.org even for stable patches.

> 
> In case VM does IO while we run a stream job.
> When stream finishes, the stream coroutine drains all IOs before
> close the unused image, in bdrv_drain_all() it may find
> a pending request which is submitted by guest IO coroutine.
> In order to wait the pending req finish, the subsequent aio_poll()
> call poll() to wait the req. however, if the req is already done by
> threadpool and is waiting for the callback, there is no chance to switch
> back to guest IO coroutine to call the callback and so that the stream
> coroutine waits in poll() all the time.
> 
> The patch detects the deadlock case above and switch back to iothread
> coroutine to handle the callback, and work on the stream coroutine
> after the pending req get finished.
> 
> Signed-off-by: Liu Yu <allanyuliu@tencent.com>
> ---
> the issue can be reproduced by
> 1. guest does fio test
> 2. while host runs virsh blockpull repeatedly
> 
> 
>  block.c |   27 ++++++++++++++++++++++++++-
>  1 files changed, 26 insertions(+), 1 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 990a754..f8c1a8d 100644
> --- a/block.c
> +++ b/block.c
> @@ -1778,6 +1778,29 @@ static bool bdrv_requests_pending_all(void)
>      return false;
>  }
> 
> +static bool bdrv_request_coroutine_wait(void)
> +{
> +    BlockDriverState *bs;
> +    Coroutine *co;
> +
> +    if (!qemu_in_coroutine())
> +        return false;
> +
> +    co = qemu_coroutine_self();
> +    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> +        if (!QLIST_EMPTY(&bs->tracked_requests)) {
> +            BdrvTrackedRequest *req = QLIST_FIRST(&bs->tracked_requests);
> +
> +            if(req->co == co)
> +                continue;
> +
> +            qemu_co_queue_wait(&req->wait_queue);
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  /*
>   * Wait for pending requests to complete across all BlockDriverStates
>   *
> @@ -1800,8 +1823,10 @@ void bdrv_drain_all(void)
>          QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>              bdrv_start_throttled_reqs(bs);
>          }
> -
> +recheck:
>          busy = bdrv_requests_pending_all();
> +        if (busy && bdrv_request_coroutine_wait())
> +            goto recheck;
>          busy |= aio_poll(qemu_get_aio_context(), busy);
>      }
>  }
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish
  2014-08-27 15:53 ` [Qemu-devel] [Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish Michael Roth
@ 2014-08-27 16:28   ` Paolo Bonzini
  2014-08-27 17:30     ` Michael Roth
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2014-08-27 16:28 UTC (permalink / raw)
  To: Michael Roth, Liu Yu, qemu-stable; +Cc: qemu-devel

Il 27/08/2014 17:53, Michael Roth ha scritto:
>> In case VM does IO while we run a stream job.
>> When stream finishes, the stream coroutine drains all IOs before
>> close the unused image, in bdrv_drain_all() it may find
>> a pending request which is submitted by guest IO coroutine.
>> In order to wait the pending req finish, the subsequent aio_poll()
>> call poll() to wait the req. however, if the req is already done by
>> threadpool and is waiting for the callback, there is no chance to switch
>> back to guest IO coroutine to call the callback and so that the stream
>> coroutine waits in poll() all the time.

This is the same bug fixed by this patch;

commit 3c80ca158c96ff902a30883a8933e755988948b1
Author: Stefan Hajnoczi <stefanha@redhat.com>
Date:   Tue Jul 15 16:44:26 2014 +0200

    thread-pool: avoid deadlock in nested aio_poll() calls

    The thread pool has a race condition if two elements complete before
    thread_pool_completion_bh() runs:
    
      If element A's callback waits for element B using aio_poll() it will
      deadlock since pool->completion_bh is not marked scheduled when the
      nested aio_poll() runs.
    
    Fix this by marking the BH scheduled while thread_pool_completion_bh()
    is executing.  This way any nested aio_poll() loops will enter
    thread_pool_completion_bh() and complete the remaining elements.
    
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish
  2014-08-27 16:28   ` Paolo Bonzini
@ 2014-08-27 17:30     ` Michael Roth
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Roth @ 2014-08-27 17:30 UTC (permalink / raw)
  To: Paolo Bonzini, Liu Yu, qemu-stable; +Cc: qemu-devel

Quoting Paolo Bonzini (2014-08-27 11:28:23)
> Il 27/08/2014 17:53, Michael Roth ha scritto:
> >> In case VM does IO while we run a stream job.
> >> When stream finishes, the stream coroutine drains all IOs before
> >> close the unused image, in bdrv_drain_all() it may find
> >> a pending request which is submitted by guest IO coroutine.
> >> In order to wait the pending req finish, the subsequent aio_poll()
> >> call poll() to wait the req. however, if the req is already done by
> >> threadpool and is waiting for the callback, there is no chance to switch
> >> back to guest IO coroutine to call the callback and so that the stream
> >> coroutine waits in poll() all the time.
> 
> This is the same bug fixed by this patch;
> 
> commit 3c80ca158c96ff902a30883a8933e755988948b1
> Author: Stefan Hajnoczi <stefanha@redhat.com>
> Date:   Tue Jul 15 16:44:26 2014 +0200
> 
>     thread-pool: avoid deadlock in nested aio_poll() calls
> 
>     The thread pool has a race condition if two elements complete before
>     thread_pool_completion_bh() runs:
>     
>       If element A's callback waits for element B using aio_poll() it will
>       deadlock since pool->completion_bh is not marked scheduled when the
>       nested aio_poll() runs.
>     
>     Fix this by marking the BH scheduled while thread_pool_completion_bh()
>     is executing.  This way any nested aio_poll() loops will enter
>     thread_pool_completion_bh() and complete the remaining elements.
>     
>     Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Thanks. Looks like this missed v2.1.0 so I've gone ahead and queued it
for 2.1.1 (along with 5f8127a5811429da85cc4bc273f166aed129fc31).

> 
> Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-08-27 17:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <53F80DC4.7030102@gmail.com>
2014-08-27 15:53 ` [Qemu-devel] [Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish Michael Roth
2014-08-27 16:28   ` Paolo Bonzini
2014-08-27 17:30     ` Michael Roth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).