qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <aliguori@us.ibm.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Ping Fan Liu <pingfank@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 09 Oct 2012 13:26:05 -0500	[thread overview]
Message-ID: <87ehl7lcxu.fsf@codemonkey.ws> (raw)
In-Reply-To: <5074502F.5030706@redhat.com>

Paolo Bonzini <pbonzini@redhat.com> writes:

> Il 09/10/2012 17:37, Anthony Liguori ha scritto:
>>>> >> In the very short term, I can imagine an aio fastpath that was only
>>>> >> implemented in terms of the device API.  We could have a slow path that
>>>> >> acquired the BQL.
>>> >
>>> > Not sure I follow.
>> 
>> As long as the ioeventfd thread can acquire qemu_mutex in order to call
>> bdrv_* functions.  The new device-only API could do this under the
>> covers for everything but the linux-aio fast path initially.
>
> Ok, so it's about the locking.  I'm not even sure we need locking if we
> have cooperative multitasking.  For example if bdrv_aio_readv/writev
> is called from a VCPU thread, it can just schedule a bottom half for
> itself in the appropriate AioContext.  Similarly for block jobs.


Okay, let's separate out the two issues here though.  One is whether we
need a device specific block API.  The second is whether we should short
cut to a fast path in the short term and go after a fully unlocked bdrv_
layer in the long(shortish?) term.

So let's talk about your proposal...

> The only part where I'm not sure how it would work is bdrv_read/write,
> because of the strange "qemu_aio_wait() calls select with a lock taken".
> Maybe we can just forbid synchronous I/O if you set a non-default
> AioContext.

Not sure how practical that is.  The is an awful lot of sync I/O still left.

> This would be entirely hidden in the block layer.  For example the
> following does it for bdrv_aio_readv/writev:
>
> diff --git a/block.c b/block.c
> index e95f613..7165e82 100644
> --- a/block.c
> +++ b/block.c
> @@ -3712,15 +3712,6 @@ static AIOPool bdrv_em_co_aio_pool = {
>      .cancel             = bdrv_aio_co_cancel_em,
>  };
>  
> -static void bdrv_co_em_bh(void *opaque)
> -{
> -    BlockDriverAIOCBCoroutine *acb = opaque;
> -
> -    acb->common.cb(acb->common.opaque, acb->req.error);
> -    qemu_bh_delete(acb->bh);
> -    qemu_aio_release(acb);
> -}
> -
>  /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
>  static void coroutine_fn bdrv_co_do_rw(void *opaque)
>  {
> @@ -3735,8 +3726,17 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
>              acb->req.nb_sectors, acb->req.qiov, 0);
>      }
>  
> -    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
> -    qemu_bh_schedule(acb->bh);
> +    acb->common.cb(acb->common.opaque, acb->req.error);
> +    qemu_aio_release(acb);
> +}
> +
> +static void bdrv_co_em_bh(void *opaque)
> +{
> +    BlockDriverAIOCBCoroutine *acb = opaque;
> +
> +    qemu_bh_delete(acb->bh);
> +    co = qemu_coroutine_create(bdrv_co_do_rw);
> +    qemu_coroutine_enter(co, acb);
>  }
>  
>  static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
> @@ -3756,8 +3756,8 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
>      acb->req.qiov = qiov;
>      acb->is_write = is_write;
>  
> -    co = qemu_coroutine_create(bdrv_co_do_rw);
> -    qemu_coroutine_enter(co, acb);
> +    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
> +    qemu_bh_schedule(acb->bh);
>  
>      return &acb->common;
>  }
>
>
> Then we can add a bdrv_aio_readv/writev_unlocked API to the protocols, which
> would run outside the bottom half and provide the desired fast path.

This works for some of the block layer I think.  How does this interact
with thread pools for AIO?

But this wouldn't work well with things like NBD or curl, right?  What's
the plan there?

Regards,

Anthony Liguori

>
> Paolo
>
>> That means that we can convert block devices to use the device-only API
>> across the board (provided we make BQL recursive).
>> 
>> It also means we get at least some of the benefits of data-plane in the
>> short term.

  reply	other threads:[~2012-10-09 18:26 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-25 12:55 [Qemu-devel] [RFC PATCH 00/17] Support for multiple "AIO contexts" Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 01/17] build: do not rely on indirect inclusion of qemu-config.h Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 02/17] event_notifier: enable it to use pipes Paolo Bonzini
2012-10-08  7:03   ` Stefan Hajnoczi
2012-09-25 12:55 ` [Qemu-devel] [PATCH 03/17] event_notifier: add Win32 implementation Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 04/17] aio: change qemu_aio_set_fd_handler to return void Paolo Bonzini
2012-09-25 21:47   ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 05/17] aio: provide platform-independent API Paolo Bonzini
2012-09-25 21:48   ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 06/17] aio: introduce AioContext, move bottom halves there Paolo Bonzini
2012-09-25 21:51   ` Anthony Liguori
2012-09-26  6:30     ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 07/17] aio: add I/O handlers to the AioContext interface Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 08/17] aio: add non-blocking variant of aio_wait Paolo Bonzini
2012-09-25 21:56   ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 09/17] aio: prepare for introducing GSource-based dispatch Paolo Bonzini
2012-09-25 22:01   ` Anthony Liguori
2012-09-26  6:36     ` Paolo Bonzini
2012-09-26  6:48     ` Paolo Bonzini
2012-09-29 11:28   ` Blue Swirl
2012-10-01  6:40     ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 10/17] aio: add Win32 implementation Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 11/17] aio: make AioContexts GSources Paolo Bonzini
2012-09-25 22:06   ` Anthony Liguori
2012-09-26  6:40     ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 12/17] aio: add aio_notify Paolo Bonzini
2012-09-25 22:07   ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 13/17] aio: call aio_notify after setting I/O handlers Paolo Bonzini
2012-09-25 22:07   ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 14/17] main-loop: use GSource to poll AIO file descriptors Paolo Bonzini
2012-09-25 22:09   ` Anthony Liguori
2012-09-26  6:38     ` Paolo Bonzini
2012-09-25 12:56 ` [Qemu-devel] [PATCH 15/17] main-loop: use aio_notify for qemu_notify_event Paolo Bonzini
2012-09-25 22:10   ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 16/17] aio: clean up now-unused functions Paolo Bonzini
2012-09-25 22:11   ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 17/17] linux-aio: use event notifiers Paolo Bonzini
2012-09-26 12:28 ` [Qemu-devel] [RFC PATCH 00/17] Support for multiple "AIO contexts" Kevin Wolf
2012-09-26 13:32   ` Paolo Bonzini
2012-09-26 14:31     ` Kevin Wolf
2012-09-26 15:48       ` Paolo Bonzini
2012-09-27  7:11         ` Kevin Wolf
2012-09-27  7:43           ` Paolo Bonzini
2012-10-08 11:39 ` Stefan Hajnoczi
2012-10-08 13:00   ` Paolo Bonzini
2012-10-09  9:08     ` [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"" Stefan Hajnoczi
2012-10-09  9:26       ` Avi Kivity
2012-10-09 10:36         ` Paolo Bonzini
2012-10-09 10:52           ` Avi Kivity
2012-10-09 11:08             ` Paolo Bonzini
2012-10-09 11:55               ` Avi Kivity
2012-10-09 12:01                 ` Paolo Bonzini
2012-10-09 12:18                   ` Jan Kiszka
2012-10-09 12:28                     ` Avi Kivity
2012-10-09 12:22                   ` Avi Kivity
2012-10-09 13:11                     ` Paolo Bonzini
2012-10-09 13:21                       ` Avi Kivity
2012-10-09 13:50                         ` Paolo Bonzini
2012-10-09 14:24                           ` Avi Kivity
2012-10-09 14:35                             ` Paolo Bonzini
2012-10-09 14:41                               ` Avi Kivity
2012-10-09 14:05                   ` Stefan Hajnoczi
2012-10-09 15:02       ` Anthony Liguori
2012-10-09 15:06         ` Paolo Bonzini
2012-10-09 15:37           ` Anthony Liguori
2012-10-09 16:26             ` Paolo Bonzini
2012-10-09 18:26               ` Anthony Liguori [this message]
2012-10-10  7:11                 ` Paolo Bonzini
2012-10-10 12:25                   ` Anthony Liguori
2012-10-10 13:31                     ` Paolo Bonzini
2012-10-10 14:44                       ` Anthony Liguori
2012-10-11 12:28         ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ehl7lcxu.fsf@codemonkey.ws \
    --to=aliguori@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pingfank@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).