qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: sheepdog@lists.wpkg.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] sheepdog: use coroutines
Date: Wed, 24 Aug 2011 14:56:02 +0200	[thread overview]
Message-ID: <4E54F4E2.9080703@redhat.com> (raw)
In-Reply-To: <877h64gid6.wl%morita.kazutaka@lab.ntt.co.jp>

Am 23.08.2011 19:14, schrieb MORITA Kazutaka:
> At Tue, 23 Aug 2011 14:29:50 +0200,
> Kevin Wolf wrote:
>>
>> Am 12.08.2011 14:33, schrieb MORITA Kazutaka:
>>> This makes the sheepdog block driver support bdrv_co_readv/writev
>>> instead of bdrv_aio_readv/writev.
>>>
>>> With this patch, Sheepdog network I/O becomes fully asynchronous.  The
>>> block driver yields back when send/recv returns EAGAIN, and is resumed
>>> when the sheepdog network connection is ready for the operation.
>>>
>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
>>> ---
>>>  block/sheepdog.c |  150 +++++++++++++++++++++++++++++++++--------------------
>>>  1 files changed, 93 insertions(+), 57 deletions(-)
>>>
>>> diff --git a/block/sheepdog.c b/block/sheepdog.c
>>> index e150ac0..e283c34 100644
>>> --- a/block/sheepdog.c
>>> +++ b/block/sheepdog.c
>>> @@ -274,7 +274,7 @@ struct SheepdogAIOCB {
>>>      int ret;
>>>      enum AIOCBState aiocb_type;
>>>  
>>> -    QEMUBH *bh;
>>> +    Coroutine *coroutine;
>>>      void (*aio_done_func)(SheepdogAIOCB *);
>>>  
>>>      int canceled;
>>> @@ -295,6 +295,10 @@ typedef struct BDRVSheepdogState {
>>>      char *port;
>>>      int fd;
>>>  
>>> +    CoMutex lock;
>>> +    Coroutine *co_send;
>>> +    Coroutine *co_recv;
>>> +
>>>      uint32_t aioreq_seq_num;
>>>      QLIST_HEAD(outstanding_aio_head, AIOReq) outstanding_aio_head;
>>>  } BDRVSheepdogState;
>>> @@ -346,19 +350,16 @@ static const char * sd_strerror(int err)
>>>  /*
>>>   * Sheepdog I/O handling:
>>>   *
>>> - * 1. In the sd_aio_readv/writev, read/write requests are added to the
>>> - *    QEMU Bottom Halves.
>>> - *
>>> - * 2. In sd_readv_writev_bh_cb, the callbacks of BHs, we send the I/O
>>> - *    requests to the server and link the requests to the
>>> - *    outstanding_list in the BDRVSheepdogState.  we exits the
>>> - *    function without waiting for receiving the response.
>>> + * 1. In sd_co_rw_vector, we send the I/O requests to the server and
>>> + *    link the requests to the outstanding_list in the
>>> + *    BDRVSheepdogState.  The function exits without waiting for
>>> + *    receiving the response.
>>>   *
>>> - * 3. We receive the response in aio_read_response, the fd handler to
>>> + * 2. We receive the response in aio_read_response, the fd handler to
>>>   *    the sheepdog connection.  If metadata update is needed, we send
>>>   *    the write request to the vdi object in sd_write_done, the write
>>> - *    completion function.  The AIOCB callback is not called until all
>>> - *    the requests belonging to the AIOCB are finished.
>>> + *    completion function.  We switch back to sd_co_readv/writev after
>>> + *    all the requests belonging to the AIOCB are finished.
>>>   */
>>>  
>>>  static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
>>> @@ -398,7 +399,7 @@ static inline int free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
>>>  static void sd_finish_aiocb(SheepdogAIOCB *acb)
>>>  {
>>>      if (!acb->canceled) {
>>> -        acb->common.cb(acb->common.opaque, acb->ret);
>>> +        qemu_coroutine_enter(acb->coroutine, NULL);
>>>      }
>>>      qemu_aio_release(acb);
>>>  }
>>> @@ -411,7 +412,8 @@ static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
>>>       * Sheepdog cannot cancel the requests which are already sent to
>>>       * the servers, so we just complete the request with -EIO here.
>>>       */
>>> -    acb->common.cb(acb->common.opaque, -EIO);
>>> +    acb->ret = -EIO;
>>> +    qemu_coroutine_enter(acb->coroutine, NULL);
>>>      acb->canceled = 1;
>>>  }
>>>  
>>> @@ -435,24 +437,12 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
>>>  
>>>      acb->aio_done_func = NULL;
>>>      acb->canceled = 0;
>>> -    acb->bh = NULL;
>>> +    acb->coroutine = qemu_coroutine_self();
>>>      acb->ret = 0;
>>>      QLIST_INIT(&acb->aioreq_head);
>>>      return acb;
>>>  }
>>>  
>>> -static int sd_schedule_bh(QEMUBHFunc *cb, SheepdogAIOCB *acb)
>>> -{
>>> -    if (acb->bh) {
>>> -        error_report("bug: %d %d", acb->aiocb_type, acb->aiocb_type);
>>> -        return -EIO;
>>> -    }
>>> -
>>> -    acb->bh = qemu_bh_new(cb, acb);
>>> -    qemu_bh_schedule(acb->bh);
>>> -    return 0;
>>> -}
>>> -
>>>  #ifdef _WIN32
>>>  
>>>  struct msghdr {
>>> @@ -635,7 +625,13 @@ static int do_readv_writev(int sockfd, struct iovec *iov, int len,
>>>  again:
>>>      ret = do_send_recv(sockfd, iov, len, iov_offset, write);
>>>      if (ret < 0) {
>>> -        if (errno == EINTR || errno == EAGAIN) {
>>> +        if (errno == EINTR) {
>>> +            goto again;
>>> +        }
>>> +        if (errno == EAGAIN) {
>>> +            if (qemu_in_coroutine()) {
>>> +                qemu_coroutine_yield();
>>> +            }
>>
>> Who reenters the coroutine if we yield here?
> 
> The fd handlers, co_write_request() and co_read_response(), will call
> qemu_coroutine_enter() for the coroutine.  It will be restarted after
> the sheepdog network connection becomes ready.

Makes sense. Applied to the block branch,

>> And of course for a coroutine based block driver I would like to see
>> much less code in callback functions. But it's a good start anyway.
> 
> Yes, actually they can be much simpler.  This patch adds asynchrous
> I/O support with a minimal change based on a coroutine, and I think
> code simplification would be the next step.

Ok, if you're looking into this, perfect. :-)

Kevin

  reply	other threads:[~2011-08-24 12:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-12 12:33 [Qemu-devel] [PATCH] sheepdog: use coroutines MORITA Kazutaka
2011-08-23 12:29 ` Kevin Wolf
2011-08-23 17:14   ` MORITA Kazutaka
2011-08-24 12:56     ` Kevin Wolf [this message]
2011-12-23 13:38 ` Christoph Hellwig
2011-12-29 12:06   ` [Qemu-devel] coroutine bug?, was " Christoph Hellwig
2011-12-30 10:35     ` Stefan Hajnoczi
2012-01-02 15:39       ` Christoph Hellwig
2012-01-02 15:40         ` Christoph Hellwig
2012-01-02 22:38         ` Stefan Hajnoczi
2012-01-03  8:13           ` Stefan Hajnoczi
2012-01-06 11:16             ` MORITA Kazutaka
2012-01-03  8:16         ` Stefan Hajnoczi
2012-01-04 15:58           ` Christoph Hellwig
2012-01-04 17:03             ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E54F4E2.9080703@redhat.com \
    --to=kwolf@redhat.com \
    --cc=morita.kazutaka@lab.ntt.co.jp \
    --cc=qemu-devel@nongnu.org \
    --cc=sheepdog@lists.wpkg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).