From: Kevin Wolf <kwolf@redhat.com>
To: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: sheepdog@lists.wpkg.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] sheepdog: use coroutines
Date: Wed, 24 Aug 2011 14:56:02 +0200 [thread overview]
Message-ID: <4E54F4E2.9080703@redhat.com> (raw)
In-Reply-To: <877h64gid6.wl%morita.kazutaka@lab.ntt.co.jp>
Am 23.08.2011 19:14, schrieb MORITA Kazutaka:
> At Tue, 23 Aug 2011 14:29:50 +0200,
> Kevin Wolf wrote:
>>
>> Am 12.08.2011 14:33, schrieb MORITA Kazutaka:
>>> This makes the sheepdog block driver support bdrv_co_readv/writev
>>> instead of bdrv_aio_readv/writev.
>>>
>>> With this patch, Sheepdog network I/O becomes fully asynchronous. The
>>> block driver yields back when send/recv returns EAGAIN, and is resumed
>>> when the sheepdog network connection is ready for the operation.
>>>
>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
>>> ---
>>> block/sheepdog.c | 150 +++++++++++++++++++++++++++++++++--------------------
>>> 1 files changed, 93 insertions(+), 57 deletions(-)
>>>
>>> diff --git a/block/sheepdog.c b/block/sheepdog.c
>>> index e150ac0..e283c34 100644
>>> --- a/block/sheepdog.c
>>> +++ b/block/sheepdog.c
>>> @@ -274,7 +274,7 @@ struct SheepdogAIOCB {
>>> int ret;
>>> enum AIOCBState aiocb_type;
>>>
>>> - QEMUBH *bh;
>>> + Coroutine *coroutine;
>>> void (*aio_done_func)(SheepdogAIOCB *);
>>>
>>> int canceled;
>>> @@ -295,6 +295,10 @@ typedef struct BDRVSheepdogState {
>>> char *port;
>>> int fd;
>>>
>>> + CoMutex lock;
>>> + Coroutine *co_send;
>>> + Coroutine *co_recv;
>>> +
>>> uint32_t aioreq_seq_num;
>>> QLIST_HEAD(outstanding_aio_head, AIOReq) outstanding_aio_head;
>>> } BDRVSheepdogState;
>>> @@ -346,19 +350,16 @@ static const char * sd_strerror(int err)
>>> /*
>>> * Sheepdog I/O handling:
>>> *
>>> - * 1. In the sd_aio_readv/writev, read/write requests are added to the
>>> - * QEMU Bottom Halves.
>>> - *
>>> - * 2. In sd_readv_writev_bh_cb, the callbacks of BHs, we send the I/O
>>> - * requests to the server and link the requests to the
>>> - * outstanding_list in the BDRVSheepdogState. we exits the
>>> - * function without waiting for receiving the response.
>>> + * 1. In sd_co_rw_vector, we send the I/O requests to the server and
>>> + * link the requests to the outstanding_list in the
>>> + * BDRVSheepdogState. The function exits without waiting for
>>> + * receiving the response.
>>> *
>>> - * 3. We receive the response in aio_read_response, the fd handler to
>>> + * 2. We receive the response in aio_read_response, the fd handler to
>>> * the sheepdog connection. If metadata update is needed, we send
>>> * the write request to the vdi object in sd_write_done, the write
>>> - * completion function. The AIOCB callback is not called until all
>>> - * the requests belonging to the AIOCB are finished.
>>> + * completion function. We switch back to sd_co_readv/writev after
>>> + * all the requests belonging to the AIOCB are finished.
>>> */
>>>
>>> static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
>>> @@ -398,7 +399,7 @@ static inline int free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
>>> static void sd_finish_aiocb(SheepdogAIOCB *acb)
>>> {
>>> if (!acb->canceled) {
>>> - acb->common.cb(acb->common.opaque, acb->ret);
>>> + qemu_coroutine_enter(acb->coroutine, NULL);
>>> }
>>> qemu_aio_release(acb);
>>> }
>>> @@ -411,7 +412,8 @@ static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
>>> * Sheepdog cannot cancel the requests which are already sent to
>>> * the servers, so we just complete the request with -EIO here.
>>> */
>>> - acb->common.cb(acb->common.opaque, -EIO);
>>> + acb->ret = -EIO;
>>> + qemu_coroutine_enter(acb->coroutine, NULL);
>>> acb->canceled = 1;
>>> }
>>>
>>> @@ -435,24 +437,12 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
>>>
>>> acb->aio_done_func = NULL;
>>> acb->canceled = 0;
>>> - acb->bh = NULL;
>>> + acb->coroutine = qemu_coroutine_self();
>>> acb->ret = 0;
>>> QLIST_INIT(&acb->aioreq_head);
>>> return acb;
>>> }
>>>
>>> -static int sd_schedule_bh(QEMUBHFunc *cb, SheepdogAIOCB *acb)
>>> -{
>>> - if (acb->bh) {
>>> - error_report("bug: %d %d", acb->aiocb_type, acb->aiocb_type);
>>> - return -EIO;
>>> - }
>>> -
>>> - acb->bh = qemu_bh_new(cb, acb);
>>> - qemu_bh_schedule(acb->bh);
>>> - return 0;
>>> -}
>>> -
>>> #ifdef _WIN32
>>>
>>> struct msghdr {
>>> @@ -635,7 +625,13 @@ static int do_readv_writev(int sockfd, struct iovec *iov, int len,
>>> again:
>>> ret = do_send_recv(sockfd, iov, len, iov_offset, write);
>>> if (ret < 0) {
>>> - if (errno == EINTR || errno == EAGAIN) {
>>> + if (errno == EINTR) {
>>> + goto again;
>>> + }
>>> + if (errno == EAGAIN) {
>>> + if (qemu_in_coroutine()) {
>>> + qemu_coroutine_yield();
>>> + }
>>
>> Who reenters the coroutine if we yield here?
>
> The fd handlers, co_write_request() and co_read_response(), will call
> qemu_coroutine_enter() for the coroutine. It will be restarted after
> the sheepdog network connection becomes ready.
Makes sense. Applied to the block branch,
>> And of course for a coroutine based block driver I would like to see
>> much less code in callback functions. But it's a good start anyway.
>
> Yes, actually they can be much simpler. This patch adds asynchrous
> I/O support with a minimal change based on a coroutine, and I think
> code simplification would be the next step.
Ok, if you're looking into this, perfect. :-)
Kevin
next prev parent reply other threads:[~2011-08-24 12:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-12 12:33 [Qemu-devel] [PATCH] sheepdog: use coroutines MORITA Kazutaka
2011-08-23 12:29 ` Kevin Wolf
2011-08-23 17:14 ` MORITA Kazutaka
2011-08-24 12:56 ` Kevin Wolf [this message]
2011-12-23 13:38 ` Christoph Hellwig
2011-12-29 12:06 ` [Qemu-devel] coroutine bug?, was " Christoph Hellwig
2011-12-30 10:35 ` Stefan Hajnoczi
2012-01-02 15:39 ` Christoph Hellwig
2012-01-02 15:40 ` Christoph Hellwig
2012-01-02 22:38 ` Stefan Hajnoczi
2012-01-03 8:13 ` Stefan Hajnoczi
2012-01-06 11:16 ` MORITA Kazutaka
2012-01-03 8:16 ` Stefan Hajnoczi
2012-01-04 15:58 ` Christoph Hellwig
2012-01-04 17:03 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E54F4E2.9080703@redhat.com \
--to=kwolf@redhat.com \
--cc=morita.kazutaka@lab.ntt.co.jp \
--cc=qemu-devel@nongnu.org \
--cc=sheepdog@lists.wpkg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.