From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 09/12] mirror: switch mirror_iteration to AIO
Date: Mon, 21 Jan 2013 13:09:00 +0100 [thread overview]
Message-ID: <50FD2FDC.5080906@redhat.com> (raw)
In-Reply-To: <50FD28D7.4080004@redhat.com>
Il 21/01/2013 12:39, Kevin Wolf ha scritto:
> Am 16.01.2013 18:31, schrieb Paolo Bonzini:
>> There is really no change in the behavior of the job here, since
>> there is still a maximum of one in-flight I/O operation between
>> the source and the target. However, this patch already introduces
>> the AIO callbacks (which are unmodified in the next patch)
>> and some of the logic to count in-flight operations and only
>> complete the job when there is none.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> block/mirror.c | 155 ++++++++++++++++++++++++++++++++++++++++++--------------
>> trace-events | 2 +
>> 2 files changed, 119 insertions(+), 38 deletions(-)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index ab41340..75c550a 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -33,8 +33,19 @@ typedef struct MirrorBlockJob {
>> unsigned long *cow_bitmap;
>> HBitmapIter hbi;
>> uint8_t *buf;
>> +
>> + int in_flight;
>> + int ret;
>> } MirrorBlockJob;
>>
>> +typedef struct MirrorOp {
>> + MirrorBlockJob *s;
>> + QEMUIOVector qiov;
>> + struct iovec iov;
>> + int64_t sector_num;
>> + int nb_sectors;
>> +} MirrorOp;
>> +
>> static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
>> int error)
>> {
>> @@ -48,15 +59,60 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
>> }
>> }
>>
>> -static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>> - BlockErrorAction *p_action)
>> +static void mirror_iteration_done(MirrorOp *op)
>> +{
>> + MirrorBlockJob *s = op->s;
>> +
>> + s->in_flight--;
>> + trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors);
>> + g_slice_free(MirrorOp, op);
>> + qemu_coroutine_enter(s->common.co, NULL);
>
> This doesn't check if the job coroutine is actually in a state where
> it's valid to reenter.
>
> Technically it might even be okay because reentering during a sleep is
> allowed and as good as reentering during the new yield, and bdrv_flush()
> is only called if s->in_flight == 0. Most other calls _should_ be okay
> as well, but I'm not so sure about bdrv_drain_all(), especially once
> .bdrv_drain exists.
bdrv_drain_all is also called only if s->in_flight == 0 too, but I see
your point. It is indeed quite subtle, but it's okay.
> As you can see, this is becoming very subtle, so I would prefer adding
> some explicit bool s->may_reenter or something like that.
The right boolean to test is already there, it's job->busy. I can add a
new API block_job_yield/block_job_enter (where block_job_yield
resets/sets busy across the yield, and block_job_enter only enters if
!job->busy), but that would be a separate series IMO.
>> +}
>
>> @@ -177,28 +233,43 @@ static void coroutine_fn mirror_run(void *opaque)
>> }
>>
>> bdrv_dirty_iter_init(bs, &s->hbi);
>> + last_pause_ns = qemu_get_clock_ns(rt_clock);
>> for (;;) {
>> uint64_t delay_ns;
>> int64_t cnt;
>> bool should_complete;
>>
>> + if (s->ret < 0) {
>> + ret = s->ret;
>> + break;
>> + }
>> +
>> cnt = bdrv_get_dirty_count(bs);
>> - if (cnt != 0) {
>> - BlockErrorAction action = BDRV_ACTION_REPORT;
>> - ret = mirror_iteration(s, &action);
>> - if (ret < 0 && action == BDRV_ACTION_REPORT) {
>> - goto immediate_exit;
>> +
>> + /* Note that even when no rate limit is applied we need to yield
>> + * periodically with no pending I/O so that qemu_aio_flush() returns.
>> + * We do so every SLICE_TIME milliseconds, or when there is an error,
>
> s/milli/nano/
>
>> + * or when the source is clean, whichever comes first.
>> + */
>> + if (qemu_get_clock_ns(rt_clock) - last_pause_ns < SLICE_TIME &&
>> + s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
>> + if (s->in_flight > 0) {
>> + trace_mirror_yield(s, s->in_flight, cnt);
>> + qemu_coroutine_yield();
>> + continue;
>> + } else if (cnt != 0) {
>> + mirror_iteration(s);
>> + continue;
>> }
>> - cnt = bdrv_get_dirty_count(bs);
>> }
>>
>> should_complete = false;
>> - if (cnt == 0) {
>> + if (s->in_flight == 0 && cnt == 0) {
>> trace_mirror_before_flush(s);
>> ret = bdrv_flush(s->target);
>> if (ret < 0) {
>> if (mirror_error_action(s, false, -ret) == BDRV_ACTION_REPORT) {
>> - goto immediate_exit;
>> + break;
>
> Is this an unrelated change?
Seems like a rebase hiccup. Doesn't have any semantic change, I'll drop it.
>> }
>> } else {
>> /* We're out of the streaming phase. From now on, if the job
>> @@ -244,15 +315,12 @@ static void coroutine_fn mirror_run(void *opaque)
>> delay_ns = 0;
>> }
>>
>> - /* Note that even when no rate limit is applied we need to yield
>> - * with no pending I/O here so that bdrv_drain_all() returns.
>> - */
>> block_job_sleep_ns(&s->common, rt_clock, delay_ns);
>> if (block_job_is_cancelled(&s->common)) {
>> break;
>> }
>> } else if (!should_complete) {
>> - delay_ns = (cnt == 0 ? SLICE_TIME : 0);
>> + delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
>> block_job_sleep_ns(&s->common, rt_clock, delay_ns);
>> } else if (cnt == 0) {
>
> Why don't we need to check s->in_flight == 0 here?
As above, should_complete is only set to true within an if(in_flight == 0).
Paolo
>> /* The two disks are in sync. Exit and report successful
>> @@ -262,9 +330,20 @@ static void coroutine_fn mirror_run(void *opaque)
>> s->common.cancelled = false;
>> break;
>> }
>> + last_pause_ns = qemu_get_clock_ns(rt_clock);
>> }
>>
>> immediate_exit:
>> + if (s->in_flight > 0) {
>> + /* We get here only if something went wrong. Either the job failed,
>> + * or it was cancelled prematurely so that we do not guarantee that
>> + * the target is a copy of the source.
>> + */
>> + assert(ret < 0 || (!s->synced && block_job_is_cancelled(&s->common)));
>> + mirror_drain(s);
>> + }
>> +
>> + assert(s->in_flight == 0);
>> qemu_vfree(s->buf);
>> g_free(s->cow_bitmap);
>> bdrv_set_dirty_tracking(bs, 0);
>
> Kevin
>
>
next prev parent reply other threads:[~2013-01-21 12:09 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-16 17:31 [Qemu-devel] [PATCH v2 00/12] Drive mirroring performance improvements Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 01/12] host-utils: add ffsl Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 02/12] add hierarchical bitmap data type and test cases Paolo Bonzini
2013-01-16 22:50 ` Eric Blake
2013-01-18 13:21 ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 03/12] block: implement dirty bitmap using HBitmap Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 04/12] block: make round_to_clusters public Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 05/12] mirror: perform COW if the cluster size is bigger than the granularity Paolo Bonzini
2013-01-18 15:13 ` Kevin Wolf
2013-01-18 16:22 ` Paolo Bonzini
2013-01-18 17:05 ` Kevin Wolf
2013-01-18 17:33 ` Paolo Bonzini
2013-01-21 10:17 ` Kevin Wolf
2013-01-21 11:15 ` Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 06/12] block: return count of dirty sectors, not chunks Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 07/12] block: allow customizing the granularity of the dirty bitmap Paolo Bonzini
2013-01-16 23:39 ` Eric Blake
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 08/12] mirror: allow customizing the granularity Paolo Bonzini
2013-01-16 23:44 ` Eric Blake
2013-01-21 11:00 ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 09/12] mirror: switch mirror_iteration to AIO Paolo Bonzini
2013-01-21 11:39 ` Kevin Wolf
2013-01-21 12:09 ` Paolo Bonzini [this message]
2013-01-21 12:15 ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 10/12] mirror: add buf-size argument to drive-mirror Paolo Bonzini
2013-01-16 23:46 ` Eric Blake
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 11/12] mirror: support more than one in-flight AIO operation Paolo Bonzini
2013-01-21 12:35 ` Kevin Wolf
2013-01-21 12:55 ` Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 12/12] mirror: support arbitrarily-sized iterations Paolo Bonzini
2013-01-16 23:48 ` [Qemu-devel] [PATCH v2 00/12] Drive mirroring performance improvements Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50FD2FDC.5080906@redhat.com \
--to=pbonzini@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.