All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kwolf@redhat.com, jcody@redhat.com, Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 1/2] block: allow live commit of active image
Date: Tue, 30 Jul 2013 15:17:50 +0200	[thread overview]
Message-ID: <51F7BCFE.40808@redhat.com> (raw)
In-Reply-To: <20130730125300.GB7239@stefanha-thinkpad.redhat.com>

Il 30/07/2013 14:53, Stefan Hajnoczi ha scritto:
> On Tue, Jul 30, 2013 at 03:17:47PM +0800, Fam Zheng wrote:
>>      for (sector_num = 0; sector_num < end; sector_num += n) {
>> -        uint64_t delay_ns = 0;
>> -        bool copy;
>>  
>> -wait:
>> -        /* Note that even when no rate limit is applied we need to yield
>> -         * with no pending I/O here so that bdrv_drain_all() returns.
>> -         */
>> -        block_job_sleep_ns(&s->common, rt_clock, delay_ns);
>> -        if (block_job_is_cancelled(&s->common)) {
>> -            break;
>> -        }
>>          /* Copy if allocated above the base */
>>          ret = bdrv_co_is_allocated_above(top, base, sector_num,
>> -                                         COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
>> +                                         COMMIT_BUFFER_SECTORS,
>>                                           &n);
>> -        copy = (ret == 1);
>> -        trace_commit_one_iteration(s, sector_num, n, ret);
>> -        if (copy) {
>> -            if (s->common.speed) {
>> -                delay_ns = ratelimit_calculate_delay(&s->limit, n);
>> -                if (delay_ns > 0) {
>> -                    goto wait;
>> -                }
>> +        if (ret) {
>> +            bdrv_set_dirty(top, sector_num, n);
>> +        }
> 
> This could take a while on a big image.  You need sleep/cancel here like
> the other blockjob loops have.
> 
> I think error handling isn't sufficient here.  If
> bdrv_co_is_allocated_above() fails you need to exit (if n becomes 0 on
> error, then this is an infinite loop!).
> 
>> +        bdrv_dirty_iter_init(s->top, &hbi);
>> +        for (next_dirty = hbitmap_iter_next(&hbi);
>> +                next_dirty >= 0;
>> +                next_dirty = hbitmap_iter_next(&hbi)) {
>> +            sector_num = next_dirty;
>> +            if (block_job_is_cancelled(&s->common)) {
>> +                goto exit;
>> +            }
>> +            delay_ns = ratelimit_calculate_delay(&s->limit,
>> +                                                 COMMIT_BUFFER_SECTORS);
>> +            /* Note that even when no rate limit is applied we need to yield
>> +             * with no pending I/O here so that bdrv_drain_all() returns.
>> +             */
>> +            block_job_sleep_ns(&s->common, rt_clock, delay_ns);
>> +            trace_commit_one_iteration(s, sector_num,
>> +                                       COMMIT_BUFFER_SECTORS, ret);
>> +            ret = commit_populate(top, base, sector_num,
>> +                                  COMMIT_BUFFER_SECTORS, buf);
> 
> Can we be sure that a guest write during commit_populate()...
> 
>> +            if (ret < 0) {
>> +                if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
>> +                    s->on_error == BLOCKDEV_ON_ERROR_REPORT ||
>> +                    (s->on_error == BLOCKDEV_ON_ERROR_ENOSPC &&
>> +                         ret == -ENOSPC)) {
>> +                    goto exit;
>> +                } else {
>> +                    continue;
>> +                }
>>              }
>> +            /* Publish progress */
>> +            s->common.offset += COMMIT_BUFFER_BYTES;
>> +            bdrv_reset_dirty(top, sector_num, COMMIT_BUFFER_SECTORS);
> 
> ...sets the dirty but *after* us?  Otherwise there is a race condition
> where guest writes fail to be copied into the base image.
> 
> I think the answer is "no" since commit_populate() performs two separate
> blocking operations: bdrv_read(top) followed by bdrv_write(base).  Now
> imagine the guest does bdrv_aio_writev(top) after we complete
> bdrv_read(top) but before we do bdrv_reset_dirty().

Indeed, reset must always come _before_ reads (and set comes always
after writes).

Paolo

  reply	other threads:[~2013-07-30 13:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-30  7:17 [Qemu-devel] [PATCH v2 0/2] block: allow commit active as top Fam Zheng
2013-07-30  7:17 ` [Qemu-devel] [PATCH v2 1/2] block: allow live commit of active image Fam Zheng
2013-07-30 12:53   ` Stefan Hajnoczi
2013-07-30 13:17     ` Paolo Bonzini [this message]
2013-08-15  7:46     ` Fam Zheng
2013-07-30  7:17 ` [Qemu-devel] [PATCH v2 2/2] qemu-iotests: update test cases for commit active Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F7BCFE.40808@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.