qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 3/8] block: add image streaming block job
Date: Thu, 3 Nov 2011 14:34:24 -0200	[thread overview]
Message-ID: <20111103163424.GA21743@amt.cnet> (raw)
In-Reply-To: <CAJSP0QXFrJiq8OhHAZRKW63X4BapnLcQ-DwQjEHH6n8J3KD4Ag@mail.gmail.com>

On Wed, Nov 02, 2011 at 03:43:49PM +0000, Stefan Hajnoczi wrote:
> On Tue, Nov 1, 2011 at 6:06 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Thu, Oct 27, 2011 at 04:22:50PM +0100, Stefan Hajnoczi wrote:
> >> +static int stream_one_iteration(StreamBlockJob *s, int64_t sector_num,
> >> +                                void *buf, int max_sectors, int *n)
> >> +{
> >> +    BlockDriverState *bs = s->common.bs;
> >> +    int ret;
> >> +
> >> +    trace_stream_one_iteration(s, sector_num, max_sectors);
> >> +
> >> +    ret = bdrv_is_allocated(bs, sector_num, max_sectors, n);
> >> +    if (ret < 0) {
> >> +        return ret;
> >> +    }
> >
> > bdrv_is_allocated is still synchronous? If so, there should be at least
> > a plan to make it asynchronous.
> 
> Yes, that's a good discussion to have.  My thoughts are that
> bdrv_is_allocated() should be executed in coroutine context.  The
> semantics are a little tricky because of parallel requests:
> 
> 1. If a write request is in progress when we do bdrv_is_allocated() we
> might get back "unallocated" even though clusters are just being
> allocated.
> 2. If a TRIM request is in progress when we do bdrv_is_allocated() we
> might get back "allocated" even though clusters are just being
> deallocated.
> 
> In order to be reliable the caller needs to be aware of parallel
> requests.  I think it's correct to defer this problem to the caller.
> 
> In the case of image streaming we're not TRIM-safe, I haven't really
> thought about it yet.  But we are safe against parallel write requests
> because there is serialization to prevent copy-on-read requests from
> racing with write requests.
> 
> >> +    if (!ret) {
> >> +        ret = stream_populate(bs, sector_num, *n, buf);
> >> +    }
> >> +    return ret;
> >> +}
> >> +
> >> +static void coroutine_fn stream_run(void *opaque)
> >> +{
> >> +    StreamBlockJob *s = opaque;
> >> +    BlockDriverState *bs = s->common.bs;
> >> +    int64_t sector_num, end;
> >> +    int ret = 0;
> >> +    int n;
> >> +    void *buf;
> >> +
> >> +    buf = qemu_blockalign(bs, STREAM_BUFFER_SIZE);
> >> +    s->common.len = bdrv_getlength(bs);
> >> +    bdrv_get_geometry(bs, (uint64_t *)&end);
> >> +
> >> +    bdrv_set_zero_detection(bs, true);
> >> +    bdrv_set_copy_on_read(bs, true);
> >
> > Should distinguish between stream initiated and user initiated setting
> > of zero detection and cor (so that unsetting below does not clear
> > user settings).
> 
> For zero detection I agree.
> 
> For copy-on-read it is questionable since once streaming is complete
> it does not make sense to have copy-on-read enabled.
> 
> I will address this in the next revision and think more about the
> copy-on-read case.
> 
> >> +
> >> +    for (sector_num = 0; sector_num < end; sector_num += n) {
> >> +        if (block_job_is_cancelled(&s->common)) {
> >> +            break;
> >> +        }
> >
> > If cancellation is seen here in the last loop iteration,
> > bdrv_change_backing_file below should not be executed.
> 
> I documented this case in the QMP API.  I'm not sure if it's possible
> to guarantee that the operation isn't just completing as you cancel
> it.  Any blocking point between completion of the last iteration and
> completing the operation is vulnerable to missing the cancel.  It's
> easier to explicitly say the operation might just have completed when
> you canceled, rather than trying to protect the completion path.  Do
> you think it's a problem to have these loose semantics that I
> described?

No, that is ok. I'm referring to bdrv_change_backing_file() being
executed without the entire image being streamed.

"if (sector_num == end && ret == 0)" includes both all sectors being 
streamed and all sectors except the last iteration being streamed (due
to job cancelled break).

> >> +
> >> +        /* TODO rate-limit */
> >> +        /* Note that even when no rate limit is applied we need to yield with
> >> +         * no pending I/O here so that qemu_aio_flush() is able to return.
> >> +         */
> >> +        co_sleep_ns(rt_clock, 0);
> >
> > How do you plan to implement rate limit?
> 
> It was implemented in the QED-specific image streaming series:
> 
> http://repo.or.cz/w/qemu/stefanha.git/commitdiff/22f2c09d2fcfe5e49ac4604fd23e4744f549a476
> 
> That implementation works fine and is small but I'd like to reuse the
> migration speed limit, if possible.  That way we don't have 3
> different rate-limiting implementations in QEMU :).

One possibility would be to create a "virtual" block device for
streaming, sitting on top of the real block device. Then enforce block
I/O limits on the virtual block device, the guest would remain accessing
the real block device.

  parent reply	other threads:[~2011-11-03 16:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-27 15:22 [Qemu-devel] [PATCH 0/8] block: generic image streaming Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 1/8] coroutine: add co_sleep_ns() coroutine sleep function Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 2/8] block: add BlockJob interface for long-running operations Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 3/8] block: add image streaming block job Stefan Hajnoczi
2011-11-01 18:06   ` Marcelo Tosatti
2011-11-02 15:43     ` Stefan Hajnoczi
2011-11-02 16:43       ` Kevin Wolf
2011-11-03 16:34       ` Marcelo Tosatti [this message]
2011-11-04  8:03         ` Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 4/8] qmp: add block_stream command Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 5/8] qmp: add block_job_set_speed command Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 6/8] qmp: add block_job_cancel command Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 7/8] qmp: add query-block-jobs Stefan Hajnoczi
2011-10-27 15:22 ` [Qemu-devel] [PATCH 8/8] test: add image streaming test cases Stefan Hajnoczi
2011-10-27 18:58 ` [Qemu-devel] [PATCH 0/8] block: generic image streaming Luiz Capitulino
2011-11-01 16:46 ` Marcelo Tosatti
2011-11-02 11:06   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111103163424.GA21743@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).