qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Zhi Yong Wu <zwu.kernel@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Zhi Yong Wu <wuzhy@cn.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] Generic image streaming
Date: Mon, 26 Sep 2011 08:55:56 +0100	[thread overview]
Message-ID: <20110926075556.GB6455@stefanha-thinkpad.localdomain> (raw)
In-Reply-To: <CAEH94LjCd8trB8HsXEyBx_JTcw=RR6UaMW-KAGW8vi7Ya3cxsQ@mail.gmail.com>

On Mon, Sep 26, 2011 at 01:32:34PM +0800, Zhi Yong Wu wrote:
> On Fri, Sep 23, 2011 at 11:57 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
> > Here is my generic image streaming branch, which aims to provide a way
> > to copy the contents of a backing file into an image file of a running
> > guest without requiring specific support in the various block drivers
> > (e.g.  qcow2, qed, vmdk):
> >
> > http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/image-streaming-api
> >
> > The tree does not provide full image streaming yet but I'd like to
> > discuss the approach taken in the code.  Here are the main points:
> >
> > The image streaming API is available through HMP and QMP commands.  When
> > streaming is started on a block device a coroutine is created to do the
> > background I/O work.  The coroutine can be cancelled.
> >
> > While the coroutine copies data from the backing file into the image
> > file, the guest may be performing I/O to the image file.  Guest reads do
> > not conflict with streaming but guest writes require special handling.
> > If the guest writes to a region of the image file that we are currently
> > copying, then there is the potential to clobber the guest write with old
> > data from the backing file.
> >
> > Previously I solved this in a QED-specific way by taking advantage of
> > the serialization of allocating write requests.  In order to do this
> > generically we need to track in-flight requests and have the ability to
> > queue I/O.  Guest writes that affect an in-flight streaming copy
> > operation must wait for that operation to complete before being issued.
> > Streaming copy operations must skip overlapping regions of guest writes.
> >
> > One big difference to the QED image streaming implementation is that
> > this generic implementation is not based on copy-on-read operations.
> > Instead we do a sequence of bdrv_is_allocated() to find regions for
> > streaming, followed by bdrv_co_read() and bdrv_co_write() in order to
> > populate the image file.
> >
> > It turns out that generic copy-on-read is not an attractive operation
> > because it requires using bounce buffers for every request.  Kevin
> bounce buffers == buffer ring?

A bounce buffer is a temporary buffer that is used because the actual
data buffer is not addressable or cannot be directly accessed for some
other reason.  In this case it's because the guest should see read
semantics and not find that writes to its read data buffer result in
writes to disk.

> > pointed out the case where a guest performs a read and pokes the data
> > buffer before the read completes, copy-on-read would write out the
> > modified memory into the image file unless we use a bounce buffer.
> Can you elaborate this?

1. Guest issues a read request.
2. QEMU issues host read request as first step in copy-on-read.
3. Host read request completes...
4. Guest overwrites its data buffer before QEMU acknowledges request
   completion.
5. ...QEMU issues host write request.
6. Host completes write request and QEMU acknowledges guest read
   completion.

What happened is that we populated the image file with data from guest
memory that does not match what is in the backing file.  The guest
issued a read request, this should never result in writing to the image
file.

Although legitimate guests do not do this, a buggy guest could corrupt
its disk in this way!

Stefan

  reply	other threads:[~2011-09-26  8:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-23 15:57 [Qemu-devel] [RFC] Generic image streaming Stefan Hajnoczi
2011-09-26  5:32 ` Zhi Yong Wu
2011-09-26  7:55   ` Stefan Hajnoczi [this message]
2011-09-26  9:11     ` Zhi Yong Wu
2011-09-26  9:30       ` Stefan Hajnoczi
2011-09-26 14:18         ` Zhi Yong Wu
2011-09-26 12:35 ` Marcelo Tosatti
2011-09-26 14:21   ` Stefan Hajnoczi
2011-09-27 12:07     ` Stefan Hajnoczi
2011-09-27  3:26 ` Zhi Yong Wu
2011-09-27  8:37   ` Stefan Hajnoczi
2011-09-27  9:05 ` Zhi Yong Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110926075556.GB6455@stefanha-thinkpad.localdomain \
    --to=stefanha@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    --cc=wuzhy@cn.ibm.com \
    --cc=zwu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).