From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:47122) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R87DO-0004mS-Qb for qemu-devel@nongnu.org; Mon, 26 Sep 2011 05:11:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R87DJ-0007Os-Ql for qemu-devel@nongnu.org; Mon, 26 Sep 2011 05:11:06 -0400 Received: from mail-wy0-f173.google.com ([74.125.82.173]:54484) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R87DJ-0007Oo-Ls for qemu-devel@nongnu.org; Mon, 26 Sep 2011 05:11:01 -0400 Received: by wyh22 with SMTP id 22so6343999wyh.4 for ; Mon, 26 Sep 2011 02:11:00 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20110926075556.GB6455@stefanha-thinkpad.localdomain> References: <20110923155726.GA23088@stefanha-thinkpad.localdomain> <20110926075556.GB6455@stefanha-thinkpad.localdomain> Date: Mon, 26 Sep 2011 17:11:00 +0800 Message-ID: From: Zhi Yong Wu Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC] Generic image streaming List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , Marcelo Tosatti , Stefan Hajnoczi , Zhi Yong Wu , qemu-devel@nongnu.org On Mon, Sep 26, 2011 at 3:55 PM, Stefan Hajnoczi wrote= : > On Mon, Sep 26, 2011 at 01:32:34PM +0800, Zhi Yong Wu wrote: >> On Fri, Sep 23, 2011 at 11:57 PM, Stefan Hajnoczi >> wrote: >> > Here is my generic image streaming branch, which aims to provide a way >> > to copy the contents of a backing file into an image file of a running >> > guest without requiring specific support in the various block drivers >> > (e.g. =A0qcow2, qed, vmdk): >> > >> > http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/image-stream= ing-api >> > >> > The tree does not provide full image streaming yet but I'd like to >> > discuss the approach taken in the code. =A0Here are the main points: >> > >> > The image streaming API is available through HMP and QMP commands. =A0= When >> > streaming is started on a block device a coroutine is created to do th= e >> > background I/O work. =A0The coroutine can be cancelled. >> > >> > While the coroutine copies data from the backing file into the image >> > file, the guest may be performing I/O to the image file. =A0Guest read= s do >> > not conflict with streaming but guest writes require special handling. >> > If the guest writes to a region of the image file that we are currentl= y >> > copying, then there is the potential to clobber the guest write with o= ld >> > data from the backing file. >> > >> > Previously I solved this in a QED-specific way by taking advantage of >> > the serialization of allocating write requests. =A0In order to do this >> > generically we need to track in-flight requests and have the ability t= o >> > queue I/O. =A0Guest writes that affect an in-flight streaming copy >> > operation must wait for that operation to complete before being issued= . >> > Streaming copy operations must skip overlapping regions of guest write= s. >> > >> > One big difference to the QED image streaming implementation is that >> > this generic implementation is not based on copy-on-read operations. >> > Instead we do a sequence of bdrv_is_allocated() to find regions for >> > streaming, followed by bdrv_co_read() and bdrv_co_write() in order to >> > populate the image file. >> > >> > It turns out that generic copy-on-read is not an attractive operation >> > because it requires using bounce buffers for every request. =A0Kevin >> bounce buffers =3D=3D buffer ring? > > A bounce buffer is a temporary buffer that is used because the actual > data buffer is not addressable or cannot be directly accessed for some > other reason. =A0In this case it's because the guest should see read > semantics and not find that writes to its read data buffer result in > writes to disk. > >> > pointed out the case where a guest performs a read and pokes the data >> > buffer before the read completes, copy-on-read would write out the >> > modified memory into the image file unless we use a bounce buffer. Sorry, to be honest, i don't know which scenario will cause guest modified memory is written out into image file. >> Can you elaborate this? > > 1. Guest issues a read request. > 2. QEMU issues host read request as first step in copy-on-read. > 3. Host read request completes... > 4. Guest overwrites its data buffer before QEMU acknowledges request > =A0 completion. > 5. ...QEMU issues host write request. > 6. Host completes write request and QEMU acknowledges guest read > =A0 completion. Good, thanks. > > What happened is that we populated the image file with data from guest > memory that does not match what is in the backing file. =A0The guest How to find that the two data don't match? > issued a read request, this should never result in writing to the image > file. > > Although legitimate guests do not do this, a buggy guest could corrupt > its disk in this way! > > Stefan > --=20 Regards, Zhi Yong Wu