From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: KVM call agenda for June 28 Date: Thu, 30 Jun 2011 11:36:20 -0300 Message-ID: <20110630143620.GA4366@amt.cnet> References: <20110628194106.GA17443@amt.cnet> <4E0ADAE0.6040204@redhat.com> <20110629154134.GA6631@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Kevin Wolf , quintela@redhat.com, KVM devel mailing list , qemu-devel@nongnu.org, Chris Wright , Dor Laor , Avi Kivity To: Stefan Hajnoczi Return-path: Received: from mx1.redhat.com ([209.132.183.28]:41936 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852Ab1F3Ogd (ORCPT ); Thu, 30 Jun 2011 10:36:33 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jun 30, 2011 at 01:54:09PM +0100, Stefan Hajnoczi wrote: > On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti wrote: > > On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote: > >> =A0This can be used to merge data from an intermediate image witho= ut > >> merging the base image. =A0When streaming completes the backing fi= le > >> will be set to the base image. =A0The backing file relationship wo= uld > >> typically look like this: > >> > >> 1. Before block_stream -a -b base.img ide0-hd completion: > >> > >> base.img <- sn1 <- ... <- ide0-hd.qed > >> > >> 2. After streaming completes: > >> > >> base.img <- ide0-hd.qed > >> > >> This describes the image streaming use cases that I, Adam, and Ant= hony > >> propose to support. =A0In the course of the discussion we've somet= imes > >> been distracted with the internals of what a unified live block > >> copy/image streaming implementation should do. =A0I wanted to post= this > >> summary of image streaming to refocus us on the use case and the A= PIs > >> that users will see. > >> > >> Stefan > > > > OK, with an external COW file for formats that do not support it th= e > > interface can be similar. Also there is no need to mirror writes, > > no switch operation, always use destination image. >=20 > Marcelo, does this mean you are happy with how management deals with > power failure/crash during streaming? Yep. > Are we settled on the approach where the destination file always has > the source file as its backing file? Yep. > Here are the components that I can identify: >=20 > 1. blkmirror - used by live block copy to keep source and destination > in sync. Already implemented as a block driver by Marcelo. No need for it anymore, now you switch to the destination before the operation starts. And always use destination from there on. > 2. External COW overlay - can be used to add backing file (COW) > support on top of any image, including raw. Currently unimplemented, > needs to be a block driver. Kevin, do you want to write this? >=20 > 3. Unified background copy - image format-independent mechanism for > copy contents of a backing file chain into the image file (with > exception of backing files chained below base). Needs to play nice > with blkmirror. Stefan can write this. Note the background copy itself is to simply read from 0...END. The bul= k is in the block driver. > 4. Live block copy API and high-level control - the main code that > adds the live block copy feature. Existing patches by Marcelo, can b= e > restructured to use common core by Marcelo. Can use your proposed block_stream interface, with a "block_switch" command on top, so: 1) management creates copy.img with backing file current.img, allows access 2) management issues "block_switch dev copy.img" 3) management issues "block_stream dev base" > 5. Image streaming API and high-level control - the main code that > adds the image streaming feature. Existing patches by Stefan, Adam, > Anthony, can be restructured to use common core by Stefan. >=20 > I previously posted a proposed API for the unified background copy > mechanism. I'm thinking that background copy is not the best name > since it is limited to copying the backing file into the image file. >=20 > /** > * Start a background copy operation > * > * Unallocated clusters in the image will be populated with data > * from its backing file. This operation runs in the background and = a > * completion function is invoked when it is finished. > */ > BackgroundCopy *background_copy_start( > BlockDriverState *bs, >=20 > /** > * Note: Kevin suggests we migrate this into BlockDriverState > * in order to enable copy-on-read. > * > * Base image that both source and destination have as a > * backing file ancestor. Data will not be copied from base > * since both source and destination will have access to base > * image. This may be NULL to copy all data. > */ > BlockDriverState *base, >=20 > BlockDriverCompletionFunc *cb, void *opaque); >=20 > /** > * Cancel a background copy operation > * > * This function marks the background copy operation for cancellation= and the > * completion function is invoked once the operation has been cancell= ed. > */ > void background_copy_cancel(BackgroundCopy *bgc, > BlockDriverCompletionFunc *cb, void *opaq= ue); >=20 > /** > * Get progress of a running background copy operation > */ > void background_copy_get_status(BackgroundCopy *bgc, > BackgroundCopyStatus *status); >=20 > Stefan Thought of implementing "block_stream" command by reopening device with blkstream:imagename.img Then: AIO_READ: - for each cluster in request: - if allocated-or-in-final-base, read. - check write queue, if present wait on it, if not, add "copy" entry to write queue. - issue cluster sized read from source. - on completion: - copy data to original read buffer, complete it. - if not cancelled, write cluster to destination. AIO_WRITE for each cluster in request: - check write queue, cancel/wait for "copy" entry. - add "guest" entry to write queue. - issue write to destination. - on completion: - remove write queue entry. With the 0...END background read, once it completes write final base file for image. So block_stream/block_stream_cancel/block_stream_status commands, the background read and the rebase -u update can be separate from the block driver.