From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [Qemu-devel] KVM call agenda for June 28 Date: Tue, 5 Jul 2011 15:18:19 -0300 Message-ID: <20110705181819.GA27175@amt.cnet> References: <20110630143620.GA4366@amt.cnet> <4E0C8D90.8050305@redhat.com> <20110630183829.GA8752@amt.cnet> <4E12C4F5.9000100@redhat.com> <20110705125858.GA21254@amt.cnet> <4E1313FA.1060905@redhat.com> <20110705143230.GA22955@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Dor Laor , Kevin Wolf , Chris Wright , KVM devel mailing list , quintela@redhat.com, jes sorensen , qemu-devel@nongnu.org, Avi Kivity To: Stefan Hajnoczi Return-path: Received: from mx1.redhat.com ([209.132.183.28]:37988 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753156Ab1GESSc (ORCPT ); Tue, 5 Jul 2011 14:18:32 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote: > On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti = wrote: > > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote: > >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote: > >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote: > >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor =A0wr= ote: > >> >>>I tried to re-arrange all of the requirements and use cases usi= ng this wiki > >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration > >> >>> > >> >>>It would be the best to agree upon the most interesting use cas= es (while we > >> >>>make sure we cover future ones) and agree to them. > >> >>>The next step is to set the interface for all the various verbs= since the > >> >>>implementation seems to be converging. > >> >> > >> >>Live block copy was supposed to support snapshot merge. =A0I thi= nk the > >> >>current favored approach is to make the source image a backing f= ile to > >> >>the destination image and essentially do image streaming. > >> >> > >> >>Using this mechanism for snapshot merge is tricky. =A0The COW fi= le > >> >>already uses the read-only snapshot base image. =A0So now we can= not > >> >>trivally copy the COW file contents back into the snapshot base = image > >> >>using live block copy. > >> > > >> >It never did. Live copy creates a new image were both snapshot an= d > >> >"current" are copied to. > >> > > >> >This is similar with image streaming. > >> > >> Not sure I realize what's bad to do in-place merge: > >> > >> Let's suppose we have this COW chain: > >> > >> =A0 base <-- s1 <-- s2 > >> > >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW= : > >> > >> =A0 base <-- s1 <-- s2 <-- s3 > >> > >> Now we've done with s2 (post backup) and like to merge s3 into s2. > >> > >> With your approach we use live copy of s3 into newSnap: > >> > >> =A0 base <-- s1 <-- s2 <-- s3 > >> =A0 base <-- s1 <-- newSnap > >> > >> When it is over s2 and s3 can be erased. > >> The down side is the IOs for copying s2 data and the temporary > >> storage. I guess temp storage is cheap but excessive IO are > >> expensive. > >> > >> My approach was to collapse s3 into s2 and erase s3 eventually: > >> > >> before: base <-- s1 <-- s2 <-- s3 > >> after: =A0base <-- s1 <-- s2 > >> > >> If we use live block copy using mirror driver it should be safe as > >> long as we keep the ordering of new writes into s3 during the > >> execution. > >> Even a failure in the the middle won't cause harm since the > >> management will keep using s3 until it gets success event. > > > > Well, it is more complicated than simply streaming into a new > > image. I'm not entirely sure it is necessary. The common case is: > > > > base -> sn-1 -> sn-2 -> ... -> sn-n > > > > When n reaches a limit, you do: > > > > base -> merge-1 > > > > You're potentially copying similar amount of data when merging back= into > > a single image (and you can't easily merge multiple snapshots). > > > > If the amount of data thats not in 'base' is large, you create > > leave a new external file around: > > > > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n > > to > > base -> merge-1 -> merge-2 > > > >> > > >> >>It seems like snapshot merge will require dedicated code that re= ads > >> >>the allocated clusters from the COW file and writes them back in= to the > >> >>base image. > >> >> > >> >>A very inefficient alternative would be to create a third image,= the > >> >>"merge" image file, which has the COW file as its backing file: > >> >>snapshot (base) -> =A0cow -> =A0merge > > > > Remember there is a 'base' before snapshot, you don't copy the enti= re > > image. >=20 > One use case I have in mind is the Live Backup approach that Jagane > has been developing. Here the backup solution only creates a snapsho= t > for the period of time needed to read out the dirty blocks. Then the > snapshot is deleted again and probably contains very little new data > relative to the base image. The backup solution does this operation > every day. >=20 > This is the pathalogical case for any approach that copies the entire > base into a new file. We could have avoided a lot of I/O by doing an > in-place update. >=20 > I want to make sure this works well. This use case does not fit the streaming scheme that has come up. Its a completly different operation. IMO it should be implemented separately. > Stefan