qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Chris Wright <chrisw@redhat.com>,
	KVM devel mailing list <kvm@vger.kernel.org>,
	quintela@redhat.com, jes sorensen <jes.sorensen@redhat.com>,
	Dor Laor <dlaor@redhat.com>,
	qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] KVM call agenda for June 28
Date: Thu, 7 Jul 2011 16:25:37 +0100	[thread overview]
Message-ID: <CAJSP0QU9M6xjHNnCJZaLtTRUgEd4BBaJm-n4tcDkAn8c-hw1bg@mail.gmail.com> (raw)
In-Reply-To: <20110705181819.GA27175@amt.cnet>

On Tue, Jul 5, 2011 at 7:18 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>> >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>> >> >>>I tried to re-arrange all of the requirements and use cases using this wiki
>> >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
>> >> >>>
>> >> >>>It would be the best to agree upon the most interesting use cases (while we
>> >> >>>make sure we cover future ones) and agree to them.
>> >> >>>The next step is to set the interface for all the various verbs since the
>> >> >>>implementation seems to be converging.
>> >> >>
>> >> >>Live block copy was supposed to support snapshot merge.  I think the
>> >> >>current favored approach is to make the source image a backing file to
>> >> >>the destination image and essentially do image streaming.
>> >> >>
>> >> >>Using this mechanism for snapshot merge is tricky.  The COW file
>> >> >>already uses the read-only snapshot base image.  So now we cannot
>> >> >>trivally copy the COW file contents back into the snapshot base image
>> >> >>using live block copy.
>> >> >
>> >> >It never did. Live copy creates a new image were both snapshot and
>> >> >"current" are copied to.
>> >> >
>> >> >This is similar with image streaming.
>> >>
>> >> Not sure I realize what's bad to do in-place merge:
>> >>
>> >> Let's suppose we have this COW chain:
>> >>
>> >>   base <-- s1 <-- s2
>> >>
>> >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>
>> >> Now we've done with s2 (post backup) and like to merge s3 into s2.
>> >>
>> >> With your approach we use live copy of s3 into newSnap:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>   base <-- s1 <-- newSnap
>> >>
>> >> When it is over s2 and s3 can be erased.
>> >> The down side is the IOs for copying s2 data and the temporary
>> >> storage. I guess temp storage is cheap but excessive IO are
>> >> expensive.
>> >>
>> >> My approach was to collapse s3 into s2 and erase s3 eventually:
>> >>
>> >> before: base <-- s1 <-- s2 <-- s3
>> >> after:  base <-- s1 <-- s2
>> >>
>> >> If we use live block copy using mirror driver it should be safe as
>> >> long as we keep the ordering of new writes into s3 during the
>> >> execution.
>> >> Even a failure in the the middle won't cause harm since the
>> >> management will keep using s3 until it gets success event.
>> >
>> > Well, it is more complicated than simply streaming into a new
>> > image. I'm not entirely sure it is necessary. The common case is:
>> >
>> > base -> sn-1 -> sn-2 -> ... -> sn-n
>> >
>> > When n reaches a limit, you do:
>> >
>> > base -> merge-1
>> >
>> > You're potentially copying similar amount of data when merging back into
>> > a single image (and you can't easily merge multiple snapshots).
>> >
>> > If the amount of data thats not in 'base' is large, you create
>> > leave a new external file around:
>> >
>> > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
>> > to
>> > base -> merge-1 -> merge-2
>> >
>> >> >
>> >> >>It seems like snapshot merge will require dedicated code that reads
>> >> >>the allocated clusters from the COW file and writes them back into the
>> >> >>base image.
>> >> >>
>> >> >>A very inefficient alternative would be to create a third image, the
>> >> >>"merge" image file, which has the COW file as its backing file:
>> >> >>snapshot (base) ->  cow ->  merge
>> >
>> > Remember there is a 'base' before snapshot, you don't copy the entire
>> > image.
>>
>> One use case I have in mind is the Live Backup approach that Jagane
>> has been developing.  Here the backup solution only creates a snapshot
>> for the period of time needed to read out the dirty blocks.  Then the
>> snapshot is deleted again and probably contains very little new data
>> relative to the base image.  The backup solution does this operation
>> every day.
>>
>> This is the pathalogical case for any approach that copies the entire
>> base into a new file.  We could have avoided a lot of I/O by doing an
>> in-place update.
>>
>> I want to make sure this works well.
>
> This use case does not fit the streaming scheme that has come up. Its a
> completly different operation.
>
> IMO it should be implemented separately.

Okay, not everything can fit into this one grand unified block
copy/image streaming mechanism :).

Stefan

  parent reply	other threads:[~2011-07-07 15:25 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-27 14:32 [Qemu-devel] KVM call agenda for June 28 Juan Quintela
2011-06-28 13:38 ` Stefan Hajnoczi
2011-06-28 19:41   ` Marcelo Tosatti
2011-06-29  5:32     ` Stefan Hajnoczi
2011-06-29  7:57     ` Kevin Wolf
2011-06-29 10:08       ` Stefan Hajnoczi
2011-06-29 15:41         ` Marcelo Tosatti
2011-06-30 11:48           ` Stefan Hajnoczi
2011-06-30 12:39             ` Kevin Wolf
2011-06-30 12:54           ` Stefan Hajnoczi
2011-06-30 14:36             ` Marcelo Tosatti
2011-06-30 14:52               ` Kevin Wolf
2011-06-30 18:38                 ` Marcelo Tosatti
2011-07-05  8:01                   ` Dor Laor
2011-07-05 12:40                     ` Stefan Hajnoczi
2011-07-05 12:58                       ` Marcelo Tosatti
2011-07-05 13:39                         ` Dor Laor
2011-07-05 14:29                           ` Marcelo Tosatti
2011-07-05 14:32                           ` Marcelo Tosatti
2011-07-05 14:46                             ` Kevin Wolf
2011-07-05 15:04                             ` Dor Laor
2011-07-05 15:29                               ` Marcelo Tosatti
2011-07-05 15:37                             ` Stefan Hajnoczi
2011-07-05 18:18                               ` Marcelo Tosatti
2011-07-06  7:48                                 ` Kevin Wolf
2011-07-07 15:25                                 ` Stefan Hajnoczi [this message]
2011-06-28 13:43 ` Anthony Liguori
2011-06-28 13:48   ` Avi Kivity
2011-06-30 14:10     ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJSP0QU9M6xjHNnCJZaLtTRUgEd4BBaJm-n4tcDkAn8c-hw1bg@mail.gmail.com \
    --to=stefanha@gmail.com \
    --cc=avi@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=dlaor@redhat.com \
    --cc=jes.sorensen@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwolf@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).