kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jagane Sundar <jagane@sundar.org>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Jes Sorensen <Jes.Sorensen@redhat.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	Badari Pulavarty <pbadari@us.ibm.com>
Subject: Re: A Live Backup feature for KVM
Date: Mon, 25 Apr 2011 01:16:19 -0700	[thread overview]
Message-ID: <4DB52DD3.4000600@sundar.org> (raw)
In-Reply-To: <BANLkTi=3rZZKoupZECV_4o5-QA6R64Fo_g@mail.gmail.com>

Hello Stefan,

It's good to know that live snapshots and online backup are useful
functions.

I read through the two snapshot proposals that you pointed me at.

The direction that I chose to go is slightly different. In both of the
proposals you pointed me at, the original virtual disk is made
read-only and the VM writes to a different COW file. After backup
of the original virtual disk file is complete, the COW file is merged
with the original vdisk file.

Instead, I create an Original-Blocks-COW-file to store the original
blocks that are overwritten by the VM everytime the VM performs
a write while the backup is in progress. Livebackup copies these
underlying blocks from the original virtual disk file before the VM's
write to the original virtual disk file is scheduled. The advantage of
this is that there is no merge necessary at the end of the backup, we
can simply delete the Original-Blocks-COW-file.

I have some reasons to believe that the Original-Blocks-COW-file
design that I am putting forth might work better. I have listed them
below. (It's past midnight here, so pardon me if it sounds garbled -- I
will try to clarify more in a writeup on wiki.qemu.org).
Let me know what your thoughts are..

I feel that the livebackup mechanism will impact the running VM
less. For example, if something goes wrong with the backup process,
then we can simply delete the Original-Blocks-COW-file and force
the backup client to do a full backup the next time around. The
running VM or its virtual disks are not impacted at all.

Adjunct functionality such as block migration and live migration
might work easier with the Original-Blocks-COW-file way, since
the original virtual disk file functions as the only virtual disk
file for the VM. If a live migration needs to happen while a
backup is in progress, we can just delete the Original-Blocks-COW-file
and be on our way.

Livebackup includes a rudimentary network protocol to transfer
the modified blocks to a livebackup_client. It supports incremental
backups. Also, livebackup treats a backup as containing all the virtual
disks of a VM. Hence a snapshot in livebackup terms refer to a
snapshot of all the virtual disks.

The approximate sequence of operation is as follows:
1. VM boots up. When bdrv_open_common opens any file backed
     virtual disk, it checks for a file called <base_file>.livebackupconf.
     If such a file exists, then the virtual disk is part of the backup set,
     and a chunk of memory is allocated to keep track of dirty blocks.
2. qemu starts up a  livebackup thread that listens on a specified port
     (e.g) port 7900, for connections from the livebackup client.
3. The livebackup_client connects to qemu at port 7900.
4. livebackup_client sends a 'do snapshot' command.
5. qemu waits 30 seconds for outstanding asynchronous I/O to complete.
6. When there are no more outstanding async I/O requests, qemu
     copies the dirty_bitmap to its snapshot structure and starts a new 
dirty
     bitmap.
7. livebackup_client starts iterating through the list of dirty blocks, and
     starts saving these blocks to the backup image
8. When all blocks have been backed up, then the backup_client sends a
     destroy snapshot command; the server simply deletes the
     Original-Blocks-COW-files for each of the virtual disks and frees the
     calloc'd memory holding the dirty blocks list.

Thanks for the pointers to virtagent and fsfreeze. fsfreeze looks 
exactly like
what is necessary to quiesce file system activity.

I have pushed my code to the following git tree.
git://github.com/jagane/qemu-kvm-livebackup.git

It started as a clone of the linux kvm tree at:

git clone git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git

If you want to look at the code, see livebackup.[ch] and livebackup_client.c

This is very much a work in progress, and I expect to do a lot of
testing/debugging over the next few weeks. I will also create a
detailed proposal on wiki.qemu.org, with much more information.

Thanks,
Jagane

On 4/24/2011 1:32 AM, Stefan Hajnoczi wrote:
> On Sun, Apr 24, 2011 at 12:17 AM, Jagane Sundar<jagane@sundar.org>  wrote:
>> I would like to get your input on a KVM feature that I am
>> currently developing.
>>
>> What it does is this - it can perform full and incremental
>> disk backups of running KVM VMs, where a backup is defined
>> as a snapshot of the disk state of all virtual disks
>> configured for the VM.
> Great, there is definitely demand for live snapshots and online
> backup.  Some efforts are already underway to implement this.
>
> Jes has worked on a live snapshot feature for online backups.  The
> snapshot_blkdev QEMU monitor command is available in qemu.git and
> works like this:
> qemu>  snapshot_blockdev virtio-disk0 /tmp/new-img.qcow2
>
> It will create a new image file backed by the current image file.  It
> then switches the VM disk to the new image file.  All writes will go
> to the new image file.  The backup software on the host can now read
> from the original image file since it will not be modified.
>
> There is no support yet for live merging the new image file back into
> the original image file (live commit).
>
> Here are some of the workflows and requirements:
>
> http://wiki.qemu.org/Features/Snapshots
> http://wiki.qemu.org/Features/Snapshots2
> http://wiki.qemu.org/Features/Block/Merge
>
> It is possible to find the dirty blocks by enumerating allocated
> clusters in the new image file - these are the clusters that have been
> written to since the snapshot.
>
>> My proposal will also eventually need the capability to run an
>> agent in the guest for sync'ing the filesystem, flushing
>> database caches, etc. I am also unsure whether just sync'ing
>> a ext3 or ext4 FS and then snapshotting is adequate for backup
>> purposes.
> virtagent is being developed by Mike Roth as a guest agent for QEMU.
> One of the use cases for virtagent is backup/snapshots and Jes has
> submitted patches to add file system freeze.  You can find both
> virtagent and fsfreeze on the qemu mailing list.
>
>> Please let me know if you find this feature interesting. I am
>> looking forward to feedback on any and all aspects of this
>> design. I would like to work with the KVM community to
>> contribute this feature to the KVM code base.
> Do you have a link to a git repo with your code?
>
> Stefan


  reply	other threads:[~2011-04-25  8:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-23 23:17 A Live Backup feature for KVM Jagane Sundar
2011-04-24  8:32 ` Stefan Hajnoczi
2011-04-25  8:16   ` Jagane Sundar [this message]
2011-04-25 13:34     ` Stefan Hajnoczi
2011-04-26  3:31       ` Jagane Sundar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DB52DD3.4000600@sundar.org \
    --to=jagane@sundar.org \
    --cc=Jes.Sorensen@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).