From: Jagane Sundar <jagane@sundar.org>
To: kvm@vger.kernel.org
Subject: A Live Backup feature for KVM
Date: Sat, 23 Apr 2011 16:17:59 -0700 [thread overview]
Message-ID: <4DB35E27.4030404@sundar.org> (raw)
Hello All,
I would like to get your input on a KVM feature that I am
currently developing.
What it does is this - it can perform full and incremental
disk backups of running KVM VMs, where a backup is defined
as a snapshot of the disk state of all virtual disks
configured for the VM.
This backup mechanism is built by modifying the qemu-kvm
userland process, and works as follows:
- If a VM is configured for backup, qemu-kvm maintains a
dirty blocks list since the last backup. Note that this
is different from the dirty blocks list currently
maintained for block migration purposes in that it is
persistent across VM reboots.
- qemu-kvm creates a thread and listens for backup clients.
- A backup client connects to qemu-kvm and initiates an
incremental backup.
* A snapshot of each virtual disk is created by
qemu-kvm. This is as simple as saving the dirty
blocks map in the snapshot structure
* The dirty blocks are now transferred over to the
backup client.
* While this transfer is in progress, if any blocks
are written by the VM, the livebackup code
intercepts these writes, saves the old blocks in
a qcow2 file, and then allows the write to progress.
* When the transfer of all dirty blocks in the
incremental backup is completed, then the snapshot
is destroyed.
I have considered other technologies that may be utilized
to solve the same problem such as LVM snapshots. It is
possible to create a new LVM partition for each virtual disk
in the VM. When a VM needs to be backed up, each of these LVM
partitions is snapshotted. At this point things get messy
- I don't really know of a good way to identify the blocks
that were modified since the last backup. Also, once these
blocks are identified, we need a mechanism to transfer
them over a TCP connection to the backup server. Perhaps
a way to export the 'dirty blocks' map to userland and use
a deamon to transfer the block. Or maybe a kernel thread
capable of listening on TCP sockets and transferring the
blocks over to the backup client (I don't know if this
is possible).
In any case, my first attempt is to implement this in the
qemu-kvm userland binary.
The benefit to the end user of this technology is this: Today
IaaS cloud platforms such as EC2 provide you with the ability
to have two types of virtual disks in VM instances
1. Ephemeral virtual disks that are lost if there is a
hardware failure
2. EBS storage volumes which are costly.
I think that an efficient disk backup mechanism will enable
a third type of virtual disk - one that is backed up, perhaps
every hour or so. So a cloud operator using KVM virtual
machines can offer three types of VMS:
1. An ephemeral VM that is lost if a hardware failure happens
2. A backed up VM that can be restored from the last hourly
backup
3. A fully highly-available VM running off of a NAS or SAN
or some such shared storage.
VMware has extensive support for backing up running Virtual
Machines in their products. It is called VMware Consolidated
Backup. A lot of it seems to be targeted at Windows VMs,
with hooks provided into Microsoft's Volume Snapshot Service
running in the guest.
My proposal will also eventually need the capability to run an
agent in the guest for sync'ing the filesystem, flushing
database caches, etc. I am also unsure whether just sync'ing
a ext3 or ext4 FS and then snapshotting is adequate for backup
purposes.
I want to target this feature squarely at the cloud use model,
with automated backups scheduled for instances created using
an EC2 or Openstack API.
Please let me know if you find this feature interesting. I am
looking forward to feedback on any and all aspects of this
design. I would like to work with the KVM community to
contribute this feature to the KVM code base.
Thanks,
Jagane Sundar
next reply other threads:[~2011-04-23 23:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-23 23:17 Jagane Sundar [this message]
2011-04-24 8:32 ` A Live Backup feature for KVM Stefan Hajnoczi
2011-04-25 8:16 ` Jagane Sundar
2011-04-25 13:34 ` Stefan Hajnoczi
2011-04-26 3:31 ` Jagane Sundar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DB35E27.4030404@sundar.org \
--to=jagane@sundar.org \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).