All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Richard Laager <rlaager@wiktel.com>
Cc: kvm@vger.kernel.org, qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [RFC] Consistent Snapshots Idea
Date: Mon, 21 Nov 2011 14:31:31 +0200	[thread overview]
Message-ID: <4ECA44A3.2040200@redhat.com> (raw)
In-Reply-To: <1321876869.761.59.camel@watermelon.coderich.net>

On 11/21/2011 02:01 PM, Richard Laager wrote:
> I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
> question. If so, please let me know and I'll ask on a different list.

It is a qemu question, yes (though fork()ing a guest also relates to kvm).

> Background:
>
> Assuming the block layer can make instantaneous snapshots of a guest's
> disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
> guest crashed) snapshots. To get a "fully consistent" snapshot, you need
> to shutdown the guest. For production VMs, this is obviously not ideal.
>
> Idea:
>
> What if KVM/QEMU was to fork() the guest and shutdown one copy?
>
> KVM/QEMU would momentarily halt the execution of the guest and take a
> writable, instantaneous snapshot of each block device. Then it would
> fork(). The parent would resume execution as normal. The child would
> redirect disk writes to the snapshot(s). The RAM should have
> copy-on-write behavior as with any other fork()ed process. Other
> resources like the network, display, sound, serial, etc. would simply be
> disconnected/bit-bucketed. Finally, the child would resume guest
> execution and send the guest an ACPI power button press event. This
> would cause the guest OS to perform an orderly shutdown.
>
> I believe this would provide consistent snapshots in the vast majority
> of real-world scenarios in a guest OS and application-independent way.

Interesting idea.  Will the guest actually shut down nicely without a
network?  Things like NFS mounts will break.

> Implementation Nits:
>
>       * A timeout on the child process would likely be a good idea.
>       * It'd probably be best to disconnect the network (i.e. tell the
>         guest the cable is unplugged) to avoid long timeouts. Likewise
>         for the hardware flow-control lines on the serial port.

This is actually critical, otherwise the guest will shutdown(2) all
sockets and confuse the clients.

>       * For correctness, fdatasync()ing or similar might be necessary
>         after halting execution and before creating the snapshots.

Microsoft guests have an API to quiesce storage prior to a snapshot, and
I think there is work to bring this to Linux guests.  So it should be
possible to get consistent snapshots even without this, but it takes
more integration.

-- 
error compiling committee.c: too many arguments to function


WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi@redhat.com>
To: Richard Laager <rlaager@wiktel.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [RFC] Consistent Snapshots Idea
Date: Mon, 21 Nov 2011 14:31:31 +0200	[thread overview]
Message-ID: <4ECA44A3.2040200@redhat.com> (raw)
In-Reply-To: <1321876869.761.59.camel@watermelon.coderich.net>

On 11/21/2011 02:01 PM, Richard Laager wrote:
> I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
> question. If so, please let me know and I'll ask on a different list.

It is a qemu question, yes (though fork()ing a guest also relates to kvm).

> Background:
>
> Assuming the block layer can make instantaneous snapshots of a guest's
> disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
> guest crashed) snapshots. To get a "fully consistent" snapshot, you need
> to shutdown the guest. For production VMs, this is obviously not ideal.
>
> Idea:
>
> What if KVM/QEMU was to fork() the guest and shutdown one copy?
>
> KVM/QEMU would momentarily halt the execution of the guest and take a
> writable, instantaneous snapshot of each block device. Then it would
> fork(). The parent would resume execution as normal. The child would
> redirect disk writes to the snapshot(s). The RAM should have
> copy-on-write behavior as with any other fork()ed process. Other
> resources like the network, display, sound, serial, etc. would simply be
> disconnected/bit-bucketed. Finally, the child would resume guest
> execution and send the guest an ACPI power button press event. This
> would cause the guest OS to perform an orderly shutdown.
>
> I believe this would provide consistent snapshots in the vast majority
> of real-world scenarios in a guest OS and application-independent way.

Interesting idea.  Will the guest actually shut down nicely without a
network?  Things like NFS mounts will break.

> Implementation Nits:
>
>       * A timeout on the child process would likely be a good idea.
>       * It'd probably be best to disconnect the network (i.e. tell the
>         guest the cable is unplugged) to avoid long timeouts. Likewise
>         for the hardware flow-control lines on the serial port.

This is actually critical, otherwise the guest will shutdown(2) all
sockets and confuse the clients.

>       * For correctness, fdatasync()ing or similar might be necessary
>         after halting execution and before creating the snapshots.

Microsoft guests have an API to quiesce storage prior to a snapshot, and
I think there is work to bring this to Linux guests.  So it should be
possible to get consistent snapshots even without this, but it takes
more integration.

-- 
error compiling committee.c: too many arguments to function

  reply	other threads:[~2011-11-21 12:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-21 12:01 [RFC] Consistent Snapshots Idea Richard Laager
2011-11-21 12:31 ` Avi Kivity [this message]
2011-11-21 12:31   ` [Qemu-devel] " Avi Kivity
2011-11-21 14:27   ` shu ming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ECA44A3.2040200@redhat.com \
    --to=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rlaager@wiktel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.