From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC] Consistent Snapshots Idea Date: Mon, 21 Nov 2011 14:31:31 +0200 Message-ID: <4ECA44A3.2040200@redhat.com> References: <1321876869.761.59.camel@watermelon.coderich.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, qemu-devel To: Richard Laager Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53825 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750935Ab1KUMbf (ORCPT ); Mon, 21 Nov 2011 07:31:35 -0500 In-Reply-To: <1321876869.761.59.camel@watermelon.coderich.net> Sender: kvm-owner@vger.kernel.org List-ID: On 11/21/2011 02:01 PM, Richard Laager wrote: > I'm not an expert on the architecture of KVM, so perhaps this is a QEMU > question. If so, please let me know and I'll ask on a different list. It is a qemu question, yes (though fork()ing a guest also relates to kvm). > Background: > > Assuming the block layer can make instantaneous snapshots of a guest's > disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the > guest crashed) snapshots. To get a "fully consistent" snapshot, you need > to shutdown the guest. For production VMs, this is obviously not ideal. > > Idea: > > What if KVM/QEMU was to fork() the guest and shutdown one copy? > > KVM/QEMU would momentarily halt the execution of the guest and take a > writable, instantaneous snapshot of each block device. Then it would > fork(). The parent would resume execution as normal. The child would > redirect disk writes to the snapshot(s). The RAM should have > copy-on-write behavior as with any other fork()ed process. Other > resources like the network, display, sound, serial, etc. would simply be > disconnected/bit-bucketed. Finally, the child would resume guest > execution and send the guest an ACPI power button press event. This > would cause the guest OS to perform an orderly shutdown. > > I believe this would provide consistent snapshots in the vast majority > of real-world scenarios in a guest OS and application-independent way. Interesting idea. Will the guest actually shut down nicely without a network? Things like NFS mounts will break. > Implementation Nits: > > * A timeout on the child process would likely be a good idea. > * It'd probably be best to disconnect the network (i.e. tell the > guest the cable is unplugged) to avoid long timeouts. Likewise > for the hardware flow-control lines on the serial port. This is actually critical, otherwise the guest will shutdown(2) all sockets and confuse the clients. > * For correctness, fdatasync()ing or similar might be necessary > after halting execution and before creating the snapshots. Microsoft guests have an API to quiesce storage prior to a snapshot, and I think there is work to bring this to Linux guests. So it should be possible to get consistent snapshots even without this, but it takes more integration. -- error compiling committee.c: too many arguments to function