From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RST28-0005MD-Q4 for qemu-devel@nongnu.org; Mon, 21 Nov 2011 07:31:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RST27-00059y-KB for qemu-devel@nongnu.org; Mon, 21 Nov 2011 07:31:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:2110) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RST27-00059l-4X for qemu-devel@nongnu.org; Mon, 21 Nov 2011 07:31:35 -0500 Message-ID: <4ECA44A3.2040200@redhat.com> Date: Mon, 21 Nov 2011 14:31:31 +0200 From: Avi Kivity MIME-Version: 1.0 References: <1321876869.761.59.camel@watermelon.coderich.net> In-Reply-To: <1321876869.761.59.camel@watermelon.coderich.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] Consistent Snapshots Idea List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Laager Cc: qemu-devel , kvm@vger.kernel.org On 11/21/2011 02:01 PM, Richard Laager wrote: > I'm not an expert on the architecture of KVM, so perhaps this is a QEMU > question. If so, please let me know and I'll ask on a different list. It is a qemu question, yes (though fork()ing a guest also relates to kvm). > Background: > > Assuming the block layer can make instantaneous snapshots of a guest's > disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the > guest crashed) snapshots. To get a "fully consistent" snapshot, you need > to shutdown the guest. For production VMs, this is obviously not ideal. > > Idea: > > What if KVM/QEMU was to fork() the guest and shutdown one copy? > > KVM/QEMU would momentarily halt the execution of the guest and take a > writable, instantaneous snapshot of each block device. Then it would > fork(). The parent would resume execution as normal. The child would > redirect disk writes to the snapshot(s). The RAM should have > copy-on-write behavior as with any other fork()ed process. Other > resources like the network, display, sound, serial, etc. would simply be > disconnected/bit-bucketed. Finally, the child would resume guest > execution and send the guest an ACPI power button press event. This > would cause the guest OS to perform an orderly shutdown. > > I believe this would provide consistent snapshots in the vast majority > of real-world scenarios in a guest OS and application-independent way. Interesting idea. Will the guest actually shut down nicely without a network? Things like NFS mounts will break. > Implementation Nits: > > * A timeout on the child process would likely be a good idea. > * It'd probably be best to disconnect the network (i.e. tell the > guest the cable is unplugged) to avoid long timeouts. Likewise > for the hardware flow-control lines on the serial port. This is actually critical, otherwise the guest will shutdown(2) all sockets and confuse the clients. > * For correctness, fdatasync()ing or similar might be necessary > after halting execution and before creating the snapshots. Microsoft guests have an API to quiesce storage prior to a snapshot, and I think there is work to bring this to Linux guests. So it should be possible to get consistent snapshots even without this, but it takes more integration. -- error compiling committee.c: too many arguments to function