From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:50153) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RiZVR-0007VH-RD for qemu-devel@nongnu.org; Wed, 04 Jan 2012 17:40:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RiZVQ-0006HU-8V for qemu-devel@nongnu.org; Wed, 04 Jan 2012 17:40:25 -0500 Received: from mail-we0-f173.google.com ([74.125.82.173]:48783) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RiZVQ-0006HN-0j for qemu-devel@nongnu.org; Wed, 04 Jan 2012 17:40:24 -0500 Received: by werb10 with SMTP id b10so10555605wer.4 for ; Wed, 04 Jan 2012 14:40:23 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <4F049462.50406@redhat.com> References: <20120104140854.631720304@redhat.com> <20120104140945.618799948@redhat.com> <4F0477FE.3000801@redhat.com> <20120104174724.GA21596@amt.cnet> <4F049462.50406@redhat.com> Date: Wed, 4 Jan 2012 22:40:22 +0000 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [patch 3/4] block stream: add support for partial streaming List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: kwolf@redhat.com, Paolo Bonzini , Marcelo Tosatti , stefanha@linux.vnet.ibm.com, qemu-devel@nongnu.org On Wed, Jan 4, 2012 at 6:03 PM, Eric Blake wrote: > On 01/04/2012 10:47 AM, Marcelo Tosatti wrote: >>>> +/* >>>> + * Given an image chain: [BASE] -> [INTER1] -> [INTER2] -> [TOP] >>>> + * >>> >>> How hard would it be to go one step further, and provide a monitor >>> command where qemu could dump the state of BASE, INTER1, or INTER2 >>> without removing it from the image chain? =A0Libvirt would really like = to >>> be able to have a command where the user can request to inspect to see >>> the contents of (a portion of) the disk at the time the snapshot was >>> created, all while qemu continues to run and the TOP file continues to >>> be adding deltas to that portion of the disk. >> >> What exactly do you mean "dump the state of"? You want access to >> the contents of INTER2, INTER1, BASE, via libguestfs? > > I want access via the qemu monitor (which can then be used by libvirt, > libguestfs, and others, to do whatever further management operations on > that snapshot as desired). > >> >>> For that matter, I'm still missing out on the ability to extract the >>> contents of a qcow2 internal snapshot from an image that is in use by >>> qemu - we have the ability to delete internal snapshots but not to prob= e >>> their contents. >> >> Same question (although i am not familiar with internal snapshots). > > With external snapshots, I know that once the external snapshot TOP is > created, then qemu is treating INTER2 as read-only; therefore, I can > then use qemu-img in parallel on INTER2 to probe the contents of the > snapshot; therefore, in libvirt, it would be possible for me to create a > raw image corresponding to the qcow2 contents of INTER2, or to create a > cloned qcow2 image corresponding to the raw contents of BASE, all while > TOP continues to be modified. > > But with internal snapshots, both the snapshot and the current disk > state reside in the same qcow2 file, which is under current use by qemu, > and therefore, qemu-img cannot be safely used on that file. =A0The only > way I know of to extract the contents of that internal snapshot is via > qemu itself, but qemu does not currently expose that. =A0I envision > something similar to the memsave and pmemsave monitor commands, which > copy a (portion) of the guest's memory into a file (although copying > into an already-open fd passed via SCM_RIGHTS would be nicer than > requiring a file name, as is the current case with memsave). > > And once we get qemu to expose the contents of an internal snapshot, > that same monitor command seems like it would be useful for exposing the > contents of an external snapshot such as INTER2 or BASE, rather than > having to use qemu-img in parallel on the external file. The qcow2 implementation never accesses snapshots directly. Instead there's the concept of the current L1 table, which means there is a single global state of the disk. Snapshots are immutable and are never accessed directly, only copied into the current L1 table. The single global state makes it a little tricky to access a snapshot while the VM is running. That said, the file format itself doesn't prevent an implementation from supporting read-only access to snapshots. In theory we can extend the qcow2 implementation to support this behavior. What you want sounds almost like an NBD server that can be launched/stopped while qemu is already running a VM. This could be a QEMU monitor command like: nbd-start tcp::1234 virtio-disk0 --snapshot 20120104 It would be possible to stop the server using the same tuple. Note the server needs to provide read-only access, allowing writes probably has little use and people will hose their data. Paolo: I haven't looked at the new and improved NBD server yet. Does this sound doable? Kevin: I think we need something like qcow2_snapshot_load_tmp() but it returns a full new BlockDriverState. The hard thing is that duping a read-only snapshot qcow2 state leads to sharing and lifecycle problems - what if we want to close the original BlockDriverState, will the read-only snapshot state prevent this? Stefan