From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55075) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aJ1vN-0006HJ-W8 for qemu-devel@nongnu.org; Tue, 12 Jan 2016 11:36:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aJ1vI-0007Vd-RM for qemu-devel@nongnu.org; Tue, 12 Jan 2016 11:36:01 -0500 References: <1452578622-4492-1-git-send-email-den@openvz.org> <20160112141607.GD4841@noname.redhat.com> <569514E7.8090101@redhat.com> <20160112152051.GG4841@noname.redhat.com> <5695201C.2050504@openvz.org> From: "Denis V. Lunev" Message-ID: <56952B5E.9080002@openvz.org> Date: Tue, 12 Jan 2016 19:35:42 +0300 MIME-Version: 1.0 In-Reply-To: <5695201C.2050504@openvz.org> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/1] blk: do not select PFLASH device for internal snapshot List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf , Paolo Bonzini Cc: Laszlo Ersek , qemu-devel@nongnu.org, qemu-block@nongnu.org On 01/12/2016 06:47 PM, Denis V. Lunev wrote: > On 01/12/2016 06:20 PM, Kevin Wolf wrote: >> Am 12.01.2016 um 15:59 hat Paolo Bonzini geschrieben: >>> >>> On 12/01/2016 15:16, Kevin Wolf wrote: >>>>> Thus we should avoid selection of "pflash" drives for VM state >>>>> saving. >>>>> >>>>> For now "pflash" is read-write raw image as it configured by libvirt. >>>>> Thus there are no such images in the field and we could safely >>>>> disable >>>>> ability to save state to those images inside QEMU. >>>> This is obviously broken. If you write to the pflash, then it needs to >>>> be snapshotted in order to keep a consistent state. >>>> >>>> If you want to avoid snapshotting the image, make it read-only and it >>>> will be skipped even today. >>> Sort of. The point of having flash is to _not_ make it read-only, so >>> that is not a solution. >>> >>> Flash is already being snapshotted as part of saving RAM state. In >>> fact, for this reason the device (at least the one used with OVMF; I >>> haven't checked other pflash devices) can simply save it back to disk >>> on the migration destination, without the need to use "migrate -b" or >>> shared storage. >>> [...] >>> I don't like very much using IF_PFLASH this way, which is why I hadn't >>> replied to the patch so far---I hadn't made up my mind about *what* to >>> suggest instead, or whether to just accept it. However, it does work. >>> >>> Perhaps a separate "I know what I am doing" skip-snapshot option? Or >>> a device callback saying "not snapshotting this is fine"? >> Boy, is this ugly... >> >> What do you do with disk-only snapshots? The recovery only works as long >> as you have VM state. >> >> Kevin > actually I am in a bit of trouble :( > > I understand that this is ugly, but I would like to make working > 'virsh snapshot' for OVFM VMs. This is necessary for us to make > a release. > > Currently libvirt guys generate XML in the following way: > > > hvm > type='pflash'>/usr/share/OVMF/OVMF_CODE_new.fd > /var/lib/libvirt/qemu/nvram/f20efi_VARS.fd > > > This results in: > > qemu -drive > file=/usr/share/OVMF/OVMF_CODE_new.fd,if=pflash,format=raw,unit=0,readonly=on > \ > -drive > file=/var/lib/libvirt/qemu/nvram/f20efi_VARS.fd,if=pflash,format=raw,unit=1 > > This obviously can not pass check in bdrv_all_can_snapshot() > as 'pflash' is RW and raw, i.e. can not be snapshoted. > > They have discussed the switch to the following command line: > > qemu -drive > file=/usr/share/OVMF/OVMF_CODE_new.fd,if=pflash,format=raw,unit=0,readonly=on > \ > -drive > file=/var/lib/libvirt/qemu/nvram/f20efi_VARS.fd.qcow2,if=pflash,format=qcow2,unit=1 > > and say that in this case VM state could fall into PFLASH > drive which is should not be big as the location of the > file is different. This means that I am doomed here. > > Either we should force libvirt people to forget about their > opinion that pflash should be small which I am unable to > do or I should invent a way to ban VM state saving into > pflash. > > OK. There are 2 options. > > 1) Ban pflash as it was done. > 2) Add 'no-vmstate' flag to -drive (invented just now). > something like this: diff --git a/block.c b/block.c index 3e1877d..8900589 100644 --- a/block.c +++ b/block.c @@ -881,6 +881,11 @@ static QemuOptsList bdrv_runtime_opts = { .help = "Block driver to use for the node", }, { + .name = "novmstate", + .type = QEMU_OPT_BOOL, + .help = "Ignore for selecting to save VM state", + }, + { .name = BDRV_OPT_CACHE_WB, .type = QEMU_OPT_BOOL, .help = "Enable writeback mode", @@ -957,6 +962,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file, bs->request_alignment = 512; bs->zero_beyond_eof = true; bs->read_only = !(bs->open_flags & BDRV_O_RDWR); + bs->disable_vmstate_save = qemu_opt_get_bool(opts, "novmstate", false); if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) { error_setg(errp, diff --git a/block/snapshot.c b/block/snapshot.c index 2d86b88..33cdd86 100644 --- a/block/snapshot.c +++ b/block/snapshot.c @@ -483,6 +483,10 @@ BlockDriverState *bdrv_all_find_vmstate_bs(void) while (not_found && (bs = bdrv_next(bs))) { AioContext *ctx = bdrv_get_aio_context(bs); + if (bs->disable_vmstate_save) { + continue; + } + aio_context_acquire(ctx); not_found = !bdrv_can_snapshot(bs); aio_context_release(ctx); diff --git a/include/block/block_int.h b/include/block/block_int.h index 256609d..855a209 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -438,6 +438,9 @@ struct BlockDriverState { /* do we need to tell the quest if we have a volatile write cache? */ int enable_write_cache; + /* skip this BDS searching for one to save VM state */ + bool disable_vmstate_save; + /* the following member gives a name to every node on the bs graph. */ char node_name[32]; /* element of the list of named nodes building the graph */