From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55075)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1aJ1vN-0006HJ-W8
	for qemu-devel@nongnu.org; Tue, 12 Jan 2016 11:36:06 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1aJ1vI-0007Vd-RM
	for qemu-devel@nongnu.org; Tue, 12 Jan 2016 11:36:01 -0500
References: <1452578622-4492-1-git-send-email-den@openvz.org>
	<20160112141607.GD4841@noname.redhat.com> <569514E7.8090101@redhat.com>
	<20160112152051.GG4841@noname.redhat.com> <5695201C.2050504@openvz.org>
From: "Denis V. Lunev" <den@openvz.org>
Message-ID: <56952B5E.9080002@openvz.org>
Date: Tue, 12 Jan 2016 19:35:42 +0300
MIME-Version: 1.0
In-Reply-To: <5695201C.2050504@openvz.org>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 1/1] blk: do not select PFLASH device for
 internal snapshot
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>, qemu-devel@nongnu.org, qemu-block@nongnu.org

On 01/12/2016 06:47 PM, Denis V. Lunev wrote:
> On 01/12/2016 06:20 PM, Kevin Wolf wrote:
>> Am 12.01.2016 um 15:59 hat Paolo Bonzini geschrieben:
>>>
>>> On 12/01/2016 15:16, Kevin Wolf wrote:
>>>>> Thus we should avoid selection of "pflash" drives for VM state 
>>>>> saving.
>>>>>
>>>>> For now "pflash" is read-write raw image as it configured by libvirt.
>>>>> Thus there are no such images in the field and we could safely 
>>>>> disable
>>>>> ability to save state to those images inside QEMU.
>>>> This is obviously broken. If you write to the pflash, then it needs to
>>>> be snapshotted in order to keep a consistent state.
>>>>
>>>> If you want to avoid snapshotting the image, make it read-only and it
>>>> will be skipped even today.
>>> Sort of.  The point of having flash is to _not_ make it read-only, so
>>> that is not a solution.
>>>
>>> Flash is already being snapshotted as part of saving RAM state.  In
>>> fact, for this reason the device (at least the one used with OVMF; I
>>> haven't checked other pflash devices) can simply save it back to disk
>>> on the migration destination, without the need to use "migrate -b" or
>>> shared storage.
>>> [...]
>>> I don't like very much using IF_PFLASH this way, which is why I hadn't
>>> replied to the patch so far---I hadn't made up my mind about *what* to
>>> suggest instead, or whether to just accept it.  However, it does work.
>>>
>>> Perhaps a separate "I know what I am doing" skip-snapshot option?  Or
>>> a device callback saying "not snapshotting this is fine"?
>> Boy, is this ugly...
>>
>> What do you do with disk-only snapshots? The recovery only works as long
>> as you have VM state.
>>
>> Kevin
> actually I am in a bit of trouble :(
>
> I understand that this is ugly, but I would like to make working
> 'virsh snapshot' for OVFM VMs. This is necessary for us to make
> a release.
>
> Currently libvirt guys generate XML in the following way:
>
>   <os>
>     <type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type>
>     <loader readonly='yes' 
> type='pflash'>/usr/share/OVMF/OVMF_CODE_new.fd</loader>
> <nvram>/var/lib/libvirt/qemu/nvram/f20efi_VARS.fd</nvram>
>   </os>
>
> This results in:
>
> qemu -drive 
> file=/usr/share/OVMF/OVMF_CODE_new.fd,if=pflash,format=raw,unit=0,readonly=on 
> \
>      -drive 
> file=/var/lib/libvirt/qemu/nvram/f20efi_VARS.fd,if=pflash,format=raw,unit=1
>
> This obviously can not pass check in bdrv_all_can_snapshot()
> as 'pflash' is RW and raw, i.e. can not be snapshoted.
>
> They have discussed the switch to the following command line:
>
> qemu -drive 
> file=/usr/share/OVMF/OVMF_CODE_new.fd,if=pflash,format=raw,unit=0,readonly=on 
> \
>      -drive 
> file=/var/lib/libvirt/qemu/nvram/f20efi_VARS.fd.qcow2,if=pflash,format=qcow2,unit=1
>
> and say that in this case VM state could fall into PFLASH
> drive which is should not be big as the location of the
> file is different. This means that I am doomed here.
>
> Either we should force libvirt people to forget about their
> opinion that pflash should be small which I am unable to
> do or I should invent a way to ban VM state saving into
> pflash.
>
> OK. There are 2 options.
>
> 1) Ban pflash as it was done.
> 2) Add 'no-vmstate' flag to -drive (invented just now).
>
something like this:
diff --git a/block.c b/block.c
index 3e1877d..8900589 100644
--- a/block.c
+++ b/block.c
@@ -881,6 +881,11 @@ static QemuOptsList bdrv_runtime_opts = {
              .help = "Block driver to use for the node",
          },
          {
+            .name = "novmstate",
+            .type = QEMU_OPT_BOOL,
+            .help = "Ignore for selecting to save VM state",
+        },
+        {
              .name = BDRV_OPT_CACHE_WB,
              .type = QEMU_OPT_BOOL,
              .help = "Enable writeback mode",
@@ -957,6 +962,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BdrvChild *file,
      bs->request_alignment = 512;
      bs->zero_beyond_eof = true;
      bs->read_only = !(bs->open_flags & BDRV_O_RDWR);
+    bs->disable_vmstate_save = qemu_opt_get_bool(opts, "novmstate", false);

      if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
          error_setg(errp,
diff --git a/block/snapshot.c b/block/snapshot.c
index 2d86b88..33cdd86 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -483,6 +483,10 @@ BlockDriverState *bdrv_all_find_vmstate_bs(void)
      while (not_found && (bs = bdrv_next(bs))) {
          AioContext *ctx = bdrv_get_aio_context(bs);

+        if (bs->disable_vmstate_save) {
+            continue;
+        }
+
          aio_context_acquire(ctx);
          not_found = !bdrv_can_snapshot(bs);
          aio_context_release(ctx);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 256609d..855a209 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -438,6 +438,9 @@ struct BlockDriverState {
      /* do we need to tell the quest if we have a volatile write cache? */
      int enable_write_cache;

+    /* skip this BDS searching for one to save VM state */
+    bool disable_vmstate_save;
+
      /* the following member gives a name to every node on the bs graph. */
      char node_name[32];
      /* element of the list of named nodes building the graph */