From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53882)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1aJ0dm-00030o-BV
	for qemu-devel@nongnu.org; Tue, 12 Jan 2016 10:13:47 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <den@openvz.org>) id 1aJ0di-00045c-5g
	for qemu-devel@nongnu.org; Tue, 12 Jan 2016 10:13:46 -0500
References: <1452578622-4492-1-git-send-email-den@openvz.org>
	<20160112141607.GD4841@noname.redhat.com> <569514E7.8090101@redhat.com>
From: "Denis V. Lunev" <den@openvz.org>
Message-ID: <56951813.5000402@openvz.org>
Date: Tue, 12 Jan 2016 18:13:23 +0300
MIME-Version: 1.0
In-Reply-To: <569514E7.8090101@redhat.com>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 1/1] blk: do not select PFLASH device for
 internal snapshot
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>, Kevin Wolf <kwolf@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>, qemu-devel@nongnu.org, qemu-block@nongnu.org

On 01/12/2016 05:59 PM, Paolo Bonzini wrote:
>
> On 12/01/2016 15:16, Kevin Wolf wrote:
>>> Thus we should avoid selection of "pflash" drives for VM state saving.
>>>
>>> For now "pflash" is read-write raw image as it configured by libvirt.
>>> Thus there are no such images in the field and we could safely disable
>>> ability to save state to those images inside QEMU.
>> This is obviously broken. If you write to the pflash, then it needs to
>> be snapshotted in order to keep a consistent state.
>>
>> If you want to avoid snapshotting the image, make it read-only and it
>> will be skipped even today.
> Sort of.  The point of having flash is to _not_ make it read-only, so
> that is not a solution.
>
> Flash is already being snapshotted as part of saving RAM state.  In
> fact, for this reason the device (at least the one used with OVMF; I
> haven't checked other pflash devices) can simply save it back to disk
> on the migration destination, without the need to use "migrate -b" or
> shared storage.
>
> See commit 4c0cfc72b31a79f737a64ebbe0411e4b83e25771:
>
>      Author: Laszlo Ersek <lersek@redhat.com>
>      Date:   Sat Aug 23 12:19:07 2014 +0200
>
>      pflash_cfi01: write flash contents to bdrv on incoming migration
>      
>      A drive that backs a pflash device is special:
>      - it is very small,
>      - its entire contents are kept in a RAMBlock at all times, covering the
>        guest-phys address range that provides the guest's view of the emulated
>        flash chip.
>      
>      The pflash device model keeps the drive (the host-side file) and the
>      guest-visible flash contents in sync. When migrating the guest, the
>      guest-visible flash contents (the RAMBlock) is migrated by default, but on
>      the target host, the drive (the host-side file) remains in full sync with
>      the RAMBlock only if:
>      - the source and target hosts share the storage underlying the pflash
>        drive,
>      - or the migration requests full or incremental block migration too, which
>        then covers all drives.
>      
>      Due to the special nature of pflash drives, the following scenario makes
>      sense as well:
>      - no full nor incremental block migration, covering all drives, alongside
>        the base migration (justified eg. by shared storage for "normal" (big)
>        drives),
>      - non-shared storage for pflash drives.
>      
>      In this case, currently only those portions of the flash drive are updated
>      on the target disk that the guest reprograms while running on the target
>      host.
>      
>      In order to restore accord, dump the entire flash contents to the bdrv in
>      a post_load() callback.
>      
>      - The read-only check follows the other call-sites of pflash_update();
>      - both "pfl->ro" and pflash_update() reflect / consider the case when
>        "pfl->bs" is NULL;
>      - the total size of the flash device is calculated as in
>        pflash_cfi01_realize().
>      
>      When using shared storage, or requesting full or incremental block
>      migration along with the normal migration, the patch should incur a
>      harmless rewrite from the target side.
>      
>      It is assumed that, on the target host, RAM is loaded ahead of the call to
>      pflash_post_load().
>
> I don't like very much using IF_PFLASH this way, which is why I hadn't
> replied to the patch so far---I hadn't made up my mind about *what* to
> suggest instead, or whether to just accept it.  However, it does work.
>
> Perhaps a separate "I know what I am doing" skip-snapshot option?  Or
> a device callback saying "not snapshotting this is fine"?
>
> Paolo
Paolo,

it looks I have made a bad description :(

The idea of this patch was trivial. First of all, I would like to keep
this image internally snapshoted. That is why the ultimate goal
was to switch from raw to qcow2 to keep changes inside the
image.

Though in this case this drive could be selected to save VM
state, which could be big. The function being changed selects
the image for VM state saving.

here I would like to skip IP_PFLASH from being selected to keep
it small as required by libvirt guys.

Den