From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:40892)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den-lists@parallels.com>) id 1aIfUd-0000e8-UL
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:38:57 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <den-lists@parallels.com>) id 1aIfUY-0007e1-RS
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:38:55 -0500
Received: from relay.parallels.com ([195.214.232.42]:44609)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <den-lists@parallels.com>) id 1aIfUY-0007cw-Jj
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:38:50 -0500
References: <1450802786-20893-1-git-send-email-kwolf@redhat.com>
	<567B860F.5080902@parallels.com>
	<20160111163351.GI9454@noname.redhat.com>
From: "Denis V. Lunev" <den-lists@parallels.com>
Message-ID: <5693DA87.6060403@parallels.com>
Date: Mon, 11 Jan 2016 19:38:31 +0300
MIME-Version: 1.0
In-Reply-To: <20160111163351.GI9454@noname.redhat.com>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 00/10] qcow2: Implement image
	locking
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, mreitz@redhat.com

On 01/11/2016 07:33 PM, Kevin Wolf wrote:
> Am 24.12.2015 um 06:43 hat Denis V. Lunev geschrieben:
>> On 12/22/2015 07:46 PM, Kevin Wolf wrote:
>>> Enough innocent images have died because users called 'qemu-img snapshot' while
>>> the VM was still running. Educating the users doesn't seem to be a working
>>> strategy, so this series adds locking to qcow2 that refuses to access the image
>>> read-write from two processes.
>>>
>>> Eric, this will require a libvirt update to deal with qemu crashes which leave
>>> locked images behind. The simplest thinkable way would be to unconditionally
>>> override the lock in libvirt whenever the option is present. In that case,
>>> libvirt VMs would be protected against concurrent non-libvirt accesses, but not
>>> the other way round. If you want more than that, libvirt would have to check
>>> somehow if it was its own VM that used the image and left the lock behind. I
>>> imagine that can't be too hard either.
>>>
>>> Also note that this kind of depends on Max's bdrv_close_all() series, but only
>>> in order to pass test case 142. This is not a bug in this series, but a
>>> preexisting one (bs->file can be closed before bs), and it becomes apparent
>>> when qemu fails to unlock an image due to this bug. Max's series fixes this.
>>>
>>>
>> This approach has a hole with qcow2_invalidate_cache()
>> The lock is released (and can't be kept by design) in
>> between qcow2_close()/qcow2_open() sequences if
>> I understand this correctly.
> qcow2_invalidate_cache() is only called with BDRV_O_INCOMING set, i.e.
> the instance isn't the current user and doesn't release the lock. It
> requires, however, that a previous user has released the lock, otherwise
> the qcow2_open() would fail.
>
> But even if qcow2_invalidate_cache() was called for an image that is a
> user, the behaviour wouldn't be wrong. In the period while the flag is
> cleared, there is no write access to the image file. If someone were
> opening the image for another process at this point, they would get the
> image, but again qcow2_open() would fail and we wouldn't corrupt the
> image.
>
> And finally, the goal of this series is to protect the users against
> stupid mistakes, not against malicious acts. Even if there were some
> windows in which the image wouldn't be protected (though I don't think
> they exist), having the common case safe would already be a huge
> improvement over the current state.
>
> Kevin
but how we will recover after node powerloss (which will happen
sooner or later)?

We are doomed in this case to make a blind guess whether
we are allowed to clear the flag or not.

Den