qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, famz@redhat.com,
	stefanha@redhat.com, mreitz@redhat.com, jcody@redhat.com,
	eblake@redhat.com, jsnow@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH 2/2] block/backup: fix fleecing scheme: use serialized writes
Date: Wed, 4 Jul 2018 16:20:48 +0300	[thread overview]
Message-ID: <15ce8930-b00e-3d1e-ebd7-fc6bcc68bfce@virtuozzo.com> (raw)
In-Reply-To: <20180704103918.GC4334@localhost.localdomain>

04.07.2018 13:39, Kevin Wolf wrote:
> Am 03.07.2018 um 20:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Fleecing scheme works as follows: we want a kind of temporary snapshot
>> of active drive A. We create temporary image B, with B->backing = A.
>> Then we start backup(sync=none) from A to B. From this point, B reads
>> as point-in-time snapshot of A (A continues to be active drive,
>> accepting guest IO).
>>
>> This scheme needs some additional synchronization between reads from B
>> and backup COW operations, otherwise, the following situation is
>> theoretically possible:
>>
>> (assume B is qcow2, client is NBD client, reading from B)
>>
>> 1. client starts reading and take qcow2 mutex in qcow2_co_preadv, and
>>     goes up to l2 table loading (assume cache miss)
>>
>> 2) guest write => backup COW => qcow2 write =>
>>     try to take qcow2 mutex => waiting
>>
>> 3. l2 table loaded, we see that cluster is UNALLOCATED, go to
>>     "case QCOW2_CLUSTER_UNALLOCATED" and unlock mutex before
>>     bdrv_co_preadv(bs->backing, ...)
>>
>> 4) aha, mutex unlocked, backup COW continues, and we finally finish
>>     guest write and change cluster in our active disk A
>>
>> 5. actually, do bdrv_co_preadv(bs->backing, ...) and read
>>     _new updated_ data.
>>
>> To avoid this, let's make all COW writes serializing, to not intersect
>> with reads from B.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> I think this should work, even though we can still improve it a bit.
>
> First, I don't think we need to disable copy offloading as long as we
> add BDRV_REQ_SERIALISING support to bdrv_co_copy_range_internal(). The
> thing that's a bit strange there is that there is only a single flags
> parameter for both source and target. We may need to split that so
> that we can pass BDRV_REQ_NO_SERIALISING for the source and
> BDRV_REQ_SERIALISING for the target.

here is interesting point: why copy_range skips wait_for_serializing for 
write if k? I think it should influence only on read, isn't it? is it a bug?

another question, I don't see, where is request alignment handled in 
copy_range path, like in bdrv_co_pwritev/readv ?

>
>> +    /* Detect image fleecing */
>> +    fleecing = sync_mode == MIRROR_SYNC_MODE_NONE && target->backing->bs == bs;
> The other thing is that I'm not convinced of the condition here. This is
> the narrowest thinkable condition to recognise the exact fleecing setup
> we have in mind right now. However, it is not the condition to flag all
> setups that are affected by the same problem.
>
> I don't think sync_mode actually makes a meaningful difference here. If
> someone wants to create a full backup and already access it while the
> job is still in progress, they will still want it to be consistent.
>
> It also doesn't make a difference whether the fleecing overlay has the
> source directly attached as a backing node or whether there is a filter
> node between them.
>
> So while I'm not sure whether it really covers all interesting cases,
> this condition would already be a lot better:
>
>      fleecing = bdrv_chain_contains(target, bs);
>
> Maybe rename the variable to serialise_target_writes because it's no
> longer restricted to the exact fleecing case.
>
>> +    if (fleecing) {
>> +        if (compress) {
>> +            error_setg(errp, "Image fleecing doesn't support compressed mode.");
>> +            return NULL;
>> +        }
>> +    }
> Why not fleecing && compress instead of a nested if? And why is
> fleecing even at odds with compression?
>
> Kevin


-- 
Best regards,
Vladimir

      parent reply	other threads:[~2018-07-04 13:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-03 18:07 [Qemu-devel] [PATCH 0/2] fix image fleecing Vladimir Sementsov-Ogievskiy
2018-07-03 18:07 ` [Qemu-devel] [PATCH 1/2] block: add BDRV_REQ_SERIALISING flag Vladimir Sementsov-Ogievskiy
2018-07-04 14:44   ` Vladimir Sementsov-Ogievskiy
2018-07-04 15:08     ` Kevin Wolf
2018-07-04 16:11       ` Vladimir Sementsov-Ogievskiy
2018-07-04 16:20         ` Kevin Wolf
2018-07-04 16:36           ` Vladimir Sementsov-Ogievskiy
2018-07-04 17:06             ` Vladimir Sementsov-Ogievskiy
2018-07-05  8:34               ` Kevin Wolf
2018-07-03 18:07 ` [Qemu-devel] [PATCH 2/2] block/backup: fix fleecing scheme: use serialized writes Vladimir Sementsov-Ogievskiy
2018-07-04 10:39   ` Kevin Wolf
2018-07-04 12:31     ` Vladimir Sementsov-Ogievskiy
2018-07-04 13:20     ` Vladimir Sementsov-Ogievskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15ce8930-b00e-3d1e-ebd7-fc6bcc68bfce@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).