Re: [Qemu-devel] [PATCH v5 11/11] block/backup: use backup-top instead of write notifiers

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "fam@euphon.net" <fam@euphon.net>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"jcody@redhat.com" <jcody@redhat.com>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	Denis Lunev <den@virtuozzo.com>,
	"eblake@redhat.com" <eblake@redhat.com>,
	"jsnow@redhat.com" <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 11/11] block/backup: use backup-top instead of write notifiers
Date: Mon, 28 Jan 2019 17:53:58 +0100	[thread overview]
Message-ID: <0298f3ec-2267-c624-91bb-6136df3b786a@redhat.com> (raw)
In-Reply-To: <cd1c98bd-1f40-5aa9-01ec-5109ee36e1e8@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 5963 bytes --]

On 28.01.19 17:44, Vladimir Sementsov-Ogievskiy wrote:
> 28.01.2019 18:59, Max Reitz wrote:
>> On 28.01.19 12:29, Vladimir Sementsov-Ogievskiy wrote:
>>> 18.01.2019 17:56, Max Reitz wrote:
>>>> On 29.12.18 13:20, Vladimir Sementsov-Ogievskiy wrote:

[...]

>>>>> @@ -505,8 +474,20 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
>>>>>                if (alloced < 0) {
>>>>>                    ret = alloced;
>>>>>                } else {
>>>>> +                if (!hbitmap_get(s->copy_bitmap, offset)) {
>>>>> +                    trace_backup_do_cow_skip(job, offset);
>>>>> +                    continue; /* already copied */
>>>>> +                }
>>>>> +                if (!lock) {
>>>>> +                    lock = bdrv_co_try_lock(s->source, offset, s->cluster_size);
>>>>> +                    /*
>>>>> +                     * Dirty bit is set, which means that there are no in-flight
>>>>> +                     * write requests on this area. We must succeed.
>>>>> +                     */
>>>>> +                    assert(lock);
>>>>
>>>> What if I have a different parent node for the source that issues
>>>> concurrent writes?  This used to work fine because the before_write
>>>> notifier would still work.  After this patch, that would be broken
>>>> because those writes would not cause a CbW.
>>>
>>> But haw could you have this different parent node? After appending filter,
>>> there should not be such nodes.
>>
>> Unless you append them afterwards:
>>
>>> And I think, during backup it should be
>>> forbidden to append new parents to source, ignoring filter, as it definitely
>>> breaks what filter does.
>>
>> Agreed, but then this needs to be implemented.
>>
>>> And it applies to other block-job with their filters.
>>> If we appended a filter, we don't want someone other to write omit our filter.
>>>
>>>>
>>>> That's not so bad because we just have to make sure that all writes go
>>>> through the backup-top node.  That would make this assertion valid
>>>> again, too.  But that means we cannot share PERM_WRITE; see [1].
>>>
>>> But we don't share PERM_WRITE on source in backup_top, only on target.
>>
>> Are you sure?  The job itself shares it, and the filter shares it, too,
>> as far as I can see.  It uses bdrv_filter_default_perms(), and that does
>> seem to share PERM_WRITE.
> 
> And in bdrv_Filter_default_perms it does "*nshared = *nshared | BLK_PERM_WRITE"
> only for child_file, it is target. Source is child_backing.

Hm?  bdrv_filter_default_perms() does this, unconditionally:

>     *nshared = (shared & DEFAULT_PERM_PASSTHROUGH) |
>                (c->shared_perm & DEFAULT_PERM_UNCHANGED);

The backup_top filter does what you describe, but it just leaves
*nshared untouched after bdrv_filter_default_perms() has done the above.

[...]

>>>>> @@ -655,25 +656,31 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>>>>>    
>>>>>        copy_bitmap = hbitmap_alloc(len, ctz32(cluster_size));
>>>>>    
>>>>> -    /* job->len is fixed, so we can't allow resize */
>>>>> -    job = block_job_create(job_id, &backup_job_driver, txn, bs,
>>>>> -                           BLK_PERM_CONSISTENT_READ,
>>>>> -                           BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
>>>>> -                           BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
>>>>> -                           speed, creation_flags, cb, opaque, errp);
>>>>> -    if (!job) {
>>>>> +    /*
>>>>> +     * bdrv_get_device_name will not help to find device name starting from
>>>>> +     * @bs after backup-top append,
>>>>
>>>> Why not?  Since backup-top is appended, shouldn't all parents of @bs be
>>>> parents of @backup_top then?  (Making bdrv_get_parent_name() return the
>>>> same result)
>>>
>>> bdrv_get_device_name goes finally through role->get_name, and only root role has
>>> this handler. After append we'll have backing role instead of root.
>>
>> Ah, I see, I asked the wrong question.
>>
>> Why is block_job_create() called on bs and not on backup_top?  mirror
>> calls it on mirror_top_bs.
> 
> Good question. I don't exactly remember, may be there are were more troubles with
> permissions or somthing. So, I've to try it again..
> 
> What is more beneficial?
> 
> My current approach, is that job and filter are two sibling users of source node,
> they do copying, they are synchronized. And in this way, it is better to read from
> source directly, to not create extra intersection between job and filter..
> 
> On the other hand, if we read through the filter, we possible should do the whole
> copy operation through the filter..
> 
> What is the difference between guest read and backup-job read, in filter POV? I think:
> 
> For guest read, filter MUST read (as we must handle guest request), and than, if
> we don't have too much in-flight requests, ram-cache is not full, etc, we can handle
> already read data in terms of backup, so, copy it somewhere. Or we can drop it, if
> we can't handle it at the moment..
> 
> For job read, we even MAY not read, if our queues are full, postponing job request.
> 
> So
> 
> Guest read: MUST read, MAY backup
> Job read: MAY read and backup
> 
> So, reading through filter has a possibility of common code path + native prioritization
> of copy operations. This of course will need more refactoring of backup, and may be done
> as a separate step, but definitely, I have to at least try to create job above the filter.

Well, as far as I see it, right now backup_top's read function is just a
passthrough.  I don't see a functional difference between reading from
backup_top and source, but the fact that you could save yourself the
trouble of figuring out the job ID manually.

As for the RAM cache, I thought it was just a target like any other and
backup_top wouldn't need to care at all...?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2019-01-28 16:54 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-29 12:20 [Qemu-devel] [PATCH v5 00/11] backup-top filter driver for backup Vladimir Sementsov-Ogievskiy
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 01/11] block/backup: simplify backup_incremental_init_copy_bitmap Vladimir Sementsov-Ogievskiy
2019-01-14 13:10   ` Max Reitz
2019-01-14 13:13     ` Max Reitz
2019-01-14 14:01     ` Vladimir Sementsov-Ogievskiy
2019-01-14 14:13       ` Max Reitz
2019-01-14 14:48         ` Vladimir Sementsov-Ogievskiy
2019-01-16 13:05           ` Max Reitz
2019-01-23  8:20             ` Vladimir Sementsov-Ogievskiy
2019-01-23 13:19               ` Max Reitz
2019-01-23 14:36               ` Eric Blake
2019-01-24 14:20                 ` Vladimir Sementsov-Ogievskiy
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 02/11] block/backup: move to copy_bitmap with granularity Vladimir Sementsov-Ogievskiy
2019-01-14 14:10   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 03/11] block: improve should_update_child Vladimir Sementsov-Ogievskiy
2019-01-14 14:32   ` Max Reitz
2019-01-14 16:13     ` Vladimir Sementsov-Ogievskiy
2019-01-16 13:17       ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 04/11] iotests: handle -f argument correctly for qemu_io_silent Vladimir Sementsov-Ogievskiy
2019-01-14 14:36   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 05/11] iotests: allow resume_drive by node name Vladimir Sementsov-Ogievskiy
2019-01-14 14:46   ` Max Reitz
2019-01-14 16:06     ` Vladimir Sementsov-Ogievskiy
2019-01-16 13:11       ` Max Reitz
2019-01-23 13:22         ` Vladimir Sementsov-Ogievskiy
2019-01-23 13:31           ` Vladimir Sementsov-Ogievskiy
2019-01-23 13:33             ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 06/11] iotests: prepare 055 to graph changes during backup job Vladimir Sementsov-Ogievskiy
2019-01-16 13:48   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 07/11] block: introduce backup-top filter driver Vladimir Sementsov-Ogievskiy
2019-01-16 16:02   ` Max Reitz
2019-01-17 12:13     ` Vladimir Sementsov-Ogievskiy
2019-01-18 12:05       ` Max Reitz
2019-01-23 13:47         ` Vladimir Sementsov-Ogievskiy
2019-04-13 16:08     ` Vladimir Sementsov-Ogievskiy
2019-04-13 16:08       ` Vladimir Sementsov-Ogievskiy
2019-04-13 17:03       ` Vladimir Sementsov-Ogievskiy
2019-04-13 17:03         ` Vladimir Sementsov-Ogievskiy
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 08/11] block/io: refactor wait_serialising_requests Vladimir Sementsov-Ogievskiy
2019-01-16 16:18   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 09/11] block: add lock/unlock range functions Vladimir Sementsov-Ogievskiy
2019-01-16 16:36   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 10/11] block/backup: tiny refactor backup_job_create Vladimir Sementsov-Ogievskiy
2019-01-18 13:00   ` Max Reitz
2018-12-29 12:20 ` [Qemu-devel] [PATCH v5 11/11] block/backup: use backup-top instead of write notifiers Vladimir Sementsov-Ogievskiy
2019-01-18 14:56   ` Max Reitz
2019-01-28 11:29     ` Vladimir Sementsov-Ogievskiy
2019-01-28 15:59       ` Max Reitz
2019-01-28 16:44         ` Vladimir Sementsov-Ogievskiy
2019-01-28 16:53           ` Max Reitz [this message]
2019-01-28 17:14             ` Vladimir Sementsov-Ogievskiy
2019-01-28 17:40           ` Kevin Wolf
2019-01-28 19:00             ` Vladimir Sementsov-Ogievskiy
2019-01-23 15:26 ` [Qemu-devel] [PATCH v5 00/11] backup-top filter driver for backup no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0298f3ec-2267-c624-91bb-6136df3b786a@redhat.com \
    --to=mreitz@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).