Re: [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, John Snow <jsnow@redhat.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>,
	Denis Lunev <den@virtuozzo.com>,
	"armbru@redhat.com" <armbru@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags
Date: Thu, 1 Aug 2019 12:40:32 +0000	[thread overview]
Message-ID: <fc222b5c-0132-c1a3-7fe0-4c899de3e56c@virtuozzo.com> (raw)
In-Reply-To: <2225a89f-fe7e-e1be-c640-7281e8a20305@redhat.com>

01.08.2019 15:21, Max Reitz wrote:
> On 01.08.19 14:02, Vladimir Sementsov-Ogievskiy wrote:
>> 01.08.2019 14:37, Max Reitz wrote:
>>> On 01.08.19 13:32, Vladimir Sementsov-Ogievskiy wrote:
>>>> 01.08.2019 14:28, Max Reitz wrote:
>>>>> On 31.07.19 18:01, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 30.07.2019 21:28, John Snow wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/30/19 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>> write flags are constant, let's store it in BackupBlockJob instead of
>>>>>>>> recalculating. It also makes two boolean fields to be unused, so,
>>>>>>>> drop them.
>>>>>>>>
>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>>>> ---
>>>>>>>>      block/backup.c | 24 ++++++++++++------------
>>>>>>>>      1 file changed, 12 insertions(+), 12 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/block/backup.c b/block/backup.c
>>>>>>>> index c5f941101a..4651649e9d 100644
>>>>>>>> --- a/block/backup.c
>>>>>>>> +++ b/block/backup.c
>>>>>>>> @@ -47,7 +47,6 @@ typedef struct BackupBlockJob {
>>>>>>>>          uint64_t len;
>>>>>>>>          uint64_t bytes_read;
>>>>>>>>          int64_t cluster_size;
>>>>>>>> -    bool compress;
>>>>>>>>          NotifierWithReturn before_write;
>>>>>>>>          QLIST_HEAD(, CowRequest) inflight_reqs;
>>>>>>>>      
>>>>>>>> @@ -55,7 +54,7 @@ typedef struct BackupBlockJob {
>>>>>>>>          bool use_copy_range;
>>>>>>>>          int64_t copy_range_size;
>>>>>>>>      
>>>>>>>> -    bool serialize_target_writes;
>>>>>>>> +    BdrvRequestFlags write_flags;
>>>>>>>>      } BackupBlockJob;
>>>>>>>>      
>>>>>>>>      static const BlockJobDriver backup_job_driver;
>>>>>>>> @@ -110,10 +109,6 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>>          BlockBackend *blk = job->common.blk;
>>>>>>>>          int nbytes;
>>>>>>>>          int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>>>>>> -    int write_flags =
>>>>>>>> -            (job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0) |
>>>>>>>> -            (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> -
>>>>>>>>      
>>>>>>>>          assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>>          hbitmap_reset(job->copy_bitmap, start, job->cluster_size);
>>>>>>>> @@ -132,7 +127,7 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>>          }
>>>>>>>>      
>>>>>>>>          ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>>>>>>> -                        write_flags);
>>>>>>>> +                        job->write_flags);
>>>>>>>>          if (ret < 0) {
>>>>>>>>              trace_backup_do_cow_write_fail(job, start, ret);
>>>>>>>>              if (error_is_read) {
>>>>>>>> @@ -160,7 +155,6 @@ static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>>          BlockBackend *blk = job->common.blk;
>>>>>>>>          int nbytes;
>>>>>>>>          int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>>>>>> -    int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0;
>>>>>>>>      
>>>>>>>>          assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
>>>>>>>>          assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>> @@ -168,7 +162,7 @@ static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>>          nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
>>>>>>>>          hbitmap_reset(job->copy_bitmap, start, job->cluster_size * nr_clusters);
>>>>>>>>          ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
>>>>>>>> -                            read_flags, write_flags);
>>>>>>>> +                            read_flags, job->write_flags);
>>>>>>>>          if (ret < 0) {
>>>>>>>>              trace_backup_do_cow_copy_range_fail(job, start, ret);
>>>>>>>>              hbitmap_set(job->copy_bitmap, start, job->cluster_size * nr_clusters);
>>>>>>>> @@ -638,10 +632,16 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>>>>>>>>          job->sync_mode = sync_mode;
>>>>>>>>          job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
>>>>>>>>                             sync_bitmap : NULL;
>>>>>>>> -    job->compress = compress;
>>>>>>>>      
>>>>>>>> -    /* Detect image-fleecing (and similar) schemes */
>>>>>>>> -    job->serialize_target_writes = bdrv_chain_contains(target, bs);
>>>>>>>> +    /*
>>>>>>>> +     * Set write flags:
>>>>>>>> +     *  1. Detect image-fleecing (and similar) schemes
>>>>>>>> +     *  2. Handle compression
>>>>>>>> +     */
>>>>>>>> +    job->write_flags =
>>>>>>>> +            (bdrv_chain_contains(target, bs) ? BDRV_REQ_SERIALISING : 0) |
>>>>>>>> +            (compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> +
>>>>>>>>          job->cluster_size = cluster_size;
>>>>>>>>          job->copy_bitmap = copy_bitmap;
>>>>>>>>          copy_bitmap = NULL;
>>>>>>>>
>>>>>>>
>>>>>>> What happens if you did pass BDRV_REQ_WRITE_COMPRESSED to
>>>>>>> blk_co_copy_range? Is that rejected somewhere in the stack?
>>>>>>
>>>>>>
>>>>>> Hmm, I'm afraid that it will be silently ignored.
>>>>>>
>>>>>> And I have one question related to copy offload too.
>>>>>>
>>>>>> Do we really need to handle max_transfer in backup code for copy offload?
>>>>>> Is max_transfer related to it really?
>>>>>>
>>>>>> Anyway, bl.max_transfer should be handled in generic copy_range code in block/io.c
>>>>>> (if it should at all), I'm going to fix it, but may be, I can just drop this limitation
>>>>>> from backup?
>>>>>
>>>>> On a quick glance, it doesn’t look like a limitation to me but actually
>>>>> like the opposite.  backup_cow_with_bounce_buffer() only copies up to
>>>>> cluster_size, whereas backup_cow_with_offload() will copy up to the
>>>>> maximum transfer size permitted by both source and target for copy
>>>>> offloading.
>>>>>
>>>>
>>>> I mean, why not to just copy the whole chunk comes in notifier and don't care about
>>>> max_transfer? Backup loop copies cluster by cluster anyway, so only notifier may copy
>>>> larger chunk.
>>>
>>> One thing that comes to mind is the hbitmap_get() check in
>>> backup_do_cow().  You don’t want to copy everything just because the
>>> first cluster needs to be copied.
>>>
>>
>> Hmm, but seems that we do exactly this, and this is wrong. But this is separate thing..
> 
> You’re right.  It’s totally broken.  Nice.
> 
> The following gets me a nice content mismatch:
> 
> $ ./qemu-img create -f qcow2 src.qcow2 2M
> $ ./qemu-io -c 'write -P 42 0 2M' src.qcow2
> $ cp src.qcow2 ref.qcow2
> 
> {"execute":"qmp_capabilities"}
> {"return": {}}
> {"execute":"blockdev-add","arguments":
>   {"node-name":"src","driver":"qcow2",
>    "file":{"driver":"file","filename":"src.qcow2"}}}
> {"return": {}}
> {"execute":"drive-backup","arguments":
>   {"device":"src","job-id":"backup","target":"tgt.qcow2",
>    "format":"qcow2","sync":"full","speed":1048576}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268384},
>   "event": "JOB_STATUS_CHANGE",
>    "data": {"status": "created", "id": "backup"}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268436},
>   "event": "JOB_STATUS_CHANGE",
>    "data": {"status": "running", "id": "backup"}}
> {"return": {}}
> {"execute":"human-monitor-command",
>   "arguments":{"command-line":
>                "qemu-io src \"write -P 23 1114112 65536\""}}
> {"return": ""}
> {"execute":"human-monitor-command",
>   "arguments":{"command-line":
>                "qemu-io src \"write -P 66 1048576 1M\""}}
> {"return": ""}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278362}.
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "waiting", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278534},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "pending", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278778},
>   "event": "BLOCK_JOB_COMPLETED",
>   "data": {"device": "backup", "len": 2097152, "offset": 2162688,
>            "speed": 1048576, "type": "backup"}}
> {"execute":"quit"}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278884},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "concluded", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278946},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "null", "id": "backup"}}
> {"return": {}}
> 
> $ ./qemu-img compare src.qcow2
> Content mismatch at offset 1114112!
> 
> 
> Aww maaan.  Setting copy_range to false “fixes” it.  I guess this’ll
> need to be fixed for 4.1. :-/

Oops yes. Writing "this is a separate thing..", I thought that it's not very bad to
copy more than needed, but actually we must not copy clusters not-dirty in copy_bitmap,
as they may already be rewritten by the guest.

> 
>> About copy_range, I just don't sure that max_transfer is a true restriction for copy_range.
>> For example, for file_posix max_transfer comes from some specific ioctl or from sysfs.. Is
>> it appropriate as limitation for copy_file_range?
> 
> Who knows, but it’s probably the best approximation we have.

Ok. Than I'll try to move handling of max_transfer for copy_range into block/io, like it's done for
write.

> 
>> Also, Max, could you please take a look at "[PATCH v3] blockjob: drain all job nodes in block_job_drain"
>> thread? Maybe, what John questions is obvious for you.
> 
> Perhaps after fixing backup. :-/
> 
> Max
> 


-- 
Best regards,
Vladimir

next prev parent reply	other threads:[~2019-08-01 12:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 16:32 [Qemu-devel] [PATCH 0/3] backup fixes for 4.1? Vladimir Sementsov-Ogievskiy
2019-07-30 16:32 ` [Qemu-devel] [PATCH 1/3] block/backup: deal with zero detection Vladimir Sementsov-Ogievskiy
2019-07-30 18:40   ` John Snow
2019-07-31 10:01     ` Vladimir Sementsov-Ogievskiy
2019-07-31 13:45       ` John Snow
2019-08-01 11:18     ` Max Reitz
2019-08-01 11:18   ` Max Reitz
2019-07-30 16:32 ` [Qemu-devel] [PATCH 2/3] block/backup: disable copy_range for compressed backup Vladimir Sementsov-Ogievskiy
2019-07-30 18:22   ` John Snow
2019-07-31 13:51     ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:20   ` Max Reitz
2019-08-05 16:05   ` Max Reitz
2019-07-30 16:32 ` [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags Vladimir Sementsov-Ogievskiy
2019-07-30 18:28   ` John Snow
2019-07-31 16:01     ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:28       ` Max Reitz
2019-08-01 11:32         ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:37           ` Max Reitz
2019-08-01 12:02             ` Vladimir Sementsov-Ogievskiy
2019-08-01 12:21               ` Max Reitz
2019-08-01 12:40                 ` Vladimir Sementsov-Ogievskiy [this message]
2019-08-01 11:28   ` Max Reitz
2019-07-30 18:41 ` [Qemu-devel] [PATCH 0/3] backup fixes for 4.1? John Snow
2019-07-31 10:29   ` Vladimir Sementsov-Ogievskiy
2019-07-31 13:46     ` John Snow
2019-08-07 23:52     ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc222b5c-0132-c1a3-7fe0-4c899de3e56c@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).