From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, John Snow <jsnow@redhat.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>,
Denis Lunev <den@virtuozzo.com>,
"armbru@redhat.com" <armbru@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags
Date: Thu, 1 Aug 2019 12:40:32 +0000 [thread overview]
Message-ID: <fc222b5c-0132-c1a3-7fe0-4c899de3e56c@virtuozzo.com> (raw)
In-Reply-To: <2225a89f-fe7e-e1be-c640-7281e8a20305@redhat.com>
01.08.2019 15:21, Max Reitz wrote:
> On 01.08.19 14:02, Vladimir Sementsov-Ogievskiy wrote:
>> 01.08.2019 14:37, Max Reitz wrote:
>>> On 01.08.19 13:32, Vladimir Sementsov-Ogievskiy wrote:
>>>> 01.08.2019 14:28, Max Reitz wrote:
>>>>> On 31.07.19 18:01, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 30.07.2019 21:28, John Snow wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/30/19 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>> write flags are constant, let's store it in BackupBlockJob instead of
>>>>>>>> recalculating. It also makes two boolean fields to be unused, so,
>>>>>>>> drop them.
>>>>>>>>
>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>>>> ---
>>>>>>>> block/backup.c | 24 ++++++++++++------------
>>>>>>>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/block/backup.c b/block/backup.c
>>>>>>>> index c5f941101a..4651649e9d 100644
>>>>>>>> --- a/block/backup.c
>>>>>>>> +++ b/block/backup.c
>>>>>>>> @@ -47,7 +47,6 @@ typedef struct BackupBlockJob {
>>>>>>>> uint64_t len;
>>>>>>>> uint64_t bytes_read;
>>>>>>>> int64_t cluster_size;
>>>>>>>> - bool compress;
>>>>>>>> NotifierWithReturn before_write;
>>>>>>>> QLIST_HEAD(, CowRequest) inflight_reqs;
>>>>>>>>
>>>>>>>> @@ -55,7 +54,7 @@ typedef struct BackupBlockJob {
>>>>>>>> bool use_copy_range;
>>>>>>>> int64_t copy_range_size;
>>>>>>>>
>>>>>>>> - bool serialize_target_writes;
>>>>>>>> + BdrvRequestFlags write_flags;
>>>>>>>> } BackupBlockJob;
>>>>>>>>
>>>>>>>> static const BlockJobDriver backup_job_driver;
>>>>>>>> @@ -110,10 +109,6 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>> BlockBackend *blk = job->common.blk;
>>>>>>>> int nbytes;
>>>>>>>> int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>>>>>> - int write_flags =
>>>>>>>> - (job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0) |
>>>>>>>> - (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> -
>>>>>>>>
>>>>>>>> assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>> hbitmap_reset(job->copy_bitmap, start, job->cluster_size);
>>>>>>>> @@ -132,7 +127,7 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>> }
>>>>>>>>
>>>>>>>> ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>>>>>>> - write_flags);
>>>>>>>> + job->write_flags);
>>>>>>>> if (ret < 0) {
>>>>>>>> trace_backup_do_cow_write_fail(job, start, ret);
>>>>>>>> if (error_is_read) {
>>>>>>>> @@ -160,7 +155,6 @@ static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>> BlockBackend *blk = job->common.blk;
>>>>>>>> int nbytes;
>>>>>>>> int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>>>>>> - int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0;
>>>>>>>>
>>>>>>>> assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
>>>>>>>> assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>> @@ -168,7 +162,7 @@ static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>> nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
>>>>>>>> hbitmap_reset(job->copy_bitmap, start, job->cluster_size * nr_clusters);
>>>>>>>> ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
>>>>>>>> - read_flags, write_flags);
>>>>>>>> + read_flags, job->write_flags);
>>>>>>>> if (ret < 0) {
>>>>>>>> trace_backup_do_cow_copy_range_fail(job, start, ret);
>>>>>>>> hbitmap_set(job->copy_bitmap, start, job->cluster_size * nr_clusters);
>>>>>>>> @@ -638,10 +632,16 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>>>>>>>> job->sync_mode = sync_mode;
>>>>>>>> job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
>>>>>>>> sync_bitmap : NULL;
>>>>>>>> - job->compress = compress;
>>>>>>>>
>>>>>>>> - /* Detect image-fleecing (and similar) schemes */
>>>>>>>> - job->serialize_target_writes = bdrv_chain_contains(target, bs);
>>>>>>>> + /*
>>>>>>>> + * Set write flags:
>>>>>>>> + * 1. Detect image-fleecing (and similar) schemes
>>>>>>>> + * 2. Handle compression
>>>>>>>> + */
>>>>>>>> + job->write_flags =
>>>>>>>> + (bdrv_chain_contains(target, bs) ? BDRV_REQ_SERIALISING : 0) |
>>>>>>>> + (compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> +
>>>>>>>> job->cluster_size = cluster_size;
>>>>>>>> job->copy_bitmap = copy_bitmap;
>>>>>>>> copy_bitmap = NULL;
>>>>>>>>
>>>>>>>
>>>>>>> What happens if you did pass BDRV_REQ_WRITE_COMPRESSED to
>>>>>>> blk_co_copy_range? Is that rejected somewhere in the stack?
>>>>>>
>>>>>>
>>>>>> Hmm, I'm afraid that it will be silently ignored.
>>>>>>
>>>>>> And I have one question related to copy offload too.
>>>>>>
>>>>>> Do we really need to handle max_transfer in backup code for copy offload?
>>>>>> Is max_transfer related to it really?
>>>>>>
>>>>>> Anyway, bl.max_transfer should be handled in generic copy_range code in block/io.c
>>>>>> (if it should at all), I'm going to fix it, but may be, I can just drop this limitation
>>>>>> from backup?
>>>>>
>>>>> On a quick glance, it doesn’t look like a limitation to me but actually
>>>>> like the opposite. backup_cow_with_bounce_buffer() only copies up to
>>>>> cluster_size, whereas backup_cow_with_offload() will copy up to the
>>>>> maximum transfer size permitted by both source and target for copy
>>>>> offloading.
>>>>>
>>>>
>>>> I mean, why not to just copy the whole chunk comes in notifier and don't care about
>>>> max_transfer? Backup loop copies cluster by cluster anyway, so only notifier may copy
>>>> larger chunk.
>>>
>>> One thing that comes to mind is the hbitmap_get() check in
>>> backup_do_cow(). You don’t want to copy everything just because the
>>> first cluster needs to be copied.
>>>
>>
>> Hmm, but seems that we do exactly this, and this is wrong. But this is separate thing..
>
> You’re right. It’s totally broken. Nice.
>
> The following gets me a nice content mismatch:
>
> $ ./qemu-img create -f qcow2 src.qcow2 2M
> $ ./qemu-io -c 'write -P 42 0 2M' src.qcow2
> $ cp src.qcow2 ref.qcow2
>
> {"execute":"qmp_capabilities"}
> {"return": {}}
> {"execute":"blockdev-add","arguments":
> {"node-name":"src","driver":"qcow2",
> "file":{"driver":"file","filename":"src.qcow2"}}}
> {"return": {}}
> {"execute":"drive-backup","arguments":
> {"device":"src","job-id":"backup","target":"tgt.qcow2",
> "format":"qcow2","sync":"full","speed":1048576}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268384},
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "created", "id": "backup"}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268436},
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "running", "id": "backup"}}
> {"return": {}}
> {"execute":"human-monitor-command",
> "arguments":{"command-line":
> "qemu-io src \"write -P 23 1114112 65536\""}}
> {"return": ""}
> {"execute":"human-monitor-command",
> "arguments":{"command-line":
> "qemu-io src \"write -P 66 1048576 1M\""}}
> {"return": ""}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278362}.
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "waiting", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278534},
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "pending", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278778},
> "event": "BLOCK_JOB_COMPLETED",
> "data": {"device": "backup", "len": 2097152, "offset": 2162688,
> "speed": 1048576, "type": "backup"}}
> {"execute":"quit"}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278884},
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "concluded", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278946},
> "event": "JOB_STATUS_CHANGE",
> "data": {"status": "null", "id": "backup"}}
> {"return": {}}
>
> $ ./qemu-img compare src.qcow2
> Content mismatch at offset 1114112!
>
>
> Aww maaan. Setting copy_range to false “fixes” it. I guess this’ll
> need to be fixed for 4.1. :-/
Oops yes. Writing "this is a separate thing..", I thought that it's not very bad to
copy more than needed, but actually we must not copy clusters not-dirty in copy_bitmap,
as they may already be rewritten by the guest.
>
>> About copy_range, I just don't sure that max_transfer is a true restriction for copy_range.
>> For example, for file_posix max_transfer comes from some specific ioctl or from sysfs.. Is
>> it appropriate as limitation for copy_file_range?
>
> Who knows, but it’s probably the best approximation we have.
Ok. Than I'll try to move handling of max_transfer for copy_range into block/io, like it's done for
write.
>
>> Also, Max, could you please take a look at "[PATCH v3] blockjob: drain all job nodes in block_job_drain"
>> thread? Maybe, what John questions is obvious for you.
>
> Perhaps after fixing backup. :-/
>
> Max
>
--
Best regards,
Vladimir
next prev parent reply other threads:[~2019-08-01 12:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-30 16:32 [Qemu-devel] [PATCH 0/3] backup fixes for 4.1? Vladimir Sementsov-Ogievskiy
2019-07-30 16:32 ` [Qemu-devel] [PATCH 1/3] block/backup: deal with zero detection Vladimir Sementsov-Ogievskiy
2019-07-30 18:40 ` John Snow
2019-07-31 10:01 ` Vladimir Sementsov-Ogievskiy
2019-07-31 13:45 ` John Snow
2019-08-01 11:18 ` Max Reitz
2019-08-01 11:18 ` Max Reitz
2019-07-30 16:32 ` [Qemu-devel] [PATCH 2/3] block/backup: disable copy_range for compressed backup Vladimir Sementsov-Ogievskiy
2019-07-30 18:22 ` John Snow
2019-07-31 13:51 ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:20 ` Max Reitz
2019-08-05 16:05 ` Max Reitz
2019-07-30 16:32 ` [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags Vladimir Sementsov-Ogievskiy
2019-07-30 18:28 ` John Snow
2019-07-31 16:01 ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:28 ` Max Reitz
2019-08-01 11:32 ` Vladimir Sementsov-Ogievskiy
2019-08-01 11:37 ` Max Reitz
2019-08-01 12:02 ` Vladimir Sementsov-Ogievskiy
2019-08-01 12:21 ` Max Reitz
2019-08-01 12:40 ` Vladimir Sementsov-Ogievskiy [this message]
2019-08-01 11:28 ` Max Reitz
2019-07-30 18:41 ` [Qemu-devel] [PATCH 0/3] backup fixes for 4.1? John Snow
2019-07-31 10:29 ` Vladimir Sementsov-Ogievskiy
2019-07-31 13:46 ` John Snow
2019-08-07 23:52 ` John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fc222b5c-0132-c1a3-7fe0-4c899de3e56c@virtuozzo.com \
--to=vsementsov@virtuozzo.com \
--cc=armbru@redhat.com \
--cc=den@virtuozzo.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).