From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "fam@euphon.net" <fam@euphon.net>,
"kwolf@redhat.com" <kwolf@redhat.com>,
Denis Lunev <den@virtuozzo.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"armbru@redhat.com" <armbru@redhat.com>,
"stefanha@redhat.com" <stefanha@redhat.com>,
"jsnow@redhat.com" <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once
Date: Mon, 12 Aug 2019 18:11:34 +0200 [thread overview]
Message-ID: <ecff86a8-57d1-4c8a-d4b4-95524471d249@redhat.com> (raw)
In-Reply-To: <89c87c83-276a-7663-a239-57dbd9f91a30@virtuozzo.com>
[-- Attachment #1.1: Type: text/plain, Size: 7407 bytes --]
On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote:
> 12.08.2019 18:10, Max Reitz wrote:
>> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote:
>>> backup_cow_with_offload can transfer more than one cluster. Let
>>> backup_cow_with_bounce_buffer behave similarly. It reduces the number
>>> of IO requests, since there is no need to copy cluster by cluster.
>>>
>>> Logic around bounce_buffer allocation changed: we can't just allocate
>>> one-cluster-sized buffer to share for all iterations. We can't also
>>> allocate buffer of full-request length it may be too large, so
>>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation logic
>>> is to allocate a buffer sufficient to handle all remaining iterations
>>> at the point where we need the buffer for the first time.
>>>
>>> Bonus: get rid of pointer-to-pointer.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>> block/backup.c | 65 +++++++++++++++++++++++++++++++-------------------
>>> 1 file changed, 41 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index d482d93458..65f7212c85 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -27,6 +27,7 @@
>>> #include "qemu/error-report.h"
>>>
>>> #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
>>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024)
>>>
>>> typedef struct CowRequest {
>>> int64_t start_byte;
>>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req)
>>> qemu_co_queue_restart_all(&req->wait_queue);
>>> }
>>>
>>> -/* Copy range to target with a bounce buffer and return the bytes copied. If
>>> - * error occurred, return a negative error number */
>>> +/*
>>> + * Copy range to target with a bounce buffer and return the bytes copied. If
>>> + * error occurred, return a negative error number
>>> + *
>>> + * @bounce_buffer is assumed to enough to store
>>
>> s/to/to be/
>>
>>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes
>>> + */
>>> static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>> int64_t start,
>>> int64_t end,
>>> bool is_write_notifier,
>>> bool *error_is_read,
>>> - void **bounce_buffer)
>>> + void *bounce_buffer)
>>> {
>>> int ret;
>>> BlockBackend *blk = job->common.blk;
>>> - int nbytes;
>>> + int nbytes, remaining_bytes;
>>> int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>
>>> assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>> - bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size);
>>> - nbytes = MIN(job->cluster_size, job->len - start);
>>> - if (!*bounce_buffer) {
>>> - *bounce_buffer = blk_blockalign(blk, job->cluster_size);
>>> - }
>>> + bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start);
>>> + nbytes = MIN(end - start, job->len - start);
>>>
>>> - ret = blk_co_pread(blk, start, nbytes, *bounce_buffer, read_flags);
>>> - if (ret < 0) {
>>> - trace_backup_do_cow_read_fail(job, start, ret);
>>> - if (error_is_read) {
>>> - *error_is_read = true;
>>> +
>>> + remaining_bytes = nbytes;
>>> + while (remaining_bytes) {
>>> + int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes);
>>> +
>>> + ret = blk_co_pread(blk, start, chunk, bounce_buffer, read_flags);
>>> + if (ret < 0) {
>>> + trace_backup_do_cow_read_fail(job, start, ret);
>>> + if (error_is_read) {
>>> + *error_is_read = true;
>>> + }
>>> + goto fail;
>>> }
>>> - goto fail;
>>> - }
>>>
>>> - ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>> - job->write_flags);
>>> - if (ret < 0) {
>>> - trace_backup_do_cow_write_fail(job, start, ret);
>>> - if (error_is_read) {
>>> - *error_is_read = false;
>>> + ret = blk_co_pwrite(job->target, start, chunk, bounce_buffer,
>>> + job->write_flags);
>>> + if (ret < 0) {
>>> + trace_backup_do_cow_write_fail(job, start, ret);
>>> + if (error_is_read) {
>>> + *error_is_read = false;
>>> + }
>>> + goto fail;
>>> }
>>> - goto fail;
>>> +
>>> + start += chunk;
>>> + remaining_bytes -= chunk;
>>> }
>>>
>>> return nbytes;
>>> @@ -301,9 +313,14 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
>>> }
>>> }
>>> if (!job->use_copy_range) {
>>> + if (!bounce_buffer) {
>>> + size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER,
>>> + MAX(dirty_end - start, end - dirty_end));
>>> + bounce_buffer = blk_try_blockalign(job->common.blk, len);
>>> + }
>>
>> If you use _try_, you should probably also check whether it succeeded.
>
> Oops, you are right, of course.
>
>>
>> Anyway, I wonder whether it’d be better to just allocate this buffer
>> once per job (the first time we get here, probably) to be of size
>> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob. (And maybe add
>> a buf-size parameter similar to what the mirror jobs have.)
>>
>
> Once per job will not work, as we may have several guest writes in parallel and therefore
> several parallel copy-before-write operations.
Hm. I’m not quite happy with that because if the guest just issues many
large discards in parallel, this means that qemu will allocate a large
amount of memory.
It would be nice if there was a simple way to keep track of the total
memory usage and let requests yield if they would exceed it.
> Or if you mean writing an allocator based
> on once-allocated buffer like in mirror, I really dislike this idea, as we already have
> allocator: memalign/malloc/free and it works well, no reason to invent a new one and
> hardcode it into block layer (look at my answer to Eric on v2 of this patch for more info).
Well, at least it’d be something we can delay until blockdev-copy
arrives(TM).
Max
> Or, if you mean only backup_loop generated copy-requests, yes we may keep only one buffer for them,
> but:
> 1. it is not how it works now, so my patch is not a degradation in this case
> 2. I'm going to parallelize backup loop too, like my series "qcow2: async handling of fragmented io",
> which will need several allocated buffers anyway.
>
>>
>>> ret = backup_cow_with_bounce_buffer(job, start, dirty_end,
>>> is_write_notifier,
>>> - error_is_read, &bounce_buffer);
>>> + error_is_read, bounce_buffer);
>>> }
>>> if (ret < 0) {
>>> break;
>>>
>>
>>
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-08-12 16:12 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-10 19:31 [Qemu-devel] [PATCH v3 0/7] backup improvements Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 1/7] block/backup: deal with zero detection Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 2/7] block/backup: refactor write_flags Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 3/7] block/io: handle alignment and max_transfer for copy_range Vladimir Sementsov-Ogievskiy
2019-08-12 14:48 ` Max Reitz
2019-08-20 15:18 ` Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 4/7] block/backup: drop handling of " Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 5/7] block/backup: fix backup_cow_with_offload for last cluster Vladimir Sementsov-Ogievskiy
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once Vladimir Sementsov-Ogievskiy
2019-08-12 15:10 ` Max Reitz
2019-08-12 15:47 ` Vladimir Sementsov-Ogievskiy
2019-08-12 16:11 ` Max Reitz [this message]
2019-08-12 16:37 ` Vladimir Sementsov-Ogievskiy
2019-08-13 14:14 ` Vladimir Sementsov-Ogievskiy
2019-08-13 14:23 ` Max Reitz
2019-08-13 14:39 ` Vladimir Sementsov-Ogievskiy
2019-08-13 14:57 ` Max Reitz
2019-08-13 15:00 ` Vladimir Sementsov-Ogievskiy
2019-08-13 15:02 ` Max Reitz
2019-08-13 15:32 ` Vladimir Sementsov-Ogievskiy
2019-08-13 16:30 ` Max Reitz
2019-08-13 16:45 ` Vladimir Sementsov-Ogievskiy
2019-08-13 16:51 ` Max Reitz
2019-08-10 19:31 ` [Qemu-devel] [PATCH v3 7/7] block/backup: merge duplicated logic into backup_do_cow Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecff86a8-57d1-4c8a-d4b4-95524471d249@redhat.com \
--to=mreitz@redhat.com \
--cc=armbru@redhat.com \
--cc=den@virtuozzo.com \
--cc=fam@euphon.net \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).