From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-block@nongnu.org
Cc: kwolf@redhat.com, jsnow@redhat.com, qemu-devel@nongnu.org,
ehabkost@redhat.com, crosa@redhat.com
Subject: Re: [PATCH v3 4/6] util: implement seqcache
Date: Fri, 12 Mar 2021 16:13:36 +0100 [thread overview]
Message-ID: <f53fc06c-38df-f9fe-e927-b4f1b9bd5263@redhat.com> (raw)
In-Reply-To: <f0acd8b3-4f43-1a37-b08c-27f710fb3a60@virtuozzo.com>
On 12.03.21 15:37, Vladimir Sementsov-Ogievskiy wrote:
> 12.03.2021 16:41, Max Reitz wrote:
>> On 05.03.21 18:35, Vladimir Sementsov-Ogievskiy wrote:
>>> Implement cache for small sequential unaligned writes, so that they may
>>> be cached until we get a complete cluster and then write it.
>>>
>>> The cache is intended to be used for backup to qcow2 compressed target
>>> opened in O_DIRECT mode, but can be reused for any similar (even not
>>> block-layer related) task.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>> include/qemu/seqcache.h | 42 +++++
>>> util/seqcache.c | 361 ++++++++++++++++++++++++++++++++++++++++
>>> MAINTAINERS | 6 +
>>> util/meson.build | 1 +
>>> 4 files changed, 410 insertions(+)
>>> create mode 100644 include/qemu/seqcache.h
>>> create mode 100644 util/seqcache.c
>>
>> Looks quite good to me, thanks. Nice explanations, too. :)
>>
>> The only design question I have is whether there’s a reason you’re
>> using a list again instead of a hash table. I suppose we do need the
>> list anyway because of the next_flush iterator, so using a hash table
>> would only complicate the implementation, but still.
>
> Yes, it seems correct for flush iterator go in same order as writes
> comes, so we need a list. We can add a hash table, it will only help on
> read.. But for compressed cache in qcow2 we try to flush often enough,
> so there should not be many clusters in the cache. So I think addition
> of hash table may be done later if needed.
Sure. The problem I see is that we’ll probably never reach the point of
it really being needed. O:)
So I think it’s a question of now or never.
[...]
>>> + */
>>> +bool seqcache_get_next_flush(SeqCache *s, int64_t *offset, int64_t
>>> *bytes,
>>> + uint8_t **buf, bool *unfinished)
>>
>> Could be “uint8_t *const *buf”, I suppose. Don’t know how much the
>> callers would hate that, though.
>
> Will do. And actually I wrote quite big explanation but missed the fact
> that caller don't get ownership on buf, it should be mentioned.
Great, thanks.
>>> +{
>>> + Cluster *req = s->next_flush;
>>> +
>>> + if (s->next_flush) {
>>> + *unfinished = false;
>>> + req = s->next_flush;
>>> + s->next_flush = QSIMPLEQ_NEXT(req, entry);
>>> + if (s->next_flush == s->cur_write) {
>>> + s->next_flush = NULL;
>>> + }
>>> + } else if (s->cur_write && *unfinished) {
>>> + req = s->cur_write;
>>
>> I was wondering whether flushing an unfinished cluster wouldn’t kind
>> of finalize it, but I suppose the problem with that would be that you
>> can’t add data to a finished cluster, which wouldn’t be that great if
>> you’re just flushing the cache without wanting to drop it all.
>>
>> (The problem I see is that flushing it later will mean all the data
>> that already has been written here will have to be rewritten. Not
>> that bad, I suppose.)
>
> Yes that's all correct. Also there is additional strong reason: qcow2
> depends on the fact that clusters become "finished" by defined rules:
> only when it really finished up the the end or when qcow2 starts writing
> another cluster.
>
> For "finished" clusters with unaligned end we can safely align this end
> up to some good alignment writing a bit more data than needed. It's safe
> because tail of the cluster is never used. And we'll perform better with
> aligned write avoiding RMW.
>
> But when flushing "unfinished" cluster, we should write exactly what we
> have in the cache, as there may happen parallel write to the same
> cluster, which will continue the sequential process.
OK, thanks for the explanation.
Max
next prev parent reply other threads:[~2021-03-12 15:15 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-05 17:35 [PATCH v3 0/6] qcow2: compressed write cache Vladimir Sementsov-Ogievskiy
2021-03-05 17:35 ` [PATCH v3 1/6] block-jobs: flush target at the end of .run() Vladimir Sementsov-Ogievskiy
2021-03-11 16:57 ` Max Reitz
2021-03-05 17:35 ` [PATCH v3 2/6] iotests: add qcow2-discard-during-rewrite Vladimir Sementsov-Ogievskiy
2021-03-05 17:35 ` [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard Vladimir Sementsov-Ogievskiy
2021-03-11 19:58 ` Max Reitz
2021-03-12 9:09 ` Vladimir Sementsov-Ogievskiy
2021-03-12 11:17 ` Max Reitz
2021-03-12 12:32 ` Vladimir Sementsov-Ogievskiy
2021-03-12 12:42 ` Vladimir Sementsov-Ogievskiy
2021-03-12 15:01 ` Max Reitz
2021-03-12 12:46 ` Vladimir Sementsov-Ogievskiy
2021-03-12 15:10 ` Max Reitz
2021-03-12 15:24 ` Vladimir Sementsov-Ogievskiy
2021-03-12 15:52 ` Max Reitz
2021-03-12 16:03 ` Vladimir Sementsov-Ogievskiy
2021-03-12 14:58 ` Max Reitz
2021-03-12 15:39 ` Vladimir Sementsov-Ogievskiy
2021-03-05 17:35 ` [PATCH v3 4/6] util: implement seqcache Vladimir Sementsov-Ogievskiy
2021-03-12 13:41 ` Max Reitz
2021-03-12 14:37 ` Vladimir Sementsov-Ogievskiy
2021-03-12 15:13 ` Max Reitz [this message]
2021-06-04 14:31 ` Vladimir Sementsov-Ogievskiy
2021-03-05 17:35 ` [PATCH v3 5/6] block-coroutine-wrapper: allow non bdrv_ prefix Vladimir Sementsov-Ogievskiy
2021-03-12 16:53 ` Max Reitz
2021-03-05 17:35 ` [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes Vladimir Sementsov-Ogievskiy
2021-03-12 18:15 ` Max Reitz
2021-03-12 18:43 ` Vladimir Sementsov-Ogievskiy
2021-03-15 9:58 ` Max Reitz
2021-03-15 14:40 ` Vladimir Sementsov-Ogievskiy
2021-03-16 12:25 ` Max Reitz
2021-03-16 17:48 ` Vladimir Sementsov-Ogievskiy
2021-03-17 8:09 ` Max Reitz
2021-03-12 18:45 ` Vladimir Sementsov-Ogievskiy
2021-03-29 20:18 ` Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f53fc06c-38df-f9fe-e927-b4f1b9bd5263@redhat.com \
--to=mreitz@redhat.com \
--cc=crosa@redhat.com \
--cc=ehabkost@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).