From: Stefano Garzarella <sgarzare@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Peter Lieven <pl@kamp.de>,
dillaman@redhat.com, qemu-devel <qemu-devel@nongnu.org>,
qemu-block <qemu-block@nongnu.org>
Subject: Re: QEMU RBD is slow with QCOW2 images
Date: Thu, 4 Mar 2021 12:12:51 +0100 [thread overview]
Message-ID: <20210304111251.2ernxss627lllwqa@steredhat> (raw)
In-Reply-To: <YEC1nQPYf4e5o8/j@redhat.com>
On Thu, Mar 04, 2021 at 10:25:33AM +0000, Daniel P. Berrangé wrote:
>On Thu, Mar 04, 2021 at 09:55:40AM +0100, Stefano Garzarella wrote:
>> On Wed, Mar 03, 2021 at 01:47:06PM -0500, Jason Dillaman wrote:
>> > On Wed, Mar 3, 2021 at 12:41 PM Stefano Garzarella <sgarzare@redhat.com> wrote:
>> > >
>> > > Hi Jason,
>> > > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
>> > > writing data is very slow compared to a raw file.
>> > >
>> > > Comparing raw vs QCOW2 image creation with RBD I found that we use a
>> > > different object size, for the raw file I see '4 MiB objects', for QCOW2
>> > > I see '64 KiB objects' as reported on comment 14 [2].
>> > > This should be the main issue of slowness, indeed forcing in the code 4
>> > > MiB object size also for QCOW2 increased the speed a lot.
>> > >
>> > > Looking better I discovered that for raw files, we call rbd_create()
>> > > with obj_order = 0 (if 'cluster_size' options is not defined), so the
>> > > default object size is used.
>> > > Instead for QCOW2, we use obj_order = 16, since the default
>> > > 'cluster_size' defined for QCOW2, is 64 KiB.
>> > >
>> > > Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster
>> > > size, since in qcow2_co_create_opts() we remove the 'cluster_size' from
>> > > QemuOpts calling qemu_opts_to_qdict_filtered().
>> > > For some reason that I have yet to understand, after this deletion,
>> > > however remains in QemuOpts the default value of 'cluster_size' for
>> > > qcow2 (64 KiB), that it's used in qemu_rbd_co_create_opts()
>> > >
>> > > At this point my doubts are:
>> > > Does it make sense to use the same cluster_size as qcow2 as object_size
>> > > in RBD?
>> >
>> > No, not really. But it also doesn't really make any sense to put a
>> > QCOW2 image within an RBD image. To clarify from the BZ, OpenStack
>> > does not put QCOW2 images on RBD, it converts QCOW2 images into raw
>> > images to store in RBD.
>>
>> Yes, that was my doubt, thanks for the confirmation.
>>
>> Also Daniel (+CC) confirmed me the same thing, but just to be complete he
>> added that there is a case where OpenStack could use qcow2 on RBD, but in
>> this case using in-kernel RBD, so the QEMU RBD is not involved.
>>
>> >
>> > > If we want to keep the 2 options separated, how can it be done? Should
>> > > we rename the option in block/rbd.c?
>> >
>> > You can already pass overrides to the RBD block driver by just
>> > appending them after the
>> > "rbd:<filename>[:option1=value1[:option2=value2]]" portion, perhaps
>> > that could be re-used.
>>
>> I see, we should extend qemu_rbd_parse_filename() to suppurt it.
>
>We shouldn't really be extending the legacy filename syntax.
>If we need extra options we want them in the QAPI schema for
>blockdev.
Got it.
I'm still a bit confused about how QemuOpts are handled between format
and protocol drivers.
It seems that in this case the protocol tries to access some information
from the format (BLOCK_OPT_CLUSTER_SIZE).
Since the format removes this information from the QemuOpts passed to
the protocol, this takes the default value of the format, even if a
different value is specified.
Is it correct for a protocol to access BLOCK_OPT_CLUSTER_SIZE?
>
>> Maybe if we don't want to support this configuration, we should print some
>> warning messages.
>
>Note these are separate layers in QEMU block layer. qcow2 is a format
>driver, while RBD is a protocol driver. QEMU lets any format driver be
>run on top of any protocol driver in general. In practice there are
>certain combinations that are more sane and commonly used than others,
>but QEMU doesn't document this or assign support level to pairing
>right now.
Thanks for the clarification,
Stefano
next prev parent reply other threads:[~2021-03-04 11:14 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-03 17:40 QEMU RBD is slow with QCOW2 images Stefano Garzarella
2021-03-03 18:47 ` Jason Dillaman
2021-03-03 21:26 ` Peter Lieven
2021-03-04 8:58 ` Stefano Garzarella
2021-03-04 8:55 ` Stefano Garzarella
2021-03-04 10:25 ` Daniel P. Berrangé
2021-03-04 11:12 ` Stefano Garzarella [this message]
2021-03-04 11:15 ` Daniel P. Berrangé
2021-03-04 12:05 ` Kevin Wolf
2021-03-04 14:08 ` Stefano Garzarella
2021-03-04 14:59 ` Kevin Wolf
2021-03-04 17:32 ` Stefano Garzarella
2021-03-05 9:16 ` Kevin Wolf
2021-03-05 9:44 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210304111251.2ernxss627lllwqa@steredhat \
--to=sgarzare@redhat.com \
--cc=berrange@redhat.com \
--cc=dillaman@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).