From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: Peter Lieven <pl@kamp.de>,
dillaman@redhat.com, qemu-devel <qemu-devel@nongnu.org>,
qemu-block <qemu-block@nongnu.org>
Subject: Re: QEMU RBD is slow with QCOW2 images
Date: Thu, 4 Mar 2021 11:15:11 +0000 [thread overview]
Message-ID: <YEDBP86Y7OxiApwX@redhat.com> (raw)
In-Reply-To: <20210304111251.2ernxss627lllwqa@steredhat>
On Thu, Mar 04, 2021 at 12:12:51PM +0100, Stefano Garzarella wrote:
> On Thu, Mar 04, 2021 at 10:25:33AM +0000, Daniel P. Berrangé wrote:
> > On Thu, Mar 04, 2021 at 09:55:40AM +0100, Stefano Garzarella wrote:
> > > On Wed, Mar 03, 2021 at 01:47:06PM -0500, Jason Dillaman wrote:
> > > > On Wed, Mar 3, 2021 at 12:41 PM Stefano Garzarella <sgarzare@redhat.com> wrote:
> > > > >
> > > > > Hi Jason,
> > > > > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
> > > > > writing data is very slow compared to a raw file.
> > > > >
> > > > > Comparing raw vs QCOW2 image creation with RBD I found that we use a
> > > > > different object size, for the raw file I see '4 MiB objects', for QCOW2
> > > > > I see '64 KiB objects' as reported on comment 14 [2].
> > > > > This should be the main issue of slowness, indeed forcing in the code 4
> > > > > MiB object size also for QCOW2 increased the speed a lot.
> > > > >
> > > > > Looking better I discovered that for raw files, we call rbd_create()
> > > > > with obj_order = 0 (if 'cluster_size' options is not defined), so the
> > > > > default object size is used.
> > > > > Instead for QCOW2, we use obj_order = 16, since the default
> > > > > 'cluster_size' defined for QCOW2, is 64 KiB.
> > > > >
> > > > > Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster
> > > > > size, since in qcow2_co_create_opts() we remove the 'cluster_size' from
> > > > > QemuOpts calling qemu_opts_to_qdict_filtered().
> > > > > For some reason that I have yet to understand, after this deletion,
> > > > > however remains in QemuOpts the default value of 'cluster_size' for
> > > > > qcow2 (64 KiB), that it's used in qemu_rbd_co_create_opts()
> > > > >
> > > > > At this point my doubts are:
> > > > > Does it make sense to use the same cluster_size as qcow2 as object_size
> > > > > in RBD?
> > > >
> > > > No, not really. But it also doesn't really make any sense to put a
> > > > QCOW2 image within an RBD image. To clarify from the BZ, OpenStack
> > > > does not put QCOW2 images on RBD, it converts QCOW2 images into raw
> > > > images to store in RBD.
> > >
> > > Yes, that was my doubt, thanks for the confirmation.
> > >
> > > Also Daniel (+CC) confirmed me the same thing, but just to be complete he
> > > added that there is a case where OpenStack could use qcow2 on RBD, but in
> > > this case using in-kernel RBD, so the QEMU RBD is not involved.
> > >
> > > >
> > > > > If we want to keep the 2 options separated, how can it be done? Should
> > > > > we rename the option in block/rbd.c?
> > > >
> > > > You can already pass overrides to the RBD block driver by just
> > > > appending them after the
> > > > "rbd:<filename>[:option1=value1[:option2=value2]]" portion, perhaps
> > > > that could be re-used.
> > >
> > > I see, we should extend qemu_rbd_parse_filename() to suppurt it.
> >
> > We shouldn't really be extending the legacy filename syntax.
> > If we need extra options we want them in the QAPI schema for
> > blockdev.
>
> Got it.
>
> I'm still a bit confused about how QemuOpts are handled between format and
> protocol drivers.
>
> It seems that in this case the protocol tries to access some information
> from the format (BLOCK_OPT_CLUSTER_SIZE).
>
> Since the format removes this information from the QemuOpts passed to the
> protocol, this takes the default value of the format, even if a different
> value is specified.
>
> Is it correct for a protocol to access BLOCK_OPT_CLUSTER_SIZE?
In a -blockdev world, the caller would be expected to set the values
explicitly at all layers that need it.
You're talking about a scenario that is non-blockdev though, and
I'm not sure what the right answer is here. Will need Kevin/Max
to answer that one.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2021-03-04 11:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-03 17:40 QEMU RBD is slow with QCOW2 images Stefano Garzarella
2021-03-03 18:47 ` Jason Dillaman
2021-03-03 21:26 ` Peter Lieven
2021-03-04 8:58 ` Stefano Garzarella
2021-03-04 8:55 ` Stefano Garzarella
2021-03-04 10:25 ` Daniel P. Berrangé
2021-03-04 11:12 ` Stefano Garzarella
2021-03-04 11:15 ` Daniel P. Berrangé [this message]
2021-03-04 12:05 ` Kevin Wolf
2021-03-04 14:08 ` Stefano Garzarella
2021-03-04 14:59 ` Kevin Wolf
2021-03-04 17:32 ` Stefano Garzarella
2021-03-05 9:16 ` Kevin Wolf
2021-03-05 9:44 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YEDBP86Y7OxiApwX@redhat.com \
--to=berrange@redhat.com \
--cc=dillaman@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=sgarzare@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).