Re: [Qemu-devel] [RFC] Re-evaluating subcluster allocation for qcow2 images

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Alberto Garcia <berto@igalia.com>
Cc: Anton Nefedov <anton.nefedov@virtuozzo.com>,
	Denis Lunev <den@virtuozzo.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Re-evaluating subcluster allocation for qcow2 images
Date: Thu, 11 Jul 2019 16:32:34 +0200	[thread overview]
Message-ID: <20190711143234.GB6594@linux.fritz.box> (raw)
In-Reply-To: <w51zhlkirzr.fsf@maestria.local.igalia.com>

Am 11.07.2019 um 16:08 hat Alberto Garcia geschrieben:
> Some questions that are still open:
> 
> - It is possible to configure very easily the number of subclusters per
>   cluster. It is now hardcoded to 32 in qcow2_do_open() but any power of
>   2 would work (just change the number there if you want to test
>   it). Would an option for this be worth adding?

I think for testing we can just change the constant. Once th feature is
merged and used in production, I don't think there is any reason to
leave bits unused.

> - We could also allow the user to choose 64 subclusters per cluster and
>   disable the "all zeroes" bits in that case. It is quite simple in
>   terms of lines of code but it would make the qcow2 spec a bit more
>   complicated.
> 
> - We would now have "all zeroes" bits at the cluster and subcluster
>   levels, so there's an ambiguity here that we need to solve. In
>   particular, what happens if we have a QCOW2_CLUSTER_ZERO_ALLOC cluster
>   but some bits from the bitmap are set? Do we ignore them completely?

The (super)cluster zero bit should probably always be clear if
subclusters are used. If it's set, we have a corrupted image.

> I also ran some I/O tests using a similar scenario like last time (SSD
> drive, 40GB backing image). Here are the results, you can see the
> difference between the previous prototype (8 subclusters per cluster)
> and the new one (32):

Is the 8 subclusters test run with the old version (64 bit L2 entries)
or the new version (128 bit L2 entries) with bits left unused?

> |--------------+----------------+---------------+-----------------|
> | Cluster size | 32 subclusters | 8 subclusters | subclusters=off |
> |--------------+----------------+---------------+-----------------|
> |         4 KB |        80 IOPS |      101 IOPS |         92 IOPS |
> |         8 KB |       108 IOPS |      299 IOPS |        417 IOPS |
> |        16 KB |      3440 IOPS |     7555 IOPS |       3347 IOPS |
> |        32 KB |     10718 IOPS |    13038 IOPS |       2435 IOPS |
> |        64 KB |     12569 IOPS |    10613 IOPS |       1622 IOPS |
> |       128 KB |     11444 IOPS |     4907 IOPS |        866 IOPS |
> |       256 KB |      9335 IOPS |     2618 IOPS |        561 IOPS |
> |       512 KB |       185 IOPS |     1678 IOPS |        353 IOPS |
> |      1024 KB |      2477 IOPS |      863 IOPS |        212 IOPS |
> |      2048 KB |      1536 IOPS |      571 IOPS |        123 IOPS |
> |--------------+----------------+---------------+-----------------|
> 
> I'm surprised about the 256 KB cluster / 32 subclusters case (I would
> expect ~3300 IOPS), but I ran it a few times and the results are always
> the same. I still haven't investigated why that happens. The rest of the
> results seem more or less normal.

Shouldn't 256k/8k perform similarly to 64k/8k, or maybe a bit better?
Why did you expect ~3300 IOPS?

I found other results more surprising. In particular:

* Why does 64k/2k perform better than 128k/4k when the block size for
  your requests is 4k?

* Why is the maximum for 8 subclusters higher than for 32 subclusters?
  I guess this does make some sense if the 8 subclusters case actually
  used 64 bit L2 entries. If you did use 128 bit entries for both 32 and
  8 subclusters, I don't see why 8 subclusters should perform better in
  any case.

* What causes the minimum at 512k with 32 subclusters? The other two
  setups have a maximum and performance decreases monotonically to both
  sides. This one has a minimum at 512k and larger cluster sizes improve
  performance again.

  In fact, 512k performs really bad compared even to subclusters=off.

Kevin

next prev parent reply	other threads:[~2019-07-11 14:33 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 13:59 [Qemu-devel] [RFC] Re-evaluating subcluster allocation for qcow2 images Alberto Garcia
2019-06-27 14:19 ` Denis Lunev
2019-06-27 15:38   ` Alberto Garcia
2019-06-27 15:42     ` Alberto Garcia
2019-06-28  9:20       ` Kevin Wolf
2019-06-28  9:53         ` Alberto Garcia
2019-06-28 10:04           ` Kevin Wolf
2019-06-28 13:19             ` Alberto Garcia
2019-06-28 14:16               ` Kevin Wolf
2019-06-28 16:31                 ` Alberto Garcia
2019-06-27 16:05     ` Denis Lunev
2019-06-28 14:43       ` Alberto Garcia
2019-06-28 14:47         ` Denis Lunev
2019-06-28 14:57         ` Kevin Wolf
2019-06-28 15:02           ` Alberto Garcia
2019-06-28 15:03             ` Denis Lunev
2019-06-28 15:10               ` Alberto Garcia
2019-06-28 15:15                 ` Kevin Wolf
2019-06-28 15:09             ` Kevin Wolf
2019-06-28 15:12               ` Alberto Garcia
2019-07-01  6:22                 ` Kevin Wolf
2019-06-27 16:54 ` Kevin Wolf
2019-06-27 17:08   ` Denis Lunev
2019-06-28 16:32     ` Alberto Garcia
2019-07-11 14:08     ` Alberto Garcia
2019-07-11 14:32       ` Kevin Wolf [this message]
2019-07-11 14:56         ` Alberto Garcia
2019-06-28 12:57   ` Alberto Garcia
2019-06-28 13:03     ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190711143234.GB6594@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=anton.nefedov@virtuozzo.com \
    --cc=berto@igalia.com \
    --cc=den@virtuozzo.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).