qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Alberto Garcia <berto@igalia.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Fri, 7 Apr 2017 15:01:29 +0200	[thread overview]
Message-ID: <20170407130129.GE4716@noname.redhat.com> (raw)
In-Reply-To: <20170407122021.GP13602@stefanha-x1.localdomain>

[-- Attachment #1: Type: text/plain, Size: 4218 bytes --]

Am 07.04.2017 um 14:20 hat Stefan Hajnoczi geschrieben:
> On Thu, Apr 06, 2017 at 06:01:48PM +0300, Alberto Garcia wrote:
> > Here are the results (subcluster size in brackets):
> > 
> > |-----------------+----------------+-----------------+-------------------|
> > |  cluster size   | subclusters=on | subclusters=off | Max L2 cache size |
> > |-----------------+----------------+-----------------+-------------------|
> > |   2 MB (256 KB) |   440 IOPS     |  100 IOPS       | 160 KB (*)        |
> > | 512 KB  (64 KB) |  1000 IOPS     |  300 IOPS       | 640 KB            |
> > |  64 KB   (8 KB) |  3000 IOPS     | 1000 IOPS       |   5 MB            |
> > |  32 KB   (4 KB) | 12000 IOPS     | 1300 IOPS       |  10 MB            |
> > |   4 KB  (512 B) |   100 IOPS     |  100 IOPS       |  80 MB            |
> > |-----------------+----------------+-----------------+-------------------|
> > 
> >                 (*) The L2 cache must be a multiple of the cluster
> >                     size, so in this case it must be 2MB. On the table
> >                     I chose to show how much of those 2MB are actually
> >                     used so you can compare it with the other cases.
> > 
> > Some comments about the results:
> > 
> > - For the 64KB, 512KB and 2MB cases, having subclusters increases
> >   write performance roughly by three. This happens because for each
> >   cluster allocation there's less data to copy from the backing
> >   image. For the same reason, the smaller the cluster, the better the
> >   performance. As expected, 64KB clusters with no subclusters perform
> >   roughly the same as 512KB clusters with 64KB subclusters.
> > 
> > - The 32KB case is the most interesting one. Without subclusters it's
> >   not very different from the 64KB case, but having a subcluster with
> >   the same size of the I/O block eliminates the need for COW entirely
> >   and the performance skyrockets (10 times faster!).
> > 
> > - 4KB is however very slow. I attribute this to the fact that the
> >   cluster size is so small that a new cluster needs to be allocated
> >   for every single write and its refcount updated accordingly. The L2
> >   and refcount tables are also so small that they are too inefficient
> >   and need to grow all the time.
> > 
> > Here are the results when writing to an empty 40GB qcow2 image with no
> > backing file. The numbers are of course different but as you can see
> > the patterns are similar:
> > 
> > |-----------------+----------------+-----------------+-------------------|
> > |  cluster size   | subclusters=on | subclusters=off | Max L2 cache size |
> > |-----------------+----------------+-----------------+-------------------|
> > |   2 MB (256 KB) |  1200 IOPS     |  255 IOPS       | 160 KB            |
> > | 512 KB  (64 KB) |  3000 IOPS     |  700 IOPS       | 640 KB            |
> > |  64 KB   (8 KB) |  7200 IOPS     | 3300 IOPS       |   5 MB            |
> > |  32 KB   (4 KB) | 12300 IOPS     | 4200 IOPS       |  10 MB            |
> > |   4 KB  (512 B) |   100 IOPS     |  100 IOPS       |  80 MB            |
> > |-----------------+----------------+-----------------+-------------------|
> 
> I don't understand why subclusters=on performs so much better when
> there's no backing file.  Is qcow2 zeroing out the 64 KB cluster with
> subclusters=off?
> 
> It ought to just write the 4 KB data when a new cluster is touched.
> Therefore the performance should be very similar to subclusters=on.

No, it can't do that. Nobody guarantees that the cluster contains only
zeros when we don't write them. It could have been used before and then
either freed on a qcow2 level or we could be sitting on a block device
rather than a file.

One optimisation that would be possible even without subclusters is
making only a single I/O request to write the whole cluster instead of
three of them (COW head, guest write, COW tail). Without a backing file,
this improved performance almost to the level of rewrites, but it
couldn't solve the problem when a backing file was used (which is the
main use case for qcow2), so I never got to submitting a patch for it.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2017-04-07 13:01 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06 15:01 [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation Alberto Garcia
2017-04-06 16:40 ` Eric Blake
2017-04-07  8:49   ` Alberto Garcia
2017-04-07 12:41   ` Kevin Wolf
2017-04-07 14:24     ` Alberto Garcia
2017-04-21 21:09   ` [Qemu-devel] proposed qcow2 extension: cluster reservations [was: " Eric Blake
2017-04-22 17:56     ` Max Reitz
2017-04-24 11:45       ` Kevin Wolf
2017-04-24 12:46       ` Alberto Garcia
2017-04-07 12:20 ` [Qemu-devel] " Stefan Hajnoczi
2017-04-07 12:24   ` Alberto Garcia
2017-04-07 13:01   ` Kevin Wolf [this message]
2017-04-10 15:32     ` Stefan Hajnoczi
2017-04-07 17:10 ` Max Reitz
2017-04-10  8:42   ` Kevin Wolf
2017-04-10 15:03     ` Max Reitz
2017-04-11 12:56   ` Alberto Garcia
2017-04-11 14:04     ` Max Reitz
2017-04-11 14:31       ` Alberto Garcia
2017-04-11 14:45         ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-04-12 12:41           ` Alberto Garcia
2017-04-12 14:10             ` Max Reitz
2017-04-13  8:05               ` Alberto Garcia
2017-04-13  9:02                 ` Kevin Wolf
2017-04-13  9:05                   ` Alberto Garcia
2017-04-11 14:49         ` [Qemu-devel] " Kevin Wolf
2017-04-11 14:58           ` Eric Blake
2017-04-11 14:59           ` Max Reitz
2017-04-11 15:08             ` Eric Blake
2017-04-11 15:18               ` Max Reitz
2017-04-11 15:29                 ` Kevin Wolf
2017-04-11 15:29                   ` Max Reitz
2017-04-11 15:30                 ` Eric Blake
2017-04-11 15:34                   ` Max Reitz
2017-04-12 12:47           ` Alberto Garcia
2017-04-12 16:54 ` Denis V. Lunev
2017-04-13 11:58   ` Alberto Garcia
2017-04-13 12:44     ` Denis V. Lunev
2017-04-13 13:05       ` Kevin Wolf
2017-04-13 13:09         ` Denis V. Lunev
2017-04-13 13:36           ` Alberto Garcia
2017-04-13 14:06             ` Denis V. Lunev
2017-04-13 13:21       ` Alberto Garcia
2017-04-13 13:30         ` Denis V. Lunev
2017-04-13 13:59           ` Kevin Wolf
2017-04-13 15:04           ` Alberto Garcia
2017-04-13 15:17             ` Denis V. Lunev
2017-04-18 11:52               ` Alberto Garcia
2017-04-18 17:27                 ` Denis V. Lunev
2017-04-13 13:51         ` Kevin Wolf
2017-04-13 14:15           ` Alberto Garcia
2017-04-13 14:27             ` Kevin Wolf
2017-04-13 16:42               ` [Qemu-devel] [Qemu-block] " Roman Kagan
2017-04-13 14:42           ` [Qemu-devel] " Denis V. Lunev
2017-04-12 17:55 ` Denis V. Lunev
2017-04-12 18:20   ` Eric Blake
2017-04-12 19:02     ` Denis V. Lunev
2017-04-13  9:44       ` Kevin Wolf
2017-04-13 10:19         ` Denis V. Lunev
2017-04-14  1:06           ` [Qemu-devel] [Qemu-block] " John Snow
2017-04-14  4:17             ` Denis V. Lunev
2017-04-18 11:22               ` Kevin Wolf
2017-04-18 17:30                 ` Denis V. Lunev
2017-04-14  7:40             ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170407130129.GE4716@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=berto@igalia.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).