qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH 0/7] qcow2: async handling of fragmented io
Date: Mon, 20 Aug 2018 18:39:26 +0200	[thread overview]
Message-ID: <60e47db0-873a-56e0-4c28-faa44896526f@redhat.com> (raw)
In-Reply-To: <b4feb9e3-2bad-4a69-12ac-83233791a3dc@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 5529 bytes --]

On 2018-08-20 18:33, Vladimir Sementsov-Ogievskiy wrote:
> 17.08.2018 22:34, Max Reitz wrote:
>> On 2018-08-16 15:58, Vladimir Sementsov-Ogievskiy wrote:
>>> 16.08.2018 03:51, Max Reitz wrote:
>>>> On 2018-08-07 19:43, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Hi all!
>>>>>
>>>>> Here is an asynchronous scheme for handling fragmented qcow2
>>>>> reads and writes. Both qcow2 read and write functions loops through
>>>>> sequential portions of data. The series aim it to parallelize these
>>>>> loops iterations.
>>>>>
>>>>> It improves performance for fragmented qcow2 images, I've tested it
>>>>> as follows:
>>>>>
>>>>> I have four 4G qcow2 images (with default 64k block size) on my ssd
>>>>> disk:
>>>>> t-seq.qcow2 - sequentially written qcow2 image
>>>>> t-reverse.qcow2 - filled by writing 64k portions from end to the start
>>>>> t-rand.qcow2 - filled by writing 64k portions (aligned) in random
>>>>> order
>>>>> t-part-rand.qcow2 - filled by shuffling order of 64k writes in 1m
>>>>> clusters
>>>>> (see source code of image generation in the end for details)
>>>>>
>>>>> and the test (sequential io by 1mb chunks):
>>>>>
>>>>> test write:
>>>>>      for t in /ssd/t-*; \
>>>>>          do sync; echo 1 > /proc/sys/vm/drop_caches; echo ===  $t 
>>>>> ===; \
>>>>>          ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none -w
>>>>> $t; \
>>>>>      done
>>>>>
>>>>> test read (same, just drop -w parameter):
>>>>>      for t in /ssd/t-*; \
>>>>>          do sync; echo 1 > /proc/sys/vm/drop_caches; echo ===  $t 
>>>>> ===; \
>>>>>          ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none $t; \
>>>>>      done
>>>>>
>>>>> short info about parameters:
>>>>>    -w - do writes (otherwise do reads)
>>>>>    -c - count of blocks
>>>>>    -s - block size
>>>>>    -t none - disable cache
>>>>>    -n - native aio
>>>>>    -d 1 - don't use parallel requests provided by qemu-img bench
>>>>> itself
>>>> Hm, actually, why not?  And how does a guest behave?
>>>>
>>>> If parallel requests on an SSD perform better, wouldn't a guest issue
>>>> parallel requests to the virtual device and thus to qcow2 anyway?
>>> Guest knows nothing about qcow2 fragmentation, so this kind of
>>> "asynchronization" could be done only at qcow2 level.
>> Hm, yes.  I'm sorry, but without having looked closer at the series
>> (which is why I'm sorry in advance), I would suspect that the
>> performance improvement comes from us being able to send parallel
>> requests to an SSD.
>>
>> So if you send large requests to an SSD, you may either send them in
>> parallel or sequentially, it doesn't matter.  But for small requests,
>> it's better to send them in parallel so the SSD always has requests in
>> its queue.
>>
>> I would think this is where the performance improvement comes from.  But
>> I would also think that a guest OS knows this and it would also send
>> many requests in parallel so the virtual block device never runs out of
>> requests.
>>
>>> However, if guest do async io, send a lot of parallel requests, it
>>> behave like qemu-img without -d 1 option, and in this case,
>>> parallel loop iterations in qcow2 doesn't have such great sense.
>>> However, I think that async parallel requests are better in
>>> general than sequential, because if device have some unused opportunity
>>> of parallelization, it will be utilized.
>> I agree that it probably doesn't make things worse performance-wise, but
>> it's always added complexity (see the diffstat), which is why I'm just
>> routinely asking how useful it is in practice. :-)
>>
>> Anyway, I suspect there are indeed cases where a guest doesn't send many
>> requests in parallel but it makes sense for the qcow2 driver to
>> parallelize it.  That would be mainly when the guest reads seemingly
>> sequential data that is then fragmented in the qcow2 file.  So basically
>> what your benchmark is testing. :-)
>>
>> Then, the guest could assume that there is no sense in parallelizing it
>> because the latency from the device is large enough, whereas in qemu
>> itself we always run dry and wait for different parts of the single
>> large request to finish.  So, yes, in that case, parallelization that's
>> internal to qcow2 would make sense.
>>
>> Now another question is, does this negatively impact devices where
>> seeking is slow, i.e. HDDs?  Unfortunately I'm not home right now, so I
>> don't have access to an HDD to test myself...
> 
> 
> hdd:
> 
> +-----------+-----------+----------+-----------+----------+
> |   file    | wr before | wr after | rd before | rd after |
> +-----------+-----------+----------+-----------+----------+
> | seq       |    39.821 |   40.513 |    38.600 |   38.916 |
> | reverse   |    60.320 |   57.902 |    98.223 |  111.717 |
> | rand      |   614.826 |  580.452 |   672.600 |  465.120 |
> | part-rand |    52.311 |   52.450 |    37.663 |   37.989 |
> +-----------+-----------+----------+-----------+----------+
> 
> hmm. 10% degradation on "reverse" case, strange magic.. However reverse
> is near to impossible.

I tend to agree.  It's faster for random, and that's what matters more.

(Distinguishing between the cases in qcow2 seems like not so good of an
idea, and making it user-configurable is probably pointless because
noone will change the default.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2018-08-20 16:39 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-07 17:43 [Qemu-devel] [PATCH 0/7] qcow2: async handling of fragmented io Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 1/7] qcow2: move qemu_co_mutex_lock below decryption procedure Vladimir Sementsov-Ogievskiy
     [not found]   ` <8e1cc18c-307f-99b1-5892-713ebd17a15f@redhat.com>
     [not found]     ` <43277786-b6b9-e18c-b0ca-064ff7c9c0c9@redhat.com>
2018-10-01 15:56       ` Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 2/7] qcow2: bdrv_co_pwritev: move encryption code out of lock Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 3/7] qcow2: split out reading normal clusters from qcow2_co_preadv Vladimir Sementsov-Ogievskiy
     [not found]   ` <6e19aaeb-8acc-beb9-5ece-9ae6101637a9@redhat.com>
2018-10-01 15:14     ` Vladimir Sementsov-Ogievskiy
2018-10-01 15:39       ` Max Reitz
2018-10-01 16:00         ` Vladimir Sementsov-Ogievskiy
2018-11-01 12:17     ` Vladimir Sementsov-Ogievskiy
2018-11-07 13:51       ` Max Reitz
2018-11-07 18:16       ` Kevin Wolf
2018-11-08 10:02         ` Vladimir Sementsov-Ogievskiy
2018-11-08 10:33           ` Kevin Wolf
2018-11-08 12:36             ` Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 4/7] qcow2: async scheme for qcow2_co_preadv Vladimir Sementsov-Ogievskiy
     [not found]   ` <08a610aa-9c78-1c83-5e48-b93080aac87b@redhat.com>
2018-10-01 15:33     ` Vladimir Sementsov-Ogievskiy
2018-10-01 15:49       ` Max Reitz
2018-10-01 16:17         ` Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 5/7] qcow2: refactor qcow2_co_pwritev: split out qcow2_co_do_pwritev Vladimir Sementsov-Ogievskiy
     [not found]   ` <5c871ce7-2cab-f897-0b06-cbc05b9ffe97@redhat.com>
2018-10-01 15:43     ` Vladimir Sementsov-Ogievskiy
2018-10-01 15:50       ` Max Reitz
2018-08-07 17:43 ` [Qemu-devel] [PATCH 6/7] qcow2: refactor qcow2_co_pwritev locals scope Vladimir Sementsov-Ogievskiy
2018-08-07 17:43 ` [Qemu-devel] [PATCH 7/7] qcow2: async scheme for qcow2_co_pwritev Vladimir Sementsov-Ogievskiy
     [not found]   ` <1c8299bf-0b31-82a7-c7c4-5069581f2d94@redhat.com>
2018-10-01 15:46     ` Vladimir Sementsov-Ogievskiy
2018-08-16  0:51 ` [Qemu-devel] [PATCH 0/7] qcow2: async handling of fragmented io Max Reitz
2018-08-16 13:58   ` Vladimir Sementsov-Ogievskiy
2018-08-17 19:34     ` Max Reitz
2018-08-17 19:43       ` Denis V. Lunev
2018-08-20 16:33       ` Vladimir Sementsov-Ogievskiy
2018-08-20 16:39         ` Max Reitz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60e47db0-873a-56e0-4c28-faa44896526f@redhat.com \
    --to=mreitz@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).