* [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
@ 2008-06-20 14:38 Laurent Vivier
2008-06-23 2:50 ` Avi Kivity
2008-06-24 15:40 ` Kevin Wolf
0 siblings, 2 replies; 6+ messages in thread
From: Laurent Vivier @ 2008-06-20 14:38 UTC (permalink / raw)
To: qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 670 bytes --]
Hi,
this patch improves qcow2 I/O performance when used with cache=off.
It modifies qcow_aio_[read|write]_cb() to read as many clusters as
possible per bdrv_aio_[read|write]().
I've made some tests with dbench:
WITHOUT PATCH WITH PATCH
ide, cache=off,snapshot=off 20.8494 MB/sec 24.0711 MB/sec
ide, cache=off,snapshot=on 20.9349 MB/sec 24.5031 MB/sec
scsi,cache=off,snapshot=off 21.0264 MB/sec 24.6119 MB/sec
scsi,cache=off,snapshot=on 21.4184 MB/sec 24.6739 MB/sec
The gain is approximately 15%.
Regards,
Laurent
--
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay
[-- Attachment #2: qcow2.patch --]
[-- Type: text/x-patch, Size: 3961 bytes --]
Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c 2008-06-19 14:38:59.000000000 +0200
+++ qemu/block-qcow2.c 2008-06-20 14:23:35.000000000 +0200
@@ -808,6 +808,8 @@ static void qcow_aio_read_cb(void *opaqu
BlockDriverState *bs = acb->common.bs;
BDRVQcowState *s = bs->opaque;
int index_in_cluster, n1;
+ uint64_t next;
+ int n;
acb->hd_aiocb = NULL;
if (ret < 0) {
@@ -846,11 +848,22 @@ static void qcow_aio_read_cb(void *opaqu
acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9,
0, 0, 0, 0);
index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
- acb->n = s->cluster_sectors - index_in_cluster;
- if (acb->n > acb->nb_sectors)
- acb->n = acb->nb_sectors;
if (!acb->cluster_offset) {
+ /* seek how many clusters we must read from the base image */
+ n = s->cluster_sectors;
+ while (n < acb->nb_sectors + index_in_cluster) {
+ next = get_cluster_offset(bs, (acb->sector_num + n) << 9,
+ 0, 0, 0, 0);
+ if (next)
+ break;
+ n += s->cluster_sectors;
+ }
+ n -= index_in_cluster;
+ if (n > acb->nb_sectors)
+ n = acb->nb_sectors;
+ acb->n = n;
+
if (bs->backing_hd) {
/* read from the base image */
n1 = backing_read1(bs->backing_hd, acb->sector_num,
@@ -869,6 +882,9 @@ static void qcow_aio_read_cb(void *opaqu
goto redo;
}
} else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) {
+ acb->n = s->cluster_sectors - index_in_cluster;
+ if (acb->n > acb->nb_sectors)
+ acb->n = acb->nb_sectors;
/* add AIO support for compressed blocks ? */
if (decompress_cluster(s, acb->cluster_offset) < 0)
goto fail;
@@ -880,6 +896,22 @@ static void qcow_aio_read_cb(void *opaqu
ret = -EIO;
goto fail;
}
+
+ /* seek how many clusters we can read */
+
+ n = s->cluster_sectors;
+ while (n < acb->nb_sectors + index_in_cluster) {
+ next = get_cluster_offset(bs, (acb->sector_num + n) << 9,
+ 0, 0, 0, 0);
+ if (next != acb->cluster_offset + (n << 9))
+ break;
+ n += s->cluster_sectors;
+ }
+ n -= index_in_cluster;
+ if (n > acb->nb_sectors)
+ n = acb->nb_sectors;
+ acb->n = n;
+
acb->hd_aiocb = bdrv_aio_read(s->hd,
(acb->cluster_offset >> 9) + index_in_cluster,
acb->buf, acb->n, qcow_aio_read_cb, acb);
@@ -928,6 +960,9 @@ static void qcow_aio_write_cb(void *opaq
int index_in_cluster;
uint64_t cluster_offset;
const uint8_t *src_buf;
+ uint64_t next;
+ int n;
+ int alloc;
acb->hd_aiocb = NULL;
@@ -972,6 +1007,25 @@ static void qcow_aio_write_cb(void *opaq
acb->n, 1, &s->aes_encrypt_key);
src_buf = acb->cluster_data;
} else {
+
+ /* seek how many clusters we can write */
+
+ n = s->cluster_sectors;
+ while(n < acb->nb_sectors + index_in_cluster) {
+ alloc = s->cluster_sectors;
+ if (n + alloc > acb->nb_sectors + index_in_cluster)
+ alloc = acb->nb_sectors + index_in_cluster - n;
+ next = get_cluster_offset(bs, (acb->sector_num + n) << 9,
+ 1, 0, 0, alloc);
+ if (next != cluster_offset + (n << 9))
+ break;
+ n += alloc;
+ }
+ n -= index_in_cluster;
+ if (n > acb->nb_sectors)
+ n = acb->nb_sectors;
+ acb->n = n;
+
src_buf = acb->buf;
}
acb->hd_aiocb = bdrv_aio_write(s->hd,
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
2008-06-20 14:38 [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off Laurent Vivier
@ 2008-06-23 2:50 ` Avi Kivity
2008-06-24 15:40 ` Kevin Wolf
1 sibling, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2008-06-23 2:50 UTC (permalink / raw)
To: qemu-devel
Laurent Vivier wrote:
> Hi,
>
> this patch improves qcow2 I/O performance when used with cache=off.
>
> It modifies qcow_aio_[read|write]_cb() to read as many clusters as
> possible per bdrv_aio_[read|write]().
>
>
Write allocation is still slow though. For fixing this, I think the
only way is to modify get_cluster_offset() to accept and return an extent.
> I've made some tests with dbench:
>
> WITHOUT PATCH WITH PATCH
>
> ide, cache=off,snapshot=off 20.8494 MB/sec 24.0711 MB/sec
> ide, cache=off,snapshot=on 20.9349 MB/sec 24.5031 MB/sec
> scsi,cache=off,snapshot=off 21.0264 MB/sec 24.6119 MB/sec
> scsi,cache=off,snapshot=on 21.4184 MB/sec 24.6739 MB/sec
>
> The gain is approximately 15%.
>
I don't think dbench is a good test for this as it involves the guest
pagecache. 'dd' or 'fio' would be much more indicative.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
2008-06-20 14:38 [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off Laurent Vivier
2008-06-23 2:50 ` Avi Kivity
@ 2008-06-24 15:40 ` Kevin Wolf
2008-06-24 16:40 ` Laurent Vivier
1 sibling, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2008-06-24 15:40 UTC (permalink / raw)
To: Laurent Vivier; +Cc: qemu-devel
Hi Laurent,
Laurent Vivier schrieb:
> this patch improves qcow2 I/O performance when used with cache=off.
Why do you think this patch helps only for cache=off? I have applied
your patch to Xen ioemu (which has no cache=off / O_DIRECT yet) and
I certainly do see a performance gain for large block sizes (using dd).
With small block sizes like 512 bytes or 1k I lose a bit of perfomance,
though.
bonnie++ shows slightly better numbers with this patch, too. In the
case of block reads the improvement is huge and I even got double
throughput.
I also had a look at your code and it seems fine to me. (Except that
the aio callback handlers become even longer, but that is a different
problem...)
Kevin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
2008-06-24 15:40 ` Kevin Wolf
@ 2008-06-24 16:40 ` Laurent Vivier
2008-06-25 8:43 ` Kevin Wolf
0 siblings, 1 reply; 6+ messages in thread
From: Laurent Vivier @ 2008-06-24 16:40 UTC (permalink / raw)
To: Kevin Wolf; +Cc: qemu-devel
Le mardi 24 juin 2008 à 17:40 +0200, Kevin Wolf a écrit :
> Hi Laurent,
Hi Kevin,
> Laurent Vivier schrieb:
> > this patch improves qcow2 I/O performance when used with cache=off.
>
> Why do you think this patch helps only for cache=off? I have applied
> your patch to Xen ioemu (which has no cache=off / O_DIRECT yet) and
> I certainly do see a performance gain for large block sizes (using dd).
> With small block sizes like 512 bytes or 1k I lose a bit of perfomance,
> though.
In fact I made some tests with dbench and results were not as good as
with cache=off. It's why I spoke only about cache=off. But as said Avi,
dbench is not a good benchmark for this...
WITHOUT WITH
ide, cache=off,snapshot=off 20.8494 MB/sec 24.0711 MB/sec
ide, cache=off,snapshot=on 20.9349 MB/sec 24.5031 MB/sec
ide, cache=on, snapshot=off 23.6612 MB/sec 24.7186 MB/sec
ide, cache=on, snapshot=on 24.1836 MB/sec 24.7678 MB/sec
scsi,cache=off,snapshot=off 21.0264 MB/sec 24.6119 MB/sec
scsi,cache=off,snapshot=on 21.4184 MB/sec 24.6739 MB/sec
scsi,cache=on, snapshot=off 25.1483 MB/sec 24.8600 MB/sec
scsi,cache=on, snapshot=on 25.2000 MB/sec 25.2758 MB/sec
> bonnie++ shows slightly better numbers with this patch, too. In the
> case of block reads the improvement is huge and I even got double
> throughput.
> I also had a look at your code and it seems fine to me. (Except that
> the aio callback handlers become even longer, but that is a different
> problem...)
I modify this patch according Avi comments, and I'll repost it.
Thank you for your comments.
Regards,
Laurent
--
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
2008-06-24 16:40 ` Laurent Vivier
@ 2008-06-25 8:43 ` Kevin Wolf
2008-06-25 8:59 ` Laurent Vivier
0 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2008-06-25 8:43 UTC (permalink / raw)
To: Laurent Vivier; +Cc: qemu-devel
Hi Laurent,
Laurent Vivier schrieb:
> I modify this patch according Avi comments, and I'll repost it.
>
> Thank you for your comments.
It's certainly a good idea to add the change suggested by Avi. But don't
you think you'd better make a second patch out of it, on top of this
one? I feel that the patch to get_cluster_offset might not be too small
for itself.
Oh, and while you're at it... ;-) I think this function could use at
least some more comments, and splitting out some parts as new functions
could also help with its readability.
Kevin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off
2008-06-25 8:43 ` Kevin Wolf
@ 2008-06-25 8:59 ` Laurent Vivier
0 siblings, 0 replies; 6+ messages in thread
From: Laurent Vivier @ 2008-06-25 8:59 UTC (permalink / raw)
To: Kevin Wolf; +Cc: qemu-devel
Le mercredi 25 juin 2008 à 10:43 +0200, Kevin Wolf a écrit :
> Hi Laurent,
>
> Laurent Vivier schrieb:
> > I modify this patch according Avi comments, and I'll repost it.
> >
> > Thank you for your comments.
>
> It's certainly a good idea to add the change suggested by Avi. But don't
> you think you'd better make a second patch out of it, on top of this
> one? I feel that the patch to get_cluster_offset might not be too small
> for itself.
Yes, you're right. So I'll resend this patch in a serie.
> Oh, and while you're at it... ;-) I think this function could use at
> least some more comments, and splitting out some parts as new functions
> could also help with its readability.
I'll think about that. But adding more functions can also be bad for the
readability.
Thank you,
Laurent
--
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-06-25 9:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-20 14:38 [Qemu-devel] [PATCH] qcow2: improve I/O performance with cache=off Laurent Vivier
2008-06-23 2:50 ` Avi Kivity
2008-06-24 15:40 ` Kevin Wolf
2008-06-24 16:40 ` Laurent Vivier
2008-06-25 8:43 ` Kevin Wolf
2008-06-25 8:59 ` Laurent Vivier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).