From: Ryan Harper <ryanh@us.ibm.com>
To: Laurent Vivier <laurent@lvivier.info>
Cc: Chris Wright <chrisw@redhat.com>,
Mark McLoughlin <markmc@redhat.com>,
kvm-devel <kvm-devel@lists.sourceforge.net>,
Laurent Vivier <Laurent.Vivier@bull.net>,
qemu-devel@nongnu.org, Ryan Harper <ryanh@us.ibm.com>
Subject: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Date: Mon, 13 Oct 2008 14:43:28 -0500 [thread overview]
Message-ID: <20081013194328.GJ21410@us.ibm.com> (raw)
In-Reply-To: <6A99DBA5-D422-447D-BF9D-019FB394E6C6@lvivier.info>
* Laurent Vivier <laurent@lvivier.info> [2008-10-13 13:52]:
>
> Le 13 oct. 08 à 19:06, Ryan Harper a écrit :
>
> >* Anthony Liguori <anthony@codemonkey.ws> [2008-10-09 12:00]:
> >>Read performance should be unaffected by using O_DSYNC. O_DIRECT
> >>will
> >>significantly reduce read performance. I think we should use
> >>O_DSYNC by
> >>default and I have sent out a patch that contains that. We will
> >>follow
> >>up with benchmarks to demonstrate this.
> >
>
> Hi Ryan,
>
> as "cache=on" implies a factor (memory) shared by the whole system,
> you must take into account the size of the host memory and run some
> applications (several guests ?) to pollute the host cache, for
> instance you can run 4 guest and run bench in each of them
> concurrently, and you could reasonably limits the size of the host
> memory to 5 x the size of the guest memory.
> (for instance 4 guests with 128 MB on a host with 768 MB).
I'm not following you here, the only assumption I see is that we have 1g
of host mem free for caching the write.
>
> as O_DSYNC implies journal commit, you should run a bench on the ext3
> host file system concurrently to the bench on a guest to see the
> impact of the commit on each bench.
I understand the goal here, but what sort of host ext3 journaling load
is appropriate. Additionally, when we're exporting block devices, I
don't believe the ext3 journal is an issue.
>
> >
> >baremetal baseline (1g dataset):
> >---------------------------+-------+-------+--------------
> >+------------+
> >Test scenarios | bandw | % CPU | ave submit | ave
> >compl |
> >type, block size, iface | MB/s | usage | latency usec | latency
> >ms |
> >---------------------------+-------+-------+--------------
> >+------------+
> >write, 16k, lvm, direct=1 | 127.7 | 12 | 11.66 |
> >9.48 |
> >write, 64k, lvm, direct=1 | 178.4 | 5 | 13.65 |
> >27.15 |
> >write, 1M, lvm, direct=1 | 186.0 | 3 | 163.75 |
> >416.91 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >read , 16k, lvm, direct=1 | 170.4 | 15 | 10.86 |
> >7.10 |
> >read , 64k, lvm, direct=1 | 199.2 | 5 | 12.52 |
> >24.31 |
> >read , 1M, lvm, direct=1 | 202.0 | 3 | 133.74 |
> >382.67 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >
>
> Could you recall which benchmark you use ?
yeah:
fio --name=guestrun --filename=/dev/vda --rw=write --bs=${SIZE}
--ioengine=libaio --direct=1 --norandommap --numjobs=1 --group_reporting
--thread --size=1g --write_lat_log --write_bw_log --iodepth=74
>
> >kvm write (1g dataset):
> >---------------------------+-------+-------+--------------
> >+------------+
> >Test scenarios | bandw | % CPU | ave submit | ave
> >compl |
> >block size,iface,cache,sync| MB/s | usage | latency usec | latency
> >ms |
> >---------------------------+-------+-------+--------------
> >+------------+
> >16k,virtio,off,none | 135.0 | 94 | 9.1 |
> >8.71 |
> >16k,virtio,on ,none | 184.0 | 100 | 63.69 |
> >63.48 |
> >16k,virtio,on ,O_DSYNC | 150.0 | 35 | 6.63 |
> >8.31 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >64k,virtio,off,none | 169.0 | 51 | 17.10 |
> >28.00 |
> >64k,virtio,on ,none | 189.0 | 60 | 69.42 |
> >24.92 |
> >64k,virtio,on ,O_DSYNC | 171.0 | 48 | 18.83 |
> >27.72 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >1M ,virtio,off,none | 142.0 | 30 | 7176.00 |
> >523.00 |
> >1M ,virtio,on ,none | 190.0 | 45 | 5332.63 |
> >392.35 |
> >1M ,virtio,on ,O_DSYNC | 164.0 | 39 | 6444.48 |
> >471.20 |
> >---------------------------+-------+-------+--------------
> >+------------+
>
> According to the semantic, I don't understand how O_DSYNC can be
> better than cache=off in this case...
I don't have a good answer either, but O_DIRECT and O_DSYNC are
different paths through the kernel. This deserves a better reply, but
I don't have one off the top of my head.
>
> >
> >kvm read (1g dataset):
> >---------------------------+-------+-------+--------------
> >+------------+
> >Test scenarios | bandw | % CPU | ave submit | ave
> >compl |
> >block size,iface,cache,sync| MB/s | usage | latency usec | latency
> >ms |
> >---------------------------+-------+-------+--------------
> >+------------+
> >16k,virtio,off,none | 175.0 | 40 | 22.42 |
> >6.71 |
> >16k,virtio,on ,none | 211.0 | 147 | 59.49 |
> >5.54 |
> >16k,virtio,on ,O_DSYNC | 212.0 | 145 | 60.45 |
> >5.47 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >64k,virtio,off,none | 190.0 | 64 | 16.31 |
> >24.92 |
> >64k,virtio,on ,none | 546.0 | 161 | 111.06 |
> >8.54 |
> >64k,virtio,on ,O_DSYNC | 520.0 | 151 | 116.66 |
> >8.97 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >1M ,virtio,off,none | 182.0 | 32 | 5573.44 |
> >407.21 |
> >1M ,virtio,on ,none | 750.0 | 127 | 1344.65 |
> >96.42 |
> >1M ,virtio,on ,O_DSYNC | 768.0 | 123 | 1289.05 |
> >94.25 |
> >---------------------------+-------+-------+--------------
> >+------------+
>
> OK, but in this case the size of the cache for "cache=off" is the size
> of the guest cache whereas in the other cases the size of the cache is
> the size of the guest cache + the size of the host cache, this is not
> fair...
it isn't supposed to be fair, cache=off is O_DIRECT, we're reading from
the device, we *want* to be able to lean on the host cache to read the
data, pay once and benefit in other guests if possible.
>
> >
> >--------------------------------------------------------------------------
> >exporting file in ext3 filesystem as block device (1g)
> >--------------------------------------------------------------------------
> >
> >kvm write (1g dataset):
> >---------------------------+-------+-------+--------------
> >+------------+
> >Test scenarios | bandw | % CPU | ave submit | ave
> >compl |
> >block size,iface,cache,sync| MB/s | usage | latency usec | latency
> >ms |
> >---------------------------+-------+-------+--------------
> >+------------+
> >16k,virtio,off,none | 12.1 | 15 | 9.1 |
> >8.71 |
> >16k,virtio,on ,none | 192.0 | 52 | 62.52 |
> >6.17 |
> >16k,virtio,on ,O_DSYNC | 142.0 | 59 | 18.81 |
> >8.29 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >64k,virtio,off,none | 15.5 | 8 | 21.10 |
> >311.00 |
> >64k,virtio,on ,none | 454.0 | 130 | 113.25 |
> >10.65 |
> >64k,virtio,on ,O_DSYNC | 154.0 | 48 | 20.25 |
> >30.75 |
> >---------------------------+-------+-------+--------------
> >+------------+
> >1M ,virtio,off,none | 24.7 | 5 | 41736.22 |
> >3020.08 |
> >1M ,virtio,on ,none | 485.0 | 100 | 2052.09 |
> >149.81 |
> >1M ,virtio,on ,O_DSYNC | 161.0 | 42 | 6268.84 |
> >453.84 |
> >---------------------------+-------+-------+--------------
> >+------------+
>
> What file type do you use (qcow2, raw ?).
Raw.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@us.ibm.com
next prev parent reply other threads:[~2008-10-13 19:43 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-09 17:00 [Qemu-devel] [RFC] Disk integrity in QEMU Anthony Liguori
2008-10-10 7:54 ` Gerd Hoffmann
2008-10-10 8:12 ` Mark McLoughlin
2008-10-12 23:10 ` Jamie Lokier
2008-10-14 17:15 ` Avi Kivity
2008-10-10 9:32 ` Avi Kivity
2008-10-12 23:00 ` Jamie Lokier
2008-10-10 8:11 ` Aurelien Jarno
2008-10-10 12:26 ` Anthony Liguori
2008-10-10 12:53 ` Paul Brook
2008-10-10 13:55 ` Anthony Liguori
2008-10-10 14:05 ` Paul Brook
2008-10-10 14:19 ` Avi Kivity
2008-10-17 13:14 ` Jens Axboe
2008-10-19 9:13 ` Avi Kivity
2008-10-10 15:48 ` Aurelien Jarno
2008-10-10 9:16 ` Avi Kivity
2008-10-10 9:58 ` Daniel P. Berrange
2008-10-10 10:26 ` Avi Kivity
2008-10-10 12:59 ` Paul Brook
2008-10-10 13:20 ` Avi Kivity
2008-10-10 12:34 ` Anthony Liguori
2008-10-10 12:56 ` Avi Kivity
2008-10-11 9:07 ` andrzej zaborowski
2008-10-11 17:54 ` Mark Wagner
2008-10-11 20:35 ` Anthony Liguori
2008-10-12 0:43 ` Mark Wagner
2008-10-12 1:50 ` Chris Wright
2008-10-12 16:22 ` Jamie Lokier
2008-10-12 17:54 ` Anthony Liguori
2008-10-12 18:14 ` nuitari-qemu
2008-10-13 0:27 ` Mark Wagner
2008-10-13 1:21 ` Anthony Liguori
2008-10-13 2:09 ` Mark Wagner
2008-10-13 3:16 ` Anthony Liguori
2008-10-13 6:42 ` Aurelien Jarno
2008-10-13 14:38 ` Steve Ofsthun
2008-10-12 0:44 ` Chris Wright
2008-10-12 10:21 ` Avi Kivity
2008-10-12 14:37 ` Dor Laor
2008-10-12 15:35 ` Jamie Lokier
2008-10-12 18:00 ` Anthony Liguori
2008-10-12 18:02 ` Anthony Liguori
2008-10-15 10:17 ` Andrea Arcangeli
2008-10-12 17:59 ` Anthony Liguori
2008-10-12 18:34 ` Avi Kivity
2008-10-12 19:33 ` Izik Eidus
2008-10-14 17:08 ` Avi Kivity
2008-10-12 19:59 ` Anthony Liguori
2008-10-12 20:43 ` Avi Kivity
2008-10-12 21:11 ` Anthony Liguori
2008-10-14 15:21 ` Avi Kivity
2008-10-14 15:32 ` Anthony Liguori
2008-10-14 15:43 ` Avi Kivity
2008-10-14 19:25 ` Laurent Vivier
2008-10-16 9:47 ` Avi Kivity
2008-10-12 10:12 ` Avi Kivity
2008-10-17 13:20 ` Jens Axboe
2008-10-19 9:01 ` Avi Kivity
2008-10-19 18:10 ` Jens Axboe
2008-10-19 18:23 ` Avi Kivity
2008-10-19 19:17 ` M. Warner Losh
2008-10-19 19:31 ` Avi Kivity
2008-10-19 18:24 ` Avi Kivity
2008-10-19 18:36 ` Jens Axboe
2008-10-19 19:11 ` Avi Kivity
2008-10-19 19:30 ` Jens Axboe
2008-10-19 20:16 ` Avi Kivity
2008-10-20 14:14 ` Avi Kivity
2008-10-10 10:03 ` Fabrice Bellard
2008-10-13 16:11 ` Laurent Vivier
2008-10-13 16:58 ` Anthony Liguori
2008-10-13 17:36 ` Jamie Lokier
2008-10-13 17:06 ` [Qemu-devel] " Ryan Harper
2008-10-13 18:43 ` Anthony Liguori
2008-10-14 16:42 ` Avi Kivity
2008-10-13 18:51 ` Laurent Vivier
2008-10-13 19:43 ` Ryan Harper [this message]
2008-10-13 20:21 ` Laurent Vivier
2008-10-13 21:05 ` Ryan Harper
2008-10-15 13:10 ` Laurent Vivier
2008-10-16 10:24 ` Laurent Vivier
2008-10-16 13:43 ` Anthony Liguori
2008-10-16 16:08 ` Laurent Vivier
2008-10-17 12:48 ` Avi Kivity
2008-10-17 13:17 ` Laurent Vivier
2008-10-14 10:05 ` Kevin Wolf
2008-10-14 14:32 ` Ryan Harper
2008-10-14 16:37 ` Avi Kivity
2008-10-13 19:00 ` Mark Wagner
2008-10-13 19:15 ` Ryan Harper
2008-10-14 16:49 ` Avi Kivity
2008-10-13 17:58 ` [Qemu-devel] " Rik van Riel
2008-10-13 18:22 ` Jamie Lokier
2008-10-13 18:34 ` Rik van Riel
2008-10-14 1:56 ` Jamie Lokier
2008-10-14 2:28 ` nuitari-qemu
2008-10-28 17:34 ` Ian Jackson
2008-10-28 17:45 ` Anthony Liguori
2008-10-28 17:50 ` Ian Jackson
2008-10-28 18:19 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081013194328.GJ21410@us.ibm.com \
--to=ryanh@us.ibm.com \
--cc=Laurent.Vivier@bull.net \
--cc=chrisw@redhat.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=laurent@lvivier.info \
--cc=markmc@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).