qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Ryan Harper <ryanh@us.ibm.com>
To: Laurent Vivier <laurent@lvivier.info>
Cc: Chris Wright <chrisw@redhat.com>,
	Mark McLoughlin <markmc@redhat.com>,
	kvm-devel <kvm-devel@lists.sourceforge.net>,
	Laurent Vivier <Laurent.Vivier@bull.net>,
	qemu-devel@nongnu.org, Ryan Harper <ryanh@us.ibm.com>
Subject: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Date: Mon, 13 Oct 2008 14:43:28 -0500	[thread overview]
Message-ID: <20081013194328.GJ21410@us.ibm.com> (raw)
In-Reply-To: <6A99DBA5-D422-447D-BF9D-019FB394E6C6@lvivier.info>

* Laurent Vivier <laurent@lvivier.info> [2008-10-13 13:52]:
> 
> Le 13 oct. 08 à 19:06, Ryan Harper a écrit :
> 
> >* Anthony Liguori <anthony@codemonkey.ws> [2008-10-09 12:00]:
> >>Read performance should be unaffected by using O_DSYNC.  O_DIRECT  
> >>will
> >>significantly reduce read performance.  I think we should use  
> >>O_DSYNC by
> >>default and I have sent out a patch that contains that.  We will  
> >>follow
> >>up with benchmarks to demonstrate this.
> >
> 
> Hi Ryan,
> 
> as "cache=on" implies a factor (memory) shared by the whole system,  
> you must take into account the size of the host memory and run some  
> applications (several guests ?) to pollute the host cache, for  
> instance you can run 4 guest and run bench in each of them  
> concurrently, and you could reasonably limits the size of the host  
> memory to 5 x the size of the guest memory.
> (for instance 4 guests with 128 MB on a host with 768 MB).

I'm not following you here, the only assumption I see is that we have 1g
of host mem free for caching the write.


> 
> as O_DSYNC implies journal commit, you should run a bench on the ext3  
> host file system concurrently to the bench on a guest to see the  
> impact of the commit on each bench.

I understand the goal here, but what sort of host ext3 journaling load
is appropriate.  Additionally, when we're exporting block devices, I
don't believe the ext3 journal is an issue.

> 
> >
> >baremetal baseline (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >type, block size, iface    | MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >write, 16k, lvm, direct=1  | 127.7 |  12   |   11.66      |     
> >9.48    |
> >write, 64k, lvm, direct=1  | 178.4 |   5   |   13.65      |    
> >27.15    |
> >write, 1M,  lvm, direct=1  | 186.0 |   3   |  163.75      |   
> >416.91    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >read , 16k, lvm, direct=1  | 170.4 |  15   |   10.86      |     
> >7.10    |
> >read , 64k, lvm, direct=1  | 199.2 |   5   |   12.52      |    
> >24.31    |
> >read , 1M,  lvm, direct=1  | 202.0 |   3   |  133.74      |   
> >382.67    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >
> 
> Could you recall which benchmark you use ?

yeah:

fio --name=guestrun --filename=/dev/vda --rw=write --bs=${SIZE}
--ioengine=libaio --direct=1 --norandommap --numjobs=1 --group_reporting
--thread --size=1g --write_lat_log --write_bw_log --iodepth=74

> 
> >kvm write (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        | 135.0 |  94   |    9.1       |     
> >8.71    |
> >16k,virtio,on ,none        | 184.0 | 100   |   63.69      |    
> >63.48    |
> >16k,virtio,on ,O_DSYNC     | 150.0 |  35   |    6.63      |     
> >8.31    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        | 169.0 |  51   |   17.10      |    
> >28.00    |
> >64k,virtio,on ,none        | 189.0 |  60   |   69.42      |    
> >24.92    |
> >64k,virtio,on ,O_DSYNC     | 171.0 |  48   |   18.83      |    
> >27.72    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        | 142.0 |  30   |  7176.00     |   
> >523.00    |
> >1M ,virtio,on ,none        | 190.0 |  45   |  5332.63     |   
> >392.35    |
> >1M ,virtio,on ,O_DSYNC     | 164.0 |  39   |  6444.48     |   
> >471.20    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> According to the semantic, I don't understand how O_DSYNC can be  
> better than cache=off in this case...

I don't have a good answer either, but O_DIRECT and O_DSYNC are
different paths through the kernel.  This deserves a better reply, but
I don't have one off the top of my head.

> 
> >
> >kvm read (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        | 175.0 |  40   |   22.42      |     
> >6.71    |
> >16k,virtio,on ,none        | 211.0 | 147   |   59.49      |     
> >5.54    |
> >16k,virtio,on ,O_DSYNC     | 212.0 | 145   |   60.45      |     
> >5.47    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        | 190.0 |  64   |   16.31      |    
> >24.92    |
> >64k,virtio,on ,none        | 546.0 | 161   |  111.06      |     
> >8.54    |
> >64k,virtio,on ,O_DSYNC     | 520.0 | 151   |  116.66      |     
> >8.97    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        | 182.0 |  32   | 5573.44      |   
> >407.21    |
> >1M ,virtio,on ,none        | 750.0 | 127   | 1344.65      |    
> >96.42    |
> >1M ,virtio,on ,O_DSYNC     | 768.0 | 123   | 1289.05      |    
> >94.25    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> OK, but in this case the size of the cache for "cache=off" is the size  
> of the guest cache whereas in the other cases the size of the cache is  
> the size of the guest cache + the size of the host cache, this is not  
> fair...

it isn't supposed to be fair, cache=off is O_DIRECT, we're reading from
the device, we *want* to be able to lean on the host cache to read the
data, pay once and benefit in other guests if possible.

> 
> >
> >--------------------------------------------------------------------------
> >exporting file in ext3 filesystem as block device (1g)
> >--------------------------------------------------------------------------
> >
> >kvm write (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        |  12.1 |  15   |    9.1       |     
> >8.71    |
> >16k,virtio,on ,none        | 192.0 |  52   |   62.52      |     
> >6.17    |
> >16k,virtio,on ,O_DSYNC     | 142.0 |  59   |   18.81      |     
> >8.29    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        |  15.5 |   8   |   21.10      |   
> >311.00    |
> >64k,virtio,on ,none        | 454.0 | 130   |  113.25      |    
> >10.65    |
> >64k,virtio,on ,O_DSYNC     | 154.0 |  48   |   20.25      |    
> >30.75    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        |  24.7 |   5   | 41736.22     |  
> >3020.08    |
> >1M ,virtio,on ,none        | 485.0 | 100   |  2052.09     |   
> >149.81    |
> >1M ,virtio,on ,O_DSYNC     | 161.0 |  42   |  6268.84     |   
> >453.84    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> What file type do you use (qcow2, raw ?).

Raw.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

  reply	other threads:[~2008-10-13 19:43 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-09 17:00 [Qemu-devel] [RFC] Disk integrity in QEMU Anthony Liguori
2008-10-10  7:54 ` Gerd Hoffmann
2008-10-10  8:12   ` Mark McLoughlin
2008-10-12 23:10     ` Jamie Lokier
2008-10-14 17:15       ` Avi Kivity
2008-10-10  9:32   ` Avi Kivity
2008-10-12 23:00     ` Jamie Lokier
2008-10-10  8:11 ` Aurelien Jarno
2008-10-10 12:26   ` Anthony Liguori
2008-10-10 12:53     ` Paul Brook
2008-10-10 13:55       ` Anthony Liguori
2008-10-10 14:05         ` Paul Brook
2008-10-10 14:19         ` Avi Kivity
2008-10-17 13:14           ` Jens Axboe
2008-10-19  9:13             ` Avi Kivity
2008-10-10 15:48     ` Aurelien Jarno
2008-10-10  9:16 ` Avi Kivity
2008-10-10  9:58   ` Daniel P. Berrange
2008-10-10 10:26     ` Avi Kivity
2008-10-10 12:59       ` Paul Brook
2008-10-10 13:20         ` Avi Kivity
2008-10-10 12:34   ` Anthony Liguori
2008-10-10 12:56     ` Avi Kivity
2008-10-11  9:07     ` andrzej zaborowski
2008-10-11 17:54   ` Mark Wagner
2008-10-11 20:35     ` Anthony Liguori
2008-10-12  0:43       ` Mark Wagner
2008-10-12  1:50         ` Chris Wright
2008-10-12 16:22           ` Jamie Lokier
2008-10-12 17:54         ` Anthony Liguori
2008-10-12 18:14           ` nuitari-qemu
2008-10-13  0:27           ` Mark Wagner
2008-10-13  1:21             ` Anthony Liguori
2008-10-13  2:09               ` Mark Wagner
2008-10-13  3:16                 ` Anthony Liguori
2008-10-13  6:42                 ` Aurelien Jarno
2008-10-13 14:38                 ` Steve Ofsthun
2008-10-12  0:44       ` Chris Wright
2008-10-12 10:21         ` Avi Kivity
2008-10-12 14:37           ` Dor Laor
2008-10-12 15:35             ` Jamie Lokier
2008-10-12 18:00               ` Anthony Liguori
2008-10-12 18:02             ` Anthony Liguori
2008-10-15 10:17               ` Andrea Arcangeli
2008-10-12 17:59           ` Anthony Liguori
2008-10-12 18:34             ` Avi Kivity
2008-10-12 19:33               ` Izik Eidus
2008-10-14 17:08                 ` Avi Kivity
2008-10-12 19:59               ` Anthony Liguori
2008-10-12 20:43                 ` Avi Kivity
2008-10-12 21:11                   ` Anthony Liguori
2008-10-14 15:21                     ` Avi Kivity
2008-10-14 15:32                       ` Anthony Liguori
2008-10-14 15:43                         ` Avi Kivity
2008-10-14 19:25                       ` Laurent Vivier
2008-10-16  9:47                         ` Avi Kivity
2008-10-12 10:12       ` Avi Kivity
2008-10-17 13:20         ` Jens Axboe
2008-10-19  9:01           ` Avi Kivity
2008-10-19 18:10             ` Jens Axboe
2008-10-19 18:23               ` Avi Kivity
2008-10-19 19:17                 ` M. Warner Losh
2008-10-19 19:31                   ` Avi Kivity
2008-10-19 18:24               ` Avi Kivity
2008-10-19 18:36                 ` Jens Axboe
2008-10-19 19:11                   ` Avi Kivity
2008-10-19 19:30                     ` Jens Axboe
2008-10-19 20:16                       ` Avi Kivity
2008-10-20 14:14                       ` Avi Kivity
2008-10-10 10:03 ` Fabrice Bellard
2008-10-13 16:11 ` Laurent Vivier
2008-10-13 16:58   ` Anthony Liguori
2008-10-13 17:36     ` Jamie Lokier
2008-10-13 17:06 ` [Qemu-devel] " Ryan Harper
2008-10-13 18:43   ` Anthony Liguori
2008-10-14 16:42     ` Avi Kivity
2008-10-13 18:51   ` Laurent Vivier
2008-10-13 19:43     ` Ryan Harper [this message]
2008-10-13 20:21       ` Laurent Vivier
2008-10-13 21:05         ` Ryan Harper
2008-10-15 13:10           ` Laurent Vivier
2008-10-16 10:24             ` Laurent Vivier
2008-10-16 13:43               ` Anthony Liguori
2008-10-16 16:08                 ` Laurent Vivier
2008-10-17 12:48                 ` Avi Kivity
2008-10-17 13:17                   ` Laurent Vivier
2008-10-14 10:05       ` Kevin Wolf
2008-10-14 14:32         ` Ryan Harper
2008-10-14 16:37       ` Avi Kivity
2008-10-13 19:00   ` Mark Wagner
2008-10-13 19:15     ` Ryan Harper
2008-10-14 16:49       ` Avi Kivity
2008-10-13 17:58 ` [Qemu-devel] " Rik van Riel
2008-10-13 18:22   ` Jamie Lokier
2008-10-13 18:34     ` Rik van Riel
2008-10-14  1:56       ` Jamie Lokier
2008-10-14  2:28         ` nuitari-qemu
2008-10-28 17:34 ` Ian Jackson
2008-10-28 17:45   ` Anthony Liguori
2008-10-28 17:50     ` Ian Jackson
2008-10-28 18:19       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081013194328.GJ21410@us.ibm.com \
    --to=ryanh@us.ibm.com \
    --cc=Laurent.Vivier@bull.net \
    --cc=chrisw@redhat.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=laurent@lvivier.info \
    --cc=markmc@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).