qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: qemu-devel@nongnu.org
Cc: Chris Wright <chrisw@redhat.com>,
	Mark McLoughlin <markmc@redhat.com>,
	Ryan Harper <ryanh@us.ibm.com>,
	kvm-devel <kvm-devel@lists.sourceforge.net>,
	Laurent Vivier <Laurent.Vivier@bull.net>
Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU
Date: Fri, 10 Oct 2008 07:34:31 -0500	[thread overview]
Message-ID: <48EF4BD7.3040000@codemonkey.ws> (raw)
In-Reply-To: <48EF1D55.7060307@redhat.com>

Avi Kivity wrote:
> Anthony Liguori wrote:
>
> [O_DSYNC, O_DIRECT, and 0]
>
>>
>> Thoughts?
>
> There are (at least) three usage models for qemu:
>
> - OS development tool
> - casual or client-side virtualization
> - server partitioning
>
> The last two uses are almost always in conjunction with a hypervisor.
>
> When using qemu as an OS development tool, data integrity is not very 
> important.  On the other hand, performance and caching are, especially 
> as the guest is likely to be restarted multiple times so the guest 
> page cache is of limited value.  For this use model the current 
> default (write back cache) is fine.
>
> The 'causal virtualization' use is when the user has a full native 
> desktop, and is also running another operating system.  In this case, 
> the host page cache is likely to be larger than the guest page cache.  
> Data integrity is important, so write-back is out of the picture.  I 
> guess for this use case O_DSYNC is preferred though O_DIRECT might not 
> be significantly slower for long-running guests.  This is because 
> reads are unlikely to be cached and writes will not benefit much from 
> the host pagecache.
>
> For server partitioning, data integrity and performance are critical.  
> The host page cache is significantly smaller than the guest page 
> cache; if you have spare memory, give it to your guests.

I don't think this wisdom is bullet-proof.  In the case of server 
partitioning, if you're designing for the future then you can assume 
some form of host data deduplification either through qcow 
deduplification, a proper content addressable storage mechanism, or file 
system level deduplification.  It's becoming more common to see large 
amounts of homogeneous consolidation either because of cloud computing, 
virtual appliances, or just because most x86 virtualization involves 
Windows consolidation and there aren't that many versions of Windows.

In this case, there is an awful lot of opportunity for increasing 
overall system throughput by caching common data access across virtual 
machines.

> O_DIRECT is practically mandataed here; the host page cache does 
> nothing except to impose an additional copy.
>
> Given the rather small difference between O_DSYNC and O_DIRECT, I 
> favor not adding O_DSYNC as it will add only marginal value.

The difference isn't small.  Our fio runs are defeating the host page 
cache on write so we're adjusting the working set size.  But the 
difference in read performance between dsync and direct is many factors 
when the data can be cached.

> Regarding choosing the default value, I think we should change the 
> default to be safe, that is O_DIRECT.  If that is regarded as too 
> radical, the default should be O_DSYNC with options to change it to 
> O_DIRECT or writeback.  Note that some disk formats will need updating 
> like qcow2 if they are not to have abyssal performance.

I think qcow2 will be okay because the only issue is image expansion and 
that is a relatively uncommon case that is amortized throughout the life 
time of the VM.  So far, while there is objection to using O_DIRECT by 
default, I haven't seen any objection to O_DSYNC by default so as long 
as no one objects in the next few days, I think that's what we'll end up 
doing.

Regards,

Anthony Liguori

  parent reply	other threads:[~2008-10-10 12:34 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-09 17:00 [Qemu-devel] [RFC] Disk integrity in QEMU Anthony Liguori
2008-10-10  7:54 ` Gerd Hoffmann
2008-10-10  8:12   ` Mark McLoughlin
2008-10-12 23:10     ` Jamie Lokier
2008-10-14 17:15       ` Avi Kivity
2008-10-10  9:32   ` Avi Kivity
2008-10-12 23:00     ` Jamie Lokier
2008-10-10  8:11 ` Aurelien Jarno
2008-10-10 12:26   ` Anthony Liguori
2008-10-10 12:53     ` Paul Brook
2008-10-10 13:55       ` Anthony Liguori
2008-10-10 14:05         ` Paul Brook
2008-10-10 14:19         ` Avi Kivity
2008-10-17 13:14           ` Jens Axboe
2008-10-19  9:13             ` Avi Kivity
2008-10-10 15:48     ` Aurelien Jarno
2008-10-10  9:16 ` Avi Kivity
2008-10-10  9:58   ` Daniel P. Berrange
2008-10-10 10:26     ` Avi Kivity
2008-10-10 12:59       ` Paul Brook
2008-10-10 13:20         ` Avi Kivity
2008-10-10 12:34   ` Anthony Liguori [this message]
2008-10-10 12:56     ` Avi Kivity
2008-10-11  9:07     ` andrzej zaborowski
2008-10-11 17:54   ` Mark Wagner
2008-10-11 20:35     ` Anthony Liguori
2008-10-12  0:43       ` Mark Wagner
2008-10-12  1:50         ` Chris Wright
2008-10-12 16:22           ` Jamie Lokier
2008-10-12 17:54         ` Anthony Liguori
2008-10-12 18:14           ` nuitari-qemu
2008-10-13  0:27           ` Mark Wagner
2008-10-13  1:21             ` Anthony Liguori
2008-10-13  2:09               ` Mark Wagner
2008-10-13  3:16                 ` Anthony Liguori
2008-10-13  6:42                 ` Aurelien Jarno
2008-10-13 14:38                 ` Steve Ofsthun
2008-10-12  0:44       ` Chris Wright
2008-10-12 10:21         ` Avi Kivity
2008-10-12 14:37           ` Dor Laor
2008-10-12 15:35             ` Jamie Lokier
2008-10-12 18:00               ` Anthony Liguori
2008-10-12 18:02             ` Anthony Liguori
2008-10-15 10:17               ` Andrea Arcangeli
2008-10-12 17:59           ` Anthony Liguori
2008-10-12 18:34             ` Avi Kivity
2008-10-12 19:33               ` Izik Eidus
2008-10-14 17:08                 ` Avi Kivity
2008-10-12 19:59               ` Anthony Liguori
2008-10-12 20:43                 ` Avi Kivity
2008-10-12 21:11                   ` Anthony Liguori
2008-10-14 15:21                     ` Avi Kivity
2008-10-14 15:32                       ` Anthony Liguori
2008-10-14 15:43                         ` Avi Kivity
2008-10-14 19:25                       ` Laurent Vivier
2008-10-16  9:47                         ` Avi Kivity
2008-10-12 10:12       ` Avi Kivity
2008-10-17 13:20         ` Jens Axboe
2008-10-19  9:01           ` Avi Kivity
2008-10-19 18:10             ` Jens Axboe
2008-10-19 18:23               ` Avi Kivity
2008-10-19 19:17                 ` M. Warner Losh
2008-10-19 19:31                   ` Avi Kivity
2008-10-19 18:24               ` Avi Kivity
2008-10-19 18:36                 ` Jens Axboe
2008-10-19 19:11                   ` Avi Kivity
2008-10-19 19:30                     ` Jens Axboe
2008-10-19 20:16                       ` Avi Kivity
2008-10-20 14:14                       ` Avi Kivity
2008-10-10 10:03 ` Fabrice Bellard
2008-10-13 16:11 ` Laurent Vivier
2008-10-13 16:58   ` Anthony Liguori
2008-10-13 17:36     ` Jamie Lokier
2008-10-13 17:06 ` [Qemu-devel] " Ryan Harper
2008-10-13 18:43   ` Anthony Liguori
2008-10-14 16:42     ` Avi Kivity
2008-10-13 18:51   ` Laurent Vivier
2008-10-13 19:43     ` Ryan Harper
2008-10-13 20:21       ` Laurent Vivier
2008-10-13 21:05         ` Ryan Harper
2008-10-15 13:10           ` Laurent Vivier
2008-10-16 10:24             ` Laurent Vivier
2008-10-16 13:43               ` Anthony Liguori
2008-10-16 16:08                 ` Laurent Vivier
2008-10-17 12:48                 ` Avi Kivity
2008-10-17 13:17                   ` Laurent Vivier
2008-10-14 10:05       ` Kevin Wolf
2008-10-14 14:32         ` Ryan Harper
2008-10-14 16:37       ` Avi Kivity
2008-10-13 19:00   ` Mark Wagner
2008-10-13 19:15     ` Ryan Harper
2008-10-14 16:49       ` Avi Kivity
2008-10-13 17:58 ` [Qemu-devel] " Rik van Riel
2008-10-13 18:22   ` Jamie Lokier
2008-10-13 18:34     ` Rik van Riel
2008-10-14  1:56       ` Jamie Lokier
2008-10-14  2:28         ` nuitari-qemu
2008-10-28 17:34 ` Ian Jackson
2008-10-28 17:45   ` Anthony Liguori
2008-10-28 17:50     ` Ian Jackson
2008-10-28 18:19       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48EF4BD7.3040000@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=Laurent.Vivier@bull.net \
    --cc=chrisw@redhat.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=markmc@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ryanh@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).