From: Mark Wagner <mwagner@redhat.com>
To: qemu-devel@nongnu.org
Cc: Chris Wright <chrisw@redhat.com>,
Mark McLoughlin <markmc@redhat.com>,
Ryan Harper <ryanh@us.ibm.com>,
kvm-devel <kvm-devel@lists.sourceforge.net>,
Laurent Vivier <Laurent.Vivier@bull.net>
Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU
Date: Sun, 12 Oct 2008 22:09:29 -0400 [thread overview]
Message-ID: <48F2ADD9.9000804@redhat.com> (raw)
In-Reply-To: <48F2A294.4020303@codemonkey.ws>
Anthony Liguori wrote:
> Mark Wagner wrote:
>> If you stopped and listened to yourself, you'd see that you are making
>> my point...
>>
>> AFAIK, QEMU is neither designed nor intended to be an Enterprise
>> Storage Array,
>> I thought this group is designing a virtualization layer. However,
>> the persistent
>> argument is that since Enterprise Storage products will often
>> acknowledge a write
>> before the data is actually on the disk, its OK for QEMU to do the same.
>
> I think you're a little lost in this thread. We're going to have QEMU
> only acknowledge writes when they complete. I've already sent out a
> patch. Just waiting a couple days to let everyone give their input.
>
Actually, I'm just don't being clear enough in trying to point out that I
don't think just setting a default value for "cache" goes far enough. My
argument has nothing to do with the default value. It has to do with what the
right thing to do is in specific situations regardless of the value of the
cache setting.
My point is that if a file is opened in the guest with the O_DIRECT (or O_DSYNC)
then QEMU *must* honor that regardless of whatever value the current value of
"cache" is.
So, if the system admin for the host decides to set cache=on and something
in the guest opens a file with O_DIRECT, I feel that it is a violation
of the system call for the host to cache the write in its local cache w/o
sending it immediately to the storage subsystem. It must get an ACK from
the storage subsystem before it can return to the guest in order to preserve
the guarantee.
So, if your proposed default value for the cache is in effect, then O_DSYNC
should provide the write-thru required by the guests use of O_DIRECT on the
writes. However, if the default cache value is not used and its set to
cache=on, and if the guest is using O_DIRECT or O_DSYNC, I feel there are
issues that need to be addressed.
-mark
>> If QEMU
>> had a similar design to Enterprise Storage with redundancy, battery
>> backup, etc, I'd
>> be fine with it, but you don't. QEMU is a layer that I've also thought
>> was suppose
>> to be small, lightweight and unobtrusive that is silently putting
>> everyones data
>> at risk.
>>
>> The low-end iSCSI server from EqualLogic claims:
>> "it combines intelligence and automation with fault tolerance"
>> "Dual, redundant controllers with a total of 4 GB battery-backed
>> memory"
>>
>> AFAIK QEMU provides neither of these characteristics.
>
> So if this is your only concern, we're in violent agreement. You were
> previously arguing that we should use O_DIRECT in the host if we're not
> "lying" about write completions anymore. That's what I'm opposing
> because the details of whether we use O_DIRECT or not have absolutely
> nothing to do with data integrity as long as we're using O_DSYNC.
>
> Regards,
>
> Anthony Liguori
>
>>
>> -mark
>>
>>> The fact that the virtualization layer has a cache is really not that
>>> unusual.
>> Do other virtualization layers lie to the guest and indicate that the
>> data
>> has successfully been ACK'd by the storage subsystem when the data is
>> actually
>> still in the host cache?
>>
>>
>> -mark
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>
>>
>>
>>
>
>
>
next prev parent reply other threads:[~2008-10-13 2:10 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-09 17:00 [Qemu-devel] [RFC] Disk integrity in QEMU Anthony Liguori
2008-10-10 7:54 ` Gerd Hoffmann
2008-10-10 8:12 ` Mark McLoughlin
2008-10-12 23:10 ` Jamie Lokier
2008-10-14 17:15 ` Avi Kivity
2008-10-10 9:32 ` Avi Kivity
2008-10-12 23:00 ` Jamie Lokier
2008-10-10 8:11 ` Aurelien Jarno
2008-10-10 12:26 ` Anthony Liguori
2008-10-10 12:53 ` Paul Brook
2008-10-10 13:55 ` Anthony Liguori
2008-10-10 14:05 ` Paul Brook
2008-10-10 14:19 ` Avi Kivity
2008-10-17 13:14 ` Jens Axboe
2008-10-19 9:13 ` Avi Kivity
2008-10-10 15:48 ` Aurelien Jarno
2008-10-10 9:16 ` Avi Kivity
2008-10-10 9:58 ` Daniel P. Berrange
2008-10-10 10:26 ` Avi Kivity
2008-10-10 12:59 ` Paul Brook
2008-10-10 13:20 ` Avi Kivity
2008-10-10 12:34 ` Anthony Liguori
2008-10-10 12:56 ` Avi Kivity
2008-10-11 9:07 ` andrzej zaborowski
2008-10-11 17:54 ` Mark Wagner
2008-10-11 20:35 ` Anthony Liguori
2008-10-12 0:43 ` Mark Wagner
2008-10-12 1:50 ` Chris Wright
2008-10-12 16:22 ` Jamie Lokier
2008-10-12 17:54 ` Anthony Liguori
2008-10-12 18:14 ` nuitari-qemu
2008-10-13 0:27 ` Mark Wagner
2008-10-13 1:21 ` Anthony Liguori
2008-10-13 2:09 ` Mark Wagner [this message]
2008-10-13 3:16 ` Anthony Liguori
2008-10-13 6:42 ` Aurelien Jarno
2008-10-13 14:38 ` Steve Ofsthun
2008-10-12 0:44 ` Chris Wright
2008-10-12 10:21 ` Avi Kivity
2008-10-12 14:37 ` Dor Laor
2008-10-12 15:35 ` Jamie Lokier
2008-10-12 18:00 ` Anthony Liguori
2008-10-12 18:02 ` Anthony Liguori
2008-10-15 10:17 ` Andrea Arcangeli
2008-10-12 17:59 ` Anthony Liguori
2008-10-12 18:34 ` Avi Kivity
2008-10-12 19:33 ` Izik Eidus
2008-10-14 17:08 ` Avi Kivity
2008-10-12 19:59 ` Anthony Liguori
2008-10-12 20:43 ` Avi Kivity
2008-10-12 21:11 ` Anthony Liguori
2008-10-14 15:21 ` Avi Kivity
2008-10-14 15:32 ` Anthony Liguori
2008-10-14 15:43 ` Avi Kivity
2008-10-14 19:25 ` Laurent Vivier
2008-10-16 9:47 ` Avi Kivity
2008-10-12 10:12 ` Avi Kivity
2008-10-17 13:20 ` Jens Axboe
2008-10-19 9:01 ` Avi Kivity
2008-10-19 18:10 ` Jens Axboe
2008-10-19 18:23 ` Avi Kivity
2008-10-19 19:17 ` M. Warner Losh
2008-10-19 19:31 ` Avi Kivity
2008-10-19 18:24 ` Avi Kivity
2008-10-19 18:36 ` Jens Axboe
2008-10-19 19:11 ` Avi Kivity
2008-10-19 19:30 ` Jens Axboe
2008-10-19 20:16 ` Avi Kivity
2008-10-20 14:14 ` Avi Kivity
2008-10-10 10:03 ` Fabrice Bellard
2008-10-13 16:11 ` Laurent Vivier
2008-10-13 16:58 ` Anthony Liguori
2008-10-13 17:36 ` Jamie Lokier
2008-10-13 17:06 ` [Qemu-devel] " Ryan Harper
2008-10-13 18:43 ` Anthony Liguori
2008-10-14 16:42 ` Avi Kivity
2008-10-13 18:51 ` Laurent Vivier
2008-10-13 19:43 ` Ryan Harper
2008-10-13 20:21 ` Laurent Vivier
2008-10-13 21:05 ` Ryan Harper
2008-10-15 13:10 ` Laurent Vivier
2008-10-16 10:24 ` Laurent Vivier
2008-10-16 13:43 ` Anthony Liguori
2008-10-16 16:08 ` Laurent Vivier
2008-10-17 12:48 ` Avi Kivity
2008-10-17 13:17 ` Laurent Vivier
2008-10-14 10:05 ` Kevin Wolf
2008-10-14 14:32 ` Ryan Harper
2008-10-14 16:37 ` Avi Kivity
2008-10-13 19:00 ` Mark Wagner
2008-10-13 19:15 ` Ryan Harper
2008-10-14 16:49 ` Avi Kivity
2008-10-13 17:58 ` [Qemu-devel] " Rik van Riel
2008-10-13 18:22 ` Jamie Lokier
2008-10-13 18:34 ` Rik van Riel
2008-10-14 1:56 ` Jamie Lokier
2008-10-14 2:28 ` nuitari-qemu
2008-10-28 17:34 ` Ian Jackson
2008-10-28 17:45 ` Anthony Liguori
2008-10-28 17:50 ` Ian Jackson
2008-10-28 18:19 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48F2ADD9.9000804@redhat.com \
--to=mwagner@redhat.com \
--cc=Laurent.Vivier@bull.net \
--cc=chrisw@redhat.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=markmc@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=ryanh@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).