From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>,
Rusty Russell <rusty@rustcorp.com.au>,
kvm@vger.kernel.org
Subject: Re: [PATCH, RFC] virtio_blk: add cache flush command
Date: Mon, 11 May 2009 12:47:58 -0500 [thread overview]
Message-ID: <4A0864CE.10505@codemonkey.ws> (raw)
In-Reply-To: <4A085721.2050005@redhat.com>
Avi Kivity wrote:
> Christoph Hellwig wrote:
>> On Mon, May 11, 2009 at 06:45:50PM +0300, Avi Kivity wrote:
>>
>>>> Right now it's fsync. By the time I'll submit the backend change it
>>>> will still be fsync, but at least called from the posix-aio-compat
>>>> thread pool.
>>>>
>>>>
>>> I think if we have cache=writeback we should ignore this.
>>>
>>
>> It's only needed for cache=writeback, because without that there is no
>> reason to flush a write cache.
>>
>
> Maybe we should add a fourth cache= mode then. But
> cache=writeback+fsync doesn't correspond to any real world drive; in
> the real world you're limited to power failures and a few megabytes of
> cache (typically less), cache=writeback+fsync can lose hundreds of
> megabytes due to power loss or software failure.
>
> Oh, and cache=writeback+fsync doesn't work on qcow2, unless we add
> fsync after metadata updates.
But how do we define the data integrity guarantees to the user of
cache=writeback+fsync? It seems to require a rather detailed knowledge
of Linux's use of T_FLUSH operations.
Right now, it's fairly easy to understand. cache=none and
cache=writethrough guarantee that all write operations that the guest
thinks have completed are completed. cache=writeback provides no such
guarantee.
cache=writeback+fsync would guarantee that only operations that include
a T_FLUSH are present on disk which currently includes fsyncs but does
not include O_DIRECT writes. I guess whether O_SYNC does a T_FLUSH also
has to be determined.
It seems too complicated to me. If we could provide a mode where
cache=writeback provided as strong a guarantee as cache=writethrough,
then that would be quite interesting.
>> (Or maybe ext3 actually is stupid enough to flush the whole fs even for
>> that case
>
> Sigh.
I'm also worried about ext3 here.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2009-05-11 17:48 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-11 8:39 [PATCH, RFC] virtio_blk: add cache flush command Christoph Hellwig
2009-05-11 14:51 ` Anthony Liguori
2009-05-11 15:40 ` Christoph Hellwig
2009-05-11 15:45 ` Avi Kivity
2009-05-11 16:28 ` Christoph Hellwig
2009-05-11 16:49 ` Avi Kivity
2009-05-11 17:47 ` Anthony Liguori [this message]
2009-05-11 18:00 ` Avi Kivity
2009-05-11 18:29 ` Anthony Liguori
2009-05-11 18:40 ` Avi Kivity
2009-05-18 12:03 ` Christoph Hellwig
2009-05-12 7:23 ` Christoph Hellwig
2009-05-12 7:19 ` Christoph Hellwig
2009-05-12 8:35 ` Avi Kivity
2009-05-18 12:06 ` Christoph Hellwig
2009-05-11 16:38 ` Anthony Liguori
2009-05-12 7:26 ` Christoph Hellwig
2009-05-12 13:54 ` Rusty Russell
2009-05-12 14:18 ` Christian Borntraeger
2009-05-13 1:52 ` Rusty Russell
2009-05-18 12:07 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A0864CE.10505@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=hch@lst.de \
--cc=kvm@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.