From: Anthony Liguori <anthony@codemonkey.ws>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] block-queue: Delay and batch metadata writes
Date: Mon, 20 Sep 2010 11:34:02 -0500 [thread overview]
Message-ID: <4C978CFA.1000600@codemonkey.ws> (raw)
In-Reply-To: <4C9783E7.5080905@redhat.com>
On 09/20/2010 10:55 AM, Kevin Wolf wrote:
> Am 20.09.2010 17:40, schrieb Anthony Liguori:
>
>> On 09/20/2010 10:08 AM, Kevin Wolf wrote:
>>
>>>> If you're comfortable with a writeback cache for metadata, then you
>>>> should also be comfortable with a writeback cache for data in which
>>>> case, cache=writeback is the answer.
>>>>
>>>>
>>> Well, there is a difference: We don't pollute the host page cache with
>>> guest data and we don't get a virtual "disk cache" as big as the host
>>> RAM, but only a very limited queue of metadata.
>>>
>>> Basically, in qemu we have three different types of caching:
>>>
>>> 1. O_DSYNC, everything is always synced without any explicit request.
>>> This is cache=writethrough.
>>>
>>>
>> I actually think O_DSYNC is the wrong implementation of
>> cache=writethrough. cache=writethrough should behave just like
>> cache=none except that data goes through the page cache.
>>
> Then you have cache=writeback, basically.
>
No. Write through means "write requests are not completed until the
data has been acknowledged by the next layer." Write back means "write
requests are completed irrespective of the data being acknowledged by
the next layer."
Write through ensures consistency with the next layer whereas write back
doesn't.
cache=none means that there is no cache. If there is no cache, then
you're guaranteed to be consistent with the next layer.
The only reason it's exposed as writeback at the emulation layer is that
*usually* disks have writeback caches. The fact that writethrough
currently is stronger than the next layer (in terms that it breaks
through the writeback cache a disk may have) is not a designed feature.
It's an accident.
Had ext3 enabled barriers by default, writethrough would not use O_DSYNC.
>>> 2. Nothing is ever synced. This is cache=unsafe.
>>>
>>> 3. We present a writeback disk cache to the guest and the guest needs
>>> to explicitly flush to gets its data safe on disk. This is
>>> cache=writeback and cache=none.
>>>
>>>
>> We shouldn't tie the virtual disk cache to which cache= option is used
>> in the host. cache=none means that all requests go directly to the
>> disk. cache=writeback means the host acts as a writeback cache.
>>
> No, that's not the meaning of cache=none if you take the disk cache into
> consideration.
You can't possibly take into account the disk cache because we don't
know anything about the disk cache in QEMU. Just assuming disks always
have writeback caches is wrong.
> It might be what you think should be the meaning of
> cache=none, but it's not what it means in any qemu release.
>
That's precisely what it's meant in every release since the cache
options when I first introduced them.
>>> We're still lacking modes for O_DSYNC | O_DIRECT and unsafe | O_DIRECT,
>>> but they are entirely possible, because it's two different dimensions.
>>> (And I think Christoph was planning to actually make it two independent
>>> options)
>>>
>> I don't really think O_DSYNC | O_DIRECT makes much sense.
>>
> Maybe, maybe not. It's just a missing entry in the matrix.
>
Regards,
Anthony Liguori
next prev parent reply other threads:[~2010-09-20 16:34 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-20 13:56 [Qemu-devel] [RFC] block-queue: Delay and batch metadata writes Kevin Wolf
2010-09-20 14:31 ` Anthony Liguori
2010-09-20 14:56 ` Anthony Liguori
2010-09-20 15:33 ` Kevin Wolf
2010-09-20 15:48 ` Anthony Liguori
2010-09-20 15:08 ` Kevin Wolf
2010-09-20 15:33 ` Avi Kivity
2010-09-20 15:38 ` Avi Kivity
2010-09-20 15:46 ` Kevin Wolf
2010-09-20 15:40 ` Anthony Liguori
2010-09-20 15:55 ` Kevin Wolf
2010-09-20 16:34 ` Anthony Liguori [this message]
2010-09-20 15:51 ` Anthony Liguori
2010-09-20 16:05 ` Avi Kivity
2010-09-21 9:13 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C978CFA.1000600@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).