From: Anthony Liguori <anthony@codemonkey.ws>
To: Christoph Hellwig <hch@lst.de>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel <qemu-devel@nongnu.org>
Subject: [Qemu-devel] Re: Caching modes
Date: Tue, 21 Sep 2010 16:27:57 -0500 [thread overview]
Message-ID: <4C99235D.9050506@codemonkey.ws> (raw)
In-Reply-To: <20100921205740.GA1467@lst.de>
On 09/21/2010 03:57 PM, Christoph Hellwig wrote:
> On Tue, Sep 21, 2010 at 10:13:01AM -0500, Anthony Liguori wrote:
>
>> 1) make virtual WC guest controllable. If a guest enables WC,&=
>> ~O_DSYNC. If it disables WC, |= O_DSYNC. Obviously, we can let a user
>> specify the virtual WC mode but it has to be changable during live
>> migration.
>>
> I have patches for that are almost ready to submit.
>
>
>> 2) only let the user choose between using and not using the host page
>> cache. IOW, direct=on|off. cache=XXX is deprecated.
>>
> Also done by that patch series. That's exactly what I described to mail
> roundtrips ago..
>
Yes.
>> My concern is ext4. With a preallocated file and cache=none as
>> implemented today, performance is good even when barrier=1. If we
>> enable O_DSYNC, performance will plummet. Ultimately, this is an ext4
>> problem, not a QEMU problem.
>>
> For Linux or Windows guests WCE=0 is not a particularly good default
> given that they can deal with the write caches, and mirrors the
> situation with consumer SATA disk. For for older Unix guests you'll
> need to be able to persistently disable the write cache.
>
> To make things more confusing the default ATA/SATA way to tune the
> volatile write cache setting is not persistent - e.g. if you disable it
> using hdparm it will come up enabled again.
>
Yes, potentially, we could save this in a config file (and really, I
mean libvirt could save it).
>> 2) User does not have enterprise storage, but has an image on ext4 with
>> barrier=1. User explicitly disables WC in guest because they don't know
>> what they're doing.
>>
>> For (2), again it's probably the user doing the wrong thing because if
>> they don't have enterprise storage, then they shouldn't care about a
>> virtual WC. Practically though, I've seen a lot of this with users.
>>
> This setting is just fine, especially if using O_DIRECT. The guest
> sends cache flush requests often enough to not make it a problem. If
> you do not use O_DIRECT in that scenario which will cache a lot more
> data in theory - but any filesystem aware of cache flushes will flush
> them frequent enough to not make it a problem. It is a real problem
> however when using ext3 in it's default setting in the guest which
> doesn't use barrier. But that's a bug in ext3 and nothing but
> petitioning it's maintainer to fix it will help you there.
>
It's not just ext3, it's ext4 with barrier=0 which is what certain
applications are being told to do in the face of poor performance.
So direct=on,wc=on + ext4 barrier=0 in the guest is less safe than ext4
barrier=0 on bare metal.
Very specifically, if we do cache=none as we do today, and within the
guest, we have ext4 barrier=0 and run DB2, DB2's guarantees are weaker
than they are on bare metal because of the fact that metadata is not
getting flushed.
To resolve this, we need to do direct=on,wc=off + ext4 barrier=0 on the
host. This is safe and should perform reasonably well but there's far
too much complexity for a user to get to this point.
Regards,
Anthony Liguori
prev parent reply other threads:[~2010-09-21 21:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-20 16:53 [Qemu-devel] Caching modes Anthony Liguori
2010-09-20 18:37 ` Blue Swirl
2010-09-20 18:51 ` Anthony Liguori
2010-09-20 19:34 ` [Qemu-devel] " Christoph Hellwig
2010-09-20 20:11 ` Anthony Liguori
2010-09-20 23:17 ` Christoph Hellwig
2010-09-21 0:18 ` Anthony Liguori
2010-09-21 8:15 ` Kevin Wolf
2010-09-21 14:26 ` Christoph Hellwig
2010-09-21 15:13 ` Anthony Liguori
2010-09-21 20:57 ` Christoph Hellwig
2010-09-21 21:27 ` Anthony Liguori [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C99235D.9050506@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=hch@lst.de \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).