From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] ide.c make write cacheing controllable by guest
Date: Mon, 25 Feb 2008 20:50:40 +0000 [thread overview]
Message-ID: <20080225205040.GA18613@shareable.org> (raw)
In-Reply-To: <18371.1341.577787.909764@mariner.uk.xensource.com>
Ian Jackson wrote:
Content-Description: message body text
> The attached patch implements the ATA write cache feature. This
> enables a guest to control, in the standard way, whether disk writes
> are immediately committed to disk before the IDE command completes, or
> may be buffered in the host.
>
> In this patch, by default buffering is off, which provides better
> reliability but may have a performance impact. It would be
> straightforward to change the default, or perhaps offer a command-line
> option, if that would be preferred.
>
> This patch is derived from one which was originally submitted to the
> Xen tree by Rik van Riel <riel@redhat.com>.
This is a very sensible improvement, imho.
However, I notice that it tells the guest that data is committed to
hard storage when the host has merely called fsync().
On Linux (and other host OSes), fdatsync() and fsync() don't always
commit data to hard storage; it sometimes only commits it to the hard
drive cache. (Seriously, just look at fs/ext3/fsync.c; only journal
writes cause the flush, and they aren't done if the inode itself
hasn't changed).
It may be worth mentioning in documentation that guests which need
strong durability guarantees, i.e. for critical database work or for
filesystem journalling safety following host power failure, it is not
enough to disable the IDE write cache in the guest even with this
patch. It is necessary to disable the host's disk write cache too,
for that.
Ideally, the host would provide variation of fdatasync() which flushes
data to hard storage in the same way that kernel filesystem journal
writes can do, and Qemu would use that.
But, presently, I'm not aware of any way to do that short of the
administrator disabling the host's disk write cache.
(Darwin provides F_FULLSYNC. On Linux, an extra flag to
sync_file_range() suggests itself. It would need changes to the block
device and elevator APIs, though, as it's a flush command not an
ordering tag, and not always associated with a prior or subsequent
write although there are some coalescing optimisations when it can be.)
-- Jamie
next prev parent reply other threads:[~2008-02-25 20:50 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-25 18:13 [Qemu-devel] [PATCH] ide.c make write cacheing controllable by guest Ian Jackson
2008-02-25 20:50 ` Jamie Lokier [this message]
2008-02-26 1:16 ` Chris Wedgwood
2008-02-26 7:32 ` Jamie Lokier
2008-02-26 12:15 ` Ian Jackson
2008-02-26 12:49 ` Jamie Lokier
2008-02-26 16:57 ` Ian Jackson
2008-02-26 17:25 ` Jamie Lokier
2008-02-26 18:11 ` Ian Jackson
-- strict thread matches above, loose matches on Subject: below --
2008-03-27 18:02 Ian Jackson
2008-03-27 18:16 ` Paul Brook
2008-03-28 9:38 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080225205040.GA18613@shareable.org \
--to=jamie@shareable.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).