From: Anthony Liguori <anthony@codemonkey.ws>
To: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: [Qemu-devel] [RFC] Use O_DSYNC by default and update documentation to explain IO integrity in QEMU
Date: Thu, 09 Oct 2008 12:02:02 -0500 [thread overview]
Message-ID: <48EE390A.6070601@codemonkey.ws> (raw)
[-- Attachment #1: Type: text/plain, Size: 76 bytes --]
We'll have some benchmarks later this afternoon.
Regards,
Anthony Liguori
[-- Attachment #2: o_sync.patch --]
[-- Type: text/x-patch, Size: 4057 bytes --]
diff --git a/block-raw-posix.c b/block-raw-posix.c
index 83a358c..e58f191 100644
--- a/block-raw-posix.c
+++ b/block-raw-posix.c
@@ -120,7 +120,7 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags)
s->lseek_err_cnt = 0;
- open_flags = O_BINARY;
+ open_flags = O_BINARY | O_DSYNC;
if ((flags & BDRV_O_ACCESS) == O_RDWR) {
open_flags |= O_RDWR;
} else {
@@ -996,7 +996,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
IOObjectRelease( mediaIterator );
}
#endif
- open_flags = O_BINARY;
+ open_flags = O_BINARY | O_DSYNC;
if ((flags & BDRV_O_ACCESS) == O_RDWR) {
open_flags |= O_RDWR;
} else {
diff --git a/qemu-doc.texi b/qemu-doc.texi
index adf270b..2e859ff 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -267,13 +267,56 @@ These options have the same definition as they have in @option{-hdachs}.
@item snapshot=@var{snapshot}
@var{snapshot} is "on" or "off" and allows to enable snapshot for given drive (see @option{-snapshot}).
@item cache=@var{cache}
-@var{cache} is "on" or "off" and allows to disable host cache to access data.
+@var{cache} is "on" or "off" and allows to the use of the host page cache.
@item format=@var{format}
Specify which disk @var{format} will be used rather than detecting
the format. Can be used to specifiy format=raw to avoid interpreting
an untrusted format header.
@end table
+By default, QEMU accesses all disk data through the host's page cache.
+This allows the host to perform read-ahead and to avoid duplicating IO
+requests unnecessarily increasing disk performance. You may notice that
+certain benchmarks in the guest perform better than they do in the host
+(for read) because of this. This is primarily because the benchmark is
+unaware of the extra level of caching that is occurring when running in
+a virtual environment.
+
+The cache=off option can be used to disable the use of the host's page
+cache. Disabling the use of the host's page cache will likely reduce
+performance since the host is unable to perform read-ahead and unable
+to avoid duplicating IO requests. At this time, QEMU will copy data
+internally so the cost of copying data into the host's page cache is
+unlikely to be statistically significant.
+
+The use of cache=off may make a benchmark appear to have results that
+are closer to the results in the host. This does not imply that data
+integrity is not preserved when using cache=on, it is simply an artifact
+of the fact that the benchmark is not aware that it is in a virtual machine.
+It also does not imply that cache=off should be used for general workloads.
+
+In the future, QEMU will be able to avoid copying data internally and
+under certain workloads, disabling the use of the host's page cache may
+increase performance provided that the guest is actively working to avoid
+bringing data into the CPU's cache. This can only be achieved when using
+things like sendfile() in the guest or other forms of direct-io. An example
+of a workload that may benefit from avoiding the host's page cache is a
+static web server that is serving entirely unique data and has a relatively
+large amount of memory relative to the host. This documentation will be
+updated when this change is made. For now, cache=off is mostly useful for
+development purposes and for benchmarks that are not virtualization aware.
+
+Write requests are only reported completed to the guest when they have
+been reported completed by the disk regardless of whether the host's
+page cache is used for access so the use of the host's page cache is
+orthogonal to data integrity.
+
+If the host's disk drive has write-back caching enabled and the disk does
+not have a battery-backed cache, then data loss can occur regardless of
+whether write-back caching is disabled in the guest.
+
+If in doubt, do not change the default value (which is cache=on).
+
Instead of @option{-cdrom} you can use:
@example
qemu -drive file=file,index=2,media=cdrom
next reply other threads:[~2008-10-09 17:02 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-09 17:02 Anthony Liguori [this message]
2008-10-10 9:27 ` [Qemu-devel] [RFC] Use O_DSYNC by default and update documentation to explain IO integrity in QEMU Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48EE390A.6070601@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).