From: Anthony Liguori <anthony@codemonkey.ws>
To: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: [Qemu-devel] [RFC] Use O_DSYNC by default and update documentation to explain IO integrity in QEMU
Date: Thu, 09 Oct 2008 12:02:02 -0500 [thread overview]
Message-ID: <48EE390A.6070601@codemonkey.ws> (raw)
[-- Attachment #1: Type: text/plain, Size: 76 bytes --]
We'll have some benchmarks later this afternoon.
Regards,
Anthony Liguori
[-- Attachment #2: o_sync.patch --]
[-- Type: text/x-patch, Size: 4057 bytes --]
diff --git a/block-raw-posix.c b/block-raw-posix.c
index 83a358c..e58f191 100644
--- a/block-raw-posix.c
+++ b/block-raw-posix.c
@@ -120,7 +120,7 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags)
s->lseek_err_cnt = 0;
- open_flags = O_BINARY;
+ open_flags = O_BINARY | O_DSYNC;
if ((flags & BDRV_O_ACCESS) == O_RDWR) {
open_flags |= O_RDWR;
} else {
@@ -996,7 +996,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
IOObjectRelease( mediaIterator );
}
#endif
- open_flags = O_BINARY;
+ open_flags = O_BINARY | O_DSYNC;
if ((flags & BDRV_O_ACCESS) == O_RDWR) {
open_flags |= O_RDWR;
} else {
diff --git a/qemu-doc.texi b/qemu-doc.texi
index adf270b..2e859ff 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -267,13 +267,56 @@ These options have the same definition as they have in @option{-hdachs}.
@item snapshot=@var{snapshot}
@var{snapshot} is "on" or "off" and allows to enable snapshot for given drive (see @option{-snapshot}).
@item cache=@var{cache}
-@var{cache} is "on" or "off" and allows to disable host cache to access data.
+@var{cache} is "on" or "off" and allows to the use of the host page cache.
@item format=@var{format}
Specify which disk @var{format} will be used rather than detecting
the format. Can be used to specifiy format=raw to avoid interpreting
an untrusted format header.
@end table
+By default, QEMU accesses all disk data through the host's page cache.
+This allows the host to perform read-ahead and to avoid duplicating IO
+requests unnecessarily increasing disk performance. You may notice that
+certain benchmarks in the guest perform better than they do in the host
+(for read) because of this. This is primarily because the benchmark is
+unaware of the extra level of caching that is occurring when running in
+a virtual environment.
+
+The cache=off option can be used to disable the use of the host's page
+cache. Disabling the use of the host's page cache will likely reduce
+performance since the host is unable to perform read-ahead and unable
+to avoid duplicating IO requests. At this time, QEMU will copy data
+internally so the cost of copying data into the host's page cache is
+unlikely to be statistically significant.
+
+The use of cache=off may make a benchmark appear to have results that
+are closer to the results in the host. This does not imply that data
+integrity is not preserved when using cache=on, it is simply an artifact
+of the fact that the benchmark is not aware that it is in a virtual machine.
+It also does not imply that cache=off should be used for general workloads.
+
+In the future, QEMU will be able to avoid copying data internally and
+under certain workloads, disabling the use of the host's page cache may
+increase performance provided that the guest is actively working to avoid
+bringing data into the CPU's cache. This can only be achieved when using
+things like sendfile() in the guest or other forms of direct-io. An example
+of a workload that may benefit from avoiding the host's page cache is a
+static web server that is serving entirely unique data and has a relatively
+large amount of memory relative to the host. This documentation will be
+updated when this change is made. For now, cache=off is mostly useful for
+development purposes and for benchmarks that are not virtualization aware.
+
+Write requests are only reported completed to the guest when they have
+been reported completed by the disk regardless of whether the host's
+page cache is used for access so the use of the host's page cache is
+orthogonal to data integrity.
+
+If the host's disk drive has write-back caching enabled and the disk does
+not have a battery-backed cache, then data loss can occur regardless of
+whether write-back caching is disabled in the guest.
+
+If in doubt, do not change the default value (which is cache=on).
+
Instead of @option{-cdrom} you can use:
@example
qemu -drive file=file,index=2,media=cdrom
next reply other threads:[~2008-10-09 17:02 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-09 17:02 Anthony Liguori [this message]
2008-10-10 9:27 ` [Qemu-devel] [RFC] Use O_DSYNC by default and update documentation to explain IO integrity in QEMU Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48EE390A.6070601@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.