qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: Samuel Thibault <samuel.thibault@eu.citrix.com>
Subject: Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Date: Mon, 3 Dec 2007 18:40:16 +0000	[thread overview]
Message-ID: <20071203184015.GB15798@shareable.org> (raw)
In-Reply-To: <47544613.3050309@codemonkey.ws>

Anthony Liguori wrote:
> >With the IDE emulation, when the emulated "disk write cache" flag is
> >on it may be reasonable to report a write as completed when the AIO is
> >dispatched, without waiting for the AIO to complete. 
> >
> >An IDE flush cache command would wait for all outstanding write AIOs
> >to complete, and then issue a flush cache (fdatasync) to the real
> >device before reporting it has completed.
> >
> >That's roughly equivalent to what an IDE disk with write caching does,
> >and it would provide exactly the guarantees for safe storage to the
> >real physical medium that a journalling filesystem or database in the
> >guest requires.
> 
> Except that in an enterprise environment, you typically have battery 
> backed disk cache.  It really doesn't matter though b/c in QEMU today, 
> submitting the request blocks until it's completed anyway (which is 
> nearly instant anyway since I/O is buffered).

Buffered I/O is less reliable in a sense.

With buffered I/O, if the host crashes, you may lose data that a
filesystem or database on the guest reported as committed to
applications.  That can result, on those rare occasions, in guest
journalled filesystem corruption (something that should be
impossible), and in database corruption or durability failure.

With direct I/O and write cache emulation (as described), when a guest
journalling filesystem or database reports data is committed, it has
much the same committment/durability guarantee that the same
applications would have running on the host.  Namely, the data has
reached the disk, and the disk has reported it's committed.

This may matter if you want to run those sort of applications in a
guest, which clearly people often do, especially with KVM or Xen.

Anecdote: This is already a problem in some environments.  I have a
rented virtual machine; it's running UML.  The UML disk uses O_SYNC
writes (nowadays), because buffered host writes resulted in occasional
guest data loss, and journalled filesystem corruption.  Unfortunately,
this is a performance slowdown, but it's better than occasional
corruption.  I imagine similar things apply with Qemu machines
occasionally.

-- Jamie

  reply	other threads:[~2007-12-03 18:40 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-03 10:09 [Qemu-devel] [PATCH 0/2 v2] Open disk images with O_DIRECT Laurent Vivier
2007-12-03 10:09 ` [Qemu-devel] [PATCH 1/2 v2] Add "cache" parameter to "-drive" Laurent Vivier
2007-12-03 10:09   ` [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O Laurent Vivier
2007-12-03 10:23     ` Fabrice Bellard
2007-12-03 10:30       ` Laurent Vivier
2007-12-03 11:40         ` Markus Hitter
2007-12-03 15:39           ` Paul Brook
2007-12-03 19:26             ` Samuel Thibault
2007-12-03 15:54         ` Anthony Liguori
2007-12-03 17:08           ` Samuel Thibault
2007-12-03 17:17             ` Paul Brook
2007-12-03 17:49               ` Jamie Lokier
2007-12-03 18:08                 ` Anthony Liguori
2007-12-03 18:40                   ` Jamie Lokier [this message]
2007-12-03 18:06             ` Anthony Liguori
2007-12-03 19:10               ` Laurent Vivier
2007-12-03 19:16                 ` Paul Brook
2007-12-03 21:36                   ` Anthony Liguori
2007-12-04 12:49                     ` Gerd Hoffmann
2007-12-04 13:02                       ` Laurent Vivier
2007-12-04  8:13                   ` Laurent Vivier
2007-12-03 21:13                 ` Gerd Hoffmann
2007-12-03 21:23                   ` Samuel Thibault
2007-12-03 21:38                   ` Anthony Liguori
2007-12-04 13:21                     ` Gerd Hoffmann
2007-12-04 15:03                       ` Anthony Liguori
2007-12-04 16:18                         ` Gerd Hoffmann
2007-12-05 14:47                           ` Anthony Liguori
2007-12-03 19:14               ` Paul Brook
2007-12-03 19:00           ` Laurent Vivier
2007-12-03 11:14       ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071203184015.GB15798@shareable.org \
    --to=jamie@shareable.org \
    --cc=qemu-devel@nongnu.org \
    --cc=samuel.thibault@eu.citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).