public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Christoph Hellwig <hch@infradead.org>,
	Michael Tokarev <mjt@tls.msk.ru>, KVM list <kvm@vger.kernel.org>,
	Kevin Wolf <kwolf@redhat.com>
Subject: Re: JFYI: ext4 bug triggerable by kvm
Date: Tue, 17 Aug 2010 10:28:08 -0400	[thread overview]
Message-ID: <20100817142808.GA22412@infradead.org> (raw)
In-Reply-To: <4C6A9AB5.6050404@codemonkey.ws>

On Tue, Aug 17, 2010 at 09:20:37AM -0500, Anthony Liguori wrote:
> On 08/17/2010 08:07 AM, Christoph Hellwig wrote:
> >>The point is that we don't want to flush the disk write cache.  The
> >>intention of writethrough is not to make the disk cache writethrough
> >>but to treat the host's cache as writethrough.
> >
> >We need to make sure data is not in the disk write cache if want to
> >provide data integrity.
> 
> When the guest explicitly flushes the emulated disk's write cache.
> Not on every single write completion.

That depends on the cache= mode.  For cache=none and cache=writeback
we present a write-back cache to the guest, and the guest does explicit
cache flushes.  For cache=writethrough we present a writethrough cache
to the guest, and we need to make sure data actually has hit the disk
before returning I/O completion to the guest.

> >   It has nothing to do with the qemu caching
> >mode - for data=writeback or none it's commited as part of the fdatasync
> >call, and for data=writethrough it's commited as part of the O_SYNC
> >write.  Note that both these path end up calling the filesystems ->fsync
> >method which is what's require to make writes stable.  That's exactly
> >what is missing out in sync_file_range, and that's why that API is not
> >useful at all for data integrity operations.
> 
> For normal writes from a guest, we don't need to follow the write
> with an fsync().  We should only need to issue an fsync() given an
> explicit flush from the guest.

Define normal writes.  For cache=none and cache=writeback we don't
have to, and instead do explicit calls to fsync()/fdatasync() calls
when a we a cache flush from the guest.  For data=writethrough we
guarantee data has made it to disk, and we implement this using
O_DSYNC/O_SYNC when opening the file.  That tells the operating system
to not return until data has hit the disk.   For Linux this is
internally implement using a range-fsync/fdatasync after the actual
write.

> fsync() being slow is orthogonal to my point.  I don't see why we
> need to do an fsync() on *every* write.  It should only be necessary
> when a guest injects an actual barrier.

See above.


  reply	other threads:[~2010-08-17 14:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-16 14:00 JFYI: ext4 bug triggerable by kvm Michael Tokarev
2010-08-16 14:43 ` Anthony Liguori
2010-08-16 18:42   ` Christoph Hellwig
2010-08-16 20:34     ` Anthony Liguori
2010-08-17  9:07       ` Christoph Hellwig
2010-08-17  9:23         ` Avi Kivity
2010-08-17 11:17           ` Christoph Hellwig
2010-08-17 12:56         ` Anthony Liguori
2010-08-17 13:07           ` Christoph Hellwig
2010-08-17 14:20             ` Anthony Liguori
2010-08-17 14:28               ` Christoph Hellwig [this message]
2010-08-17 14:39                 ` Anthony Liguori
2010-08-17 14:45                   ` Christoph Hellwig
2010-08-17 14:53                     ` Avi Kivity
2010-08-17 14:54                     ` Anthony Liguori
2010-08-17 15:01                       ` Avi Kivity
2010-08-17 15:02                       ` Christoph Hellwig
2010-08-17 14:40                 ` Michael Tokarev
2010-08-17 14:44                   ` Anthony Liguori
2010-08-17 14:46                     ` Christoph Hellwig
2010-08-17 14:57                       ` Anthony Liguori
2010-08-17 14:59                       ` Avi Kivity
2010-08-17 15:04                         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100817142808.GA22412@infradead.org \
    --to=hch@infradead.org \
    --cc=anthony@codemonkey.ws \
    --cc=kvm@vger.kernel.org \
    --cc=kwolf@redhat.com \
    --cc=mjt@tls.msk.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox