From: Michael Tokarev <mjt@tls.msk.ru>
To: Christoph Hellwig <hch@infradead.org>
Cc: Anthony Liguori <anthony@codemonkey.ws>,
KVM list <kvm@vger.kernel.org>, Kevin Wolf <kwolf@redhat.com>
Subject: Re: JFYI: ext4 bug triggerable by kvm
Date: Tue, 17 Aug 2010 18:40:15 +0400 [thread overview]
Message-ID: <4C6A9F4F.8040209@msgid.tls.msk.ru> (raw)
In-Reply-To: <20100817142808.GA22412@infradead.org>
17.08.2010 18:28, Christoph Hellwig wrote:
> On Tue, Aug 17, 2010 at 09:20:37AM -0500, Anthony Liguori wrote:
[]
>> For normal writes from a guest, we don't need to follow the write
>> with an fsync(). We should only need to issue an fsync() given an
>> explicit flush from the guest.
>
> Define normal writes. For cache=none and cache=writeback we don't
> have to, and instead do explicit calls to fsync()/fdatasync() calls
> when a we a cache flush from the guest. For data=writethrough we
> guarantee data has made it to disk, and we implement this using
> O_DSYNC/O_SYNC when opening the file. That tells the operating system
> to not return until data has hit the disk. For Linux this is
> internally implement using a range-fsync/fdatasync after the actual
> write.
And this is actually what I mentioned in the very beginning,
in a hopefully-single-thread-email I've sent. Mentioned
that ext4 is very slow when using with O_SYNC (without O_DIRECT).
I still had no opportunity to collect more info on this, and
yes, I've seen your (Christoph's) speed tests of a few FSes
in the famous "BTRFS: Unbelievably slow with kvm/qemu" thread.
A few users reported _insane_ write speeds of qcow2 files
with default cache mode on ext4.
And this is what prompted all this discussion (which actually
has nothing to do with the $subject line ;), -- an attempt
to think about replacing O_SYNC/fsync() with something
"lighter"...
>> fsync() being slow is orthogonal to my point. I don't see why we
>> need to do an fsync() on *every* write. It should only be necessary
>> when a guest injects an actual barrier.
We don't do sync on every write, but O_SYNC implies that.
And apparently it is what happening behind the scenes in
ext4 O_SYNC case.
But ok....
/mjt
next prev parent reply other threads:[~2010-08-17 14:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-16 14:00 JFYI: ext4 bug triggerable by kvm Michael Tokarev
2010-08-16 14:43 ` Anthony Liguori
2010-08-16 18:42 ` Christoph Hellwig
2010-08-16 20:34 ` Anthony Liguori
2010-08-17 9:07 ` Christoph Hellwig
2010-08-17 9:23 ` Avi Kivity
2010-08-17 11:17 ` Christoph Hellwig
2010-08-17 12:56 ` Anthony Liguori
2010-08-17 13:07 ` Christoph Hellwig
2010-08-17 14:20 ` Anthony Liguori
2010-08-17 14:28 ` Christoph Hellwig
2010-08-17 14:39 ` Anthony Liguori
2010-08-17 14:45 ` Christoph Hellwig
2010-08-17 14:53 ` Avi Kivity
2010-08-17 14:54 ` Anthony Liguori
2010-08-17 15:01 ` Avi Kivity
2010-08-17 15:02 ` Christoph Hellwig
2010-08-17 14:40 ` Michael Tokarev [this message]
2010-08-17 14:44 ` Anthony Liguori
2010-08-17 14:46 ` Christoph Hellwig
2010-08-17 14:57 ` Anthony Liguori
2010-08-17 14:59 ` Avi Kivity
2010-08-17 15:04 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C6A9F4F.8040209@msgid.tls.msk.ru \
--to=mjt@tls.msk.ru \
--cc=anthony@codemonkey.ws \
--cc=hch@infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.