qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>, qemu-devel <qemu-devel@nongnu.org>
Subject: [Qemu-devel] Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3)
Date: Tue, 20 Jul 2010 17:59:24 +0200	[thread overview]
Message-ID: <20100720155923.GB12998@quack.suse.cz> (raw)
In-Reply-To: <4C45B59D.8040207@msgid.tls.msk.ru>

On Tue 20-07-10 17:41:33, Michael Tokarev wrote:
> 20.07.2010 16:46, Jan Kara wrote:
> >   Hi,
> >
> >On Fri 02-07-10 16:46:28, Michael Tokarev wrote:
> >>-----BEGIN PGP SIGNED MESSAGE-----
> >>Hash: SHA1
> >>
> >>I noticed that qcow2 images, esp. fresh ones (so that they
> >>receive lots of metadata updates) are very slow on my
> >>machine.  And on IRC (#kvm), Sheldon Hearn found that on
> >>ext3, it is fast again.
> >>
> >>So I tested different combinations for a bit, and observed
> >>the following:
> >>
> >>for fresh qcow2 file, with default qemu cache settings,
> >>copying kernel source is about 10 times slower on ext4
> >>than on ext3.  Second copy (rewrite) is significantly
> >>faster in both cases (expectable), but still ~20% slower
> >>on ext4 than on ext3.
> >>
> >>Normal cache mode in qemu is writethrough, which translates
> >>to O_SYNC file open mode.
> >>
> >>With cache=none, which translates to O_DIRECT, metadata-
> >>intensive writes (fresh qcow) are about as slow as on
> >>ext4 with O_SYNC, and rewrite is expectedly faster, but
> >>now there's _no_ difference in speed between ext3 and ext4.
> >>
> >>I did a series of straces of the writer processes, -- time
> >>spent in pwrite() syscalls is significantly larger for
> >>ext4 with O_SYNC than with ext3 with O_SYNC, the diff is
> >>about 50 times.
> >>
> >>Also, with slower I/O in case of ext4, qemu-kvm starts more
> >>I/O threads, which, as it seems, slows whole thing down even
> >>further - I changed max_threads from default 64 to 16, and
> >>the speed improved slightly.  Here, the diff. is again quite
> >>significant: on ext3 qemu spawns only 8 threads, while on
> >>ext4 all 64 I/O threads are spawned almost immediately.
> >>
> >>So I've two questions:
> >>
> >>  1.  Why ext4 O_SYNC is too slow compared with ext3 O_SYNC?
> >>    This is observed on 2.6.32 and 2.6.34 kernels, barriers
> >>    or data={writeback|ordered} had no difference.  I tested
> >>    whole thing on a partition on a single drive, sheldonh
> >>    used ext[34]fs on top of lvm on a raid1 volume.
> >   Do I get it right, that you have ext3/4 which carries fs images used by
> >KVM? What you describe is strange. Up to this moment it sounded to me like
> >a difference in barrier settings on the host but you seem to have tried
> >that. Just stabbing in the dark - could you try nodelalloc mount option
> >of ext4?
> 
> Yes, exactly, a guest filesystem image stored on ext3 or
> ext4.  And yes, I suspected barriers too, but immediately
> ruled that out, since barrier or no barrier does not matter
> in this test.
> 
> I'll try nodelalloc, but I'm not sure when: right now I'm at
> vacation, typing from a hotel, and my home machine whith all
> the guest images and the like is turned off and - for some
> reason - I can't wake it up over ethernet, it seemingly ignores
> WOL packets.  Too bad I don't have any guest image here on my
> notebook.
> 
> >>  2.  The number of threads spawned for I/O... this is a good
> >>    question, how to find an adequate cap.  Different hw has
> >>    different capabilities, and we may have more users doing
> >>    I/O at the same time...
> 
> >   Maybe you could measure your total throughput over some period,
> >try increasing number of threads in the next period and if it
> >helps significantly, use larger number, otherwise go back to a
> >smaller number?
> 
> Well, this is, again, a good question -- it's how qemu works right
> now, spawning up to 64 I/O threads for all I/O requiests guests
> submits.  The slower the I/O, the more threads can be spawned.
> Working that part out is a separate, difficult job.
> 
> The main question here is why ext4 is so slow for O_[D]SYNC writes.
  Yes.

> Besides, quite similar topic were discussed meanwhile, in a different
> thread titled "BTRFS: Unbelievably slow with kvm/qemu" -- see f.e.
> http://marc.info/?t=127891236700003&r=1&w=2 .  In particular, this
> message http://marc.info/?l=linux-kernel&m=127913696420974 shows
> a comparison table for a few filesystems and qemu/kvm usage, but on
> raw files instead of qcow.
  Thanks for the pointer. But in the comparison Christoph did, ext4 came
out slightly faster than ext3 when barrier options were equivalent.
Which is what I would expect... So what is the difference?

> Different qemu/kvm guest fs image options are (partial list):
> 
>  raw disk image in a file on host.  Either pre-allocated or
>    (initially) sparse.  The pre-allocated case should - in
>    theory - work equally on all filesystems.  While sparse
>    case should differ per filesystem, depending on how different
>    filesystems allocate data.
> 
>  qcow[2] image in a file on host.  This one is never sparse,
>   but unlike raw it also contains some qemu-specific metadata,
>   like which blocks are allocated and in which place, sorta
>   like lvm.  Initially it is created empty (with only a header),
>   and when guest perform writes, new blocks are allocated and
>   metadata gets updated.  This requires some more writes than
>   the guest performs, and quite a few syncs (with O_SYNC they're
>   automatic).

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

      reply	other threads:[~2010-07-20 16:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-02 12:46 [Qemu-devel] slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) Michael Tokarev
2010-07-20 13:46 ` [Qemu-devel] " Jan Kara
2010-07-20 14:41   ` Michael Tokarev
2010-07-20 15:59     ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100720155923.GB12998@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).