Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3)

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael Tokarev <mjt@tls.msk.ru>
To: Jan Kara <jack@suse.cz>
Cc: qemu-devel <qemu-devel@nongnu.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3)
Date: Tue, 20 Jul 2010 17:41:33 +0300	[thread overview]
Message-ID: <4C45B59D.8040207@msgid.tls.msk.ru> (raw)
In-Reply-To: <20100720134646.GC3657@quack.suse.cz>

20.07.2010 16:46, Jan Kara wrote:
>    Hi,
>
> On Fri 02-07-10 16:46:28, Michael Tokarev wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> I noticed that qcow2 images, esp. fresh ones (so that they
>> receive lots of metadata updates) are very slow on my
>> machine.  And on IRC (#kvm), Sheldon Hearn found that on
>> ext3, it is fast again.
>>
>> So I tested different combinations for a bit, and observed
>> the following:
>>
>> for fresh qcow2 file, with default qemu cache settings,
>> copying kernel source is about 10 times slower on ext4
>> than on ext3.  Second copy (rewrite) is significantly
>> faster in both cases (expectable), but still ~20% slower
>> on ext4 than on ext3.
>>
>> Normal cache mode in qemu is writethrough, which translates
>> to O_SYNC file open mode.
>>
>> With cache=none, which translates to O_DIRECT, metadata-
>> intensive writes (fresh qcow) are about as slow as on
>> ext4 with O_SYNC, and rewrite is expectedly faster, but
>> now there's _no_ difference in speed between ext3 and ext4.
>>
>> I did a series of straces of the writer processes, -- time
>> spent in pwrite() syscalls is significantly larger for
>> ext4 with O_SYNC than with ext3 with O_SYNC, the diff is
>> about 50 times.
>>
>> Also, with slower I/O in case of ext4, qemu-kvm starts more
>> I/O threads, which, as it seems, slows whole thing down even
>> further - I changed max_threads from default 64 to 16, and
>> the speed improved slightly.  Here, the diff. is again quite
>> significant: on ext3 qemu spawns only 8 threads, while on
>> ext4 all 64 I/O threads are spawned almost immediately.
>>
>> So I've two questions:
>>
>>   1.  Why ext4 O_SYNC is too slow compared with ext3 O_SYNC?
>>     This is observed on 2.6.32 and 2.6.34 kernels, barriers
>>     or data={writeback|ordered} had no difference.  I tested
>>     whole thing on a partition on a single drive, sheldonh
>>     used ext[34]fs on top of lvm on a raid1 volume.
>    Do I get it right, that you have ext3/4 which carries fs images used by
> KVM? What you describe is strange. Up to this moment it sounded to me like
> a difference in barrier settings on the host but you seem to have tried
> that. Just stabbing in the dark - could you try nodelalloc mount option
> of ext4?

Yes, exactly, a guest filesystem image stored on ext3 or
ext4.  And yes, I suspected barriers too, but immediately
ruled that out, since barrier or no barrier does not matter
in this test.

I'll try nodelalloc, but I'm not sure when: right now I'm at
vacation, typing from a hotel, and my home machine whith all
the guest images and the like is turned off and - for some
reason - I can't wake it up over ethernet, it seemingly ignores
WOL packets.  Too bad I don't have any guest image here on my
notebook.

>>   2.  The number of threads spawned for I/O... this is a good
>>     question, how to find an adequate cap.  Different hw has
>>     different capabilities, and we may have more users doing
>>     I/O at the same time...

>    Maybe you could measure your total throughput over some period,
> try increasing number of threads in the next period and if it
> helps significantly, use larger number, otherwise go back to a
> smaller number?

Well, this is, again, a good question -- it's how qemu works right
now, spawning up to 64 I/O threads for all I/O requiests guests
submits.  The slower the I/O, the more threads can be spawned.
Working that part out is a separate, difficult job.

The main question here is why ext4 is so slow for O_[D]SYNC writes.

Besides, quite similar topic were discussed meanwhile, in a different
thread titled "BTRFS: Unbelievably slow with kvm/qemu" -- see f.e.
http://marc.info/?t=127891236700003&r=1&w=2 .  In particular, this
message http://marc.info/?l=linux-kernel&m=127913696420974 shows
a comparison table for a few filesystems and qemu/kvm usage, but on
raw files instead of qcow.

Different qemu/kvm guest fs image options are (partial list):

  raw disk image in a file on host.  Either pre-allocated or
    (initially) sparse.  The pre-allocated case should - in
    theory - work equally on all filesystems.  While sparse
    case should differ per filesystem, depending on how different
    filesystems allocate data.

  qcow[2] image in a file on host.  This one is never sparse,
   but unlike raw it also contains some qemu-specific metadata,
   like which blocks are allocated and in which place, sorta
   like lvm.  Initially it is created empty (with only a header),
   and when guest perform writes, new blocks are allocated and
   metadata gets updated.  This requires some more writes than
   the guest performs, and quite a few syncs (with O_SYNC they're
   automatic).

Thanks!

/mjt

next prev parent reply	other threads:[~2010-07-20 14:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-02 12:46 slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) Michael Tokarev
2010-07-20 13:46 ` Jan Kara
2010-07-20 14:41   ` Michael Tokarev [this message]
2010-07-20 15:59     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C45B59D.8040207@msgid.tls.msk.ru \
    --to=mjt@tls.msk.ru \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).