* [Qemu-devel] slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) @ 2010-07-02 12:46 Michael Tokarev 2010-07-20 13:46 ` [Qemu-devel] " Jan Kara 0 siblings, 1 reply; 4+ messages in thread From: Michael Tokarev @ 2010-07-02 12:46 UTC (permalink / raw) To: qemu-devel, linux-fsdevel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello. I noticed that qcow2 images, esp. fresh ones (so that they receive lots of metadata updates) are very slow on my machine. And on IRC (#kvm), Sheldon Hearn found that on ext3, it is fast again. So I tested different combinations for a bit, and observed the following: for fresh qcow2 file, with default qemu cache settings, copying kernel source is about 10 times slower on ext4 than on ext3. Second copy (rewrite) is significantly faster in both cases (expectable), but still ~20% slower on ext4 than on ext3. Normal cache mode in qemu is writethrough, which translates to O_SYNC file open mode. With cache=none, which translates to O_DIRECT, metadata- intensive writes (fresh qcow) are about as slow as on ext4 with O_SYNC, and rewrite is expectedly faster, but now there's _no_ difference in speed between ext3 and ext4. I did a series of straces of the writer processes, -- time spent in pwrite() syscalls is significantly larger for ext4 with O_SYNC than with ext3 with O_SYNC, the diff is about 50 times. Also, with slower I/O in case of ext4, qemu-kvm starts more I/O threads, which, as it seems, slows whole thing down even further - I changed max_threads from default 64 to 16, and the speed improved slightly. Here, the diff. is again quite significant: on ext3 qemu spawns only 8 threads, while on ext4 all 64 I/O threads are spawned almost immediately. So I've two questions: 1. Why ext4 O_SYNC is too slow compared with ext3 O_SYNC? This is observed on 2.6.32 and 2.6.34 kernels, barriers or data={writeback|ordered} had no difference. I tested whole thing on a partition on a single drive, sheldonh used ext[34]fs on top of lvm on a raid1 volume. 2. The number of threads spawned for I/O... this is a good question, how to find an adequate cap. Different hw has different capabilities, and we may have more users doing I/O at the same time... Comments? Thanks! /mjt -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iJwEAQECAAYFAkwt36MACgkQUlPFrXTwyDj/CAQAlhaGjk4csnhlP1zaHFubFR8F qiD6HkCUPeofrNAqqbAQYmaK9rNuiFgdiSfkqB1mBCy9Y0ay69XQPXPmTsTH2y66 s+eRC6voIBtGKiPNQN7jSSrHhl3hC1g/FrByppQsM0laWxmW6nQKaZOnlR9vKdvt 2zNKV/9qfM0VXr8Yf6Y= =UIna -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Qemu-devel] Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) 2010-07-02 12:46 [Qemu-devel] slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) Michael Tokarev @ 2010-07-20 13:46 ` Jan Kara 2010-07-20 14:41 ` Michael Tokarev 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2010-07-20 13:46 UTC (permalink / raw) To: Michael Tokarev; +Cc: linux-fsdevel, qemu-devel Hi, On Fri 02-07-10 16:46:28, Michael Tokarev wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I noticed that qcow2 images, esp. fresh ones (so that they > receive lots of metadata updates) are very slow on my > machine. And on IRC (#kvm), Sheldon Hearn found that on > ext3, it is fast again. > > So I tested different combinations for a bit, and observed > the following: > > for fresh qcow2 file, with default qemu cache settings, > copying kernel source is about 10 times slower on ext4 > than on ext3. Second copy (rewrite) is significantly > faster in both cases (expectable), but still ~20% slower > on ext4 than on ext3. > > Normal cache mode in qemu is writethrough, which translates > to O_SYNC file open mode. > > With cache=none, which translates to O_DIRECT, metadata- > intensive writes (fresh qcow) are about as slow as on > ext4 with O_SYNC, and rewrite is expectedly faster, but > now there's _no_ difference in speed between ext3 and ext4. > > I did a series of straces of the writer processes, -- time > spent in pwrite() syscalls is significantly larger for > ext4 with O_SYNC than with ext3 with O_SYNC, the diff is > about 50 times. > > Also, with slower I/O in case of ext4, qemu-kvm starts more > I/O threads, which, as it seems, slows whole thing down even > further - I changed max_threads from default 64 to 16, and > the speed improved slightly. Here, the diff. is again quite > significant: on ext3 qemu spawns only 8 threads, while on > ext4 all 64 I/O threads are spawned almost immediately. > > So I've two questions: > > 1. Why ext4 O_SYNC is too slow compared with ext3 O_SYNC? > This is observed on 2.6.32 and 2.6.34 kernels, barriers > or data={writeback|ordered} had no difference. I tested > whole thing on a partition on a single drive, sheldonh > used ext[34]fs on top of lvm on a raid1 volume. Do I get it right, that you have ext3/4 which carries fs images used by KVM? What you describe is strange. Up to this moment it sounded to me like a difference in barrier settings on the host but you seem to have tried that. Just stabbing in the dark - could you try nodelalloc mount option of ext4? > 2. The number of threads spawned for I/O... this is a good > question, how to find an adequate cap. Different hw has > different capabilities, and we may have more users doing > I/O at the same time... Maybe you could measure your total throughput over some period, try increasing number of threads in the next period and if it helps significantly, use larger number, otherwise go back to a smaller number? Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Qemu-devel] Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) 2010-07-20 13:46 ` [Qemu-devel] " Jan Kara @ 2010-07-20 14:41 ` Michael Tokarev 2010-07-20 15:59 ` Jan Kara 0 siblings, 1 reply; 4+ messages in thread From: Michael Tokarev @ 2010-07-20 14:41 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, qemu-devel 20.07.2010 16:46, Jan Kara wrote: > Hi, > > On Fri 02-07-10 16:46:28, Michael Tokarev wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> I noticed that qcow2 images, esp. fresh ones (so that they >> receive lots of metadata updates) are very slow on my >> machine. And on IRC (#kvm), Sheldon Hearn found that on >> ext3, it is fast again. >> >> So I tested different combinations for a bit, and observed >> the following: >> >> for fresh qcow2 file, with default qemu cache settings, >> copying kernel source is about 10 times slower on ext4 >> than on ext3. Second copy (rewrite) is significantly >> faster in both cases (expectable), but still ~20% slower >> on ext4 than on ext3. >> >> Normal cache mode in qemu is writethrough, which translates >> to O_SYNC file open mode. >> >> With cache=none, which translates to O_DIRECT, metadata- >> intensive writes (fresh qcow) are about as slow as on >> ext4 with O_SYNC, and rewrite is expectedly faster, but >> now there's _no_ difference in speed between ext3 and ext4. >> >> I did a series of straces of the writer processes, -- time >> spent in pwrite() syscalls is significantly larger for >> ext4 with O_SYNC than with ext3 with O_SYNC, the diff is >> about 50 times. >> >> Also, with slower I/O in case of ext4, qemu-kvm starts more >> I/O threads, which, as it seems, slows whole thing down even >> further - I changed max_threads from default 64 to 16, and >> the speed improved slightly. Here, the diff. is again quite >> significant: on ext3 qemu spawns only 8 threads, while on >> ext4 all 64 I/O threads are spawned almost immediately. >> >> So I've two questions: >> >> 1. Why ext4 O_SYNC is too slow compared with ext3 O_SYNC? >> This is observed on 2.6.32 and 2.6.34 kernels, barriers >> or data={writeback|ordered} had no difference. I tested >> whole thing on a partition on a single drive, sheldonh >> used ext[34]fs on top of lvm on a raid1 volume. > Do I get it right, that you have ext3/4 which carries fs images used by > KVM? What you describe is strange. Up to this moment it sounded to me like > a difference in barrier settings on the host but you seem to have tried > that. Just stabbing in the dark - could you try nodelalloc mount option > of ext4? Yes, exactly, a guest filesystem image stored on ext3 or ext4. And yes, I suspected barriers too, but immediately ruled that out, since barrier or no barrier does not matter in this test. I'll try nodelalloc, but I'm not sure when: right now I'm at vacation, typing from a hotel, and my home machine whith all the guest images and the like is turned off and - for some reason - I can't wake it up over ethernet, it seemingly ignores WOL packets. Too bad I don't have any guest image here on my notebook. >> 2. The number of threads spawned for I/O... this is a good >> question, how to find an adequate cap. Different hw has >> different capabilities, and we may have more users doing >> I/O at the same time... > Maybe you could measure your total throughput over some period, > try increasing number of threads in the next period and if it > helps significantly, use larger number, otherwise go back to a > smaller number? Well, this is, again, a good question -- it's how qemu works right now, spawning up to 64 I/O threads for all I/O requiests guests submits. The slower the I/O, the more threads can be spawned. Working that part out is a separate, difficult job. The main question here is why ext4 is so slow for O_[D]SYNC writes. Besides, quite similar topic were discussed meanwhile, in a different thread titled "BTRFS: Unbelievably slow with kvm/qemu" -- see f.e. http://marc.info/?t=127891236700003&r=1&w=2 . In particular, this message http://marc.info/?l=linux-kernel&m=127913696420974 shows a comparison table for a few filesystems and qemu/kvm usage, but on raw files instead of qcow. Different qemu/kvm guest fs image options are (partial list): raw disk image in a file on host. Either pre-allocated or (initially) sparse. The pre-allocated case should - in theory - work equally on all filesystems. While sparse case should differ per filesystem, depending on how different filesystems allocate data. qcow[2] image in a file on host. This one is never sparse, but unlike raw it also contains some qemu-specific metadata, like which blocks are allocated and in which place, sorta like lvm. Initially it is created empty (with only a header), and when guest perform writes, new blocks are allocated and metadata gets updated. This requires some more writes than the guest performs, and quite a few syncs (with O_SYNC they're automatic). Thanks! /mjt ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Qemu-devel] Re: slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) 2010-07-20 14:41 ` Michael Tokarev @ 2010-07-20 15:59 ` Jan Kara 0 siblings, 0 replies; 4+ messages in thread From: Jan Kara @ 2010-07-20 15:59 UTC (permalink / raw) To: Michael Tokarev; +Cc: linux-fsdevel, Jan Kara, qemu-devel On Tue 20-07-10 17:41:33, Michael Tokarev wrote: > 20.07.2010 16:46, Jan Kara wrote: > > Hi, > > > >On Fri 02-07-10 16:46:28, Michael Tokarev wrote: > >>-----BEGIN PGP SIGNED MESSAGE----- > >>Hash: SHA1 > >> > >>I noticed that qcow2 images, esp. fresh ones (so that they > >>receive lots of metadata updates) are very slow on my > >>machine. And on IRC (#kvm), Sheldon Hearn found that on > >>ext3, it is fast again. > >> > >>So I tested different combinations for a bit, and observed > >>the following: > >> > >>for fresh qcow2 file, with default qemu cache settings, > >>copying kernel source is about 10 times slower on ext4 > >>than on ext3. Second copy (rewrite) is significantly > >>faster in both cases (expectable), but still ~20% slower > >>on ext4 than on ext3. > >> > >>Normal cache mode in qemu is writethrough, which translates > >>to O_SYNC file open mode. > >> > >>With cache=none, which translates to O_DIRECT, metadata- > >>intensive writes (fresh qcow) are about as slow as on > >>ext4 with O_SYNC, and rewrite is expectedly faster, but > >>now there's _no_ difference in speed between ext3 and ext4. > >> > >>I did a series of straces of the writer processes, -- time > >>spent in pwrite() syscalls is significantly larger for > >>ext4 with O_SYNC than with ext3 with O_SYNC, the diff is > >>about 50 times. > >> > >>Also, with slower I/O in case of ext4, qemu-kvm starts more > >>I/O threads, which, as it seems, slows whole thing down even > >>further - I changed max_threads from default 64 to 16, and > >>the speed improved slightly. Here, the diff. is again quite > >>significant: on ext3 qemu spawns only 8 threads, while on > >>ext4 all 64 I/O threads are spawned almost immediately. > >> > >>So I've two questions: > >> > >> 1. Why ext4 O_SYNC is too slow compared with ext3 O_SYNC? > >> This is observed on 2.6.32 and 2.6.34 kernels, barriers > >> or data={writeback|ordered} had no difference. I tested > >> whole thing on a partition on a single drive, sheldonh > >> used ext[34]fs on top of lvm on a raid1 volume. > > Do I get it right, that you have ext3/4 which carries fs images used by > >KVM? What you describe is strange. Up to this moment it sounded to me like > >a difference in barrier settings on the host but you seem to have tried > >that. Just stabbing in the dark - could you try nodelalloc mount option > >of ext4? > > Yes, exactly, a guest filesystem image stored on ext3 or > ext4. And yes, I suspected barriers too, but immediately > ruled that out, since barrier or no barrier does not matter > in this test. > > I'll try nodelalloc, but I'm not sure when: right now I'm at > vacation, typing from a hotel, and my home machine whith all > the guest images and the like is turned off and - for some > reason - I can't wake it up over ethernet, it seemingly ignores > WOL packets. Too bad I don't have any guest image here on my > notebook. > > >> 2. The number of threads spawned for I/O... this is a good > >> question, how to find an adequate cap. Different hw has > >> different capabilities, and we may have more users doing > >> I/O at the same time... > > > Maybe you could measure your total throughput over some period, > >try increasing number of threads in the next period and if it > >helps significantly, use larger number, otherwise go back to a > >smaller number? > > Well, this is, again, a good question -- it's how qemu works right > now, spawning up to 64 I/O threads for all I/O requiests guests > submits. The slower the I/O, the more threads can be spawned. > Working that part out is a separate, difficult job. > > The main question here is why ext4 is so slow for O_[D]SYNC writes. Yes. > Besides, quite similar topic were discussed meanwhile, in a different > thread titled "BTRFS: Unbelievably slow with kvm/qemu" -- see f.e. > http://marc.info/?t=127891236700003&r=1&w=2 . In particular, this > message http://marc.info/?l=linux-kernel&m=127913696420974 shows > a comparison table for a few filesystems and qemu/kvm usage, but on > raw files instead of qcow. Thanks for the pointer. But in the comparison Christoph did, ext4 came out slightly faster than ext3 when barrier options were equivalent. Which is what I would expect... So what is the difference? > Different qemu/kvm guest fs image options are (partial list): > > raw disk image in a file on host. Either pre-allocated or > (initially) sparse. The pre-allocated case should - in > theory - work equally on all filesystems. While sparse > case should differ per filesystem, depending on how different > filesystems allocate data. > > qcow[2] image in a file on host. This one is never sparse, > but unlike raw it also contains some qemu-specific metadata, > like which blocks are allocated and in which place, sorta > like lvm. Initially it is created empty (with only a header), > and when guest perform writes, new blocks are allocated and > metadata gets updated. This requires some more writes than > the guest performs, and quite a few syncs (with O_SYNC they're > automatic). Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-07-20 16:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-07-02 12:46 [Qemu-devel] slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3) Michael Tokarev 2010-07-20 13:46 ` [Qemu-devel] " Jan Kara 2010-07-20 14:41 ` Michael Tokarev 2010-07-20 15:59 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).