All of lore.kernel.org
 help / color / mirror / Atom feed
From: RW <kvm@tauceti.net>
To: kvm@vger.kernel.org
Subject: Re: I/O Performance Tips
Date: Thu, 09 Dec 2010 09:55:35 +0100	[thread overview]
Message-ID: <4D009987.6050705@tauceti.net> (raw)
In-Reply-To: <1291882216.21240.124.camel@nick-desktop>

We've don't use Ubuntu (we use Gentoo/KVM 0.12.5) but we've had a 
similar problem with kernels <2.6.32-r11 (this is Gentoo specific and 
means update -r11 and not release candidate 11) and with the first three 
releases of 2.6.34 especially when NFS was involved. We're currently 
running 2.6.32-r11 and 2.6.32-r20 kernels without this problems. And 
we're using deadline scheduler in host and guests. The VM images are 
stored on a LVM volume. Maybe a kernel update will help (if available of 
course...). We're also running all VMs with "cache=none".

You should also send the qemu-kvm options you're using and version to 
the list. Maybe someone else could help further.

- Robert


On 12/09/10 09:10, Sebastian Nickel - Hetzner Online AG wrote:
> Hello,
> we have got some issues with I/O in our kvm environment. We are using
> kernel version 2.6.32 (Ubuntu 10.04 LTS) to virtualise our hosts and we
> are using ksm, too. Recently we noticed that sometimes the guest systems
> (mainly OpenSuse guest systems) suddenly have a read only filesystem.
> After some inspection we found out that the guest system generates some
> ata errors due to timeouts (mostly in "flush cache" situations). On the
> physical host there are always the same kernel messages when this
> happens:
>
> """
> [1508127.195469] INFO: task kjournald:497 blocked for more than 120
> seconds.
> [1508127.212828] "echo 0>  /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [1508127.246841] kjournald     D 00000000ffffffff     0   497      2
> 0x00000000
> [1508127.246848]  ffff88062128dba0 0000000000000046 0000000000015bc0
> 0000000000015bc0
> [1508127.246855]  ffff880621089ab0 ffff88062128dfd8 0000000000015bc0
> ffff8806210896f0
> [1508127.246862]  0000000000015bc0 ffff88062128dfd8 0000000000015bc0
> ffff880621089ab0
> [1508127.246868] Call Trace:
> [1508127.246880]  [<ffffffff8116e500>] ? sync_buffer+0x0/0x50
> [1508127.246889]  [<ffffffff81557d87>] io_schedule+0x47/0x70
> [1508127.246893]  [<ffffffff8116e545>] sync_buffer+0x45/0x50
> [1508127.246897]  [<ffffffff8155825a>] __wait_on_bit_lock+0x5a/0xc0
> [1508127.246901]  [<ffffffff8116e500>] ? sync_buffer+0x0/0x50
> [1508127.246905]  [<ffffffff81558338>] out_of_line_wait_on_bit_lock
> +0x78/0x90
> [1508127.246911]  [<ffffffff810850d0>] ? wake_bit_function+0x0/0x40
> [1508127.246915]  [<ffffffff8116e6c6>] __lock_buffer+0x36/0x40
> [1508127.246920]  [<ffffffff81213d11>] journal_submit_data_buffers
> +0x311/0x320
> [1508127.246924]  [<ffffffff81213ff2>] journal_commit_transaction
> +0x2d2/0xe40
> [1508127.246931]  [<ffffffff810397a9>] ? default_spin_lock_flags
> +0x9/0x10
> [1508127.246935]  [<ffffffff81076c7c>] ? lock_timer_base+0x3c/0x70
> [1508127.246939]  [<ffffffff81077719>] ? try_to_del_timer_sync+0x79/0xd0
> [1508127.246943]  [<ffffffff81217f0d>] kjournald+0xed/0x250
> [1508127.246947]  [<ffffffff81085090>] ? autoremove_wake_function
> +0x0/0x40
> [1508127.246951]  [<ffffffff81217e20>] ? kjournald+0x0/0x250
> [1508127.246954]  [<ffffffff81084d16>] kthread+0x96/0xa0
> [1508127.246959]  [<ffffffff810141ea>] child_rip+0xa/0x20
> [1508127.246962]  [<ffffffff81084c80>] ? kthread+0x0/0xa0
> [1508127.246966]  [<ffffffff810141e0>] ? child_rip+0x0/0x20
> [1508127.246969] INFO: task flush-251:0:505 blocked for more than 120
> seconds.
> [1508127.264076] "echo 0>  /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [1508127.298343] flush-251:0   D ffffffff810f4370     0   505      2
> 0x00000000
> [1508127.298349]  ffff880621fdba30 0000000000000046 0000000000015bc0
> 0000000000015bc0
> [1508127.298354]  ffff88062108b1a0 ffff880621fdbfd8 0000000000015bc0
> ffff88062108ade0
> [1508127.298358]  0000000000015bc0 ffff880621fdbfd8 0000000000015bc0
> ffff88062108b1a0
> [1508127.298362] Call Trace:
> [1508127.298370]  [<ffffffff810f4370>] ? sync_page+0x0/0x50
> [1508127.298375]  [<ffffffff81557d87>] io_schedule+0x47/0x70
> [1508127.298379]  [<ffffffff810f43ad>] sync_page+0x3d/0x50
> [1508127.298383]  [<ffffffff8155825a>] __wait_on_bit_lock+0x5a/0xc0
> [1508127.298391]  [<ffffffff810f4347>] __lock_page+0x67/0x70
> [1508127.298395]  [<ffffffff810850d0>] ? wake_bit_function+0x0/0x40
> [1508127.298402]  [<ffffffff810f4487>] ? unlock_page+0x27/0x30
> [1508127.298410]  [<ffffffff810fd9dd>] write_cache_pages+0x3bd/0x4d0
> [1508127.298417]  [<ffffffff810fc670>] ? __writepage+0x0/0x40
> [1508127.298425]  [<ffffffff810fdb14>] generic_writepages+0x24/0x30
> [1508127.298432]  [<ffffffff810fdb55>] do_writepages+0x35/0x40
> [1508127.298439]  [<ffffffff811668a6>] writeback_single_inode+0xf6/0x3d0
> [1508127.298449]  [<ffffffff812b81d6>] ? rb_erase+0xd6/0x160
> [1508127.298455]  [<ffffffff8116750e>] writeback_inodes_wb+0x40e/0x5e0
> [1508127.298462]  [<ffffffff811677ea>] wb_writeback+0x10a/0x1d0
> [1508127.298469]  [<ffffffff81077719>] ? try_to_del_timer_sync+0x79/0xd0
> [1508127.298477]  [<ffffffff8155803d>] ? schedule_timeout+0x19d/0x300
> [1508127.298485]  [<ffffffff81167b1c>] wb_do_writeback+0x18c/0x1a0
> [1508127.298493]  [<ffffffff81167b83>] bdi_writeback_task+0x53/0xe0
> [1508127.298503]  [<ffffffff8110f546>] bdi_start_fn+0x86/0x100
> [1508127.298510]  [<ffffffff8110f4c0>] ? bdi_start_fn+0x0/0x100
> [1508127.298518]  [<ffffffff81084d16>] kthread+0x96/0xa0
> [1508127.298526]  [<ffffffff810141ea>] child_rip+0xa/0x20
> [1508127.298533]  [<ffffffff81084c80>] ? kthread+0x0/0xa0
> [1508127.298541]  [<ffffffff810141e0>] ? child_rip+0x0/0x20
> ... some more messages like those before...
> """
> Sometimes the message "INFO: task kvm blocked for more than 120 seconds"
> appears, too. I thought the error happens in cache writeback situations
> so I started to adjust "/proc/sys/vm/dirty_background_ratio" to 5 and
> "/proc/sys/vm/dirty_ratio" to 40. I thought this will write continously
> smaller parts of cached memory to HDD (more often, but smaller chunks).
> This did not help. There are still "readonly" filesystems in the guest
> systems. Does anybody has some tips to regulate I/O on linux systems or
> to stop those "readonly" filesystems?
>
> I tried the "cfq" scheduler and did some ionice (best efforts method,
> all kvm processes in the same class). I thought of cgroups, but I could
> not find any I/O related properties to set.
>
> We are using logical volumes as virtual guest HDDs. The volume group is
> on top of a mdraid device.
>
> Thank you for your help...
>
>
> Best Regards
>
> Sebastian
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-12-09  9:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-09  8:10 I/O Performance Tips Sebastian Nickel - Hetzner Online AG
2010-12-09  8:55 ` RW [this message]
2010-12-09 10:30 ` Stefan Hajnoczi
2010-12-09 12:52   ` Sebastian Nickel - Hetzner Online AG
2010-12-09 14:51     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D009987.6050705@tauceti.net \
    --to=kvm@tauceti.net \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.