From mboxrd@z Thu Jan 1 00:00:00 1970 From: RW Subject: Re: I/O Performance Tips Date: Thu, 09 Dec 2010 09:55:35 +0100 Message-ID: <4D009987.6050705@tauceti.net> References: <1291882216.21240.124.camel@nick-desktop> Reply-To: kvm@tauceti.net Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: kvm@vger.kernel.org Return-path: Received: from tauceti.net ([62.245.250.166]:41061 "EHLO www.tauceti.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754856Ab0LIJFW (ORCPT ); Thu, 9 Dec 2010 04:05:22 -0500 Received: from [127.0.0.1] (unknown [10.0.0.8]) by www.tauceti.net (Postfix) with ESMTP id DE0D3859D40 for ; Thu, 9 Dec 2010 09:55:33 +0100 (CET) In-Reply-To: <1291882216.21240.124.camel@nick-desktop> Sender: kvm-owner@vger.kernel.org List-ID: We've don't use Ubuntu (we use Gentoo/KVM 0.12.5) but we've had a similar problem with kernels <2.6.32-r11 (this is Gentoo specific and means update -r11 and not release candidate 11) and with the first three releases of 2.6.34 especially when NFS was involved. We're currently running 2.6.32-r11 and 2.6.32-r20 kernels without this problems. And we're using deadline scheduler in host and guests. The VM images are stored on a LVM volume. Maybe a kernel update will help (if available of course...). We're also running all VMs with "cache=none". You should also send the qemu-kvm options you're using and version to the list. Maybe someone else could help further. - Robert On 12/09/10 09:10, Sebastian Nickel - Hetzner Online AG wrote: > Hello, > we have got some issues with I/O in our kvm environment. We are using > kernel version 2.6.32 (Ubuntu 10.04 LTS) to virtualise our hosts and we > are using ksm, too. Recently we noticed that sometimes the guest systems > (mainly OpenSuse guest systems) suddenly have a read only filesystem. > After some inspection we found out that the guest system generates some > ata errors due to timeouts (mostly in "flush cache" situations). On the > physical host there are always the same kernel messages when this > happens: > > """ > [1508127.195469] INFO: task kjournald:497 blocked for more than 120 > seconds. > [1508127.212828] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [1508127.246841] kjournald D 00000000ffffffff 0 497 2 > 0x00000000 > [1508127.246848] ffff88062128dba0 0000000000000046 0000000000015bc0 > 0000000000015bc0 > [1508127.246855] ffff880621089ab0 ffff88062128dfd8 0000000000015bc0 > ffff8806210896f0 > [1508127.246862] 0000000000015bc0 ffff88062128dfd8 0000000000015bc0 > ffff880621089ab0 > [1508127.246868] Call Trace: > [1508127.246880] [] ? sync_buffer+0x0/0x50 > [1508127.246889] [] io_schedule+0x47/0x70 > [1508127.246893] [] sync_buffer+0x45/0x50 > [1508127.246897] [] __wait_on_bit_lock+0x5a/0xc0 > [1508127.246901] [] ? sync_buffer+0x0/0x50 > [1508127.246905] [] out_of_line_wait_on_bit_lock > +0x78/0x90 > [1508127.246911] [] ? wake_bit_function+0x0/0x40 > [1508127.246915] [] __lock_buffer+0x36/0x40 > [1508127.246920] [] journal_submit_data_buffers > +0x311/0x320 > [1508127.246924] [] journal_commit_transaction > +0x2d2/0xe40 > [1508127.246931] [] ? default_spin_lock_flags > +0x9/0x10 > [1508127.246935] [] ? lock_timer_base+0x3c/0x70 > [1508127.246939] [] ? try_to_del_timer_sync+0x79/0xd0 > [1508127.246943] [] kjournald+0xed/0x250 > [1508127.246947] [] ? autoremove_wake_function > +0x0/0x40 > [1508127.246951] [] ? kjournald+0x0/0x250 > [1508127.246954] [] kthread+0x96/0xa0 > [1508127.246959] [] child_rip+0xa/0x20 > [1508127.246962] [] ? kthread+0x0/0xa0 > [1508127.246966] [] ? child_rip+0x0/0x20 > [1508127.246969] INFO: task flush-251:0:505 blocked for more than 120 > seconds. > [1508127.264076] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [1508127.298343] flush-251:0 D ffffffff810f4370 0 505 2 > 0x00000000 > [1508127.298349] ffff880621fdba30 0000000000000046 0000000000015bc0 > 0000000000015bc0 > [1508127.298354] ffff88062108b1a0 ffff880621fdbfd8 0000000000015bc0 > ffff88062108ade0 > [1508127.298358] 0000000000015bc0 ffff880621fdbfd8 0000000000015bc0 > ffff88062108b1a0 > [1508127.298362] Call Trace: > [1508127.298370] [] ? sync_page+0x0/0x50 > [1508127.298375] [] io_schedule+0x47/0x70 > [1508127.298379] [] sync_page+0x3d/0x50 > [1508127.298383] [] __wait_on_bit_lock+0x5a/0xc0 > [1508127.298391] [] __lock_page+0x67/0x70 > [1508127.298395] [] ? wake_bit_function+0x0/0x40 > [1508127.298402] [] ? unlock_page+0x27/0x30 > [1508127.298410] [] write_cache_pages+0x3bd/0x4d0 > [1508127.298417] [] ? __writepage+0x0/0x40 > [1508127.298425] [] generic_writepages+0x24/0x30 > [1508127.298432] [] do_writepages+0x35/0x40 > [1508127.298439] [] writeback_single_inode+0xf6/0x3d0 > [1508127.298449] [] ? rb_erase+0xd6/0x160 > [1508127.298455] [] writeback_inodes_wb+0x40e/0x5e0 > [1508127.298462] [] wb_writeback+0x10a/0x1d0 > [1508127.298469] [] ? try_to_del_timer_sync+0x79/0xd0 > [1508127.298477] [] ? schedule_timeout+0x19d/0x300 > [1508127.298485] [] wb_do_writeback+0x18c/0x1a0 > [1508127.298493] [] bdi_writeback_task+0x53/0xe0 > [1508127.298503] [] bdi_start_fn+0x86/0x100 > [1508127.298510] [] ? bdi_start_fn+0x0/0x100 > [1508127.298518] [] kthread+0x96/0xa0 > [1508127.298526] [] child_rip+0xa/0x20 > [1508127.298533] [] ? kthread+0x0/0xa0 > [1508127.298541] [] ? child_rip+0x0/0x20 > ... some more messages like those before... > """ > Sometimes the message "INFO: task kvm blocked for more than 120 seconds" > appears, too. I thought the error happens in cache writeback situations > so I started to adjust "/proc/sys/vm/dirty_background_ratio" to 5 and > "/proc/sys/vm/dirty_ratio" to 40. I thought this will write continously > smaller parts of cached memory to HDD (more often, but smaller chunks). > This did not help. There are still "readonly" filesystems in the guest > systems. Does anybody has some tips to regulate I/O on linux systems or > to stop those "readonly" filesystems? > > I tried the "cfq" scheduler and did some ionice (best efforts method, > all kvm processes in the same class). I thought of cgroups, but I could > not find any I/O related properties to set. > > We are using logical volumes as virtual guest HDDs. The volume group is > on top of a mdraid device. > > Thank you for your help... > > > Best Regards > > Sebastian > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html