From mboxrd@z Thu Jan  1 00:00:00 1970
From: Michael Tokarev <mjt@tls.msk.ru>
Subject: Re: writes to a virtio block device hungs
Date: Fri, 26 Sep 2008 12:28:46 +0400
Message-ID: <48DC9D3E.2020106@msgid.tls.msk.ru>
References: <48D76286.5090203@msgid.tls.msk.ru> <48D89563.5050502@msgid.tls.msk.ru> <20080925230251.GB22929@dmt.cnet>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from hobbit.corpit.ru ([81.13.33.150]:22940 "EHLO hobbit.corpit.ru"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752732AbYIZI2s (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 26 Sep 2008 04:28:48 -0400
In-Reply-To: <20080925230251.GB22929@dmt.cnet>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Marcelo Tosatti wrote:
> On Tue, Sep 23, 2008 at 11:06:11AM +0400, Michael Tokarev wrote:
>>> (both host and guests are linux machines), I placed
>>> one virtual machine into production use, and almost
>>> immediately come... issues.  Here's how it looks like
>>> from the guest:
>>>
>>> Sep 21 10:35:52 hobbit kernel: INFO: task cleanup:20535 blocked for more than 120 seconds.
>>> Sep 21 10:35:52 hobbit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> Sep 21 10:35:52 hobbit kernel: cleanup       D 00000000     0 20535   1570
>>> Sep 21 10:35:52 hobbit kernel:        f73b39c0 00200086 00000000 00000000 c3a2ba48 00000000 f7022e00 00000000
>>> Sep 21 10:35:52 hobbit kernel:        dbc48ed4 f789c000 c0399080 c0157e48 0000000e 00000000 d05e1b80 d05e1ce4
>>> Sep 21 10:35:52 hobbit kernel:        00000002 00200286 c01322f7 d05e1ce4 c0131ef0 dbc48ec8 00200286 c0132486
>>> Sep 21 10:35:52 hobbit kernel: Call Trace:
>>> Sep 21 10:35:52 hobbit kernel:  [<c0157e48>] find_get_pages_tag+0x38/0x80
>>> Sep 21 10:35:52 hobbit kernel:  [<c01322f7>] lock_timer_base+0x27/0x60
>>> Sep 21 10:35:52 hobbit kernel:  [<c0131ef0>] process_timeout+0x0/0x10
>>> Sep 21 10:35:52 hobbit kernel:  [<c0132486>] __mod_timer+0x86/0xa0
>>> Sep 21 10:35:52 hobbit kernel:  [<c02c6408>] schedule_timeout+0x58/0xb0
>>> Sep 21 10:35:52 hobbit kernel:  [<c0131ef0>] process_timeout+0x0/0x10
>>> Sep 21 10:35:52 hobbit kernel:  [<f882db04>] journal_stop+0xa4/0x1b0 [jbd]
>>> Sep 21 10:35:52 hobbit kernel:  [<f882ece8>] journal_start+0x88/0xc0 [jbd]
>>> Sep 21 10:35:52 hobbit kernel:  [<f8860f20>] ext3_write_inode+0x0/0x40 [ext3]
>>> Sep 21 10:35:52 hobbit kernel:  [<f8860f20>] ext3_write_inode+0x0/0x40 [ext3]
>>> Sep 21 10:35:52 hobbit kernel:  [<c019d002>] __writeback_single_inode+0x282/0x390
>>> Sep 21 10:35:52 hobbit kernel:  [<c015f3c0>] generic_writepages+0x20/0x30
>>> Sep 21 10:35:52 hobbit kernel:  [<c015f419>] do_writepages+0x49/0x50
>>> Sep 21 10:35:52 hobbit kernel:  [<c0159151>] __filemap_fdatawrite_range+0x71/0x90
>>> Sep 21 10:35:52 hobbit kernel:  [<c019d131>] sync_inode+0x21/0x40
>>> Sep 21 10:35:52 hobbit kernel:  [<f885f88e>] ext3_sync_file+0x9e/0xc0 [ext3]
>>> Sep 21 10:35:52 hobbit kernel:  [<c01a065e>] do_fsync+0x6e/0xb0
>>> Sep 21 10:35:52 hobbit kernel:  [<c01a06c7>] __do_fsync+0x27/0x50
>>> Sep 21 10:35:52 hobbit kernel:  [<c01032f3>] sysenter_past_esp+0x78/0xb1
>>> Sep 21 10:35:52 hobbit kernel:  =======================
>>>
>>> It's almost always after fsync, but I guess it's due to the fact that
>>> cleanup (from Postfix) process is the one who does that most often.
>>>
>> I'm waiting for opportunity to install a new kernel with new kvm...
>> in a hope still.

Meanwhile I installed kvm-75, which did NOT change anything, -- the system
still hangs.  What really changed things is switching guest to single
processor (was 2 before, from 4-core Phenom).

> Are you using ext3 in the host as the filesystem to back the guest
> image? If so, try writeback instead of ordered mode:

On the host there's an MD device (raid1) that hold complete "raw" disk
image for the guest.  It was in my email:

 >> The device in question is a virtio block device (vda), which is on top
 >> op a raid1 device on the host (/dev/md_d5, partitioned).  [...]

I'm trying to set up a test system to debug the case further,
because it's impossible to do that on production machine.

/mjt