Re: repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..)

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Antoine Martin <antoine@nagafix.co.uk>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: kvm@vger.kernel.org
Subject: Re: repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..)
Date: Fri, 21 May 2010 16:38:41 +0700	[thread overview]
Message-ID: <4BF654A1.6080700@nagafix.co.uk> (raw)
In-Reply-To: <4B880731.5090500@nagafix.co.uk>

On 02/27/2010 12:38 AM, Antoine Martin wrote:
>>>   1   0   0  98   0   1|   0     0 |  66B  354B|   0     0 |  30    11
>>>   1   1   0  98   0   0|   0     0 |  66B  354B|   0     0 |  29    11
>>> From that point onwards, nothing will happen.
>>> The host has disk IO to spare... So what is it waiting for??
>> Moved to an AMD64 host. No effect.
>> Disabled swap before running the test. No effect.
>> Moved the guest to a fully up-to-date FC12 server 
>> (2.6.31.6-145.fc12.x86_64), no effect.
> I have narrowed it down to the guest's filesystem used for backing the 
> disk image which is loop mounted: although it was not completely full 
> (and had enough inodes), freeing some space on it prevents the system 
> from misbehaving.
>
> FYI: the disk image was clean and was fscked before each test. kvm had 
> been updated to 0.12.3
> The weird thing is that the same filesystem works fine (no system 
> hang) if used directly from the host, it is only misbehaving via kvm...
>
> So I am not dismissing the possibility that kvm may be at least partly 
> to blame, or that it is exposing a filesystem bug (race?) not normally 
> encountered.
> (I have backed up the full 32GB virtual disk in case someone suggests 
> further investigation)
Well, well. I've just hit the exact same bug on another *host* (not a 
guest), running stock Fedora 12.
So this isn't a kvm bug after all. Definitely a loop+ext(4?) bug.
Looks like you need a pretty big loop mounted partition to trigger it. 
(bigger than available ram?)

This is what triggered it on a quad amd system with 8Gb of ram, software 
raid-1 partition:
mount -o loop 2GB.dd source
dd if=/dev/zero of=8GB.dd bs=1048576 count=8192
mkfs.ext4 -f 8GB.dd
mount -o loop 8GB.dd dest
rsync -rplogtD source/* dest/
umount source
umount dest
^ this is where it hangs, I then tried to issue a 'sync' from another 
terminal, which also hung.
It took more than 10 minutes to settle itself, during that time one CPU 
was stuck in wait state.
dstat reported almost no IO at the time (<1MB/s)
I assume dstat reports page write back like any other disk IO?
That raid partition does ~60MB/s, so writing back 8GB shouldn't take 10 
minutes. (that's even assuming it would have to write back the whole 8GB 
at umount time - which should not be the case)

Cheers
Antoine

Here's the hung trace:
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000002     0   526  32488 0x00000000
  ffff880140f9fc88 0000000000000086 ffff880008e3c228 ffffffff810d5fd9
  ffff880140f9fc28 ffff880140f9fcd8 ffff880140f9ffd8 ffff880140f9ffd8
  ffff88021b5e03d8 000000000000f980 0000000000015740 ffff88021b5e03d8
Call Trace:
  [<ffffffff810d5fd9>] ? sync_page+0x0/0x4a
  [<ffffffff81046fbd>] ? __enqueue_entity+0x7b/0x7d
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff8113a055>] bdi_sched_wait+0xe/0x12
  [<ffffffff814549f0>] __wait_on_bit+0x48/0x7b
  [<ffffffff8102649f>] ? native_smp_send_reschedule+0x5c/0x5e
  [<ffffffff81454a91>] out_of_line_wait_on_bit+0x6e/0x79
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff810748dc>] ? wake_bit_function+0x0/0x33
  [<ffffffff8113ad0b>] wait_on_bit.clone.1+0x1e/0x20
  [<ffffffff8113ad71>] bdi_sync_writeback+0x64/0x6b
  [<ffffffff8113ad9a>] sync_inodes_sb+0x22/0xec
  [<ffffffff8113e547>] __sync_filesystem+0x4e/0x77
  [<ffffffff8113e71d>] sync_filesystem+0x4b/0x4f
  [<ffffffff8111d6d9>] generic_shutdown_super+0x27/0xc9
  [<ffffffff8111d7a2>] kill_block_super+0x27/0x3f
  [<ffffffff8111ded7>] deactivate_super+0x56/0x6b
  [<ffffffff81134262>] mntput_no_expire+0xb4/0xec
  [<ffffffff8113482a>] sys_umount+0x2d5/0x304
  [<ffffffff81458133>] ? do_page_fault+0x270/0x2a0
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000002     0   526  32488 0x00000000
  ffff880140f9fc88 0000000000000086 ffff880008e3c228 ffffffff810d5fd9
  ffff880140f9fc28 ffff880140f9fcd8 ffff880140f9ffd8 ffff880140f9ffd8
  ffff88021b5e03d8 000000000000f980 0000000000015740 ffff88021b5e03d8
Call Trace:
  [<ffffffff810d5fd9>] ? sync_page+0x0/0x4a
  [<ffffffff81046fbd>] ? __enqueue_entity+0x7b/0x7d
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff8113a055>] bdi_sched_wait+0xe/0x12
  [<ffffffff814549f0>] __wait_on_bit+0x48/0x7b
  [<ffffffff8102649f>] ? native_smp_send_reschedule+0x5c/0x5e
  [<ffffffff81454a91>] out_of_line_wait_on_bit+0x6e/0x79
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff810748dc>] ? wake_bit_function+0x0/0x33
  [<ffffffff8113ad0b>] wait_on_bit.clone.1+0x1e/0x20
  [<ffffffff8113ad71>] bdi_sync_writeback+0x64/0x6b
  [<ffffffff8113ad9a>] sync_inodes_sb+0x22/0xec
  [<ffffffff8113e547>] __sync_filesystem+0x4e/0x77
  [<ffffffff8113e71d>] sync_filesystem+0x4b/0x4f
  [<ffffffff8111d6d9>] generic_shutdown_super+0x27/0xc9
  [<ffffffff8111d7a2>] kill_block_super+0x27/0x3f
  [<ffffffff8111ded7>] deactivate_super+0x56/0x6b
  [<ffffffff81134262>] mntput_no_expire+0xb4/0xec
  [<ffffffff8113482a>] sys_umount+0x2d5/0x304
  [<ffffffff81458133>] ? do_page_fault+0x270/0x2a0
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000002     0   526  32488 0x00000000
  ffff880140f9fc88 0000000000000086 ffff880008e3c228 ffffffff810d5fd9
  ffff880140f9fc28 ffff880140f9fcd8 ffff880140f9ffd8 ffff880140f9ffd8
  ffff88021b5e03d8 000000000000f980 0000000000015740 ffff88021b5e03d8
Call Trace:
  [<ffffffff810d5fd9>] ? sync_page+0x0/0x4a
  [<ffffffff81046fbd>] ? __enqueue_entity+0x7b/0x7d
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff8113a055>] bdi_sched_wait+0xe/0x12
  [<ffffffff814549f0>] __wait_on_bit+0x48/0x7b
  [<ffffffff8102649f>] ? native_smp_send_reschedule+0x5c/0x5e
  [<ffffffff81454a91>] out_of_line_wait_on_bit+0x6e/0x79
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff810748dc>] ? wake_bit_function+0x0/0x33
  [<ffffffff8113ad0b>] wait_on_bit.clone.1+0x1e/0x20
  [<ffffffff8113ad71>] bdi_sync_writeback+0x64/0x6b
  [<ffffffff8113ad9a>] sync_inodes_sb+0x22/0xec
  [<ffffffff8113e547>] __sync_filesystem+0x4e/0x77
  [<ffffffff8113e71d>] sync_filesystem+0x4b/0x4f
  [<ffffffff8111d6d9>] generic_shutdown_super+0x27/0xc9
  [<ffffffff8111d7a2>] kill_block_super+0x27/0x3f
  [<ffffffff8111ded7>] deactivate_super+0x56/0x6b
  [<ffffffff81134262>] mntput_no_expire+0xb4/0xec
  [<ffffffff8113482a>] sys_umount+0x2d5/0x304
  [<ffffffff81458133>] ? do_page_fault+0x270/0x2a0
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000002     0   526  32488 0x00000000
  ffff880140f9fc88 0000000000000086 ffff880008e3c228 ffffffff810d5fd9
  ffff880140f9fc28 ffff880140f9fcd8 ffff880140f9ffd8 ffff880140f9ffd8
  ffff88021b5e03d8 000000000000f980 0000000000015740 ffff88021b5e03d8
Call Trace:
  [<ffffffff810d5fd9>] ? sync_page+0x0/0x4a
  [<ffffffff81046fbd>] ? __enqueue_entity+0x7b/0x7d
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff8113a055>] bdi_sched_wait+0xe/0x12
  [<ffffffff814549f0>] __wait_on_bit+0x48/0x7b
  [<ffffffff8102649f>] ? native_smp_send_reschedule+0x5c/0x5e
  [<ffffffff81454a91>] out_of_line_wait_on_bit+0x6e/0x79
  [<ffffffff8113a047>] ? bdi_sched_wait+0x0/0x12
  [<ffffffff810748dc>] ? wake_bit_function+0x0/0x33
  [<ffffffff8113ad0b>] wait_on_bit.clone.1+0x1e/0x20
  [<ffffffff8113ad71>] bdi_sync_writeback+0x64/0x6b
  [<ffffffff8113ad9a>] sync_inodes_sb+0x22/0xec
  [<ffffffff8113e547>] __sync_filesystem+0x4e/0x77
  [<ffffffff8113e71d>] sync_filesystem+0x4b/0x4f
  [<ffffffff8111d6d9>] generic_shutdown_super+0x27/0xc9
  [<ffffffff8111d7a2>] kill_block_super+0x27/0x3f
  [<ffffffff8111ded7>] deactivate_super+0x56/0x6b
  [<ffffffff81134262>] mntput_no_expire+0xb4/0xec
  [<ffffffff8113482a>] sys_umount+0x2d5/0x304
  [<ffffffff81458133>] ? do_page_fault+0x270/0x2a0
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b
INFO: task sync:741 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sync          D 0000000000000000     0   741    552 0x00000000
  ffff88003c31fed8 0000000000000086 0000000000000000 ffff8801b7472ec0
  ffff88003c31fe38 0000000000000246 ffff88003c31ffd8 ffff88003c31ffd8
  ffff8801b7473298 000000000000f980 0000000000015740 ffff8801b7473298
Call Trace:
  [<ffffffff81455a2e>] __down_read+0x92/0xaa
  [<ffffffff814550e1>] down_read+0x31/0x35
  [<ffffffff8113e5f6>] sync_filesystems+0x86/0xf6
  [<ffffffff8113e6b6>] sys_sync+0x17/0x33
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b

next prev parent reply	other threads:[~2010-05-21  9:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-21 17:26 repeatable hang with loop mount and heavy IO in guest Antoine Martin
2010-01-21 20:08 ` RW
2010-01-21 21:08   ` Thomas Beinicke
2010-01-21 21:36     ` RW
2010-01-22  7:57 ` Michael Tokarev
2010-01-22 18:28   ` repeatable hang with loop mount and heavy IO in guest [SOLVED] Antoine Martin
2010-01-22 19:15     ` repeatable hang with loop mount and heavy IO in guest [NOT SOLVED] Antoine Martin
2010-01-24 11:23       ` Antoine Martin
2010-02-03 19:28       ` Antoine Martin
2010-02-26 17:38         ` repeatable hang with loop mount and heavy IO in guest Antoine Martin
2010-05-21  9:38           ` Antoine Martin [this message]
2010-05-22 18:10             ` repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..) Jim Paris
2010-05-22 19:33               ` Antoine Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BF654A1.6080700@nagafix.co.uk \
    --to=antoine@nagafix.co.uk \
    --cc=kvm@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox