From: Sebastian Hetze <s.hetze@linux-ag.com>
To: Avi Kivity <avi@redhat.com>
Cc: Sebastian Hetze <s.hetze@linux-ag.com>, kvm@vger.kernel.org
Subject: Re: syscall rmdir hangs with autofs
Date: Mon, 19 Jul 2010 14:48:46 +0200 [thread overview]
Message-ID: <20100719124847.0CDBBA005F@mail.linux-ag.de> (raw)
In-Reply-To: <4C444358.8010500@redhat.com>
On Mon, Jul 19, 2010 at 03:21:44PM +0300, Avi Kivity wrote:
> On 07/19/2010 02:40 PM, Sebastian Hetze wrote:
>> On Mon, Jul 19, 2010 at 02:12:59PM +0300, Avi Kivity wrote:
>>
>>> On 07/19/2010 11:39 AM, Sebastian Hetze wrote:
>>>
>>>> Hi *,
>>>>
>>>> we are encountering occasional problems with autofs running inside
>>>> an KVM guest.
>>>>
>>>> [1387441.969106] INFO: task automount:26560 blocked for more than 120 seconds.
>>>> [1387441.969110] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> [1387441.969112] automount D e8510198 0 26560 2702 0x00000000
>>>> [1387441.969117] db0a1ef4 00000082 80000000 e8510198 0004ed69 c8266000 f6e85a40 00000000
>>>> [1387441.969123] c08455e0 c08455e0 f41157f0 f4115a88 c55315e0 00000000 c0207c0a db0a1ef0
>>>> [1387441.969128] f4115a88 f7222bbc f7222bb8 ffffffff db0a1f20 c05976ae db0a1f14 f41157f0
>>>> [1387441.969133] Call Trace:
>>>> [1387441.969140] [<c0207c0a>] ? mntput_no_expire+0x1a/0xd0
>>>> [1387441.969146] [<c05976ae>] __mutex_lock_slowpath+0xbe/0x120
>>>> [1387441.969149] [<c05975d0>] mutex_lock+0x20/0x40
>>>> [1387441.969152] [<c01fbc82>] do_rmdir+0x52/0xe0
>>>> [1387441.969155] [<c059ae47>] ? do_page_fault+0x1d7/0x3a0
>>>> [1387441.969158] [<c01fbd70>] sys_rmdir+0x10/0x20
>>>> [1387441.969161] [<c01033cc>] syscall_call+0x7/0xb
>>>>
>>>> The block always occurs in sys_rmdir when automount tries to remove the
>>>> mountpoint right after umounting the filesystem. There is an successful lstat()
>>>> on the mountpoint directly precceeding the rmdir call.
>>>>
>>>> It looks like we are triggering some sort of race condition here.
>>>>
>>>> We are currently using 2.6.31-20-generic-pae ubuntu kernel in the 6 CPU guest,
>>>> 2.6.34 vanilla and qemu-kvm-0.12.4 in the host. But the problem existed
>>>> long before with all different combinations of guest/host/qemu versions.
>>>> The virtual HD is if=ide,format=host_device,cache=none on an DRBD container
>>>> on top of an LVM device. FS is ext3.
>>>>
>>>> Unfortunately, the problem is not easy reproduceable. It occurs every one
>>>> or two weeks. But since the hanging system call blocks the whole filesystem
>>>> we have to reboot the guest to get it into an useable state again.
>>>>
>>>> Any ideas what's going wrong here?
>>>>
>>>>
>>>>
>>> Is there substantial I/O going on?
>>>
>>> If not, it may be an autofs bug unrelated to kvm.
>>>
>> the autofs expire event occured at 01:15:01
>>
>> sar shows
>>
>> 00:00:01 CPU %user %nice %system %iowait %steal %idle
>> 00:35:02 all 0,31 1,90 3,88 29,78 0,00 64,12
>> 00:45:02 all 0,72 1,99 3,56 23,93 0,00 69,80
>> 00:55:01 all 1,35 1,49 4,13 23,76 0,00 69,27
>> 01:05:01 all 0,77 1,84 4,43 28,34 0,00 64,62
>> 01:15:01 all 0,29 1,46 3,41 44,07 0,00 50,77
>> 01:25:02 all 0,22 1,25 2,63 45,34 0,00 50,56
>> 01:35:02 all 0,34 1,33 2,87 46,74 0,00 48,72
>> 01:45:02 all 0,30 0,90 2,57 40,03 0,00 56,20
>> 01:55:02 all 0,26 0,43 2,29 9,79 0,00 87,23
>>
>> 00:00:01 tps rtps wtps bread/s bwrtn/s
>> 00:35:02 461,69 407,75 53,94 35196,06 32673,83
>> 00:45:02 298,29 238,30 59,99 38553,34 33062,97
>> 00:55:01 294,81 241,08 53,73 35469,66 25948,30
>> 01:05:01 338,62 279,97 58,66 36164,27 31109,18
>> 01:15:01 462,22 406,24 55,97 28428,26 25725,05
>> 01:25:02 366,88 331,82 35,07 24160,53 22284,83
>> 01:35:02 394,73 358,21 36,52 25770,79 23516,81
>> 01:45:02 409,83 379,66 30,17 17874,79 15608,74
>> 01:55:02 453,18 448,62 4,56 3754,82 79,47
>>
>> so, yes there is substantion I/O going on.
>>
>
> Looks like a false alarm then. The rmdir is waiting for the mount to
> flush everything to disk, which is slow and takes a while.
>
> Does it return eventually?
No, it does not return within hours (>10). And the problem occurs only
once in a while although the system is busy every day (and night)
and automount is mounting/expiring frequently.
I would expect the "/bin/umount dir" process (which is forked by
automount if I read the code correctly) to return only after the flush
is complete. So I expect the rmdir being called afterwards on an plain
empty directory.
BTW: the mount is just a bind mount, so no flush should be necessary
anyway.
next prev parent reply other threads:[~2010-07-19 12:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-19 8:39 syscall rmdir hangs with autofs Sebastian Hetze
2010-07-19 11:12 ` Avi Kivity
2010-07-19 11:40 ` Sebastian Hetze
[not found] ` <20100719114034.62BDD30303F5@mail.linux-ag.de>
2010-07-19 12:21 ` Avi Kivity
2010-07-19 12:48 ` Sebastian Hetze [this message]
2010-07-19 13:09 ` Avi Kivity
2010-07-19 13:45 ` Sebastian Hetze
[not found] ` <20100719134558.A0CD2A005F@mail.linux-ag.de>
2010-07-19 14:00 ` Avi Kivity
2010-07-19 14:47 ` Sebastian Hetze
[not found] ` <20100719144750.334F2303001B@mail.linux-ag.de>
2010-07-19 15:03 ` Avi Kivity
2010-07-19 15:23 ` Sebastian Hetze
[not found] ` <20100719152518.641BAB001A@mail.linux-ag.de>
2010-07-19 15:28 ` Avi Kivity
2010-07-19 15:38 ` Sebastian Hetze
[not found] ` <20100719153816.1E33FB0016@mail.linux-ag.de>
2010-07-19 17:55 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100719124847.0CDBBA005F@mail.linux-ag.de \
--to=s.hetze@linux-ag.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox