Re: syscall rmdir hangs with autofs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sebastian Hetze <s.hetze@linux-ag.com>
To: Avi Kivity <avi@redhat.com>
Cc: Sebastian Hetze <s.hetze@linux-ag.com>, kvm@vger.kernel.org
Subject: Re: syscall rmdir hangs with autofs
Date: Mon, 19 Jul 2010 14:48:46 +0200	[thread overview]
Message-ID: <20100719124847.0CDBBA005F@mail.linux-ag.de> (raw)
In-Reply-To: <4C444358.8010500@redhat.com>

On Mon, Jul 19, 2010 at 03:21:44PM +0300, Avi Kivity wrote:
> On 07/19/2010 02:40 PM, Sebastian Hetze wrote:
>> On Mon, Jul 19, 2010 at 02:12:59PM +0300, Avi Kivity wrote:
>>    
>>> On 07/19/2010 11:39 AM, Sebastian Hetze wrote:
>>>      
>>>> Hi *,
>>>>
>>>> we are encountering occasional problems with autofs running inside
>>>> an KVM guest.
>>>>
>>>> [1387441.969106] INFO: task automount:26560 blocked for more than 120 seconds.
>>>> [1387441.969110] "echo 0>   /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> [1387441.969112] automount     D e8510198     0 26560   2702 0x00000000
>>>> [1387441.969117]  db0a1ef4 00000082 80000000 e8510198 0004ed69 c8266000 f6e85a40 00000000
>>>> [1387441.969123]  c08455e0 c08455e0 f41157f0 f4115a88 c55315e0 00000000 c0207c0a db0a1ef0
>>>> [1387441.969128]  f4115a88 f7222bbc f7222bb8 ffffffff db0a1f20 c05976ae db0a1f14 f41157f0
>>>> [1387441.969133] Call Trace:
>>>> [1387441.969140]  [<c0207c0a>] ? mntput_no_expire+0x1a/0xd0
>>>> [1387441.969146]  [<c05976ae>] __mutex_lock_slowpath+0xbe/0x120
>>>> [1387441.969149]  [<c05975d0>] mutex_lock+0x20/0x40
>>>> [1387441.969152]  [<c01fbc82>] do_rmdir+0x52/0xe0
>>>> [1387441.969155]  [<c059ae47>] ? do_page_fault+0x1d7/0x3a0
>>>> [1387441.969158]  [<c01fbd70>] sys_rmdir+0x10/0x20
>>>> [1387441.969161]  [<c01033cc>] syscall_call+0x7/0xb
>>>>
>>>> The block always occurs in sys_rmdir when automount tries to remove the
>>>> mountpoint right after umounting the filesystem. There is an successful lstat()
>>>> on the mountpoint directly precceeding the rmdir call.
>>>>
>>>> It looks like we are triggering some sort of race condition here.
>>>>
>>>> We are currently using 2.6.31-20-generic-pae ubuntu kernel in the 6 CPU guest,
>>>> 2.6.34 vanilla and qemu-kvm-0.12.4 in the host. But the problem existed
>>>> long before with all different combinations of guest/host/qemu versions.
>>>> The virtual HD is if=ide,format=host_device,cache=none on an DRBD container
>>>> on top of an LVM device. FS is ext3.
>>>>
>>>> Unfortunately, the problem is not easy reproduceable. It occurs every one
>>>> or two weeks. But since the hanging system call blocks the whole filesystem
>>>> we have to reboot the guest to get it into an useable state again.
>>>>
>>>> Any ideas what's going wrong here?
>>>>
>>>>
>>>>        
>>> Is there substantial I/O going on?
>>>
>>> If not, it may be an autofs bug unrelated to kvm.
>>>      
>> the autofs expire event occured at 01:15:01
>>
>> sar shows
>>
>> 00:00:01   CPU     %user     %nice   %system   %iowait  %steal   %idle
>> 00:35:02   all      0,31      1,90      3,88     29,78    0,00   64,12
>> 00:45:02   all      0,72      1,99      3,56     23,93    0,00   69,80
>> 00:55:01   all      1,35      1,49      4,13     23,76    0,00   69,27
>> 01:05:01   all      0,77      1,84      4,43     28,34    0,00   64,62
>> 01:15:01   all      0,29      1,46      3,41     44,07    0,00   50,77
>> 01:25:02   all      0,22      1,25      2,63     45,34    0,00   50,56
>> 01:35:02   all      0,34      1,33      2,87     46,74    0,00   48,72
>> 01:45:02   all      0,30      0,90      2,57     40,03    0,00   56,20
>> 01:55:02   all      0,26      0,43      2,29      9,79    0,00   87,23
>>
>> 00:00:01          tps      rtps      wtps   bread/s   bwrtn/s
>> 00:35:02       461,69    407,75     53,94  35196,06  32673,83
>> 00:45:02       298,29    238,30     59,99  38553,34  33062,97
>> 00:55:01       294,81    241,08     53,73  35469,66  25948,30
>> 01:05:01       338,62    279,97     58,66  36164,27  31109,18
>> 01:15:01       462,22    406,24     55,97  28428,26  25725,05
>> 01:25:02       366,88    331,82     35,07  24160,53  22284,83
>> 01:35:02       394,73    358,21     36,52  25770,79  23516,81
>> 01:45:02       409,83    379,66     30,17  17874,79  15608,74
>> 01:55:02       453,18    448,62      4,56   3754,82     79,47
>>
>> so, yes there is substantion I/O going on.
>>    
>
> Looks like a false alarm then.  The rmdir is waiting for the mount to  
> flush everything to disk, which is slow and takes a while.
>
> Does it return eventually?

No, it does not return within hours (>10). And the problem occurs only
once in a while although the system is busy every day (and night)
and automount is mounting/expiring frequently.

I would expect the "/bin/umount dir" process (which is forked by
automount if I read the code correctly) to return only after the flush
is complete. So I expect the rmdir being called afterwards on an plain
empty directory.

BTW: the mount is just a bind mount, so no flush should be necessary
anyway.

next prev parent reply	other threads:[~2010-07-19 12:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-19  8:39 syscall rmdir hangs with autofs Sebastian Hetze
2010-07-19 11:12 ` Avi Kivity
2010-07-19 11:40   ` Sebastian Hetze
     [not found]   ` <20100719114034.62BDD30303F5@mail.linux-ag.de>
2010-07-19 12:21     ` Avi Kivity
2010-07-19 12:48       ` Sebastian Hetze [this message]
2010-07-19 13:09         ` Avi Kivity
2010-07-19 13:45           ` Sebastian Hetze
     [not found]           ` <20100719134558.A0CD2A005F@mail.linux-ag.de>
2010-07-19 14:00             ` Avi Kivity
2010-07-19 14:47               ` Sebastian Hetze
     [not found]               ` <20100719144750.334F2303001B@mail.linux-ag.de>
2010-07-19 15:03                 ` Avi Kivity
2010-07-19 15:23                   ` Sebastian Hetze
     [not found]                   ` <20100719152518.641BAB001A@mail.linux-ag.de>
2010-07-19 15:28                     ` Avi Kivity
2010-07-19 15:38                       ` Sebastian Hetze
     [not found]                       ` <20100719153816.1E33FB0016@mail.linux-ag.de>
2010-07-19 17:55                         ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100719124847.0CDBBA005F@mail.linux-ag.de \
    --to=s.hetze@linux-ag.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.