* Cannot unmount nfs4 sec=krb5 mount if network is down
@ 2012-05-10 20:24 Orion Poplawski
2012-05-16 21:34 ` Orion Poplawski
0 siblings, 1 reply; 4+ messages in thread
From: Orion Poplawski @ 2012-05-10 20:24 UTC (permalink / raw)
To: linux-nfs
See https://bugzilla.redhat.com/show_bug.cgi?id=820707
We're using nfs4/krb5 mounts via autofs (although I get the same result without
autofs and mounting the directory directly):
earth:/export/home/orion on /home/orion type nfs4
(rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,
acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=krb5,
clientaddr=10.10.30.4,minorversion=0,local_lock=none,addr=10.10.10.1)
If the network is disconnected it is impossible to unmount, even if no
processes are accessing the mount. umount -f and umount -l both hang on
readlink("/home/orion"). At this point it is impossible to shutdown cleanly
and you must hold the power button down until the machine powers off.
I can unmount non-krb5 nfs4/nfs3 mounts just fine.
I tried running rpc.gssd and nfsidmap with -vvv but nothing else showed up in
the log.
Sometimes get:
May 10 12:00:10 makani kernel: [ 2018.272071] nfs: server earth not responding,
still trying
but that's it. I can exit umount -l with ctrl-c.
Version-Release number of selected component (if applicable):
nfs-utils-1.2.5-15.fc17.x86_64
3.3.4-5.fc17.x86_64
How reproducible:
Every time
Steps to Reproduce:
1. mount nfs4 sec=krb5 mount
2. pull network
3. umount -l <mount>
Actual results:
umount -l hangs
Expected results:
umount succeeds, perhaps with delay.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Cannot unmount nfs4 sec=krb5 mount if network is down
2012-05-10 20:24 Cannot unmount nfs4 sec=krb5 mount if network is down Orion Poplawski
@ 2012-05-16 21:34 ` Orion Poplawski
2012-05-17 10:29 ` Karel Zak
0 siblings, 1 reply; 4+ messages in thread
From: Orion Poplawski @ 2012-05-16 21:34 UTC (permalink / raw)
To: linux-nfs
Orion Poplawski <orion@...> writes:
>
> See https://bugzilla.redhat.com/show_bug.cgi?id=820707
>
> If the network is disconnected it is impossible to unmount, even if no
> processes are accessing the mount. umount -f and umount -l both hang on
> readlink("/home/orion").
umount needs to canonicalize the path so it does a readlink on the path given to
it. This hangs. Here's the kernel trace.
[94630.673017] umount.nfs D 0000009c 0 14999 14882 0x00000080
[94630.673017] c30f5c38 00000086 00000001 0000009c ed110004 1b928142 0000560e
00000000
[94630.673017] c0c4b180 ed37c000 c0c4b180 f5007180 f6b37110 c32ef110 c30f5c28
f7fd6243
[94630.673017] c2f9c580 c30f5c20 f7fd9ff2 f82520c0 00000246 c30f5c0c c0927c33
c30f5c30
[94630.673017] Call Trace:
[94630.673017] [<f7fd6243>] ? xs_sendpages+0x63/0x1f0 [sunrpc]
[94630.673017] [<f7fd9ff2>] ? __rpc_sleep_on_priority+0x122/0x210 [sunrpc]
[94630.673017] [<c0927c33>] ? _raw_spin_unlock_bh+0x13/0x20
[94630.673017] [<c0927c33>] ? _raw_spin_unlock_bh+0x13/0x20
[94630.673017] [<c0926ed5>] schedule+0x35/0x50
[94630.673017] [<f7fd96fd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc]
[94630.673017] [<c09259a1>] __wait_on_bit+0x51/0x70
[94630.673017] [<f7fd96d0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
[94630.673017] [<f7fd96d0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
[94630.673017] [<c0925a21>] out_of_line_wait_on_bit+0x61/0x70
[94630.673017] [<c0455480>] ? autoremove_wake_function+0x50/0x50
[94630.673017] [<f7fda2e7>] __rpc_execute+0x187/0x2a0 [sunrpc]
[94630.673017] [<c0455423>] ? wake_up_bit+0x23/0x30
[94630.673017] [<f7fda548>] rpc_execute+0x38/0x40 [sunrpc]
[94630.673017] [<f7fd30a9>] rpc_run_task+0x59/0x70 [sunrpc]
[94630.673017] [<f7fd31bc>] rpc_call_sync+0x3c/0x60 [sunrpc]
[94630.673017] [<f84aff63>] _nfs4_call_sync+0x23/0x30 [nfs]
[94630.673017] [<f84afc3e>] _nfs4_proc_getattr+0x8e/0xa0 [nfs]
[94630.673017] [<f84b385b>] nfs4_proc_getattr+0x3b/0x60 [nfs]
[94630.673017] [<f849d311>] __nfs_revalidate_inode+0x81/0x210 [nfs]
[94630.673017] [<f849d5df>] nfs_revalidate_inode+0x2f/0x50 [nfs]
[94630.673017] [<f8496b3f>] nfs_check_verifier+0x4f/0x80 [nfs]
[94630.673017] [<f8498ca2>] nfs_lookup_revalidate+0x232/0x450 [nfs]
[94630.673017] [<c05ead5e>] ? autofs4_d_manage+0x8e/0xf0
[94630.673017] [<f8499811>] nfs_open_revalidate+0x41/0x220 [nfs]
[94630.673017] [<c053e79b>] ? follow_managed+0x19b/0x1f0
[94630.673017] [<c053ff00>] ? unlazy_walk+0xd0/0x180
[94630.673017] [<c0540153>] ? do_lookup+0x1a3/0x350
[94630.673017] [<c053f748>] complete_walk+0x88/0xc0
[94630.673017] [<c0540cc3>] path_lookupat+0x63/0x620
[94630.673017] [<c0523b89>] ? kmem_cache_alloc+0x29/0x120
[94630.673017] [<c065a998>] ? strncpy_from_user+0x38/0x70
[94630.673017] [<c05412aa>] do_path_lookup+0x2a/0xb0
[94630.673017] [<c0542466>] user_path_at_empty+0x46/0x80
[94630.673017] [<c092b557>] ? do_page_fault+0x1b7/0x450
[94630.673017] [<c050c074>] ? remove_vma+0x44/0x60
[94630.673017] [<c054e233>] ? mntput_no_expire+0x23/0x100
[94630.673017] [<c0539313>] sys_readlinkat+0x43/0xb0
[94630.673017] [<c05393ac>] sys_readlink+0x2c/0x30
[94630.673017] [<c0927ed4>] syscall_call+0x7/0xb
This appears to wait forever. This pretty much makes it impossible to use krb5
nfs4 with laptops where the network can disappear.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Cannot unmount nfs4 sec=krb5 mount if network is down
2012-05-16 21:34 ` Orion Poplawski
@ 2012-05-17 10:29 ` Karel Zak
2012-05-17 21:11 ` Orion Poplawski
0 siblings, 1 reply; 4+ messages in thread
From: Karel Zak @ 2012-05-17 10:29 UTC (permalink / raw)
To: Orion Poplawski; +Cc: linux-nfs
On Wed, May 16, 2012 at 09:34:27PM +0000, Orion Poplawski wrote:
> Orion Poplawski <orion@...> writes:
> >
> > See https://bugzilla.redhat.com/show_bug.cgi?id=820707
> >
> > If the network is disconnected it is impossible to unmount, even if no
> > processes are accessing the mount. umount -f and umount -l both hang on
> > readlink("/home/orion").
>
> umount needs to canonicalize the path so it does a readlink on the path given to
> it.
It seems that the canonicalization is unnecessary (already fixed in libmount
upstream code). https://bugzilla.redhat.com/show_bug.cgi?id=820707
> This appears to wait forever. This pretty much makes it impossible to use krb5
> nfs4 with laptops where the network can disappear.
Is it possible to interrupt this "wait" by signal? ... then we can add alarm()
to critical sections in programs like umount or lsof.
Now for example lsof resolves this problem by fork() and timeout in
parent.. that's pretty nasty solution :-(
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Cannot unmount nfs4 sec=krb5 mount if network is down
2012-05-17 10:29 ` Karel Zak
@ 2012-05-17 21:11 ` Orion Poplawski
0 siblings, 0 replies; 4+ messages in thread
From: Orion Poplawski @ 2012-05-17 21:11 UTC (permalink / raw)
To: Karel Zak; +Cc: linux-nfs
On 05/17/2012 04:29 AM, Karel Zak wrote:
> On Wed, May 16, 2012 at 09:34:27PM +0000, Orion Poplawski wrote:
>> Orion Poplawski<orion@...> writes:
>>>
>>> See https://bugzilla.redhat.com/show_bug.cgi?id=820707
>>>
>>> If the network is disconnected it is impossible to unmount, even if no
>>> processes are accessing the mount. umount -f and umount -l both hang on
>>> readlink("/home/orion").
>>
>> umount needs to canonicalize the path so it does a readlink on the path given to
>> it.
>
> It seems that the canonicalization is unnecessary (already fixed in libmount
> upstream code). https://bugzilla.redhat.com/show_bug.cgi?id=820707
>
That appears to fix the issue for me. Thanks!
>> This appears to wait forever. This pretty much makes it impossible to use krb5
>> nfs4 with laptops where the network can disappear.
>
> Is it possible to interrupt this "wait" by signal? ... then we can add alarm()
> to critical sections in programs like umount or lsof.
>
> Now for example lsof resolves this problem by fork() and timeout in
> parent.. that's pretty nasty solution :-(
>
> Karel
>
Seems unnecessary with the above fix.
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office FAX: 303-415-9702
3380 Mitchell Lane orion@nwra.com
Boulder, CO 80301 http://www.nwra.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-05-17 21:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-10 20:24 Cannot unmount nfs4 sec=krb5 mount if network is down Orion Poplawski
2012-05-16 21:34 ` Orion Poplawski
2012-05-17 10:29 ` Karel Zak
2012-05-17 21:11 ` Orion Poplawski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).