* 'umount -f /mnt/foo' fails if server IP is gone. @ 2013-10-15 18:29 Ben Greear 2013-10-17 17:35 ` Ben Greear 0 siblings, 1 reply; 13+ messages in thread From: Ben Greear @ 2013-10-15 18:29 UTC (permalink / raw) To: linux-nfs@vger.kernel.org Is 'umount -f' supposed to always work, even if the file server goes away? I have a user's system that just hangs forever in this case. Could be local changes we have made, but I'm curious about the expected behaviour before I go digging too deep... Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-15 18:29 'umount -f /mnt/foo' fails if server IP is gone Ben Greear @ 2013-10-17 17:35 ` Ben Greear 2013-10-17 18:03 ` Chuck Lever 2013-10-17 18:05 ` Myklebust, Trond 0 siblings, 2 replies; 13+ messages in thread From: Ben Greear @ 2013-10-17 17:35 UTC (permalink / raw) To: linux-nfs@vger.kernel.org On 10/15/2013 11:29 AM, Ben Greear wrote: > Is 'umount -f' supposed to always work, even if the file server > goes away? > > I have a user's system that just hangs forever in this case. > > Could be local changes we have made, but I'm curious about > the expected behaviour before I go digging too deep... Any input on this? I don't mind trying to fix it, but I would like to know how it is supposed to work. Older kernels do not hang (we tried 3.0.x), but I'm not sure exactly where the problem started. Test case was to set up NFSv3 mount, then pull the Ethernet cable on the nfs client machine. This system is running 3.9.11+ kernel. From /proc/mounts: 10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90) # umount /mnt/lf/nfs3-001 ^C # umount -f /mnt/lf/nfs3-001 [hangs forever it seems, certainly for a long time] Here is a stack trace of hung processes, for instance: Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082 Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace: Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30 Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20 Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28 Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080 Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400 Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980 Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace: Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60 Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70 Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs] Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs] Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290 Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28 .... Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds. Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082 Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace: Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60 Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0 Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20 Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90 Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0 Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70 Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30 Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80 Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0 Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60 Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980 Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40 Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0 Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20 Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28 Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 17:35 ` Ben Greear @ 2013-10-17 18:03 ` Chuck Lever 2013-10-17 18:08 ` Ben Greear 2013-10-17 18:16 ` Jeff Layton 2013-10-17 18:05 ` Myklebust, Trond 1 sibling, 2 replies; 13+ messages in thread From: Chuck Lever @ 2013-10-17 18:03 UTC (permalink / raw) To: Ben Greear; +Cc: Linux NFS Mailing List, Jeff Layton On Oct 17, 2013, at 1:35 PM, Ben Greear <greearb@candelatech.com> wrote: > On 10/15/2013 11:29 AM, Ben Greear wrote: >> Is 'umount -f' supposed to always work, even if the file server >> goes away? >> >> I have a user's system that just hangs forever in this case. >> >> Could be local changes we have made, but I'm curious about >> the expected behaviour before I go digging too deep... > > Any input on this? I don't mind trying to fix it, but I > would like to know how it is supposed to work. Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck. > Older kernels do not hang (we tried 3.0.x), but I'm not sure > exactly where the problem started. > > Test case was to set up NFSv3 mount, then pull the Ethernet cable > on the nfs client machine. This system is running 3.9.11+ kernel. > > From /proc/mounts: > > 10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90) > > # umount /mnt/lf/nfs3-001 > ^C > # umount -f /mnt/lf/nfs3-001 > [hangs forever it seems, certainly for a long time] > > > Here is a stack trace of hung processes, for instance: > > Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State > Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father > Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082 > Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 > Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 > Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 > Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace: > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20 > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28 > Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080 > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400 > Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980 > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c > Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace: > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs] > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290 > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28 > > .... > > Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds. > Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082 > Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 > Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 > Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 > Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace: > Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] > Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] > Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 > Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60 > Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0 > Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20 > Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90 > Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0 > Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 > Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 > Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] > Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] > Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70 > Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30 > Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] > Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80 > Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0 > Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60 > Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980 > Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40 > Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0 > Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20 > Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28 > Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out. > > > Thanks, > Ben > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:03 ` Chuck Lever @ 2013-10-17 18:08 ` Ben Greear 2013-10-17 18:16 ` Jeff Layton 1 sibling, 0 replies; 13+ messages in thread From: Ben Greear @ 2013-10-17 18:08 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List, Jeff Layton On 10/17/2013 11:03 AM, Chuck Lever wrote: > > On Oct 17, 2013, at 1:35 PM, Ben Greear <greearb@candelatech.com> wrote: > >> On 10/15/2013 11:29 AM, Ben Greear wrote: >>> Is 'umount -f' supposed to always work, even if the file server >>> goes away? >>> >>> I have a user's system that just hangs forever in this case. >>> >>> Could be local changes we have made, but I'm curious about >>> the expected behaviour before I go digging too deep... >> >> Any input on this? I don't mind trying to fix it, but I >> would like to know how it is supposed to work. > > Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck. It seems a 'mkdir' process is trying to complete at the same time, not sure if that is cause or effect. How can I go about cleaning up these stuck operations? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:03 ` Chuck Lever 2013-10-17 18:08 ` Ben Greear @ 2013-10-17 18:16 ` Jeff Layton 1 sibling, 0 replies; 13+ messages in thread From: Jeff Layton @ 2013-10-17 18:16 UTC (permalink / raw) To: Chuck Lever; +Cc: Ben Greear, Linux NFS Mailing List On Thu, 17 Oct 2013 14:03:05 -0400 Chuck Lever <chuck.lever@oracle.com> wrote: > > On Oct 17, 2013, at 1:35 PM, Ben Greear <greearb@candelatech.com> wrote: > > > On 10/15/2013 11:29 AM, Ben Greear wrote: > >> Is 'umount -f' supposed to always work, even if the file server > >> goes away? > >> > >> I have a user's system that just hangs forever in this case. > >> > >> Could be local changes we have made, but I'm curious about > >> the expected behaviour before I go digging too deep... > > > > Any input on this? I don't mind trying to fix it, but I > > would like to know how it is supposed to work. > > Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck. > Yep. > > > Older kernels do not hang (we tried 3.0.x), but I'm not sure > > exactly where the problem started. > > > > Test case was to set up NFSv3 mount, then pull the Ethernet cable > > on the nfs client machine. This system is running 3.9.11+ kernel. > > > > From /proc/mounts: > > > > 10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90) > > > > # umount /mnt/lf/nfs3-001 > > ^C > > # umount -f /mnt/lf/nfs3-001 > > [hangs forever it seems, certainly for a long time] > > > > > > Here is a stack trace of hung processes, for instance: > > > > Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State > > Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father > > Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082 > > Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 > > Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 > > Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace: > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20 > > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28 > > Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080 > > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400 > > Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980 > > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c > > Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace: > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs] > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290 > > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28 > > The umount here is stuck trying to revalidate the dentry at the root of the mount. This situation should be improved by commit 8033426e6b, which skips revalidating the last component of the lookup. > > > > Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds. > > Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082 > > Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400 > > Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0 > > Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138 > > Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace: > > Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc] > > Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] > > Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110 > > Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60 > > Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0 > > Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20 > > Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70 > > Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90 > > Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0 > > Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50 > > Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150 > > Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0 > > Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs] > > Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] > > Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70 > > Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30 > > Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs] > > Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80 > > Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0 > > Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60 > > Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980 > > Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40 > > Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0 > > Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20 > > Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28 > > Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out. > > Of course, the mkdir process here might be holding references that will prevent you from unmounting, but that commit should at least keep the lookup from getting stuck. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 17:35 ` Ben Greear 2013-10-17 18:03 ` Chuck Lever @ 2013-10-17 18:05 ` Myklebust, Trond 2013-10-17 18:11 ` Ben Greear 1 sibling, 1 reply; 13+ messages in thread From: Myklebust, Trond @ 2013-10-17 18:05 UTC (permalink / raw) To: Ben Greear; +Cc: linux-nfs@vger.kernel.org On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: > On 10/15/2013 11:29 AM, Ben Greear wrote: > > Is 'umount -f' supposed to always work, even if the file server > > goes away? > > > > I have a user's system that just hangs forever in this case. > > > > Could be local changes we have made, but I'm curious about > > the expected behaviour before I go digging too deep... > > Any input on this? I don't mind trying to fix it, but I > would like to know how it is supposed to work. 'umount -f' has always been iffy. It just kills any pending RPC calls _before_ trying to unmount. Since the unmount itself can trigger writeback flushes (and hence more RPC calls), the trace you are seeing is indeed possible. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:05 ` Myklebust, Trond @ 2013-10-17 18:11 ` Ben Greear 2013-10-17 18:23 ` Christopher T Vogan 2013-10-17 18:32 ` Myklebust, Trond 0 siblings, 2 replies; 13+ messages in thread From: Ben Greear @ 2013-10-17 18:11 UTC (permalink / raw) To: Myklebust, Trond; +Cc: linux-nfs@vger.kernel.org On 10/17/2013 11:05 AM, Myklebust, Trond wrote: > On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: >> On 10/15/2013 11:29 AM, Ben Greear wrote: >>> Is 'umount -f' supposed to always work, even if the file server >>> goes away? >>> >>> I have a user's system that just hangs forever in this case. >>> >>> Could be local changes we have made, but I'm curious about >>> the expected behaviour before I go digging too deep... >> >> Any input on this? I don't mind trying to fix it, but I >> would like to know how it is supposed to work. > > 'umount -f' has always been iffy. It just kills any pending RPC calls > _before_ trying to unmount. Since the unmount itself can trigger > writeback flushes (and hence more RPC calls), the trace you are seeing > is indeed possible. I tried 'umount -f -l', and that also does not work. Any ideas on how to fix this properly? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:11 ` Ben Greear @ 2013-10-17 18:23 ` Christopher T Vogan 2013-10-17 18:32 ` Myklebust, Trond 1 sibling, 0 replies; 13+ messages in thread From: Christopher T Vogan @ 2013-10-17 18:23 UTC (permalink / raw) To: Ben Greear; +Cc: Myklebust, Trond, linux-nfs@vger.kernel.org, linux-nfs-owner I have reported 2 scenarios related to this issue, the second topic being more relevant to your problem. vfs: allow umount to handle mountpoints without revalidating them and NFSERR_STALE on umount with 3.10.0.RC5 kernel Christopher Vogan NFS Development & Test From: Ben Greear <greearb@candelatech.com> To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>, Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org> Date: 10/17/2013 01:14 PM Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone. Sent by: linux-nfs-owner@vger.kernel.org On 10/17/2013 11:05 AM, Myklebust, Trond wrote: > On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: >> On 10/15/2013 11:29 AM, Ben Greear wrote: >>> Is 'umount -f' supposed to always work, even if the file server >>> goes away? >>> >>> I have a user's system that just hangs forever in this case. >>> >>> Could be local changes we have made, but I'm curious about >>> the expected behaviour before I go digging too deep... >> >> Any input on this? I don't mind trying to fix it, but I >> would like to know how it is supposed to work. > > 'umount -f' has always been iffy. It just kills any pending RPC calls > _before_ trying to unmount. Since the unmount itself can trigger > writeback flushes (and hence more RPC calls), the trace you are seeing > is indeed possible. I tried 'umount -f -l', and that also does not work. Any ideas on how to fix this properly? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:11 ` Ben Greear 2013-10-17 18:23 ` Christopher T Vogan @ 2013-10-17 18:32 ` Myklebust, Trond 2013-10-17 18:35 ` Ben Greear 1 sibling, 1 reply; 13+ messages in thread From: Myklebust, Trond @ 2013-10-17 18:32 UTC (permalink / raw) To: Ben Greear; +Cc: linux-nfs@vger.kernel.org On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote: > On 10/17/2013 11:05 AM, Myklebust, Trond wrote: > > On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: > >> On 10/15/2013 11:29 AM, Ben Greear wrote: > >>> Is 'umount -f' supposed to always work, even if the file server > >>> goes away? > >>> > >>> I have a user's system that just hangs forever in this case. > >>> > >>> Could be local changes we have made, but I'm curious about > >>> the expected behaviour before I go digging too deep... > >> > >> Any input on this? I don't mind trying to fix it, but I > >> would like to know how it is supposed to work. > > > > 'umount -f' has always been iffy. It just kills any pending RPC calls > > _before_ trying to unmount. Since the unmount itself can trigger > > writeback flushes (and hence more RPC calls), the trace you are seeing > > is indeed possible. > > I tried 'umount -f -l', and that also does not work. > > Any ideas on how to fix this properly? 'umount -f -l' should normally work to at least hide the gruesome details of your hanging superblock. I'm guessing that you're falling afoul of the path revalidation that Chuck alluded to. There should already be a fix for that problem with the path_umountat() patches that went into Linux 3.12-rc1. Are those failing to help? -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:32 ` Myklebust, Trond @ 2013-10-17 18:35 ` Ben Greear 2013-10-17 18:42 ` Myklebust, Trond 0 siblings, 1 reply; 13+ messages in thread From: Ben Greear @ 2013-10-17 18:35 UTC (permalink / raw) To: Myklebust, Trond; +Cc: linux-nfs@vger.kernel.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=UTF-7, Size: 1614 bytes --] On 10/17/2013 11:32 AM, Myklebust, Trond wrote: > On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote: >> On 10/17/2013 11:05 AM, Myklebust, Trond wrote: >>> On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: >>>> On 10/15/2013 11:29 AM, Ben Greear wrote: >>>>> Is 'umount -f' supposed to always work, even if the file server >>>>> goes away? >>>>> >>>>> I have a user's system that just hangs forever in this case. >>>>> >>>>> Could be local changes we have made, but I'm curious about >>>>> the expected behaviour before I go digging too deep... >>>> >>>> Any input on this? I don't mind trying to fix it, but I >>>> would like to know how it is supposed to work. >>> >>> 'umount -f' has always been iffy. It just kills any pending RPC calls >>> _before_ trying to unmount. Since the unmount itself can trigger >>> writeback flushes (and hence more RPC calls), the trace you are seeing >>> is indeed possible. >> >> I tried 'umount -f -l', and that also does not work. >> >> Any ideas on how to fix this properly? > > 'umount -f -l' should normally work to at least hide the gruesome > details of your hanging superblock. > > I'm guessing that you're falling afoul of the path revalidation that > Chuck alluded to. There should already be a fix for that problem with > the path_umountat() patches that went into Linux 3.12-rc1. Are those > failing to help? I have not tried past 3.9.11+ kernel yet. I will go look for those patches you mention as well. Did any of this go to -stable by chance? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:35 ` Ben Greear @ 2013-10-17 18:42 ` Myklebust, Trond 2013-10-17 19:34 ` Ben Greear 0 siblings, 1 reply; 13+ messages in thread From: Myklebust, Trond @ 2013-10-17 18:42 UTC (permalink / raw) To: Ben Greear; +Cc: linux-nfs@vger.kernel.org On Thu, 2013-10-17 at 11:35 -0700, Ben Greear wrote: > On 10/17/2013 11:32 AM, Myklebust, Trond wrote: > > On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote: > >> On 10/17/2013 11:05 AM, Myklebust, Trond wrote: > >>> On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote: > >>>> On 10/15/2013 11:29 AM, Ben Greear wrote: > >>>>> Is 'umount -f' supposed to always work, even if the file server > >>>>> goes away? > >>>>> > >>>>> I have a user's system that just hangs forever in this case. > >>>>> > >>>>> Could be local changes we have made, but I'm curious about > >>>>> the expected behaviour before I go digging too deep... > >>>> > >>>> Any input on this? I don't mind trying to fix it, but I > >>>> would like to know how it is supposed to work. > >>> > >>> 'umount -f' has always been iffy. It just kills any pending RPC calls > >>> _before_ trying to unmount. Since the unmount itself can trigger > >>> writeback flushes (and hence more RPC calls), the trace you are seeing > >>> is indeed possible. > >> > >> I tried 'umount -f -l', and that also does not work. > >> > >> Any ideas on how to fix this properly? > > > > 'umount -f -l' should normally work to at least hide the gruesome > > details of your hanging superblock. > > > > I'm guessing that you're falling afoul of the path revalidation that > > Chuck alluded to. There should already be a fix for that problem with > > the path_umountat() patches that went into Linux 3.12-rc1. Are those > > failing to help? > > I have not tried past 3.9.11 kernel yet. I will go look for those patches > you mention as well. Did any of this go to -stable by chance? Not as far as I know. The commit identifier is 8033426e6bdb2690d302872ac1e1fadaec1a5581 (vfs: allow umount to handle mountpoints without revalidating them) in case you are interested. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 18:42 ` Myklebust, Trond @ 2013-10-17 19:34 ` Ben Greear 2013-10-17 19:36 ` Ben Greear 0 siblings, 1 reply; 13+ messages in thread From: Ben Greear @ 2013-10-17 19:34 UTC (permalink / raw) To: Myklebust, Trond; +Cc: linux-nfs@vger.kernel.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=UTF-7, Size: 13754 bytes --] On 10/17/2013 11:42 AM, Myklebust, Trond wrote: > On Thu, 2013-10-17 at 11:35 -0700, Ben Greear wrote: >>> 'umount -f -l' should normally work to at least hide the gruesome >>> details of your hanging superblock. >>> >>> I'm guessing that you're falling afoul of the path revalidation that >>> Chuck alluded to. There should already be a fix for that problem with >>> the path_umountat() patches that went into Linux 3.12-rc1. Are those >>> failing to help? >> >> I have not tried past 3.9.11 kernel yet. I will go look for those patches >> you mention as well. Did any of this go to -stable by chance? > > Not as far as I know. > > The commit identifier is 8033426e6bdb2690d302872ac1e1fadaec1a5581 (vfs: > allow umount to handle mountpoints without revalidating them) in case > you are interested. Ok, that is the one that Jeff pointed me to a bit ago. I re-ran the test with this patch (which applies cleanly into 3.9.11+). In this case, I see a hang in my file-io process, but, 'umount -l foo' returns immediately and the mount is gone from /proc/mounts. I tried 'kill -9' but the btserver process won't die. I plugged the cable so that the mount could recover, but still the process is hung. Maybe because I did the 'umount -l' ? After cable is reconnected, (and with btserver process still hung), I tried to re-mount the same partition. Those mount calls are hanging as well. So, maybe some progress, but I think there are still some fixes needed. [ 167.229748] r8169 0000:02:00.0 eth1: link down [ 379.288195] INFO: task btserver:6895 blocked for more than 180 seconds. [ 379.300366] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 379.313502] btserver D f3a3a2a4 0 6895 1431 0x00000080 [ 379.325191] f0615e08 00000086 00000282 f3a3a2a4 f0615dd8 f3a3a2a4 f1ed99a0 c0d41240 [ 379.338396] c0d41240 c0d41240 c0d41240 7913580e 00000027 f79db240 f1ed99a0 f5936680 [ 379.351591] f8e4ffd0 f0615dcc f3a3a2a4 f0615dcc f8e120df f0615e10 f8e4a3c7 f0f2a138 [ 379.365431] Call Trace: [ 379.373114] [<f8e120df>] ? rpc_put_task+0xf/0x20 [sunrpc] [ 379.384078] [<f8e4a3c7>] ? nfs_initiate_write+0xb7/0xe0 [nfs] [ 379.395078] [<c04a076e>] ? ktime_get_ts+0x3e/0x110 [ 379.405192] [<c09baf43>] schedule+0x23/0x60 [ 379.414219] [<c09baff6>] io_schedule+0x76/0xc0 [ 379.423540] [<c05080bd>] sleep_on_page+0xd/0x20 [ 379.432895] [<c09b8fcd>] __wait_on_bit+0x4d/0x70 [ 379.442306] [<c05080b0>] ? __lock_page+0x90/0x90 [ 379.451693] [<c0508381>] wait_on_page_bit+0x91/0xa0 [ 379.461264] [<c0472690>] ? autoremove_wake_function+0x50/0x50 [ 379.472217] [<c050855b>] filemap_fdatawait_range+0xdb/0x150 [ 379.482471] [<c0508727>] filemap_write_and_wait_range+0x77/0x90 [ 379.493219] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs] [ 379.502922] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] [ 379.513423] [<c0581179>] vfs_fsync_range+0x59/0x70 [ 379.522692] [<c05811b7>] vfs_fsync+0x27/0x30 [ 379.531426] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs] [ 379.541135] [<c05546b1>] filp_close+0x31/0x80 [ 379.549817] [<c056fb9a>] __close_fd+0x6a/0x90 [ 379.558490] [<c055465c>] sys_close+0x1c/0x40 [ 379.567062] [<c09c26cd>] sysenter_do_call+0x12/0x28 .... Oct 17 12:25:09 localhost kernel: [ 1240.992796] SysRq : Show Blocked State Oct 17 12:25:09 localhost kernel: [ 1240.993012] task PC stack pid father Oct 17 12:25:09 localhost kernel: [ 1240.993012] btserver D f0f2a204 0 8701 1431 0x00000086 Oct 17 12:25:09 localhost kernel: [ 1240.993012] f5bc3c64 00000046 00000000 f0f2a204 00000000 f5aec010 f153e680 c0d41240 Oct 17 12:25:09 localhost kernel: [ 1240.993012] c0d41240 c0d41240 c0d41240 cbf49405 00000103 f79e9240 f153e680 f11a8000 Oct 17 12:25:09 localhost kernel: [ 1240.993012] f5bc3c28 c04a076e f582a148 00000246 00000246 f5bc3c5c c04d6ff6 00014993 Oct 17 12:25:09 localhost kernel: [ 1240.993012] Call Trace: Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04a076e>] ? ktime_get_ts+0x3e/0x110 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04d6ff6>] ? delayacct_end+0x96/0xb0 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04a076e>] ? ktime_get_ts+0x3e/0x110 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09baf43>] schedule+0x23/0x60 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09baff6>] io_schedule+0x76/0xc0 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c05080bd>] sleep_on_page+0xd/0x20 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09b8fcd>] __wait_on_bit+0x4d/0x70 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c05080b0>] ? __lock_page+0x90/0x90 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0508381>] wait_on_page_bit+0x91/0xa0 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0472690>] ? autoremove_wake_function+0x50/0x50 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c050855b>] filemap_fdatawait_range+0xdb/0x150 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0508727>] filemap_write_and_wait_range+0x77/0x90 Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0581179>] vfs_fsync_range+0x59/0x70 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05811b7>] vfs_fsync+0x27/0x30 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05546b1>] filp_close+0x31/0x80 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570085>] put_files_struct+0x85/0xe0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570127>] exit_files+0x47/0x60 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045653c>] do_exit+0x25c/0x980 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456c9e>] do_group_exit+0x3e/0xa0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c046630b>] get_signal_to_deliver+0x1db/0x5f0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09ba9f3>] ? __schedule+0x3e3/0x7e0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04135aa>] do_signal+0x3a/0x920 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c047eedb>] ? update_rq_clock+0x3b/0x2b0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456eee>] ? do_wait+0xfe/0x210 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045707d>] ? sys_wait4+0x7d/0xb0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04c8126>] ? __audit_syscall_exit+0x1f6/0x280 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0454f70>] ? wait_noreap_copyout+0xd0/0xd0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0413eff>] do_notify_resume+0x6f/0xa0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09bc505>] work_notifysig+0x30/0x37 Oct 17 12:25:09 localhost kernel: [ 1241.175689] mkdir D f5aec010 0 8741 8701 0x00000082 Oct 17 12:25:09 localhost kernel: [ 1241.175689] f3abfd8c 00000046 00000282 f5aec010 f11a8000 f153e680 f11a8000 c0d41240 Oct 17 12:25:09 localhost kernel: [ 1241.175689] c0d41240 c0d41240 c0d41240 cbf72225 00000103 f79e9240 f11a8000 f3188cd0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] f3abfd50 c04a076e f15526e8 00000246 00000246 f3abfd84 c04d6ff6 00019454 Oct 17 12:25:09 localhost kernel: [ 1241.175689] Call Trace: Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04a076e>] ? ktime_get_ts+0x3e/0x110 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04d6ff6>] ? delayacct_end+0x96/0xb0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04a076e>] ? ktime_get_ts+0x3e/0x110 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baf43>] schedule+0x23/0x60 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baff6>] io_schedule+0x76/0xc0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05080bd>] sleep_on_page+0xd/0x20 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b8fcd>] __wait_on_bit+0x4d/0x70 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05080b0>] ? __lock_page+0x90/0x90 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0508381>] wait_on_page_bit+0x91/0xa0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0472690>] ? autoremove_wake_function+0x50/0x50 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c050855b>] filemap_fdatawait_range+0xdb/0x150 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0508727>] filemap_write_and_wait_range+0x77/0x90 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0581179>] vfs_fsync_range+0x59/0x70 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05811b7>] vfs_fsync+0x27/0x30 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05546b1>] filp_close+0x31/0x80 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570085>] put_files_struct+0x85/0xe0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570127>] exit_files+0x47/0x60 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045653c>] do_exit+0x25c/0x980 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456c9e>] do_group_exit+0x3e/0xa0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456d18>] sys_exit_group+0x18/0x20 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09c26cd>] sysenter_do_call+0x12/0x28 Oct 17 12:25:09 localhost kernel: [ 1241.175689] mount.nfs D 00000000 0 9474 9473 0x00000080 Oct 17 12:25:09 localhost kernel: [ 1241.175689] f04d1be0 00000082 d07942dc 00000000 00000082 0000b800 f1fec010 c0d41240 Oct 17 12:25:09 localhost kernel: [ 1241.175689] c0d41240 c0d41240 c0d41240 f58bc570 00000000 f79db240 f1fec010 c0c19180 Oct 17 12:25:09 localhost kernel: [ 1241.175689] 00000000 00000000 00000020 00000000 f582b400 f79db240 00000000 f04d1c10 Oct 17 12:25:09 localhost kernel: [ 1241.175689] Call Trace: Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c048b2a0>] ? idle_balance+0x100/0x420 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baf43>] schedule+0x23/0x60 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123fd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b8fcd>] __wait_on_bit+0x4d/0x70 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123d0>] ? rpc_queue_empty+0x40/0x40 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123d0>] ? rpc_queue_empty+0x40/0x40 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b909b>] out_of_line_wait_on_bit+0xab/0xc0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0472690>] ? autoremove_wake_function+0x50/0x50 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e134fe>] __rpc_execute+0x11e/0x2a0 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0a130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0a130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c047262f>] ? wake_up_bit+0x5f/0x70 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e136b4>] rpc_execute+0x34/0x90 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0bc79>] rpc_run_task+0x59/0x70 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0bd92>] rpc_call_sync+0x42/0xa0 [sunrpc] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c0547c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c06153>] do_proc_fsinfo+0x33/0x40 [nfsv3] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c06183>] nfs3_proc_fsinfo+0x23/0x50 [nfsv3] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a97f>] nfs_probe_fsinfo+0x4f/0x500 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3bef1>] nfs_create_server+0x201/0x440 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c050ae>] nfs3_create_server+0xe/0x30 [nfsv3] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e43fc1>] nfs_try_mount+0x151/0x280 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e42e1d>] ? nfs_get_option_ul+0x3d/0x50 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e45d1b>] ? nfs_fs_mount+0x6db/0x9c0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a7d8>] ? get_nfs_version+0x28/0x80 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a7d8>] ? get_nfs_version+0x28/0x80 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0520453>] ? kstrndup+0x43/0x60 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e457cd>] nfs_fs_mount+0x18d/0x9c0 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e45450>] ? nfs_clone_super+0x150/0x150 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e43d50>] ? nfs_clone_sb_security+0x50/0x50 [nfs] Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0559036>] mount_fs+0x36/0x180 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0524b3f>] ? __alloc_percpu+0xf/0x20 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0572180>] vfs_kern_mount+0x50/0xc0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05737d8>] do_mount+0x2b8/0x810 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c050f68b>] ? __get_free_pages+0x2b/0x30 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05714e1>] ? copy_mount_options+0x41/0x120 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0573d9b>] sys_mount+0x6b/0xa0 Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09c26cd>] sysenter_do_call+0x12/0x28 Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 'umount -f /mnt/foo' fails if server IP is gone. 2013-10-17 19:34 ` Ben Greear @ 2013-10-17 19:36 ` Ben Greear 0 siblings, 0 replies; 13+ messages in thread From: Ben Greear @ 2013-10-17 19:36 UTC (permalink / raw) To: Myklebust, Trond; +Cc: linux-nfs@vger.kernel.org On 10/17/2013 12:34 PM, Ben Greear wrote: > After cable is reconnected, (and with btserver process still hung), > I tried to re-mount the same partition. Those mount calls are hanging > as well. > > So, maybe some progress, but I think there are still some fixes needed. About the time I finished composing this email and sent it, it appears everything cleaned up. So, maybe not quite as bad as it first looked, but still room for improvement in my opinion. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-10-17 19:36 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-15 18:29 'umount -f /mnt/foo' fails if server IP is gone Ben Greear 2013-10-17 17:35 ` Ben Greear 2013-10-17 18:03 ` Chuck Lever 2013-10-17 18:08 ` Ben Greear 2013-10-17 18:16 ` Jeff Layton 2013-10-17 18:05 ` Myklebust, Trond 2013-10-17 18:11 ` Ben Greear 2013-10-17 18:23 ` Christopher T Vogan 2013-10-17 18:32 ` Myklebust, Trond 2013-10-17 18:35 ` Ben Greear 2013-10-17 18:42 ` Myklebust, Trond 2013-10-17 19:34 ` Ben Greear 2013-10-17 19:36 ` Ben Greear
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).