From: "bfields@fieldses.org" <bfields@fieldses.org>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "nix@esperi.org.uk" <nix@esperi.org.uk>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: steam-associated reproducible hard NFSv4.2 client hang (5.9, 5.10)
Date: Tue, 23 Feb 2021 21:01:40 -0500 [thread overview]
Message-ID: <20210224020140.GA26848@fieldses.org> (raw)
In-Reply-To: <fde7a43ac9b61a1aff53381d0ab7b48b78cb79db.camel@hammerspace.com>
On Tue, Feb 23, 2021 at 11:58:51PM +0000, Trond Myklebust wrote:
> On Tue, 2021-02-23 at 17:57 -0500, J. Bruce Fields wrote:
> > On Sun, Jan 03, 2021 at 02:27:51PM +0000, Nick Alcock wrote:
> > > Relevant exports, from /proc/fs/nfs/exports:
> > >
> > > / 192.168.16.0/24,silk.wkstn.nix(ro,insecure,no_root_squash,s
> > > ync,no_wdelay,no_subtree_check,v4root,fsid=0,uuid=0a4a4563:00764033
> > > :8c827c0e:989cf534,sec=390003:390004:390005:1)
> > > /home/.loom.srvr.nix *.srvr.nix,fold.srvr.nix(rw,root_squash,syn
> > > c,wdelay,no_subtree_check,uuid=0a4a4563:00764033:8c827c0e:989cf534,
> > > sec=1)
>
> Isn't that trying to export the same filesystem as '/' on the line
> above using conflicting export options?
Yes, and exporting something on the root filesystem is generally a bad
idea for the usual reasons.
I don't see any explanation for the write hang there, though.
Unless maybe if it leaves mountd unable to answer an upcall for some
reason, hm.
I think that would show up in /proc/net/rpc/nfsd.fh/content. Try
cat'ing that file after 'rpcdebug -m rpc -s cache' and that should show
if there's an upcall that's not getting answered.
--b.
>
>
> > > /pkg/non-
> > > free/steam 192.168.16.0/24,silk.wkstn.nix(rw,root_squash,sync,w
> > > delay,uuid=11b94016:cb6e4b64:beb6c134:9ea3d6b3,sec=1)
> > >
> > > Blocked tasks after the hang suggest that the client is hung
> > > waiting
> > > indefinitely for the server to reply, and I'd guess the server
> > > doesn't
> > > think it needs to. (I can do some server-side debugging dumps too
> > > if you
> > > think the problem might be there.)
> >
> > I seem to remember there have been some subtle changes to how dropped
> > replies are handled. If I remember right: a v4 server is never
> > supposed
> > to drop a reply without also closing the tcp connection, but it may
> > have
> > done so in the past. (I *think* all those cases are fixed?)
> >
> > And the client may have been changed to be more unforgiving about
> > that
> > requirement in recent years.
> >
> > But I don't know if that's where the issue is.
> >
> > I'm not sure how best to go about debugging. I suppose you could try
> > to
> > capture a network trace and work out if there's an RPC that isn't
> > getting a reply.
> >
> > Maybe turning on server debugging and looking for "svc: svc_process
> > dropit" messages timed with the start of the hang. I think what you
> > want is "rpdebug -m rpc -s svcdsp".
> >
> > Does rpc.mountd look healthy?
> >
> > --b.
> >
> >
> > >
> > > [ 322.843962] sysrq: Show Blocked State
> > > [ 322.844058] task:tee state:D stack: 0 pid: 1176
> > > ppid: 1173 flags:0x00000000
> > > [ 322.844064] Call Trace:
> > > [ 322.844074] __schedule+0x3db/0x7ee
> > > [ 322.844079] ? io_schedule+0x70/0x6d
> > > [ 322.844082] schedule+0x68/0xdc
> > > [ 322.844085] io_schedule+0x46/0x6d
> > > [ 322.844089] bit_wait_io+0x11/0x52
> > > [ 322.844092] __wait_on_bit+0x31/0x90
> > > [ 322.844096] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844100] ? var_wake_function+0x30/0x27
> > > [ 322.844104] nfs_wait_on_request.part.0+0x31/0x33
> > > [ 322.844107] nfs_page_group_lock_head+0x31/0x78
> > > [ 322.844110] nfs_lock_and_join_requests+0x9d/0x26e
> > > [ 322.844114] nfs_updatepage+0x1be/0x985
> > > [ 322.844118] nfs_write_end+0x128/0x52a
> > > [ 322.844122] ? iov_iter_copy_from_user_atomic+0xe8/0x3ce
> > > [ 322.844127] generic_perform_write+0xed/0x17c
> > > [ 322.844131] nfs_file_write+0x14a/0x29f
> > > [ 322.844135] new_sync_write+0xfb/0x16b
> > > [ 322.844139] vfs_write+0x1ee/0x29a
> > > [ 322.844142] ? vfs_read+0x17b/0x198
> > > [ 322.844145] ksys_write+0x61/0xd0
> > > [ 322.844148] __x64_sys_write+0x1a/0x1c
> > > [ 322.844151] do_syscall_64+0x32/0x45
> > > [ 322.844154] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 322.844158] RIP: 0033:0x7f4781f323a3
> > > [ 322.844160] RSP: 002b:00007ffcf8220408 EFLAGS: 00000246
> > > ORIG_RAX: 0000000000000001
> > > [ 322.844164] RAX: ffffffffffffffda RBX: 0000000000000092 RCX:
> > > 00007f4781f323a3
> > > [ 322.844166] RDX: 0000000000000092 RSI: 00007ffcf8220530 RDI:
> > > 0000000000000003
> > > [ 322.844168] RBP: 00007ffcf8220530 R08: 0000000000000000 R09:
> > > 0000000000000001
> > > [ 322.844170] R10: 0000000000400ad2 R11: 0000000000000246 R12:
> > > 0000000000000092
> > > [ 322.844172] R13: 0000000002361580 R14: 0000000000000092 R15:
> > > 00007f4782201720
> > > [ 322.844202] task:CIPCServer::Thr state:D stack: 0 pid: 1801
> > > ppid: 1540 flags:0x20020000
> > > [ 322.844207] Call Trace:
> > > [ 322.844212] __schedule+0x3db/0x7ee
> > > [ 322.844219] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844222] schedule+0x68/0xdc
> > > [ 322.844227] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844230] __wait_on_bit+0x31/0x90
> > > [ 322.844235] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844238] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844242] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844246] ? var_wake_function+0x30/0x27
> > > [ 322.844250] __rpc_execute+0xe5/0x485
> > > [ 322.844254] ? sugov_get_util+0xf0/0xeb
> > > [ 322.844258] rpc_execute+0xa5/0xaa
> > > [ 322.844263] rpc_run_task+0x148/0x187
> > > [ 322.844267] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844272] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844278] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844283] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844287] nfs_access_get_cached+0x14d/0x1df
> > > [ 322.844291] nfs_do_access+0x48/0x28e
> > > [ 322.844296] ? step_into+0x2bc/0x6a1
> > > [ 322.844301] nfs_permission+0xa2/0x1b8
> > > [ 322.844305] inode_permission+0x96/0xca
> > > [ 322.844308] ? unlazy_walk+0x4c/0x80
> > > [ 322.844312] link_path_walk.part.0+0x2a5/0x320
> > > [ 322.844315] ? path_init+0x2c2/0x3e5
> > > [ 322.844318] path_lookupat+0x3f/0x1b1
> > > [ 322.844322] filename_lookup+0x97/0x17f
> > > [ 322.844326] ? kmem_cache_alloc+0x32/0x200
> > > [ 322.844331] user_path_at_empty+0x59/0x8a
> > > [ 322.844333] do_faccessat+0x70/0x227
> > > [ 322.844337] __ia32_sys_access+0x1c/0x1e
> > > [ 322.844340] __do_fast_syscall_32+0x5f/0x8e
> > > [ 322.844342] do_fast_syscall_32+0x3d/0x80
> > > [ 322.844346] entry_SYSCALL_compat_after_hwframe+0x45/0x0
> > > [ 322.844348] RIP: 0023:0xf7ee6549
> > > [ 322.844350] RSP: 002b:00000000eb763b28 EFLAGS: 00200292
> > > ORIG_RAX: 0000000000000021
> > > [ 322.844353] RAX: ffffffffffffffda RBX: 000000005754d9e0 RCX:
> > > 0000000000000004
> > > [ 322.844355] RDX: 00000000f7ab3e1c RSI: 000000005754d9e0 RDI:
> > > 00000000eb763bc8
> > > [ 322.844357] RBP: 00000000f78a91c0 R08: 0000000000000000 R09:
> > > 0000000000000000
> > > [ 322.844359] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > 0000000000000000
> > > [ 322.844360] R13: 0000000000000000 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > [ 322.844364] task:CFileWriterThre state:D stack: 0 pid: 1810
> > > ppid: 1540 flags:0x20024000
> > > [ 322.844367] Call Trace:
> > > [ 322.844371] __schedule+0x3db/0x7ee
> > > [ 322.844375] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844379] schedule+0x68/0xdc
> > > [ 322.844382] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844385] __wait_on_bit+0x31/0x90
> > > [ 322.844389] ? call_reserveresult+0xa0/0x9f
> > > [ 322.844391] ? call_transmit_status+0x160/0x159
> > > [ 322.844395] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844398] ? var_wake_function+0x30/0x27
> > > [ 322.844401] __rpc_execute+0xe5/0x485
> > > [ 322.844404] ? nfs_check_verifier+0x5b/0xa4
> > > [ 322.844407] rpc_execute+0xa5/0xaa
> > > [ 322.844410] rpc_run_task+0x148/0x187
> > > [ 322.844414] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844417] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844422] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844426] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844430] nfs_getattr+0x159/0x374
> > > [ 322.844433] vfs_getattr_nosec+0x96/0xa4
> > > [ 322.844435] vfs_statx+0x7a/0xe5
> > > [ 322.844438] vfs_fstatat+0x1f/0x21
> > > [ 322.844443] __do_compat_sys_ia32_lstat64+0x2b/0x4e
> > > [ 322.844448] __ia32_compat_sys_ia32_lstat64+0x14/0x16
> > > [ 322.844450] __do_fast_syscall_32+0x5f/0x8e
> > > [ 322.844453] do_fast_syscall_32+0x3d/0x80
> > > [ 322.844457] entry_SYSCALL_compat_after_hwframe+0x45/0x0
> > > [ 322.844459] RIP: 0023:0xf7ee6549
> > > [ 322.844460] RSP: 002b:00000000eb561be8 EFLAGS: 00000292
> > > ORIG_RAX: 00000000000000c4
> > > [ 322.844463] RAX: ffffffffffffffda RBX: 00000000c521bf60 RCX:
> > > 00000000eb561e70
> > > [ 322.844465] RDX: 00000000f7ab3e1c RSI: 00000000eb561e70 RDI:
> > > 00000000c521bf60
> > > [ 322.844467] RBP: 0000000000000000 R08: 0000000000000000 R09:
> > > 0000000000000000
> > > [ 322.844468] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > 0000000000000000
> > > [ 322.844470] R13: 0000000000000000 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > [ 322.844477] task:Steam - DirWatc state:D stack: 0 pid: 1924
> > > ppid: 1540 flags:0x20024000
> > > [ 322.844480] Call Trace:
> > > [ 322.844484] __schedule+0x3db/0x7ee
> > > [ 322.844488] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844492] schedule+0x68/0xdc
> > > [ 322.844495] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844498] __wait_on_bit+0x31/0x90
> > > [ 322.844501] ? rpc_sleep_on_timeout+0x64/0x9c
> > > [ 322.844504] ? rpc_check_timeout+0x1e0/0x1e0
> > > [ 322.844507] ? call_connect_status+0x180/0x17a
> > > [ 322.844511] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844513] ? var_wake_function+0x30/0x27
> > > [ 322.844516] __rpc_execute+0xe5/0x485
> > > [ 322.844520] rpc_execute+0xa5/0xaa
> > > [ 322.844523] rpc_run_task+0x148/0x187
> > > [ 322.844526] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844530] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844534] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844538] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844542] nfs_lookup_verify_inode+0x67/0x82
> > > [ 322.844545] nfs_do_lookup_revalidate+0x19b/0x29c
> > > [ 322.844548] nfs4_do_lookup_revalidate+0x5d/0xc5
> > > [ 322.844552] nfs4_lookup_revalidate+0x6a/0x8d
> > > [ 322.844555] lookup_fast+0xb8/0x122
> > > [ 322.844559] path_openat+0xfd/0xfd8
> > > [ 322.844562] ? unlazy_walk+0x4c/0x80
> > > [ 322.844565] ? terminate_walk+0x62/0xed
> > > [ 322.844569] ? putname+0x4b/0x53
> > > [ 322.844572] do_filp_open+0x86/0x110
> > > [ 322.844576] ? kmem_cache_alloc+0x32/0x200
> > > [ 322.844579] ? __alloc_fd+0xb2/0x143
> > > [ 322.844583] do_sys_openat2+0x8d/0x13a
> > > [ 322.844585] ? do_faccessat+0x108/0x227
> > > [ 322.844588] __ia32_compat_sys_openat+0x48/0x63
> > > [ 322.844591] __do_fast_syscall_32+0x5f/0x8e
> > > [ 322.844594] do_fast_syscall_32+0x3d/0x80
> > > [ 322.844597] entry_SYSCALL_compat_after_hwframe+0x45/0x0
> > > [ 322.844600] RIP: 0023:0xf7ee6549
> > > [ 322.844601] RSP: 002b:000000008c8fbcb0 EFLAGS: 00200287
> > > ORIG_RAX: 0000000000000127
> > > [ 322.844604] RAX: ffffffffffffffda RBX: 00000000ffffff9c RCX:
> > > 000000008c8fbffc
> > > [ 322.844606] RDX: 0000000000098800 RSI: 0000000000000000 RDI:
> > > 00000000f7ab3e1c
> > > [ 322.844608] RBP: 0000000000000000 R08: 0000000000000000 R09:
> > > 0000000000000000
> > > [ 322.844609] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > 0000000000000000
> > > [ 322.844611] R13: 0000000000000000 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > [ 322.844614] task:CGenericAsyncFi state:D stack: 0 pid: 1927
> > > ppid: 1540 flags:0x20020000
> > > [ 322.844617] Call Trace:
> > > [ 322.844621] __schedule+0x3db/0x7ee
> > > [ 322.844625] schedule+0x68/0xdc
> > > [ 322.844628] io_schedule+0x46/0x6d
> > > [ 322.844631] wait_on_page_bit_common+0xf8/0x325
> > > [ 322.844634] ?
> > > trace_raw_output_file_check_and_advance_wb_err+0x80/0x7d
> > > [ 322.844638] wait_on_page_bit+0x3f/0x41
> > > [ 322.844642] wait_on_page_writeback.part.0+0x35/0x80
> > > [ 322.844645] wait_on_page_writeback+0x27/0x29
> > > [ 322.844647] __filemap_fdatawait_range+0x88/0xd9
> > > [ 322.844651] filemap_write_and_wait_range+0x45/0x80
> > > [ 322.844655] nfs_wb_all+0x30/0x13d
> > > [ 322.844658] nfs4_file_flush+0x85/0xc0
> > > [ 322.844661] filp_close+0x3e/0x78
> > > [ 322.844664] __close_fd+0x23/0x30
> > > [ 322.844667] __ia32_sys_close+0x22/0x50
> > > [ 322.844670] __do_fast_syscall_32+0x5f/0x8e
> > > [ 322.844673] do_fast_syscall_32+0x3d/0x80
> > > [ 322.844676] entry_SYSCALL_compat_after_hwframe+0x45/0x0
> > > [ 322.844678] RIP: 0023:0xf7ee6549
> > > [ 322.844680] RSP: 002b:000000008c5fddb0 EFLAGS: 00200286
> > > ORIG_RAX: 0000000000000006
> > > [ 322.844682] RAX: ffffffffffffffda RBX: 000000000000004d RCX:
> > > 0000000000000002
> > > [ 322.844684] RDX: 0000000000000000 RSI: 00000000f7d02e70 RDI:
> > > 000000000000004d
> > > [ 322.844686] RBP: 000000008c5fe110 R08: 0000000000000000 R09:
> > > 0000000000000000
> > > [ 322.844688] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > 0000000000000000
> > > [ 322.844689] R13: 0000000000000000 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > [ 322.844699] task:ThreadPoolForeg state:D stack: 0 pid: 1768
> > > ppid: 1735 flags:0x00004000
> > > [ 322.844702] Call Trace:
> > > [ 322.844705] __schedule+0x3db/0x7ee
> > > [ 322.844709] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844712] schedule+0x68/0xdc
> > > [ 322.844716] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844719] __wait_on_bit+0x31/0x90
> > > [ 322.844722] ? rpc_sleep_on_timeout+0x64/0x9c
> > > [ 322.844725] ? rpc_check_timeout+0x1e0/0x1e0
> > > [ 322.844728] ? call_connect_status+0x180/0x17a
> > > [ 322.844731] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844734] ? var_wake_function+0x30/0x27
> > > [ 322.844737] __rpc_execute+0xe5/0x485
> > > [ 322.844740] ? nfs_check_verifier+0x5b/0xa4
> > > [ 322.844743] rpc_execute+0xa5/0xaa
> > > [ 322.844746] rpc_run_task+0x148/0x187
> > > [ 322.844750] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844753] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844758] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844762] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844766] nfs_getattr+0x159/0x374
> > > [ 322.844769] vfs_getattr_nosec+0x96/0xa4
> > > [ 322.844771] vfs_statx+0x7a/0xe5
> > > [ 322.844774] __do_sys_newstat+0x31/0x4c
> > > [ 322.844778] __x64_sys_newstat+0x16/0x18
> > > [ 322.844781] do_syscall_64+0x32/0x45
> > > [ 322.844783] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 322.844786] RIP: 0033:0x7fe70ebf19e6
> > > [ 322.844787] RSP: 002b:00007fe6c15b2068 EFLAGS: 00000246
> > > ORIG_RAX: 0000000000000004
> > > [ 322.844790] RAX: ffffffffffffffda RBX: 00000bd4193e1910 RCX:
> > > 00007fe70ebf19e6
> > > [ 322.844792] RDX: 00007fe6c15b2078 RSI: 00007fe6c15b2078 RDI:
> > > 00000bd4193e1910
> > > [ 322.844794] RBP: 00007fe6c15b2160 R08: 0000000000000001 R09:
> > > 00007ffd971ed090
> > > [ 322.844796] R10: 0000000000056764 R11: 0000000000000246 R12:
> > > 00007fe6c15b2078
> > > [ 322.844797] R13: 00000bd4192f6f60 R14: 00007fe6c15b2108 R15:
> > > 00007fe6c15b2120
> > > [ 322.844801] task:ThreadPoolForeg state:D stack: 0 pid: 1769
> > > ppid: 1735 flags:0x00000000
> > > [ 322.844804] Call Trace:
> > > [ 322.844807] __schedule+0x3db/0x7ee
> > > [ 322.844811] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844814] schedule+0x68/0xdc
> > > [ 322.844817] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844820] __wait_on_bit+0x31/0x90
> > > [ 322.844823] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844826] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844829] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844831] ? var_wake_function+0x30/0x27
> > > [ 322.844834] __rpc_execute+0xe5/0x485
> > > [ 322.844837] ? update_cfs_group+0x9a/0x9e
> > > [ 322.844840] rpc_execute+0xa5/0xaa
> > > [ 322.844842] rpc_run_task+0x148/0x187
> > > [ 322.844845] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844849] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844853] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844856] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844859] nfs_file_write+0xfc/0x29f
> > > [ 322.844862] new_sync_write+0xfb/0x16b
> > > [ 322.844866] vfs_write+0x1ee/0x29a
> > > [ 322.844868] ? __fget_light+0x31/0x72
> > > [ 322.844871] ksys_write+0x61/0xd0
> > > [ 322.844874] __x64_sys_write+0x1a/0x1c
> > > [ 322.844877] do_syscall_64+0x32/0x45
> > > [ 322.844880] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 322.844881] RIP: 0033:0x7fe71859ab7f
> > > [ 322.844883] RSP: 002b:00007fe6bd5b0d30 EFLAGS: 00000293
> > > ORIG_RAX: 0000000000000001
> > > [ 322.844885] RAX: ffffffffffffffda RBX: 00007fe6bd5b0d68 RCX:
> > > 00007fe71859ab7f
> > > [ 322.844887] RDX: 0000000000000007 RSI: 00007fe6bd5b1090 RDI:
> > > 000000000000005b
> > > [ 322.844888] RBP: 00007fe6bd5b0de0 R08: 0000000000000000 R09:
> > > 00007ffd971ed090
> > > [ 322.844890] R10: 0000000000065fd2 R11: 0000000000000293 R12:
> > > 00000bd418d95020
> > > [ 322.844891] R13: 00007fe6bd5b1090 R14: 0000000000000007 R15:
> > > 0000000000000000
> > > [ 322.844900] task:ThreadPoolForeg state:D stack: 0 pid: 1807
> > > ppid: 1773 flags:0x00000000
> > > [ 322.844902] Call Trace:
> > > [ 322.844904] __schedule+0x3db/0x7ee
> > > [ 322.844907] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844908] schedule+0x68/0xdc
> > > [ 322.844911] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.844913] __wait_on_bit+0x31/0x90
> > > [ 322.844915] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844916] ? __rpc_atrun+0x20/0x1a
> > > [ 322.844918] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.844920] ? var_wake_function+0x30/0x27
> > > [ 322.844922] __rpc_execute+0xe5/0x485
> > > [ 322.844924] rpc_execute+0xa5/0xaa
> > > [ 322.844926] rpc_run_task+0x148/0x187
> > > [ 322.844928] nfs4_do_call_sync+0x61/0x81
> > > [ 322.844930] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.844933] nfs4_proc_getattr+0x67/0xee
> > > [ 322.844936] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.844938] nfs_access_get_cached+0x14d/0x1df
> > > [ 322.844940] nfs_do_access+0x48/0x28e
> > > [ 322.844942] ? nfs4_lookup_revalidate+0x38/0x8d
> > > [ 322.844944] nfs_permission+0xa2/0x1b8
> > > [ 322.844946] inode_permission+0x96/0xca
> > > [ 322.844948] ? unlazy_walk+0x4c/0x80
> > > [ 322.844950] link_path_walk.part.0+0x2a5/0x320
> > > [ 322.844952] ? path_init+0x2c2/0x3e5
> > > [ 322.844954] path_lookupat+0x3f/0x1b1
> > > [ 322.844956] filename_lookup+0x97/0x17f
> > > [ 322.844959] ? do_unlk+0x9a/0xca
> > > [ 322.844960] ? kmem_cache_alloc+0x32/0x200
> > > [ 322.844963] user_path_at_empty+0x59/0x8a
> > > [ 322.844965] vfs_statx+0x64/0xe5
> > > [ 322.844967] __do_sys_newstat+0x31/0x4c
> > > [ 322.844968] ? do_fcntl+0x113/0x582
> > > [ 322.844971] ? fput+0x13/0x15
> > > [ 322.844972] ? __x64_sys_fcntl+0x81/0xa8
> > > [ 322.844974] __x64_sys_newstat+0x16/0x18
> > > [ 322.844976] do_syscall_64+0x32/0x45
> > > [ 322.844977] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 322.844979] RIP: 0033:0x7f0d1fd0c9e6
> > > [ 322.844980] RSP: 002b:00007f0d08e12e18 EFLAGS: 00000246
> > > ORIG_RAX: 0000000000000004
> > > [ 322.844981] RAX: ffffffffffffffda RBX: 00007f0d08e12edc RCX:
> > > 00007f0d1fd0c9e6
> > > [ 322.844983] RDX: 00007f0d08e12e28 RSI: 00007f0d08e12e28 RDI:
> > > 00001605c2ff1169
> > > [ 322.844984] RBP: 00007f0d08e12ec0 R08: 0000000000000001 R09:
> > > 00007fffaddd9090
> > > [ 322.844985] R10: 00001605c3133480 R11: 0000000000000246 R12:
> > > 0000000000000000
> > > [ 322.844986] R13: 0000000000000000 R14: 00001605c2ff0f00 R15:
> > > 0000000000000000
> > > [ 322.844988] task:ThreadPoolForeg state:D stack: 0 pid: 2031
> > > ppid: 1773 flags:0x00004000
> > > [ 322.844990] Call Trace:
> > > [ 322.844992] __schedule+0x3db/0x7ee
> > > [ 322.844995] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.844997] schedule+0x68/0xdc
> > > [ 322.844999] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.845001] __wait_on_bit+0x31/0x90
> > > [ 322.845003] ? rpc_sleep_on_timeout+0x64/0x9c
> > > [ 322.845005] ? rpc_check_timeout+0x1e0/0x1e0
> > > [ 322.845006] ? call_connect_status+0x180/0x17a
> > > [ 322.845009] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.845010] ? var_wake_function+0x30/0x27
> > > [ 322.845012] __rpc_execute+0xe5/0x485
> > > [ 322.845014] rpc_execute+0xa5/0xaa
> > > [ 322.845016] rpc_run_task+0x148/0x187
> > > [ 322.845018] nfs4_do_call_sync+0x61/0x81
> > > [ 322.845020] _nfs4_proc_getattr+0x10a/0x120
> > > [ 322.845023] nfs4_proc_getattr+0x67/0xee
> > > [ 322.845026] __nfs_revalidate_inode+0xb7/0x2df
> > > [ 322.845028] nfs_access_get_cached+0x14d/0x1df
> > > [ 322.845030] nfs_do_access+0x48/0x28e
> > > [ 322.845032] ? nfs4_lookup_revalidate+0x38/0x8d
> > > [ 322.845034] nfs_permission+0xa2/0x1b8
> > > [ 322.845036] inode_permission+0x96/0xca
> > > [ 322.845038] ? unlazy_walk+0x4c/0x80
> > > [ 322.845040] link_path_walk.part.0+0x2a5/0x320
> > > [ 322.845042] ? path_init+0x2c2/0x3e5
> > > [ 322.845044] path_lookupat+0x3f/0x1b1
> > > [ 322.845046] filename_lookup+0x97/0x17f
> > > [ 322.845049] ? kmem_cache_alloc+0x32/0x200
> > > [ 322.845051] user_path_at_empty+0x59/0x8a
> > > [ 322.845054] ? mntput_no_expire+0x4f/0x264
> > > [ 322.845055] vfs_statx+0x64/0xe5
> > > [ 322.845057] ? putname+0x4b/0x53
> > > [ 322.845059] __do_sys_newstat+0x31/0x4c
> > > [ 322.845062] ? __x64_sys_futex+0x13a/0x159
> > > [ 322.845064] ? switch_fpu_return+0x58/0xf2
> > > [ 322.845066] __x64_sys_newstat+0x16/0x18
> > > [ 322.845068] do_syscall_64+0x32/0x45
> > > [ 322.845069] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 322.845071] RIP: 0033:0x7f0d1fd0c9e6
> > > [ 322.845072] RSP: 002b:00007f0d00331108 EFLAGS: 00000246
> > > ORIG_RAX: 0000000000000004
> > > [ 322.845073] RAX: ffffffffffffffda RBX: 00001605c2fdc0a0 RCX:
> > > 00007f0d1fd0c9e6
> > > [ 322.845074] RDX: 00007f0d00331118 RSI: 00007f0d00331118 RDI:
> > > 00001605c2fdc0a0
> > > [ 322.845076] RBP: 00007f0d00331200 R08: 0000000000000001 R09:
> > > 00007fffaddd9090
> > > [ 322.845077] R10: 000000000005771c R11: 0000000000000246 R12:
> > > 00007f0d00331118
> > > [ 322.845078] R13: 00001605c3309c40 R14: 00007f0d003311a8 R15:
> > > 00007f0d003311c0
> > > [ 322.845087] task:192.168.16.8-ma state:D stack: 0 pid: 2044
> > > ppid: 2 flags:0x00004000
> > > [ 322.845089] Call Trace:
> > > [ 322.845091] __schedule+0x3db/0x7ee
> > > [ 322.845094] ?
> > > trace_event_raw_event_rpc_xdr_overflow+0x280/0x27e
> > > [ 322.845095] schedule+0x68/0xdc
> > > [ 322.845098] rpc_wait_bit_killable+0x11/0x52
> > > [ 322.845100] __wait_on_bit+0x31/0x90
> > > [ 322.845102] ? __rpc_atrun+0x20/0x1a
> > > [ 322.845103] ? __rpc_atrun+0x20/0x1a
> > > [ 322.845105] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 322.845107] ? var_wake_function+0x30/0x27
> > > [ 322.845109] __rpc_execute+0xe5/0x485
> > > [ 322.845111] rpc_execute+0xa5/0xaa
> > > [ 322.845113] rpc_run_task+0x148/0x187
> > > [ 322.845115] nfs4_do_call_sync+0x61/0x81
> > > [ 322.845117] _nfs4_proc_statfs+0xc1/0xcf
> > > [ 322.845120] nfs4_proc_statfs+0x55/0x74
> > > [ 322.845122] nfs_statfs+0x65/0x152
> > > [ 322.845124] ? _nfs4_proc_delegreturn+0x21e/0x2e5
> > > [ 322.845127] statfs_by_dentry+0x55/0x74
> > > [ 322.845129] vfs_statfs+0x24/0x43
> > > [ 322.845132] check_free_space+0x3c/0xe5
> > > [ 322.845135] do_acct_process+0x5d/0x5e1
> > > [ 322.845136] ? nfs_end_delegation_return+0x1cb/0x3aa
> > > [ 322.845138] ? complete+0x49/0x57
> > > [ 322.845141] acct_process+0xe5/0x105
> > > [ 322.845143] do_exit+0x781/0xa1b
> > > [ 322.845145] ? refcount_dec_and_lock+0x22/0x8c
> > > [ 322.845148] __module_put_and_exit+0x1b/0x1b
> > > [ 322.845150] nfs4_run_state_manager+0x293/0x933
> > > [ 322.845152] ? nfs4_do_reclaim+0x820/0x81c
> > > [ 322.845155] kthread+0x142/0x15f
> > > [ 322.845157] ? __kthread_create_worker+0x130/0x122
> > > [ 322.845159] ret_from_fork+0x1f/0x2a
> > > [ 369.859385] INFO: task tee:1176 blocked for more than 122
> > > seconds.
> > > [ 369.859391] Not tainted 5.10.4-00021-gde017294a539-dirty
> > > #3
> > > [ 369.859393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 369.859395] task:tee state:D stack: 0 pid: 1176
> > > ppid: 1173 flags:0x00000000
> > > [ 369.859399] Call Trace:
> > > [ 369.859407] __schedule+0x3db/0x7ee
> > > [ 369.859410] ? io_schedule+0x70/0x6d
> > > [ 369.859412] schedule+0x68/0xdc
> > > [ 369.859415] io_schedule+0x46/0x6d
> > > [ 369.859417] bit_wait_io+0x11/0x52
> > > [ 369.859420] __wait_on_bit+0x31/0x90
> > > [ 369.859422] out_of_line_wait_on_bit+0x7e/0x80
> > > [ 369.859426] ? var_wake_function+0x30/0x27
> > > [ 369.859429] nfs_wait_on_request.part.0+0x31/0x33
> > > [ 369.859431] nfs_page_group_lock_head+0x31/0x78
> > > [ 369.859433] nfs_lock_and_join_requests+0x9d/0x26e
> > > [ 369.859436] nfs_updatepage+0x1be/0x985
> > > [ 369.859439] nfs_write_end+0x128/0x52a
> > > [ 369.859443] ? iov_iter_copy_from_user_atomic+0xe8/0x3ce
> > > [ 369.859446] generic_perform_write+0xed/0x17c
> > > [ 369.859449] nfs_file_write+0x14a/0x29f
> > > [ 369.859452] new_sync_write+0xfb/0x16b
> > > [ 369.859455] vfs_write+0x1ee/0x29a
> > > [ 369.859457] ? vfs_read+0x17b/0x198
> > > [ 369.859459] ksys_write+0x61/0xd0
> > > [ 369.859462] __x64_sys_write+0x1a/0x1c
> > > [ 369.859464] do_syscall_64+0x32/0x45
> > > [ 369.859467] entry_SYSCALL_64_after_hwframe+0x44/0x0
> > > [ 369.859469] RIP: 0033:0x7f4781f323a3
> > > [ 369.859471] RSP: 002b:00007ffcf8220408 EFLAGS: 00000246
> > > ORIG_RAX: 0000000000000001
> > > [ 369.859474] RAX: ffffffffffffffda RBX: 0000000000000092 RCX:
> > > 00007f4781f323a3
> > > [ 369.859475] RDX: 0000000000000092 RSI: 00007ffcf8220530 RDI:
> > > 0000000000000003
> > > [ 369.859477] RBP: 00007ffcf8220530 R08: 0000000000000000 R09:
> > > 0000000000000001
> > > [ 369.859478] R10: 0000000000400ad2 R11: 0000000000000246 R12:
> > > 0000000000000092
> > > [ 369.859480] R13: 0000000002361580 R14: 0000000000000092 R15:
> > > 00007f4782201720
> > > [ 369.859495] INFO: task CGenericAsyncFi:1927 blocked for more
> > > than 122 seconds.
> > > [ 369.859497] Not tainted 5.10.4-00021-gde017294a539-dirty
> > > #3
> > > [ 369.859498] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [ 369.859499] task:CGenericAsyncFi state:D stack: 0 pid: 1927
> > > ppid: 1540 flags:0x20020000
> > > [ 369.859501] Call Trace:
> > > [ 369.859504] __schedule+0x3db/0x7ee
> > > [ 369.859506] schedule+0x68/0xdc
> > > [ 369.859508] io_schedule+0x46/0x6d
> > > [ 369.859510] wait_on_page_bit_common+0xf8/0x325
> > > [ 369.859512] ?
> > > trace_raw_output_file_check_and_advance_wb_err+0x80/0x7d
> > > [ 369.859515] wait_on_page_bit+0x3f/0x41
> > > [ 369.859517] wait_on_page_writeback.part.0+0x35/0x80
> > > [ 369.859519] wait_on_page_writeback+0x27/0x29
> > > [ 369.859521] __filemap_fdatawait_range+0x88/0xd9
> > > [ 369.859523] filemap_write_and_wait_range+0x45/0x80
> > > [ 369.859525] nfs_wb_all+0x30/0x13d
> > > [ 369.859528] nfs4_file_flush+0x85/0xc0
> > > [ 369.859530] filp_close+0x3e/0x78
> > > [ 369.859532] __close_fd+0x23/0x30
> > > [ 369.859534] __ia32_sys_close+0x22/0x50
> > > [ 369.859536] __do_fast_syscall_32+0x5f/0x8e
> > > [ 369.859538] do_fast_syscall_32+0x3d/0x80
> > > [ 369.859540] entry_SYSCALL_compat_after_hwframe+0x45/0x0
> > > [ 369.859541] RIP: 0023:0xf7ee6549
> > > [ 369.859543] RSP: 002b:000000008c5fddb0 EFLAGS: 00200286
> > > ORIG_RAX: 0000000000000006
> > > [ 369.859545] RAX: ffffffffffffffda RBX: 000000000000004d RCX:
> > > 0000000000000002
> > > [ 369.859546] RDX: 0000000000000000 RSI: 00000000f7d02e70 RDI:
> > > 000000000000004d
> > > [ 369.859547] RBP: 000000008c5fe110 R08: 0000000000000000 R09:
> > > 0000000000000000
> > > [ 369.859548] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > 0000000000000000
> > > [ 369.859549] R13: 0000000000000000 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > [ 394.423093] nfs: server loom not responding, still trying
> > >
> > > --
> > > NULL && (void)
>
> --
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
>
> www.hammer.space
>
next prev parent reply other threads:[~2021-02-24 2:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-03 14:27 steam-associated reproducible hard NFSv4.2 client hang (5.9, 5.10) Nick Alcock
2021-02-23 22:57 ` J. Bruce Fields
2021-02-23 23:58 ` Trond Myklebust
2021-02-24 2:01 ` bfields [this message]
2021-04-01 13:33 ` Nix
2021-04-01 13:44 ` bfields
2021-04-01 21:52 ` Nix
2021-04-02 19:20 ` bfields
2021-04-03 22:41 ` Nix
2021-04-05 11:48 ` Nix
2021-04-05 16:52 ` Nix
-- strict thread matches above, loose matches on Subject: below --
2021-02-13 15:21 Nick Alcock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210224020140.GA26848@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=nix@esperi.org.uk \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox