* nfs client deadloop on 6.6.53
@ 2024-10-08 13:27 Wang Yugui
2024-10-08 14:47 ` Trond Myklebust
0 siblings, 1 reply; 5+ messages in thread
From: Wang Yugui @ 2024-10-08 13:27 UTC (permalink / raw)
To: linux-nfs
Hi,
nfs client deadloop on 6.6.53.
[ 9409.381322] sysrq: Show Blocked State
[ 9409.386146] task:bash state:D stack:0 pid:2323 ppid:2226 flags:0x00004002
[ 9409.395225] Call Trace:
[ 9409.398376] <TASK>
[ 9409.401172] __schedule+0x232/0x5d0
[ 9409.405370] schedule+0x5e/0xd0
[ 9409.409217] schedule_timeout+0x8c/0x170
[ 9409.413837] ? __pfx_process_timeout+0x10/0x10
[ 9409.418989] msleep+0x3b/0x50
[ 9409.422656] ff_layout_pg_init_read+0x1c1/0x290 [nfs_layout_flexfiles]
[ 9409.429910] __nfs_pageio_add_request+0x29b/0x480 [nfs]
[ 9409.435911] nfs_pageio_add_request+0x221/0x2a0 [nfs]
[ 9409.441715] nfs_read_add_folio+0x1a3/0x280 [nfs]
[ 9409.447175] nfs_readahead+0x235/0x2d0 [nfs]
[ 9409.452193] read_pages+0x56/0x2c0
[ 9409.456298] page_cache_ra_unbounded+0x134/0x1a0
[ 9409.461626] filemap_get_pages+0xf5/0x3a0
[ 9409.466355] ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
[ 9409.472325] filemap_read+0xdc/0x350
[ 9409.476614] ? find_idlest_group+0x113/0x530
[ 9409.481614] nfs_file_read+0x74/0xc0 [nfs]
[ 9409.486461] __kernel_read+0xff/0x2b0
[ 9409.490838] search_binary_handler+0x70/0x250
[ 9409.495908] exec_binprm+0x50/0x1a0
[ 9409.500102] bprm_execve.part.0+0x17d/0x230
[ 9409.504993] do_execveat_common.isra.0+0x1a2/0x240
[ 9409.510489] __x64_sys_execve+0x37/0x50
[ 9409.515026] do_syscall_64+0x5a/0x90
[ 9409.519298] ? __count_memcg_events+0x4c/0xa0
[ 9409.524348] ? mm_account_fault+0x6c/0x100
[ 9409.529129] ? handle_mm_fault+0x154/0x280
[ 9409.533903] ? do_user_addr_fault+0x35f/0x680
[ 9409.538935] ? exc_page_fault+0x69/0x150
[ 9409.543537] entry_SYSCALL_64_after_hwframe+0x78/0xe2
[ 9409.549277] RIP: 0033:0x7f57378d987b
[ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f57378d987b
[ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI: 000055d26e6ce7f0
[ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09: 0000000000000000
[ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12: 00000000ffffffff
[ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15: 000055d26e6ceb40
[ 9409.601047] </TASK>
[ 9409.603946] task:bash state:D stack:0 pid:2550 ppid:2462 flags:0x00004002
[ 9409.613027] Call Trace:
[ 9409.616185] <TASK>
[ 9409.618983] __schedule+0x232/0x5d0
[ 9409.623186] schedule+0x5e/0xd0
[ 9409.627033] io_schedule+0x46/0x70
[ 9409.631140] folio_wait_bit_common+0x133/0x390
[ 9409.636294] ? folio_wait_bit_common+0x100/0x390
[ 9409.641624] ? nfs4_do_open+0xcd/0x210 [nfsv4]
[ 9409.646854] ? __pfx_wake_page_function+0x10/0x10
[ 9409.652268] filemap_update_page+0x2bc/0x300
[ 9409.657242] filemap_get_pages+0x21d/0x3a0
[ 9409.662042] ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
[ 9409.668010] filemap_read+0xdc/0x350
[ 9409.672299] nfs_file_read+0x74/0xc0 [nfs]
[ 9409.677126] __kernel_read+0xff/0x2b0
[ 9409.681476] search_binary_handler+0x70/0x250
[ 9409.686526] exec_binprm+0x50/0x1a0
[ 9409.690702] bprm_execve.part.0+0x17d/0x230
[ 9409.695573] do_execveat_common.isra.0+0x1a2/0x240
[ 9409.701047] __x64_sys_execve+0x37/0x50
[ 9409.705559] do_syscall_64+0x5a/0x90
[ 9409.709805] ? do_user_addr_fault+0x35f/0x680
[ 9409.714834] ? exc_page_fault+0x69/0x150
[ 9409.719414] entry_SYSCALL_64_after_hwframe+0x78/0xe2
[ 9409.725126] RIP: 0033:0x7f3c492d987b
[ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f3c492d987b
[ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI: 000055c6a90f7890
[ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09: 0000000000000000
[ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12: 00000000ffffffff
[ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15: 000055c6a90e1ea0
[ 9409.776732] </TASK>
Notice:
1, nfs server: kernel 6.6.54
pnfs optin in the service side /etc/exports.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2024/10/08
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nfs client deadloop on 6.6.53
2024-10-08 13:27 nfs client deadloop on 6.6.53 Wang Yugui
@ 2024-10-08 14:47 ` Trond Myklebust
2024-10-13 22:10 ` Wang Yugui
0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2024-10-08 14:47 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org, wangyugui@e16-tech.com
On Tue, 2024-10-08 at 21:27 +0800, Wang Yugui wrote:
> Hi,
>
> nfs client deadloop on 6.6.53.
>
> [ 9409.381322] sysrq: Show Blocked State
> [ 9409.386146] task:bash state:D stack:0 pid:2323
> ppid:2226 flags:0x00004002
> [ 9409.395225] Call Trace:
> [ 9409.398376] <TASK>
> [ 9409.401172] __schedule+0x232/0x5d0
> [ 9409.405370] schedule+0x5e/0xd0
> [ 9409.409217] schedule_timeout+0x8c/0x170
> [ 9409.413837] ? __pfx_process_timeout+0x10/0x10
> [ 9409.418989] msleep+0x3b/0x50
> [ 9409.422656] ff_layout_pg_init_read+0x1c1/0x290
> [nfs_layout_flexfiles]
> [ 9409.429910] __nfs_pageio_add_request+0x29b/0x480 [nfs]
> [ 9409.435911] nfs_pageio_add_request+0x221/0x2a0 [nfs]
> [ 9409.441715] nfs_read_add_folio+0x1a3/0x280 [nfs]
> [ 9409.447175] nfs_readahead+0x235/0x2d0 [nfs]
> [ 9409.452193] read_pages+0x56/0x2c0
> [ 9409.456298] page_cache_ra_unbounded+0x134/0x1a0
> [ 9409.461626] filemap_get_pages+0xf5/0x3a0
> [ 9409.466355] ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> [ 9409.472325] filemap_read+0xdc/0x350
> [ 9409.476614] ? find_idlest_group+0x113/0x530
> [ 9409.481614] nfs_file_read+0x74/0xc0 [nfs]
> [ 9409.486461] __kernel_read+0xff/0x2b0
> [ 9409.490838] search_binary_handler+0x70/0x250
> [ 9409.495908] exec_binprm+0x50/0x1a0
> [ 9409.500102] bprm_execve.part.0+0x17d/0x230
> [ 9409.504993] do_execveat_common.isra.0+0x1a2/0x240
> [ 9409.510489] __x64_sys_execve+0x37/0x50
> [ 9409.515026] do_syscall_64+0x5a/0x90
> [ 9409.519298] ? __count_memcg_events+0x4c/0xa0
> [ 9409.524348] ? mm_account_fault+0x6c/0x100
> [ 9409.529129] ? handle_mm_fault+0x154/0x280
> [ 9409.533903] ? do_user_addr_fault+0x35f/0x680
> [ 9409.538935] ? exc_page_fault+0x69/0x150
> [ 9409.543537] entry_SYSCALL_64_after_hwframe+0x78/0xe2
> [ 9409.549277] RIP: 0033:0x7f57378d987b
> [ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246 ORIG_RAX:
> 000000000000003b
> [ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> 00007f57378d987b
> [ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI:
> 000055d26e6ce7f0
> [ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09:
> 0000000000000000
> [ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12:
> 00000000ffffffff
> [ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15:
> 000055d26e6ceb40
> [ 9409.601047] </TASK>
> [ 9409.603946] task:bash state:D stack:0 pid:2550
> ppid:2462 flags:0x00004002
> [ 9409.613027] Call Trace:
> [ 9409.616185] <TASK>
> [ 9409.618983] __schedule+0x232/0x5d0
> [ 9409.623186] schedule+0x5e/0xd0
> [ 9409.627033] io_schedule+0x46/0x70
> [ 9409.631140] folio_wait_bit_common+0x133/0x390
> [ 9409.636294] ? folio_wait_bit_common+0x100/0x390
> [ 9409.641624] ? nfs4_do_open+0xcd/0x210 [nfsv4]
> [ 9409.646854] ? __pfx_wake_page_function+0x10/0x10
> [ 9409.652268] filemap_update_page+0x2bc/0x300
> [ 9409.657242] filemap_get_pages+0x21d/0x3a0
> [ 9409.662042] ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> [ 9409.668010] filemap_read+0xdc/0x350
> [ 9409.672299] nfs_file_read+0x74/0xc0 [nfs]
> [ 9409.677126] __kernel_read+0xff/0x2b0
> [ 9409.681476] search_binary_handler+0x70/0x250
> [ 9409.686526] exec_binprm+0x50/0x1a0
> [ 9409.690702] bprm_execve.part.0+0x17d/0x230
> [ 9409.695573] do_execveat_common.isra.0+0x1a2/0x240
> [ 9409.701047] __x64_sys_execve+0x37/0x50
> [ 9409.705559] do_syscall_64+0x5a/0x90
> [ 9409.709805] ? do_user_addr_fault+0x35f/0x680
> [ 9409.714834] ? exc_page_fault+0x69/0x150
> [ 9409.719414] entry_SYSCALL_64_after_hwframe+0x78/0xe2
> [ 9409.725126] RIP: 0033:0x7f3c492d987b
> [ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246 ORIG_RAX:
> 000000000000003b
> [ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> 00007f3c492d987b
> [ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI:
> 000055c6a90f7890
> [ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09:
> 0000000000000000
> [ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12:
> 00000000ffffffff
> [ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15:
> 000055c6a90e1ea0
> [ 9409.776732] </TASK>
>
> Notice:
> 1, nfs server: kernel 6.6.54
> pnfs optin in the service side /etc/exports.
>
This is not a client bug.
The client has no choice other than to retry here. It is being given a
layout that it cannot use (probably because it has already discovered
that it cannot talk to the data server), but it is also being told by
the same layout that it is not allowed to fall back to doing I/O
through the metadata server.
IOW: This bug needs to be fixed on the server, which is handing out a
layout that is impossible to satisfy.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nfs client deadloop on 6.6.53
2024-10-08 14:47 ` Trond Myklebust
@ 2024-10-13 22:10 ` Wang Yugui
2024-10-14 2:58 ` Trond Myklebust
0 siblings, 1 reply; 5+ messages in thread
From: Wang Yugui @ 2024-10-13 22:10 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org
Hi,
> On Tue, 2024-10-08 at 21:27 +0800, Wang Yugui wrote:
> > Hi,
> >
> > nfs client deadloop on 6.6.53.
> >
> > [ 9409.381322] sysrq: Show Blocked State
> > [ 9409.386146] task:bash??????????? state:D stack:0???? pid:2323?
> > ppid:2226?? flags:0x00004002
> > [ 9409.395225] Call Trace:
> > [ 9409.398376]? <TASK>
> > [ 9409.401172]? __schedule+0x232/0x5d0
> > [ 9409.405370]? schedule+0x5e/0xd0
> > [ 9409.409217]? schedule_timeout+0x8c/0x170
> > [ 9409.413837]? ? __pfx_process_timeout+0x10/0x10
> > [ 9409.418989]? msleep+0x3b/0x50
> > [ 9409.422656]? ff_layout_pg_init_read+0x1c1/0x290
> > [nfs_layout_flexfiles]
> > [ 9409.429910]? __nfs_pageio_add_request+0x29b/0x480 [nfs]
> > [ 9409.435911]? nfs_pageio_add_request+0x221/0x2a0 [nfs]
> > [ 9409.441715]? nfs_read_add_folio+0x1a3/0x280 [nfs]
> > [ 9409.447175]? nfs_readahead+0x235/0x2d0 [nfs]
> > [ 9409.452193]? read_pages+0x56/0x2c0
> > [ 9409.456298]? page_cache_ra_unbounded+0x134/0x1a0
> > [ 9409.461626]? filemap_get_pages+0xf5/0x3a0
> > [ 9409.466355]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > [ 9409.472325]? filemap_read+0xdc/0x350
> > [ 9409.476614]? ? find_idlest_group+0x113/0x530
> > [ 9409.481614]? nfs_file_read+0x74/0xc0 [nfs]
> > [ 9409.486461]? __kernel_read+0xff/0x2b0
> > [ 9409.490838]? search_binary_handler+0x70/0x250
> > [ 9409.495908]? exec_binprm+0x50/0x1a0
> > [ 9409.500102]? bprm_execve.part.0+0x17d/0x230
> > [ 9409.504993]? do_execveat_common.isra.0+0x1a2/0x240
> > [ 9409.510489]? __x64_sys_execve+0x37/0x50
> > [ 9409.515026]? do_syscall_64+0x5a/0x90
> > [ 9409.519298]? ? __count_memcg_events+0x4c/0xa0
> > [ 9409.524348]? ? mm_account_fault+0x6c/0x100
> > [ 9409.529129]? ? handle_mm_fault+0x154/0x280
> > [ 9409.533903]? ? do_user_addr_fault+0x35f/0x680
> > [ 9409.538935]? ? exc_page_fault+0x69/0x150
> > [ 9409.543537]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > [ 9409.549277] RIP: 0033:0x7f57378d987b
> > [ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246 ORIG_RAX:
> > 000000000000003b
> > [ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > 00007f57378d987b
> > [ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI:
> > 000055d26e6ce7f0
> > [ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09:
> > 0000000000000000
> > [ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12:
> > 00000000ffffffff
> > [ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15:
> > 000055d26e6ceb40
> > [ 9409.601047]? </TASK>
> > [ 9409.603946] task:bash??????????? state:D stack:0???? pid:2550?
> > ppid:2462?? flags:0x00004002
> > [ 9409.613027] Call Trace:
> > [ 9409.616185]? <TASK>
> > [ 9409.618983]? __schedule+0x232/0x5d0
> > [ 9409.623186]? schedule+0x5e/0xd0
> > [ 9409.627033]? io_schedule+0x46/0x70
> > [ 9409.631140]? folio_wait_bit_common+0x133/0x390
> > [ 9409.636294]? ? folio_wait_bit_common+0x100/0x390
> > [ 9409.641624]? ? nfs4_do_open+0xcd/0x210 [nfsv4]
> > [ 9409.646854]? ? __pfx_wake_page_function+0x10/0x10
> > [ 9409.652268]? filemap_update_page+0x2bc/0x300
> > [ 9409.657242]? filemap_get_pages+0x21d/0x3a0
> > [ 9409.662042]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > [ 9409.668010]? filemap_read+0xdc/0x350
> > [ 9409.672299]? nfs_file_read+0x74/0xc0 [nfs]
> > [ 9409.677126]? __kernel_read+0xff/0x2b0
> > [ 9409.681476]? search_binary_handler+0x70/0x250
> > [ 9409.686526]? exec_binprm+0x50/0x1a0
> > [ 9409.690702]? bprm_execve.part.0+0x17d/0x230
> > [ 9409.695573]? do_execveat_common.isra.0+0x1a2/0x240
> > [ 9409.701047]? __x64_sys_execve+0x37/0x50
> > [ 9409.705559]? do_syscall_64+0x5a/0x90
> > [ 9409.709805]? ? do_user_addr_fault+0x35f/0x680
> > [ 9409.714834]? ? exc_page_fault+0x69/0x150
> > [ 9409.719414]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > [ 9409.725126] RIP: 0033:0x7f3c492d987b
> > [ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246 ORIG_RAX:
> > 000000000000003b
> > [ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > 00007f3c492d987b
> > [ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI:
> > 000055c6a90f7890
> > [ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09:
> > 0000000000000000
> > [ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12:
> > 00000000ffffffff
> > [ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15:
> > 000055c6a90e1ea0
> > [ 9409.776732]? </TASK>
> >
> > Notice:
> > 1, nfs server:? kernel 6.6.54
> > pnfs optin in the service side /etc/exports.
> >
>
> This is not a client bug.
>
> The client has no choice other than to retry here. It is being given a
> layout that it cannot use (probably because it has already discovered
> that it cannot talk to the data server), but it is also being told by
> the same layout that it is not allowed to fall back to doing I/O
> through the metadata server.
>
> IOW: This bug needs to be fixed on the server, which is handing out a
> layout that is impossible to satisfy.
It seems that pnfs need nfs3/udp.
but the nfs3/udp is disabled on this server.
Thanks a lot for the reply.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2024/10/14
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nfs client deadloop on 6.6.53
2024-10-13 22:10 ` Wang Yugui
@ 2024-10-14 2:58 ` Trond Myklebust
2024-10-14 12:22 ` Wang Yugui
0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2024-10-14 2:58 UTC (permalink / raw)
To: wangyugui@e16-tech.com; +Cc: linux-nfs@vger.kernel.org
On Mon, 2024-10-14 at 06:10 +0800, Wang Yugui wrote:
> Hi,
>
> > On Tue, 2024-10-08 at 21:27 +0800, Wang Yugui wrote:
> > > Hi,
> > >
> > > nfs client deadloop on 6.6.53.
> > >
> > > [ 9409.381322] sysrq: Show Blocked State
> > > [ 9409.386146] task:bash??????????? state:D stack:0???? pid:2323?
> > > ppid:2226?? flags:0x00004002
> > > [ 9409.395225] Call Trace:
> > > [ 9409.398376]? <TASK>
> > > [ 9409.401172]? __schedule+0x232/0x5d0
> > > [ 9409.405370]? schedule+0x5e/0xd0
> > > [ 9409.409217]? schedule_timeout+0x8c/0x170
> > > [ 9409.413837]? ? __pfx_process_timeout+0x10/0x10
> > > [ 9409.418989]? msleep+0x3b/0x50
> > > [ 9409.422656]? ff_layout_pg_init_read+0x1c1/0x290
> > > [nfs_layout_flexfiles]
> > > [ 9409.429910]? __nfs_pageio_add_request+0x29b/0x480 [nfs]
> > > [ 9409.435911]? nfs_pageio_add_request+0x221/0x2a0 [nfs]
> > > [ 9409.441715]? nfs_read_add_folio+0x1a3/0x280 [nfs]
> > > [ 9409.447175]? nfs_readahead+0x235/0x2d0 [nfs]
> > > [ 9409.452193]? read_pages+0x56/0x2c0
> > > [ 9409.456298]? page_cache_ra_unbounded+0x134/0x1a0
> > > [ 9409.461626]? filemap_get_pages+0xf5/0x3a0
> > > [ 9409.466355]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > [ 9409.472325]? filemap_read+0xdc/0x350
> > > [ 9409.476614]? ? find_idlest_group+0x113/0x530
> > > [ 9409.481614]? nfs_file_read+0x74/0xc0 [nfs]
> > > [ 9409.486461]? __kernel_read+0xff/0x2b0
> > > [ 9409.490838]? search_binary_handler+0x70/0x250
> > > [ 9409.495908]? exec_binprm+0x50/0x1a0
> > > [ 9409.500102]? bprm_execve.part.0+0x17d/0x230
> > > [ 9409.504993]? do_execveat_common.isra.0+0x1a2/0x240
> > > [ 9409.510489]? __x64_sys_execve+0x37/0x50
> > > [ 9409.515026]? do_syscall_64+0x5a/0x90
> > > [ 9409.519298]? ? __count_memcg_events+0x4c/0xa0
> > > [ 9409.524348]? ? mm_account_fault+0x6c/0x100
> > > [ 9409.529129]? ? handle_mm_fault+0x154/0x280
> > > [ 9409.533903]? ? do_user_addr_fault+0x35f/0x680
> > > [ 9409.538935]? ? exc_page_fault+0x69/0x150
> > > [ 9409.543537]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > [ 9409.549277] RIP: 0033:0x7f57378d987b
> > > [ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246
> > > ORIG_RAX:
> > > 000000000000003b
> > > [ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > 00007f57378d987b
> > > [ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI:
> > > 000055d26e6ce7f0
> > > [ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09:
> > > 0000000000000000
> > > [ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > 00000000ffffffff
> > > [ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15:
> > > 000055d26e6ceb40
> > > [ 9409.601047]? </TASK>
> > > [ 9409.603946] task:bash??????????? state:D stack:0???? pid:2550?
> > > ppid:2462?? flags:0x00004002
> > > [ 9409.613027] Call Trace:
> > > [ 9409.616185]? <TASK>
> > > [ 9409.618983]? __schedule+0x232/0x5d0
> > > [ 9409.623186]? schedule+0x5e/0xd0
> > > [ 9409.627033]? io_schedule+0x46/0x70
> > > [ 9409.631140]? folio_wait_bit_common+0x133/0x390
> > > [ 9409.636294]? ? folio_wait_bit_common+0x100/0x390
> > > [ 9409.641624]? ? nfs4_do_open+0xcd/0x210 [nfsv4]
> > > [ 9409.646854]? ? __pfx_wake_page_function+0x10/0x10
> > > [ 9409.652268]? filemap_update_page+0x2bc/0x300
> > > [ 9409.657242]? filemap_get_pages+0x21d/0x3a0
> > > [ 9409.662042]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > [ 9409.668010]? filemap_read+0xdc/0x350
> > > [ 9409.672299]? nfs_file_read+0x74/0xc0 [nfs]
> > > [ 9409.677126]? __kernel_read+0xff/0x2b0
> > > [ 9409.681476]? search_binary_handler+0x70/0x250
> > > [ 9409.686526]? exec_binprm+0x50/0x1a0
> > > [ 9409.690702]? bprm_execve.part.0+0x17d/0x230
> > > [ 9409.695573]? do_execveat_common.isra.0+0x1a2/0x240
> > > [ 9409.701047]? __x64_sys_execve+0x37/0x50
> > > [ 9409.705559]? do_syscall_64+0x5a/0x90
> > > [ 9409.709805]? ? do_user_addr_fault+0x35f/0x680
> > > [ 9409.714834]? ? exc_page_fault+0x69/0x150
> > > [ 9409.719414]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > [ 9409.725126] RIP: 0033:0x7f3c492d987b
> > > [ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246
> > > ORIG_RAX:
> > > 000000000000003b
> > > [ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > 00007f3c492d987b
> > > [ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI:
> > > 000055c6a90f7890
> > > [ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09:
> > > 0000000000000000
> > > [ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > 00000000ffffffff
> > > [ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15:
> > > 000055c6a90e1ea0
> > > [ 9409.776732]? </TASK>
> > >
> > > Notice:
> > > 1, nfs server:? kernel 6.6.54
> > > pnfs optin in the service side /etc/exports.
> > >
> >
> > This is not a client bug.
> >
> > The client has no choice other than to retry here. It is being
> > given a
> > layout that it cannot use (probably because it has already
> > discovered
> > that it cannot talk to the data server), but it is also being told
> > by
> > the same layout that it is not allowed to fall back to doing I/O
> > through the metadata server.
> >
> > IOW: This bug needs to be fixed on the server, which is handing out
> > a
> > layout that is impossible to satisfy.
>
> It seems that pnfs need nfs3/udp.
> but the nfs3/udp is disabled on this server.
That is incorrect. There should be no need to enable RPC over UDP.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nfs client deadloop on 6.6.53
2024-10-14 2:58 ` Trond Myklebust
@ 2024-10-14 12:22 ` Wang Yugui
0 siblings, 0 replies; 5+ messages in thread
From: Wang Yugui @ 2024-10-14 12:22 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org
Hi,
> On Mon, 2024-10-14 at 06:10 +0800, Wang Yugui wrote:
> > Hi,
> >
> > > On Tue, 2024-10-08 at 21:27 +0800, Wang Yugui wrote:
> > > > Hi,
> > > >
> > > > nfs client deadloop on 6.6.53.
> > > >
> > > > [ 9409.381322] sysrq: Show Blocked State
> > > > [ 9409.386146] task:bash??????????? state:D stack:0???? pid:2323?
> > > > ppid:2226?? flags:0x00004002
> > > > [ 9409.395225] Call Trace:
> > > > [ 9409.398376]? <TASK>
> > > > [ 9409.401172]? __schedule+0x232/0x5d0
> > > > [ 9409.405370]? schedule+0x5e/0xd0
> > > > [ 9409.409217]? schedule_timeout+0x8c/0x170
> > > > [ 9409.413837]? ? __pfx_process_timeout+0x10/0x10
> > > > [ 9409.418989]? msleep+0x3b/0x50
> > > > [ 9409.422656]? ff_layout_pg_init_read+0x1c1/0x290
> > > > [nfs_layout_flexfiles]
> > > > [ 9409.429910]? __nfs_pageio_add_request+0x29b/0x480 [nfs]
> > > > [ 9409.435911]? nfs_pageio_add_request+0x221/0x2a0 [nfs]
> > > > [ 9409.441715]? nfs_read_add_folio+0x1a3/0x280 [nfs]
> > > > [ 9409.447175]? nfs_readahead+0x235/0x2d0 [nfs]
> > > > [ 9409.452193]? read_pages+0x56/0x2c0
> > > > [ 9409.456298]? page_cache_ra_unbounded+0x134/0x1a0
> > > > [ 9409.461626]? filemap_get_pages+0xf5/0x3a0
> > > > [ 9409.466355]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > > [ 9409.472325]? filemap_read+0xdc/0x350
> > > > [ 9409.476614]? ? find_idlest_group+0x113/0x530
> > > > [ 9409.481614]? nfs_file_read+0x74/0xc0 [nfs]
> > > > [ 9409.486461]? __kernel_read+0xff/0x2b0
> > > > [ 9409.490838]? search_binary_handler+0x70/0x250
> > > > [ 9409.495908]? exec_binprm+0x50/0x1a0
> > > > [ 9409.500102]? bprm_execve.part.0+0x17d/0x230
> > > > [ 9409.504993]? do_execveat_common.isra.0+0x1a2/0x240
> > > > [ 9409.510489]? __x64_sys_execve+0x37/0x50
> > > > [ 9409.515026]? do_syscall_64+0x5a/0x90
> > > > [ 9409.519298]? ? __count_memcg_events+0x4c/0xa0
> > > > [ 9409.524348]? ? mm_account_fault+0x6c/0x100
> > > > [ 9409.529129]? ? handle_mm_fault+0x154/0x280
> > > > [ 9409.533903]? ? do_user_addr_fault+0x35f/0x680
> > > > [ 9409.538935]? ? exc_page_fault+0x69/0x150
> > > > [ 9409.543537]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > [ 9409.549277] RIP: 0033:0x7f57378d987b
> > > > [ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246
> > > > ORIG_RAX:
> > > > 000000000000003b
> > > > [ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > > 00007f57378d987b
> > > > [ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI:
> > > > 000055d26e6ce7f0
> > > > [ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09:
> > > > 0000000000000000
> > > > [ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > > 00000000ffffffff
> > > > [ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15:
> > > > 000055d26e6ceb40
> > > > [ 9409.601047]? </TASK>
> > > > [ 9409.603946] task:bash??????????? state:D stack:0???? pid:2550?
> > > > ppid:2462?? flags:0x00004002
> > > > [ 9409.613027] Call Trace:
> > > > [ 9409.616185]? <TASK>
> > > > [ 9409.618983]? __schedule+0x232/0x5d0
> > > > [ 9409.623186]? schedule+0x5e/0xd0
> > > > [ 9409.627033]? io_schedule+0x46/0x70
> > > > [ 9409.631140]? folio_wait_bit_common+0x133/0x390
> > > > [ 9409.636294]? ? folio_wait_bit_common+0x100/0x390
> > > > [ 9409.641624]? ? nfs4_do_open+0xcd/0x210 [nfsv4]
> > > > [ 9409.646854]? ? __pfx_wake_page_function+0x10/0x10
> > > > [ 9409.652268]? filemap_update_page+0x2bc/0x300
> > > > [ 9409.657242]? filemap_get_pages+0x21d/0x3a0
> > > > [ 9409.662042]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > > [ 9409.668010]? filemap_read+0xdc/0x350
> > > > [ 9409.672299]? nfs_file_read+0x74/0xc0 [nfs]
> > > > [ 9409.677126]? __kernel_read+0xff/0x2b0
> > > > [ 9409.681476]? search_binary_handler+0x70/0x250
> > > > [ 9409.686526]? exec_binprm+0x50/0x1a0
> > > > [ 9409.690702]? bprm_execve.part.0+0x17d/0x230
> > > > [ 9409.695573]? do_execveat_common.isra.0+0x1a2/0x240
> > > > [ 9409.701047]? __x64_sys_execve+0x37/0x50
> > > > [ 9409.705559]? do_syscall_64+0x5a/0x90
> > > > [ 9409.709805]? ? do_user_addr_fault+0x35f/0x680
> > > > [ 9409.714834]? ? exc_page_fault+0x69/0x150
> > > > [ 9409.719414]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > > [ 9409.725126] RIP: 0033:0x7f3c492d987b
> > > > [ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246
> > > > ORIG_RAX:
> > > > 000000000000003b
> > > > [ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > > 00007f3c492d987b
> > > > [ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI:
> > > > 000055c6a90f7890
> > > > [ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09:
> > > > 0000000000000000
> > > > [ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > > 00000000ffffffff
> > > > [ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15:
> > > > 000055c6a90e1ea0
> > > > [ 9409.776732]? </TASK>
> > > >
> > > > Notice:
> > > > 1, nfs server:? kernel 6.6.54
> > > > pnfs optin in the service side /etc/exports.
> > > >
> > >
> > > This is not a client bug.
> > >
> > > The client has no choice other than to retry here. It is being
> > > given a
> > > layout that it cannot use (probably because it has already
> > > discovered
> > > that it cannot talk to the data server), but it is also being told
> > > by
> > > the same layout that it is not allowed to fall back to doing I/O
> > > through the metadata server.
> > >
> > > IOW: This bug needs to be fixed on the server, which is handing out
> > > a
> > > layout that is impossible to satisfy.
> >
> > It seems that pnfs need nfs3/udp.
> > but the nfs3/udp is disabled on this server.
>
> That is incorrect. There should be no need to enable RPC over UDP.
With the 2 changes in /etc/nfs.conf [nfsd] section, this problem disappeared.
1, vers3=n => vers3=y
2, udp=n => udp=y
And both nfs-server and nfs-client version are upgraded to 6.6.56.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2024/10/14
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-14 13:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08 13:27 nfs client deadloop on 6.6.53 Wang Yugui
2024-10-08 14:47 ` Trond Myklebust
2024-10-13 22:10 ` Wang Yugui
2024-10-14 2:58 ` Trond Myklebust
2024-10-14 12:22 ` Wang Yugui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox