* nfsd becomes a zombie @ 2024-06-17 6:55 Harald Dunkel 2024-06-17 14:31 ` Chuck Lever III 0 siblings, 1 reply; 30+ messages in thread From: Harald Dunkel @ 2024-06-17 6:55 UTC (permalink / raw) To: linux-nfs Hi folks, what would be the reason for nfsd getting stuck somehow and becoming an unkillable process? See - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 Doesn't this mean that something inside the kernel gets stuck as well? Seems odd to me. Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-17 6:55 nfsd becomes a zombie Harald Dunkel @ 2024-06-17 14:31 ` Chuck Lever III 2024-06-17 19:20 ` Calum Mackay 2024-12-25 9:15 ` nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie) Salvatore Bonaccorso 0 siblings, 2 replies; 30+ messages in thread From: Chuck Lever III @ 2024-06-17 14:31 UTC (permalink / raw) To: Harald Dunkel; +Cc: Linux NFS Mailing List > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > Hi folks, > > what would be the reason for nfsd getting stuck somehow and becoming > an unkillable process? See > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > Doesn't this mean that something inside the kernel gets stuck as > well? Seems odd to me. I'm not familiar with the Debian or Ubuntu kernel packages. Can the kernel release numbers be translated to LTS kernel releases please? Need both "last known working" and "first broken" releases. This: [ 6596.911785] RPC: Could not send backchannel reply error: -110 [ 6596.972490] RPC: Could not send backchannel reply error: -110 [ 6837.281307] RPC: Could not send backchannel reply error: -110 is a known set of client backchannel bugs. Knowing the LTS kernel releases (see above) will help us figure out what needs to be backported to the LTS kernels kernels in question. This: [11183.290619] wait_for_completion+0x88/0x150 [11183.290623] __flush_workqueue+0x140/0x3e0 [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] is probably related to the backchannel errors on the client, but client bugs shouldn't cause the server to hang like this. We might be able to say more if you can provide the kernel release translations (see above). -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-17 14:31 ` Chuck Lever III @ 2024-06-17 19:20 ` Calum Mackay 2024-06-18 14:29 ` Harald Dunkel 2024-12-25 9:15 ` nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie) Salvatore Bonaccorso 1 sibling, 1 reply; 30+ messages in thread From: Calum Mackay @ 2024-06-17 19:20 UTC (permalink / raw) To: Chuck Lever III, Harald Dunkel; +Cc: Calum Mackay, Linux NFS Mailing List [-- Attachment #1.1: Type: text/plain, Size: 2243 bytes --] On 17/06/2024 3:31 pm, Chuck Lever III wrote: > > >> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >> >> Hi folks, >> >> what would be the reason for nfsd getting stuck somehow and becoming >> an unkillable process? See >> >> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >> >> Doesn't this mean that something inside the kernel gets stuck as >> well? Seems odd to me. > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > the kernel release numbers be translated to LTS kernel releases The Debian NFS server's broken kernel corresponds to upstream stable 6.1.90. There doesn't seem to be any note of a working version. Decoding the Ubuntu pkg version is harder (but is obviously the 6.8.y series; the actual corresponding 'y' value is printed during boot, but other than downloading the pkg source, I'm not sure how to find it otherwise). The older Ubuntu mentioned in the bug as seeing the nfsd issue will be 6.5.y. Harald: do you have a Debian/Ubuntu kernel version that doesn't see the issue, please? i.e. ideally from the same 6.1.y series… cheers, c. > please? Need both "last known working" and "first broken" releases. > > This: > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > is a known set of client backchannel bugs. Knowing the LTS kernel > releases (see above) will help us figure out what needs to be > backported to the LTS kernels kernels in question. > > This: > > [11183.290619] wait_for_completion+0x88/0x150 > [11183.290623] __flush_workqueue+0x140/0x3e0 > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > is probably related to the backchannel errors on the client, but > client bugs shouldn't cause the server to hang like this. We > might be able to say more if you can provide the kernel release > translations (see above). > > > -- > Chuck Lever > > -- Calum Mackay Linux Kernel Engineering Oracle Linux and Virtualisation [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-17 19:20 ` Calum Mackay @ 2024-06-18 14:29 ` Harald Dunkel 2024-06-18 14:32 ` Harald Dunkel 2024-06-18 14:52 ` Chuck Lever 0 siblings, 2 replies; 30+ messages in thread From: Harald Dunkel @ 2024-06-18 14:29 UTC (permalink / raw) To: Calum Mackay, Chuck Lever III; +Cc: Linux NFS Mailing List On 2024-06-17 21:20:54, Calum Mackay wrote: > > Harald: do you have a Debian/Ubuntu kernel version that doesn't see the > issue, please? i.e. ideally from the same 6.1.y series… > I had migrated the Server from Debian 11 to Debian 12 on Dec 11th 2023. The kernel version provided by Debian 11 at that time was 5.10.197. I can't remember having seen a frozen nfsd with this version. Since Dec 11th I had these kernels in use (all provided by Debian): 6.1.66 linux-image-6.1.0-15-amd64 6.1.69 linux-image-6.1.0-17-amd64 6.1.76 linux-image-6.1.0-18-amd64 6.1.85 linux-image-6.1.0-20-amd64 6.1.90 linux-image-6.1.0-21-amd64 These are all "upstream" version numbers. The second column lists Debian's package names. AFAIR nfsd got stuck only on the 6.1.90 kernel (installed and booted on May 18th, first time freeze on May 26th). But I've got a weak memory in cases like this. I haven't found older kernel log files. Please remember that I have seen a frozen nfsd only 3 times since May 26th. Its not easy to reproduce. Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-18 14:29 ` Harald Dunkel @ 2024-06-18 14:32 ` Harald Dunkel 2024-06-18 14:52 ` Chuck Lever 1 sibling, 0 replies; 30+ messages in thread From: Harald Dunkel @ 2024-06-18 14:32 UTC (permalink / raw) To: Calum Mackay, Chuck Lever III; +Cc: Linux NFS Mailing List PS: Do you think the problem could be related to running nfsd inside a container? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-18 14:29 ` Harald Dunkel 2024-06-18 14:32 ` Harald Dunkel @ 2024-06-18 14:52 ` Chuck Lever 2024-06-19 7:32 ` Harald Dunkel 1 sibling, 1 reply; 30+ messages in thread From: Chuck Lever @ 2024-06-18 14:52 UTC (permalink / raw) To: Harald Dunkel; +Cc: Calum Mackay, Linux NFS Mailing List On Tue, Jun 18, 2024 at 04:29:55PM +0200, Harald Dunkel wrote: > On 2024-06-17 21:20:54, Calum Mackay wrote: > > > > Harald: do you have a Debian/Ubuntu kernel version that doesn't see the > > issue, please? i.e. ideally from the same 6.1.y series… > > > > I had migrated the Server from Debian 11 to Debian 12 on Dec 11th 2023. > The kernel version provided by Debian 11 at that time was 5.10.197. I > can't remember having seen a frozen nfsd with this version. > > Since Dec 11th I had these kernels in use (all provided by Debian): > > 6.1.66 linux-image-6.1.0-15-amd64 > 6.1.69 linux-image-6.1.0-17-amd64 > 6.1.76 linux-image-6.1.0-18-amd64 > 6.1.85 linux-image-6.1.0-20-amd64 > 6.1.90 linux-image-6.1.0-21-amd64 > > These are all "upstream" version numbers. The second column lists > Debian's package names. > > AFAIR nfsd got stuck only on the 6.1.90 kernel (installed and booted > on May 18th, first time freeze on May 26th). But I've got a weak > memory in cases like this. I haven't found older kernel log files. > > Please remember that I have seen a frozen nfsd only 3 times since > May 26th. Its not easy to reproduce. I can find no NFSD or SUNRPC changes between v6.1.85 and v6.1.90. Between v6.1.76 and v6.1.85, there are 48 NFSD changes, 8 changes to fs/nfs/, and 3 changes to net/sunrpc/. Are there any NFS client changes during the second half of May that correlate with the NFSD misbehavior? -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-18 14:52 ` Chuck Lever @ 2024-06-19 7:32 ` Harald Dunkel 2024-06-19 7:56 ` Harald Dunkel 2024-06-19 13:14 ` Chuck Lever III 0 siblings, 2 replies; 30+ messages in thread From: Harald Dunkel @ 2024-06-19 7:32 UTC (permalink / raw) To: Chuck Lever; +Cc: Calum Mackay, Linux NFS Mailing List On 2024-06-18 16:52:00, Chuck Lever wrote: > > I can find no NFSD or SUNRPC changes between v6.1.85 and v6.1.90. > Between v6.1.76 and v6.1.85, there are 48 NFSD changes, 8 changes to > fs/nfs/, and 3 changes to net/sunrpc/. > > Are there any NFS client changes during the second half of May that > correlate with the NFSD misbehavior? > > There was a big change: Most clients (>100) were upgraded from Debian 11 to Debian 12 within the last couple of months. A big chunk of about 40 hosts have been upgraded on May 21st. This included the kernel upgrade to 6.1.xx, and the upgrade of the "nfs-common" package from version 1.3.4 to 2.6.2. The current nfs-common package provides these tools: /sbin/mount.nfs /sbin/mount.nfs4 /sbin/rpc.statd /sbin/showmount /sbin/sm-notify /sbin/umount.nfs /sbin/umount.nfs4 /usr/sbin/blkmapd /usr/sbin/mountstats /usr/sbin/nfsconf /usr/sbin/nfsidmap /usr/sbin/nfsiostat /usr/sbin/nfsstat /usr/sbin/rpc.gssd /usr/sbin/rpc.idmapd /usr/sbin/rpc.svcgssd /usr/sbin/rpcctl /usr/sbin/rpcdebug /usr/sbin/start-statd I have 3 clients with more recent kernel versions 5.10.209, 5.10.218 and 6.7.12. The 5.10.x hosts are running Debian 12 and this kernel for several months, but the host with 6.7.12 is running this kernel since May 16th. These "backports" kernels were installed for testing purposes. Would you recommend to downgrade these kernels to 6.1.90? As asked before, do you think the problem could be related to running nfsd inside an LXC container? Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-19 7:32 ` Harald Dunkel @ 2024-06-19 7:56 ` Harald Dunkel 2024-06-19 13:14 ` Chuck Lever III 1 sibling, 0 replies; 30+ messages in thread From: Harald Dunkel @ 2024-06-19 7:56 UTC (permalink / raw) To: Chuck Lever; +Cc: Calum Mackay, Linux NFS Mailing List On 2024-06-19 09:32:23, Harald Dunkel wrote: > > I have 3 clients with more recent kernel versions 5.10.209, 5.10.218 > and 6.7.12. The 5.10.x hosts are running Debian 12 and this kernel 5.10.x kernel means Debian *11*, of course. > for several months, but the host with 6.7.12 is running this kernel > since May 16th. > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-19 7:32 ` Harald Dunkel 2024-06-19 7:56 ` Harald Dunkel @ 2024-06-19 13:14 ` Chuck Lever III 2024-06-20 5:29 ` Harald Dunkel 2024-07-02 17:25 ` Harald Dunkel 1 sibling, 2 replies; 30+ messages in thread From: Chuck Lever III @ 2024-06-19 13:14 UTC (permalink / raw) To: Harald Dunkel; +Cc: Calum Mackay, Linux NFS Mailing List > On Jun 19, 2024, at 3:32 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > On 2024-06-18 16:52:00, Chuck Lever wrote: >> I can find no NFSD or SUNRPC changes between v6.1.85 and v6.1.90. >> Between v6.1.76 and v6.1.85, there are 48 NFSD changes, 8 changes to > > fs/nfs/, and 3 changes to net/sunrpc/. > > > Are there any NFS client changes during the second half of May that >> correlate with the NFSD misbehavior? > There was a big change: Most clients (>100) were upgraded from Debian > 11 to Debian 12 within the last couple of months. A big chunk of > about 40 hosts have been upgraded on May 21st. This included the > kernel upgrade to 6.1.xx, and the upgrade of the "nfs-common" package > from version 1.3.4 to 2.6.2. > > The current nfs-common package provides these tools: > > /sbin/mount.nfs > /sbin/mount.nfs4 > /sbin/rpc.statd > /sbin/showmount > /sbin/sm-notify > /sbin/umount.nfs > /sbin/umount.nfs4 > /usr/sbin/blkmapd > /usr/sbin/mountstats > /usr/sbin/nfsconf > /usr/sbin/nfsidmap > /usr/sbin/nfsiostat > /usr/sbin/nfsstat > /usr/sbin/rpc.gssd > /usr/sbin/rpc.idmapd > /usr/sbin/rpc.svcgssd > /usr/sbin/rpcctl > /usr/sbin/rpcdebug > /usr/sbin/start-statd > > I have 3 clients with more recent kernel versions 5.10.209, 5.10.218 > and 6.7.12. The 5.10.x hosts are running Debian 12 and this kernel > for several months, but the host with 6.7.12 is running this kernel > since May 16th. This makes it very difficult to say what might have introduced the problem in your environment. My next step might be capturing network traces, but you said the problem is hard to reproduce and you seem to have a lot of clients, making such capture impractical. > These "backports" kernels were installed for testing purposes. Would > you recommend to downgrade these kernels to 6.1.90? > > As asked before, do you think the problem could be related to running > nfsd inside an LXC container? There's no way to answer either of these questions since we have no idea what the problem is yet. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-19 13:14 ` Chuck Lever III @ 2024-06-20 5:29 ` Harald Dunkel 2024-06-20 19:09 ` Chuck Lever III 2024-07-02 17:25 ` Harald Dunkel 1 sibling, 1 reply; 30+ messages in thread From: Harald Dunkel @ 2024-06-20 5:29 UTC (permalink / raw) To: Chuck Lever III; +Cc: Calum Mackay, Linux NFS Mailing List Hi Chuck, On 2024-06-19 15:14:15, Chuck Lever III wrote: > > There's no way to answer either of these questions > since we have no idea what the problem is yet. > I am getting these messages again, similar to previous incidents: [Wed Jun 19 16:46:14 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid f9d6a568 [Wed Jun 19 18:42:23 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid 59d8a568 [Wed Jun 19 18:43:15 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid 5cd8a568 nfsd is still running as usual. Assuming they are related to NFS, Is there a way to map these weird messages to a remote IP address? I could setup a similar system and use it to capture the traffic without breaking privacy. Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-20 5:29 ` Harald Dunkel @ 2024-06-20 19:09 ` Chuck Lever III 0 siblings, 0 replies; 30+ messages in thread From: Chuck Lever III @ 2024-06-20 19:09 UTC (permalink / raw) To: Harald Dunkel; +Cc: Calum Mackay, Linux NFS Mailing List > On Jun 20, 2024, at 1:29 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > Hi Chuck, > > On 2024-06-19 15:14:15, Chuck Lever III wrote: >> There's no way to answer either of these questions >> since we have no idea what the problem is yet. > > I am getting these messages again, similar to previous > incidents: > > [Wed Jun 19 16:46:14 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid f9d6a568 > [Wed Jun 19 18:42:23 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid 59d8a568 > [Wed Jun 19 18:43:15 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid 5cd8a568 > > nfsd is still running as usual. Assuming they are related to > NFS, Is there a way to map these weird messages to a remote IP > address? I could setup a similar system and use it to capture > the traffic without breaking privacy. These messages do come from kernel NFS code, but it's still not possible to say how they are related to the hang/deadlock. The best you can do here is enable server tracing: $ sudo su - # trace-cmd record -e nfsd:nfsd_cb\* That will pick up NFSv4 callback related events. When you are done tracing, ^C. # trace-cmd report | less For trace points that record client address information, you might also need the -R option to see everything in each trace record. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-06-19 13:14 ` Chuck Lever III 2024-06-20 5:29 ` Harald Dunkel @ 2024-07-02 17:25 ` Harald Dunkel 2024-07-02 18:17 ` Chuck Lever III 1 sibling, 1 reply; 30+ messages in thread From: Harald Dunkel @ 2024-07-02 17:25 UTC (permalink / raw) To: Chuck Lever III; +Cc: Calum Mackay, Linux NFS Mailing List Hi folks, my NAS ran into this problem again. NFS got stuck somehow, and the nfsd couldn't be killed :-(. dmesg: [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000749c823f xid 5bf8d3d0 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ce307050 xid 3b4fbd9f [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f7f9161e xid 0a26635c [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007c978512 xid 384cbf0c [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000dc3c09f6 xid 53cc0e3e [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000d1675728 xid 129006af [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000047159b90 xid 0c06b6e0 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000008b3b3ac xid 641bb0da [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009eb832dc xid 005fcc99 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000042dcce88 xid b3cf5de4 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b66bbd6f xid d4f06b56 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b5e5e5a3 xid c032dbba [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000e123efc9 xid 99fa75d9 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ca43f6f0 xid e38d5b74 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ad683927 xid 277cde8c [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000e8e01f09 xid 641df4a4 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000006223d195 xid 3dba2d2a [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b73943aa xid a688e47f [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000004cd80e49 xid 64e688ca [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ef92587f xid 70bf2e44 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid c0f7a668 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000fd9a0890 xid 0df7d2c7 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000c42ddaac xid 800e710e [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f43275cf xid 8b05e704 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009a1d5dcf xid 3c2ba924 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007cad732d xid e73a0429 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000008e7d297f xid 075a98e5 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ed964446 xid 8bb8e568 [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b14782f0 xid 4c4ae7c5 [Tue Jul 2 17:23:28 2024] INFO: task nfsd:3037 blocked for more than 120 seconds. [Tue Jul 2 17:23:28 2024] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1 [Tue Jul 2 17:23:28 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue Jul 2 17:23:28 2024] task:nfsd state:D stack:0 pid:3037 ppid:2 flags:0x00004000 [Tue Jul 2 17:23:28 2024] Call Trace: [Tue Jul 2 17:23:28 2024] <TASK> [Tue Jul 2 17:23:28 2024] __schedule+0x34d/0x9e0 [Tue Jul 2 17:23:28 2024] schedule+0x5a/0xd0 [Tue Jul 2 17:23:28 2024] schedule_timeout+0x118/0x150 [Tue Jul 2 17:23:28 2024] wait_for_completion+0x86/0x160 [Tue Jul 2 17:23:28 2024] __flush_workqueue+0x152/0x420 [Tue Jul 2 17:23:28 2024] nfsd4_destroy_session+0x1b6/0x250 [nfsd] [Tue Jul 2 17:23:28 2024] nfsd4_proc_compound+0x355/0x660 [nfsd] [Tue Jul 2 17:23:28 2024] nfsd_dispatch+0x1a1/0x2b0 [nfsd] [Tue Jul 2 17:23:28 2024] svc_process_common+0x289/0x5e0 [sunrpc] [Tue Jul 2 17:23:28 2024] ? svc_recv+0x4e5/0x890 [sunrpc] [Tue Jul 2 17:23:28 2024] ? nfsd_svc+0x360/0x360 [nfsd] [Tue Jul 2 17:23:28 2024] ? nfsd_shutdown_threads+0x90/0x90 [nfsd] [Tue Jul 2 17:23:28 2024] svc_process+0xad/0x100 [sunrpc] [Tue Jul 2 17:23:28 2024] nfsd+0xd5/0x190 [nfsd] [Tue Jul 2 17:23:28 2024] kthread+0xda/0x100 [Tue Jul 2 17:23:28 2024] ? kthread_complete_and_exit+0x20/0x20 [Tue Jul 2 17:23:28 2024] ret_from_fork+0x22/0x30 [Tue Jul 2 17:23:28 2024] </TASK> [Tue Jul 2 17:23:28 2024] INFO: task nfsd:3038 blocked for more than 120 seconds. [Tue Jul 2 17:23:28 2024] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1 [Tue Jul 2 17:23:28 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. /var/log/kern.log: 2024-06-28T10:40:40.273493+02:00 nasl006b kernel: [959982.169372] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000d1675728 xid 372e06af 2024-06-28T10:40:40.273507+02:00 nasl006b kernel: [959982.169374] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000506887ca xid 5be3c4d4 2024-06-28T10:40:40.273508+02:00 nasl006b kernel: [959982.169379] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b5e5e5a3 xid e5d0daba 2024-06-28T10:40:40.273509+02:00 nasl006b kernel: [959982.169423] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b66bbd6f xid 69696b56 2024-06-28T10:40:40.273509+02:00 nasl006b kernel: [959982.169498] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000008b3b3ac xid 89b9afda 2024-06-28T10:40:40.273510+02:00 nasl006b kernel: [959982.169504] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid e595a668 2024-06-28T10:40:40.273512+02:00 nasl006b kernel: [959982.169529] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f7f9161e xid 2fc4625c 2024-06-28T10:40:40.273513+02:00 nasl006b kernel: [959982.169659] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b73943aa xid cb26e47f 2024-06-28T10:40:40.273514+02:00 nasl006b kernel: [959982.169691] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009a1d5dcf xid 61c9a824 2024-06-28T10:40:40.273514+02:00 nasl006b kernel: [959982.169697] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000749c823f xid 8096d3d0 2024-06-28T10:40:40.944609+02:00 nasl006b kernel: [959983.506736] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000190b801c xid bdd1dcd0 2024-06-28T10:40:40.948612+02:00 nasl006b kernel: [959983.512235] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000042dcce88 xid d76d5de4 2024-06-28T10:40:40.952617+02:00 nasl006b kernel: [959983.514349] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007cad732d xid 0bd90329 2024-06-28T10:40:40.952623+02:00 nasl006b kernel: [959983.514564] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000004cd80e49 xid 898488ca 2024-06-28T10:40:40.952624+02:00 nasl006b kernel: [959983.514951] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000abccd646 xid d0d28401 2024-06-28T10:40:40.952624+02:00 nasl006b kernel: [959983.515009] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ef92587f xid 955d2e44 2024-06-28T10:40:40.952625+02:00 nasl006b kernel: [959983.515060] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ed964446 xid b056e568 2024-07-02T17:20:23.113792+02:00 nasl006b kernel: [1329564.790305] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b14782f0 xid 4c4ae7c5 2024-07-02T17:23:32.268700+02:00 nasl006b kernel: [1329753.944957] nfsd_dispatch+0x1a1/0x2b0 [nfsd] 2024-07-02T17:23:32.300740+02:00 nasl006b kernel: [1329753.969482] svc_process_common+0x289/0x5e0 [sunrpc] 2024-07-02T17:23:32.300757+02:00 nasl006b kernel: [1329753.969919] nfsd+0xd5/0x190 [nfsd] 2024-07-02T17:23:32.364636+02:00 nasl006b kernel: [1329754.041100] ? nfsd_svc+0x360/0x360 [nfsd] 2024-07-02T17:23:32.419012+02:00 nasl006b kernel: [1329754.088290] svc_process+0xad/0x100 [sunrpc] 2024-07-02T17:23:32.419020+02:00 nasl006b kernel: [1329754.088337] ret_from_fork+0x22/0x30 2024-07-02T17:23:32.443744+02:00 nasl006b kernel: [1329754.111842] svc_process+0xad/0x100 [sunrpc] 2024-07-02T17:23:32.443749+02:00 nasl006b kernel: [1329754.111882] ? kthread_complete_and_exit+0x20/0x20 2024-07-02T17:23:32.488628+02:00 nasl006b kernel: [1329754.161331] ? kthread_complete_and_exit+0x20/0x20 Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-07-02 17:25 ` Harald Dunkel @ 2024-07-02 18:17 ` Chuck Lever III 2024-07-03 4:14 ` Harald Dunkel 0 siblings, 1 reply; 30+ messages in thread From: Chuck Lever III @ 2024-07-02 18:17 UTC (permalink / raw) To: Harald Dunkel; +Cc: Calum Mackay, Linux NFS Mailing List > On Jul 2, 2024, at 1:25 PM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > Hi folks, > > my NAS ran into this problem again. NFS got stuck somehow, and > the nfsd couldn't be killed :-(. > > dmesg: > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000749c823f xid 5bf8d3d0 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ce307050 xid 3b4fbd9f > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f7f9161e xid 0a26635c > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007c978512 xid 384cbf0c > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000dc3c09f6 xid 53cc0e3e > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000d1675728 xid 129006af > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000047159b90 xid 0c06b6e0 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000008b3b3ac xid 641bb0da > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009eb832dc xid 005fcc99 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000042dcce88 xid b3cf5de4 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b66bbd6f xid d4f06b56 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b5e5e5a3 xid c032dbba > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000e123efc9 xid 99fa75d9 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ca43f6f0 xid e38d5b74 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ad683927 xid 277cde8c > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000e8e01f09 xid 641df4a4 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000006223d195 xid 3dba2d2a > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b73943aa xid a688e47f > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000004cd80e49 xid 64e688ca > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ef92587f xid 70bf2e44 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid c0f7a668 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000fd9a0890 xid 0df7d2c7 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000c42ddaac xid 800e710e > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f43275cf xid 8b05e704 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009a1d5dcf xid 3c2ba924 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007cad732d xid e73a0429 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000008e7d297f xid 075a98e5 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ed964446 xid 8bb8e568 > [Tue Jul 2 17:20:19 2024] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b14782f0 xid 4c4ae7c5 > [Tue Jul 2 17:23:28 2024] INFO: task nfsd:3037 blocked for more than 120 seconds. > [Tue Jul 2 17:23:28 2024] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1 > [Tue Jul 2 17:23:28 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [Tue Jul 2 17:23:28 2024] task:nfsd state:D stack:0 pid:3037 ppid:2 flags:0x00004000 > [Tue Jul 2 17:23:28 2024] Call Trace: > [Tue Jul 2 17:23:28 2024] <TASK> > [Tue Jul 2 17:23:28 2024] __schedule+0x34d/0x9e0 > [Tue Jul 2 17:23:28 2024] schedule+0x5a/0xd0 > [Tue Jul 2 17:23:28 2024] schedule_timeout+0x118/0x150 > [Tue Jul 2 17:23:28 2024] wait_for_completion+0x86/0x160 > [Tue Jul 2 17:23:28 2024] __flush_workqueue+0x152/0x420 > [Tue Jul 2 17:23:28 2024] nfsd4_destroy_session+0x1b6/0x250 [nfsd] > [Tue Jul 2 17:23:28 2024] nfsd4_proc_compound+0x355/0x660 [nfsd] > [Tue Jul 2 17:23:28 2024] nfsd_dispatch+0x1a1/0x2b0 [nfsd] > [Tue Jul 2 17:23:28 2024] svc_process_common+0x289/0x5e0 [sunrpc] > [Tue Jul 2 17:23:28 2024] ? svc_recv+0x4e5/0x890 [sunrpc] > [Tue Jul 2 17:23:28 2024] ? nfsd_svc+0x360/0x360 [nfsd] > [Tue Jul 2 17:23:28 2024] ? nfsd_shutdown_threads+0x90/0x90 [nfsd] > [Tue Jul 2 17:23:28 2024] svc_process+0xad/0x100 [sunrpc] > [Tue Jul 2 17:23:28 2024] nfsd+0xd5/0x190 [nfsd] > [Tue Jul 2 17:23:28 2024] kthread+0xda/0x100 > [Tue Jul 2 17:23:28 2024] ? kthread_complete_and_exit+0x20/0x20 > [Tue Jul 2 17:23:28 2024] ret_from_fork+0x22/0x30 > [Tue Jul 2 17:23:28 2024] </TASK> > [Tue Jul 2 17:23:28 2024] INFO: task nfsd:3038 blocked for more than 120 seconds. > [Tue Jul 2 17:23:28 2024] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1 > [Tue Jul 2 17:23:28 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > /var/log/kern.log: > 2024-06-28T10:40:40.273493+02:00 nasl006b kernel: [959982.169372] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000d1675728 xid 372e06af > 2024-06-28T10:40:40.273507+02:00 nasl006b kernel: [959982.169374] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000506887ca xid 5be3c4d4 > 2024-06-28T10:40:40.273508+02:00 nasl006b kernel: [959982.169379] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b5e5e5a3 xid e5d0daba > 2024-06-28T10:40:40.273509+02:00 nasl006b kernel: [959982.169423] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b66bbd6f xid 69696b56 > 2024-06-28T10:40:40.273509+02:00 nasl006b kernel: [959982.169498] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000008b3b3ac xid 89b9afda > 2024-06-28T10:40:40.273510+02:00 nasl006b kernel: [959982.169504] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000a5ff94a6 xid e595a668 > 2024-06-28T10:40:40.273512+02:00 nasl006b kernel: [959982.169529] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000f7f9161e xid 2fc4625c > 2024-06-28T10:40:40.273513+02:00 nasl006b kernel: [959982.169659] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b73943aa xid cb26e47f > 2024-06-28T10:40:40.273514+02:00 nasl006b kernel: [959982.169691] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000009a1d5dcf xid 61c9a824 > 2024-06-28T10:40:40.273514+02:00 nasl006b kernel: [959982.169697] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000749c823f xid 8096d3d0 > 2024-06-28T10:40:40.944609+02:00 nasl006b kernel: [959983.506736] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000190b801c xid bdd1dcd0 > 2024-06-28T10:40:40.948612+02:00 nasl006b kernel: [959983.512235] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 0000000042dcce88 xid d76d5de4 > 2024-06-28T10:40:40.952617+02:00 nasl006b kernel: [959983.514349] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007cad732d xid 0bd90329 > 2024-06-28T10:40:40.952623+02:00 nasl006b kernel: [959983.514564] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000004cd80e49 xid 898488ca > 2024-06-28T10:40:40.952624+02:00 nasl006b kernel: [959983.514951] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000abccd646 xid d0d28401 > 2024-06-28T10:40:40.952624+02:00 nasl006b kernel: [959983.515009] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ef92587f xid 955d2e44 > 2024-06-28T10:40:40.952625+02:00 nasl006b kernel: [959983.515060] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ed964446 xid b056e568 > 2024-07-02T17:20:23.113792+02:00 nasl006b kernel: [1329564.790305] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000b14782f0 xid 4c4ae7c5 > 2024-07-02T17:23:32.268700+02:00 nasl006b kernel: [1329753.944957] nfsd_dispatch+0x1a1/0x2b0 [nfsd] > 2024-07-02T17:23:32.300740+02:00 nasl006b kernel: [1329753.969482] svc_process_common+0x289/0x5e0 [sunrpc] > 2024-07-02T17:23:32.300757+02:00 nasl006b kernel: [1329753.969919] nfsd+0xd5/0x190 [nfsd] > 2024-07-02T17:23:32.364636+02:00 nasl006b kernel: [1329754.041100] ? nfsd_svc+0x360/0x360 [nfsd] > 2024-07-02T17:23:32.419012+02:00 nasl006b kernel: [1329754.088290] svc_process+0xad/0x100 [sunrpc] > 2024-07-02T17:23:32.419020+02:00 nasl006b kernel: [1329754.088337] ret_from_fork+0x22/0x30 > 2024-07-02T17:23:32.443744+02:00 nasl006b kernel: [1329754.111842] svc_process+0xad/0x100 [sunrpc] > 2024-07-02T17:23:32.443749+02:00 nasl006b kernel: [1329754.111882] ? kthread_complete_and_exit+0x20/0x20 > 2024-07-02T17:23:32.488628+02:00 nasl006b kernel: [1329754.161331] ? kthread_complete_and_exit+0x20/0x20 Harald, none of this is any more probative than the first report you sent. We can't tell what's going on unless you can help us debug the problem. We're just not set up as a help desk. Have you contacted your Linux vendor and asked for help? -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd becomes a zombie 2024-07-02 18:17 ` Chuck Lever III @ 2024-07-03 4:14 ` Harald Dunkel 0 siblings, 0 replies; 30+ messages in thread From: Harald Dunkel @ 2024-07-03 4:14 UTC (permalink / raw) To: Chuck Lever III; +Cc: Calum Mackay, Linux NFS Mailing List On 2024-07-02 20:17:11, Chuck Lever III wrote: > > Harald, none of this is any more probative than the first > report you sent. We can't tell what's going on unless you > can help us debug the problem. We're just not set up as a > help desk. Have you contacted your Linux vendor and asked > for help? > Understood. I highly appreciate your work on the NFS implementation. Maybe it is not important, but did you notice that the most recent "Got unrecognized reply" messages in dmesg did not show up in /var/\ log/kern.log, except for the first message immediately before NFS got frozen? Usually both show the same. Regards Harri ^ permalink raw reply [flat|nested] 30+ messages in thread
* nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie) 2024-06-17 14:31 ` Chuck Lever III 2024-06-17 19:20 ` Calum Mackay @ 2024-12-25 9:15 ` Salvatore Bonaccorso 2025-01-01 19:24 ` nfsd blocks indefinitely in nfsd4_destroy_session Chuck Lever 1 sibling, 1 reply; 30+ messages in thread From: Salvatore Bonaccorso @ 2024-12-25 9:15 UTC (permalink / raw) To: Chuck Lever III, Benjamin Coddington, Trond Myklebust Cc: Harald Dunkel, Linux NFS Mailing List, herzog, Martin Svec, Michael Gernoth, Pellegrin Baptiste Hi Chuck, hi all, [it was not ideal to pick one of the message for this followup, let me know if you want a complete new thread, adding as well Benjamin and Trond as they are involved in one mentioned patch] On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > Hi folks, > > > > what would be the reason for nfsd getting stuck somehow and becoming > > an unkillable process? See > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > Doesn't this mean that something inside the kernel gets stuck as > > well? Seems odd to me. > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > the kernel release numbers be translated to LTS kernel releases > please? Need both "last known working" and "first broken" releases. > > This: > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > is a known set of client backchannel bugs. Knowing the LTS kernel > releases (see above) will help us figure out what needs to be > backported to the LTS kernels kernels in question. > > This: > > [11183.290619] wait_for_completion+0x88/0x150 > [11183.290623] __flush_workqueue+0x140/0x3e0 > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > is probably related to the backchannel errors on the client, but > client bugs shouldn't cause the server to hang like this. We > might be able to say more if you can provide the kernel release > translations (see above). In Debian we hstill have the bug #1071562 open and one person notified mye offlist that it appears that the issue get more frequent since they updated on NFS client side from Ubuntu 20.04 to Debian bookworm with a 6.1.y based kernel). Some people around those issues, seem to claim that the change mentioned in https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html would fix the issue, which is as well backchannel related. This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the nfs_client's rpc timeouts for backchannel") this is not something which goes back to 6.1.y, could it be possible that hte backchannel refactoring and this final fix indeeds fixes the issue? As people report it is not easily reproducible, so this makes it harder to identify fixes correctly. I gave a (short) stance on trying to backport commits up to 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly seems to indicate it is probably still not the right thing for backporting to the older stable series. As at least pre-requisites: 2009e32997ed568a305cf9bc7bf27d22e0f6ccda 4119bd0306652776cb0b7caa3aea5b2a93aecb89 163cdfca341b76c958567ae0966bd3575c5c6192 f4afc8fead386c81fda2593ad6162271d26667f8 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc 57331a59ac0d680f606403eb24edd3c35aecba31 and still there would be conflicting codepaths (and does not seem right). Chuck, Benjamin, Trond, is there anything we can provive on reporters side that we can try to tackle this issue better? Regards, Salvatore ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2024-12-25 9:15 ` nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie) Salvatore Bonaccorso @ 2025-01-01 19:24 ` Chuck Lever 2025-01-07 21:17 ` Salvatore Bonaccorso 0 siblings, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-01 19:24 UTC (permalink / raw) To: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust Cc: Harald Dunkel, Linux NFS Mailing List, herzog, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > Hi Chuck, hi all, > > [it was not ideal to pick one of the message for this followup, let me > know if you want a complete new thread, adding as well Benjamin and > Trond as they are involved in one mentioned patch] > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >> >> >>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>> >>> Hi folks, >>> >>> what would be the reason for nfsd getting stuck somehow and becoming >>> an unkillable process? See >>> >>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>> >>> Doesn't this mean that something inside the kernel gets stuck as >>> well? Seems odd to me. >> >> I'm not familiar with the Debian or Ubuntu kernel packages. Can >> the kernel release numbers be translated to LTS kernel releases >> please? Need both "last known working" and "first broken" releases. >> >> This: >> >> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >> >> is a known set of client backchannel bugs. Knowing the LTS kernel >> releases (see above) will help us figure out what needs to be >> backported to the LTS kernels kernels in question. >> >> This: >> >> [11183.290619] wait_for_completion+0x88/0x150 >> [11183.290623] __flush_workqueue+0x140/0x3e0 >> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >> >> is probably related to the backchannel errors on the client, but >> client bugs shouldn't cause the server to hang like this. We >> might be able to say more if you can provide the kernel release >> translations (see above). > > In Debian we hstill have the bug #1071562 open and one person notified > mye offlist that it appears that the issue get more frequent since > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > with a 6.1.y based kernel). > > Some people around those issues, seem to claim that the change > mentioned in > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > would fix the issue, which is as well backchannel related. > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > nfs_client's rpc timeouts for backchannel") this is not something > which goes back to 6.1.y, could it be possible that hte backchannel > refactoring and this final fix indeeds fixes the issue? > > As people report it is not easily reproducible, so this makes it > harder to identify fixes correctly. > > I gave a (short) stance on trying to backport commits up to > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > seems to indicate it is probably still not the right thing for > backporting to the older stable series. > > As at least pre-requisites: > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > 163cdfca341b76c958567ae0966bd3575c5c6192 > f4afc8fead386c81fda2593ad6162271d26667f8 > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > 57331a59ac0d680f606403eb24edd3c35aecba31 > > and still there would be conflicting codepaths (and does not seem > right). > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > side that we can try to tackle this issue better? As I've indicated before, NFSD should not block no matter what the client may or may not be doing. I'd like to focus on the server first. What is the result of: $ cd <Bookworm's v6.1.90 kernel source > $ unset KBUILD_OUTPUT $ make -j `nproc` $ scripts/faddr2line \ fs/nfsd/nfs4state.o \ nfsd4_destroy_session+0x16d Since this issue appeared after v6.1.1, is it possible to bisect between v6.1.1 and v6.1.90 ? -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-01 19:24 ` nfsd blocks indefinitely in nfsd4_destroy_session Chuck Lever @ 2025-01-07 21:17 ` Salvatore Bonaccorso 2025-01-08 13:33 ` Chuck Lever 0 siblings, 1 reply; 30+ messages in thread From: Salvatore Bonaccorso @ 2025-01-07 21:17 UTC (permalink / raw) To: Chuck Lever Cc: Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, herzog, Martin Svec, Michael Gernoth, Pellegrin Baptiste Hi Chuck, Thanks for your time on this, much appreciated. On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: > On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > > Hi Chuck, hi all, > > > > [it was not ideal to pick one of the message for this followup, let me > > know if you want a complete new thread, adding as well Benjamin and > > Trond as they are involved in one mentioned patch] > > > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > > > > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > > > > > Hi folks, > > > > > > > > what would be the reason for nfsd getting stuck somehow and becoming > > > > an unkillable process? See > > > > > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > > > > > Doesn't this mean that something inside the kernel gets stuck as > > > > well? Seems odd to me. > > > > > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > > > the kernel release numbers be translated to LTS kernel releases > > > please? Need both "last known working" and "first broken" releases. > > > > > > This: > > > > > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > > > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > > > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > > > > > is a known set of client backchannel bugs. Knowing the LTS kernel > > > releases (see above) will help us figure out what needs to be > > > backported to the LTS kernels kernels in question. > > > > > > This: > > > > > > [11183.290619] wait_for_completion+0x88/0x150 > > > [11183.290623] __flush_workqueue+0x140/0x3e0 > > > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > > > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > > > > > is probably related to the backchannel errors on the client, but > > > client bugs shouldn't cause the server to hang like this. We > > > might be able to say more if you can provide the kernel release > > > translations (see above). > > > > In Debian we hstill have the bug #1071562 open and one person notified > > mye offlist that it appears that the issue get more frequent since > > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > > with a 6.1.y based kernel). > > > > Some people around those issues, seem to claim that the change > > mentioned in > > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > > would fix the issue, which is as well backchannel related. > > > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > > nfs_client's rpc timeouts for backchannel") this is not something > > which goes back to 6.1.y, could it be possible that hte backchannel > > refactoring and this final fix indeeds fixes the issue? > > > > As people report it is not easily reproducible, so this makes it > > harder to identify fixes correctly. > > > > I gave a (short) stance on trying to backport commits up to > > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > > seems to indicate it is probably still not the right thing for > > backporting to the older stable series. > > > > As at least pre-requisites: > > > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > > 163cdfca341b76c958567ae0966bd3575c5c6192 > > f4afc8fead386c81fda2593ad6162271d26667f8 > > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > > 57331a59ac0d680f606403eb24edd3c35aecba31 > > > > and still there would be conflicting codepaths (and does not seem > > right). > > > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > > side that we can try to tackle this issue better? > > As I've indicated before, NFSD should not block no matter what the > client may or may not be doing. I'd like to focus on the server first. > > What is the result of: > > $ cd <Bookworm's v6.1.90 kernel source > > $ unset KBUILD_OUTPUT > $ make -j `nproc` > $ scripts/faddr2line \ > fs/nfsd/nfs4state.o \ > nfsd4_destroy_session+0x16d > > Since this issue appeared after v6.1.1, is it possible to bisect > between v6.1.1 and v6.1.90 ? First please note, at least speaking of triggering the issue in Debian, Debian has moved to 6.1.119 based kernel already (and soon in the weekends point release move to 6.1.123). That said, one of the users which regularly seems to be hit by the issue was able to provide the above requested information, based for 6.1.119: ~/kernel/linux-source-6.1# make kernelversion 6.1.119 ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d nfsd4_destroy_session+0x16d/0x250: __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 They could provide as well a decode_stacktrace output for the recent hit (if that is helpful for you): [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 [Mon Jan 6 13:25:28 2025] Call Trace: [Mon Jan 6 13:25:28 2025] <TASK> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 [Mon Jan 6 13:25:28 2025] </TASK> The question about bisection is actually harder, those are production systems and I understand it's not possible to do bisect there, while unfortunately not reprodcing the issue on "lab conditions". Does the above give us still a clue on what you were looking for? Regards, Salvatore ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-07 21:17 ` Salvatore Bonaccorso @ 2025-01-08 13:33 ` Chuck Lever 2025-01-08 14:54 ` Christian Herzog 0 siblings, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-08 13:33 UTC (permalink / raw) To: Salvatore Bonaccorso Cc: Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, herzog, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: > Hi Chuck, > > Thanks for your time on this, much appreciated. > > On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>> Hi Chuck, hi all, >>> >>> [it was not ideal to pick one of the message for this followup, let me >>> know if you want a complete new thread, adding as well Benjamin and >>> Trond as they are involved in one mentioned patch] >>> >>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>> >>>> >>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>> >>>>> Hi folks, >>>>> >>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>> an unkillable process? See >>>>> >>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>> >>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>> well? Seems odd to me. >>>> >>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>> the kernel release numbers be translated to LTS kernel releases >>>> please? Need both "last known working" and "first broken" releases. >>>> >>>> This: >>>> >>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>> >>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>> releases (see above) will help us figure out what needs to be >>>> backported to the LTS kernels kernels in question. >>>> >>>> This: >>>> >>>> [11183.290619] wait_for_completion+0x88/0x150 >>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>> >>>> is probably related to the backchannel errors on the client, but >>>> client bugs shouldn't cause the server to hang like this. We >>>> might be able to say more if you can provide the kernel release >>>> translations (see above). >>> >>> In Debian we hstill have the bug #1071562 open and one person notified >>> mye offlist that it appears that the issue get more frequent since >>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>> with a 6.1.y based kernel). >>> >>> Some people around those issues, seem to claim that the change >>> mentioned in >>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>> would fix the issue, which is as well backchannel related. >>> >>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>> nfs_client's rpc timeouts for backchannel") this is not something >>> which goes back to 6.1.y, could it be possible that hte backchannel >>> refactoring and this final fix indeeds fixes the issue? >>> >>> As people report it is not easily reproducible, so this makes it >>> harder to identify fixes correctly. >>> >>> I gave a (short) stance on trying to backport commits up to >>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>> seems to indicate it is probably still not the right thing for >>> backporting to the older stable series. >>> >>> As at least pre-requisites: >>> >>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>> f4afc8fead386c81fda2593ad6162271d26667f8 >>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>> >>> and still there would be conflicting codepaths (and does not seem >>> right). >>> >>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>> side that we can try to tackle this issue better? >> >> As I've indicated before, NFSD should not block no matter what the >> client may or may not be doing. I'd like to focus on the server first. >> >> What is the result of: >> >> $ cd <Bookworm's v6.1.90 kernel source > >> $ unset KBUILD_OUTPUT >> $ make -j `nproc` >> $ scripts/faddr2line \ >> fs/nfsd/nfs4state.o \ >> nfsd4_destroy_session+0x16d >> >> Since this issue appeared after v6.1.1, is it possible to bisect >> between v6.1.1 and v6.1.90 ? > > First please note, at least speaking of triggering the issue in > Debian, Debian has moved to 6.1.119 based kernel already (and soon in > the weekends point release move to 6.1.123). > > That said, one of the users which regularly seems to be hit by the > issue was able to provide the above requested information, based for > 6.1.119: > > ~/kernel/linux-source-6.1# make kernelversion > 6.1.119 > ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d > nfsd4_destroy_session+0x16d/0x250: > __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 > (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 > (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 > (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 > > They could provide as well a decode_stacktrace output for the recent > hit (if that is helpful for you): > > [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. > [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 > [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 > [Mon Jan 6 13:25:28 2025] Call Trace: > [Mon Jan 6 13:25:28 2025] <TASK> > [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 > [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 > [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 > [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 > [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 > [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd > [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd > [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd > [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc > [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc > [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd > [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd > [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc > [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd > [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 > [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 > [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 > [Mon Jan 6 13:25:28 2025] </TASK> > > The question about bisection is actually harder, those are production > systems and I understand it's not possible to do bisect there, while > unfortunately not reprodcing the issue on "lab conditions". > > Does the above give us still a clue on what you were looking for? Thanks for the update. It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"), while not a specific fix for this issue, might offer some relief by preventing the DESTROY_SESSION hang from affecting all other clients of the degraded server. Not having a reproducer or the ability to bisect puts a damper on things. The next step, then, is to enable tracing on servers where this issue can come up, and wait for the hang to occur. The following command captures information only about callback operation, so it should not generate much data or impact server performance. # trace-cmd record -e nfsd:nfsd_cb\* Let that run until the problem occurs, then ^C, compress the resulting trace.dat file, and either attach it to 1071562 or email it to me privately. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-08 13:33 ` Chuck Lever @ 2025-01-08 14:54 ` Christian Herzog 2025-01-08 15:07 ` Chuck Lever 0 siblings, 1 reply; 30+ messages in thread From: Christian Herzog @ 2025-01-08 14:54 UTC (permalink / raw) To: Chuck Lever Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste Dear Chuck, On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: > On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: > > Hi Chuck, > > > > Thanks for your time on this, much appreciated. > > > > On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: > > > On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > > > > Hi Chuck, hi all, > > > > > > > > [it was not ideal to pick one of the message for this followup, let me > > > > know if you want a complete new thread, adding as well Benjamin and > > > > Trond as they are involved in one mentioned patch] > > > > > > > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > what would be the reason for nfsd getting stuck somehow and becoming > > > > > > an unkillable process? See > > > > > > > > > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > > > > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > > > > > > > > > Doesn't this mean that something inside the kernel gets stuck as > > > > > > well? Seems odd to me. > > > > > > > > > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > > > > > the kernel release numbers be translated to LTS kernel releases > > > > > please? Need both "last known working" and "first broken" releases. > > > > > > > > > > This: > > > > > > > > > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > > > > > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > > > > > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > > > > > > > > > is a known set of client backchannel bugs. Knowing the LTS kernel > > > > > releases (see above) will help us figure out what needs to be > > > > > backported to the LTS kernels kernels in question. > > > > > > > > > > This: > > > > > > > > > > [11183.290619] wait_for_completion+0x88/0x150 > > > > > [11183.290623] __flush_workqueue+0x140/0x3e0 > > > > > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > > > > > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > > > > > > > > > is probably related to the backchannel errors on the client, but > > > > > client bugs shouldn't cause the server to hang like this. We > > > > > might be able to say more if you can provide the kernel release > > > > > translations (see above). > > > > > > > > In Debian we hstill have the bug #1071562 open and one person notified > > > > mye offlist that it appears that the issue get more frequent since > > > > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > > > > with a 6.1.y based kernel). > > > > > > > > Some people around those issues, seem to claim that the change > > > > mentioned in > > > > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > > > > would fix the issue, which is as well backchannel related. > > > > > > > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > > > > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > > > > nfs_client's rpc timeouts for backchannel") this is not something > > > > which goes back to 6.1.y, could it be possible that hte backchannel > > > > refactoring and this final fix indeeds fixes the issue? > > > > > > > > As people report it is not easily reproducible, so this makes it > > > > harder to identify fixes correctly. > > > > > > > > I gave a (short) stance on trying to backport commits up to > > > > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > > > > seems to indicate it is probably still not the right thing for > > > > backporting to the older stable series. > > > > > > > > As at least pre-requisites: > > > > > > > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > > > > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > > > > 163cdfca341b76c958567ae0966bd3575c5c6192 > > > > f4afc8fead386c81fda2593ad6162271d26667f8 > > > > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > > > > 57331a59ac0d680f606403eb24edd3c35aecba31 > > > > > > > > and still there would be conflicting codepaths (and does not seem > > > > right). > > > > > > > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > > > > side that we can try to tackle this issue better? > > > > > > As I've indicated before, NFSD should not block no matter what the > > > client may or may not be doing. I'd like to focus on the server first. > > > > > > What is the result of: > > > > > > $ cd <Bookworm's v6.1.90 kernel source > > > > $ unset KBUILD_OUTPUT > > > $ make -j `nproc` > > > $ scripts/faddr2line \ > > > fs/nfsd/nfs4state.o \ > > > nfsd4_destroy_session+0x16d > > > > > > Since this issue appeared after v6.1.1, is it possible to bisect > > > between v6.1.1 and v6.1.90 ? > > > > First please note, at least speaking of triggering the issue in > > Debian, Debian has moved to 6.1.119 based kernel already (and soon in > > the weekends point release move to 6.1.123). > > > > That said, one of the users which regularly seems to be hit by the > > issue was able to provide the above requested information, based for > > 6.1.119: > > > > ~/kernel/linux-source-6.1# make kernelversion > > 6.1.119 > > ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d > > nfsd4_destroy_session+0x16d/0x250: > > __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 > > (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 > > (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 > > (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 > > > > They could provide as well a decode_stacktrace output for the recent > > hit (if that is helpful for you): > > > > [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. > > [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 > > [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 > > [Mon Jan 6 13:25:28 2025] Call Trace: > > [Mon Jan 6 13:25:28 2025] <TASK> > > [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 > > [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 > > [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 > > [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 > > [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 > > [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd > > [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd > > [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd > > [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc > > [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc > > [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd > > [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd > > [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc > > [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd > > [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 > > [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 > > [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 > > [Mon Jan 6 13:25:28 2025] </TASK> > > > > The question about bisection is actually harder, those are production > > systems and I understand it's not possible to do bisect there, while > > unfortunately not reprodcing the issue on "lab conditions". > > > > Does the above give us still a clue on what you were looking for? > > Thanks for the update. > > It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct > nfs4_client"), while not a specific fix for this issue, might offer some > relief by preventing the DESTROY_SESSION hang from affecting all other > clients of the degraded server. > > Not having a reproducer or the ability to bisect puts a damper on > things. The next step, then, is to enable tracing on servers where this > issue can come up, and wait for the hang to occur. The following command > captures information only about callback operation, so it should not > generate much data or impact server performance. > > # trace-cmd record -e nfsd:nfsd_cb\* > > Let that run until the problem occurs, then ^C, compress the resulting > trace.dat file, and either attach it to 1071562 or email it to me > privately. thanks for the follow-up. I am the "customer" with two affected file servers. We have since moved those servers to the backports kernel (6.11.10) in the hope of forward fixing the issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' kernel (700+ researchers affected..), and we're also not able to trigger the issue on our test environment. I'm sorry I can't be of more help. best, -Christian -- Dr. Christian Herzog <herzog@phys.ethz.ch> support: +41 44 633 26 68 Head, IT Services Group, HPT H 8 voice: +41 44 633 39 50 Department of Physics, ETH Zurich 8093 Zurich, Switzerland http://isg.phys.ethz.ch/ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-08 14:54 ` Christian Herzog @ 2025-01-08 15:07 ` Chuck Lever 2025-01-09 11:56 ` Christian Herzog 0 siblings, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-08 15:07 UTC (permalink / raw) To: Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 1/8/25 9:54 AM, Christian Herzog wrote: > Dear Chuck, > > On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >> On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>> Hi Chuck, >>> >>> Thanks for your time on this, much appreciated. >>> >>> On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>>> Hi Chuck, hi all, >>>>> >>>>> [it was not ideal to pick one of the message for this followup, let me >>>>> know if you want a complete new thread, adding as well Benjamin and >>>>> Trond as they are involved in one mentioned patch] >>>>> >>>>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>>>> >>>>>> >>>>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>>>> an unkillable process? See >>>>>>> >>>>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>>>> >>>>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>>>> well? Seems odd to me. >>>>>> >>>>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>>>> the kernel release numbers be translated to LTS kernel releases >>>>>> please? Need both "last known working" and "first broken" releases. >>>>>> >>>>>> This: >>>>>> >>>>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>>>> >>>>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>>>> releases (see above) will help us figure out what needs to be >>>>>> backported to the LTS kernels kernels in question. >>>>>> >>>>>> This: >>>>>> >>>>>> [11183.290619] wait_for_completion+0x88/0x150 >>>>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>>>> >>>>>> is probably related to the backchannel errors on the client, but >>>>>> client bugs shouldn't cause the server to hang like this. We >>>>>> might be able to say more if you can provide the kernel release >>>>>> translations (see above). >>>>> >>>>> In Debian we hstill have the bug #1071562 open and one person notified >>>>> mye offlist that it appears that the issue get more frequent since >>>>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>>>> with a 6.1.y based kernel). >>>>> >>>>> Some people around those issues, seem to claim that the change >>>>> mentioned in >>>>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>>>> would fix the issue, which is as well backchannel related. >>>>> >>>>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>>>> nfs_client's rpc timeouts for backchannel") this is not something >>>>> which goes back to 6.1.y, could it be possible that hte backchannel >>>>> refactoring and this final fix indeeds fixes the issue? >>>>> >>>>> As people report it is not easily reproducible, so this makes it >>>>> harder to identify fixes correctly. >>>>> >>>>> I gave a (short) stance on trying to backport commits up to >>>>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>>>> seems to indicate it is probably still not the right thing for >>>>> backporting to the older stable series. >>>>> >>>>> As at least pre-requisites: >>>>> >>>>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>>>> f4afc8fead386c81fda2593ad6162271d26667f8 >>>>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>>>> >>>>> and still there would be conflicting codepaths (and does not seem >>>>> right). >>>>> >>>>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>>>> side that we can try to tackle this issue better? >>>> >>>> As I've indicated before, NFSD should not block no matter what the >>>> client may or may not be doing. I'd like to focus on the server first. >>>> >>>> What is the result of: >>>> >>>> $ cd <Bookworm's v6.1.90 kernel source > >>>> $ unset KBUILD_OUTPUT >>>> $ make -j `nproc` >>>> $ scripts/faddr2line \ >>>> fs/nfsd/nfs4state.o \ >>>> nfsd4_destroy_session+0x16d >>>> >>>> Since this issue appeared after v6.1.1, is it possible to bisect >>>> between v6.1.1 and v6.1.90 ? >>> >>> First please note, at least speaking of triggering the issue in >>> Debian, Debian has moved to 6.1.119 based kernel already (and soon in >>> the weekends point release move to 6.1.123). >>> >>> That said, one of the users which regularly seems to be hit by the >>> issue was able to provide the above requested information, based for >>> 6.1.119: >>> >>> ~/kernel/linux-source-6.1# make kernelversion >>> 6.1.119 >>> ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d >>> nfsd4_destroy_session+0x16d/0x250: >>> __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 >>> (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 >>> (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 >>> (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 >>> >>> They could provide as well a decode_stacktrace output for the recent >>> hit (if that is helpful for you): >>> >>> [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. >>> [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 >>> [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 >>> [Mon Jan 6 13:25:28 2025] Call Trace: >>> [Mon Jan 6 13:25:28 2025] <TASK> >>> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>> [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>> [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>> [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>> [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>> [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>> [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>> [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>> [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>> [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>> [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>> [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>> [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>> [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>> [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>> [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>> [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>> [Mon Jan 6 13:25:28 2025] </TASK> >>> >>> The question about bisection is actually harder, those are production >>> systems and I understand it's not possible to do bisect there, while >>> unfortunately not reprodcing the issue on "lab conditions". >>> >>> Does the above give us still a clue on what you were looking for? >> >> Thanks for the update. >> >> It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >> nfs4_client"), while not a specific fix for this issue, might offer some >> relief by preventing the DESTROY_SESSION hang from affecting all other >> clients of the degraded server. >> >> Not having a reproducer or the ability to bisect puts a damper on >> things. The next step, then, is to enable tracing on servers where this >> issue can come up, and wait for the hang to occur. The following command >> captures information only about callback operation, so it should not >> generate much data or impact server performance. >> >> # trace-cmd record -e nfsd:nfsd_cb\* >> >> Let that run until the problem occurs, then ^C, compress the resulting >> trace.dat file, and either attach it to 1071562 or email it to me >> privately. > thanks for the follow-up. > > I am the "customer" with two affected file servers. We have since moved those > servers to the backports kernel (6.11.10) in the hope of forward fixing the > issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' > kernel (700+ researchers affected..), and we're also not able to trigger the > issue on our test environment. Hello Dr. Herzog - If the problem recurs on 6.11, the trace-cmd I suggest above works there as well. If not, there are several "me too!" reports in the bug. Anyone who hits this issue can try the trace-cmd and report back. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-08 15:07 ` Chuck Lever @ 2025-01-09 11:56 ` Christian Herzog 2025-01-09 12:42 ` Jeff Layton 2025-01-09 15:49 ` Chuck Lever 0 siblings, 2 replies; 30+ messages in thread From: Christian Herzog @ 2025-01-09 11:56 UTC (permalink / raw) To: Chuck Lever Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste Dear Chuck, On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: > On 1/8/25 9:54 AM, Christian Herzog wrote: > > Dear Chuck, > > > > On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: > > > On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: > > > > Hi Chuck, > > > > > > > > Thanks for your time on this, much appreciated. > > > > > > > > On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: > > > > > On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > > > > > > Hi Chuck, hi all, > > > > > > > > > > > > [it was not ideal to pick one of the message for this followup, let me > > > > > > know if you want a complete new thread, adding as well Benjamin and > > > > > > Trond as they are involved in one mentioned patch] > > > > > > > > > > > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > what would be the reason for nfsd getting stuck somehow and becoming > > > > > > > > an unkillable process? See > > > > > > > > > > > > > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > > > > > > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > > > > > > > > > > > > > Doesn't this mean that something inside the kernel gets stuck as > > > > > > > > well? Seems odd to me. > > > > > > > > > > > > > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > > > > > > > the kernel release numbers be translated to LTS kernel releases > > > > > > > please? Need both "last known working" and "first broken" releases. > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > > > > > > > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > > > > > > > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > > > > > > > > > > > > > is a known set of client backchannel bugs. Knowing the LTS kernel > > > > > > > releases (see above) will help us figure out what needs to be > > > > > > > backported to the LTS kernels kernels in question. > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > [11183.290619] wait_for_completion+0x88/0x150 > > > > > > > [11183.290623] __flush_workqueue+0x140/0x3e0 > > > > > > > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > > > > > > > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > > > > > > > > > > > > > is probably related to the backchannel errors on the client, but > > > > > > > client bugs shouldn't cause the server to hang like this. We > > > > > > > might be able to say more if you can provide the kernel release > > > > > > > translations (see above). > > > > > > > > > > > > In Debian we hstill have the bug #1071562 open and one person notified > > > > > > mye offlist that it appears that the issue get more frequent since > > > > > > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > > > > > > with a 6.1.y based kernel). > > > > > > > > > > > > Some people around those issues, seem to claim that the change > > > > > > mentioned in > > > > > > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > > > > > > would fix the issue, which is as well backchannel related. > > > > > > > > > > > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > > > > > > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > > > > > > nfs_client's rpc timeouts for backchannel") this is not something > > > > > > which goes back to 6.1.y, could it be possible that hte backchannel > > > > > > refactoring and this final fix indeeds fixes the issue? > > > > > > > > > > > > As people report it is not easily reproducible, so this makes it > > > > > > harder to identify fixes correctly. > > > > > > > > > > > > I gave a (short) stance on trying to backport commits up to > > > > > > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > > > > > > seems to indicate it is probably still not the right thing for > > > > > > backporting to the older stable series. > > > > > > > > > > > > As at least pre-requisites: > > > > > > > > > > > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > > > > > > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > > > > > > 163cdfca341b76c958567ae0966bd3575c5c6192 > > > > > > f4afc8fead386c81fda2593ad6162271d26667f8 > > > > > > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > > > > > > 57331a59ac0d680f606403eb24edd3c35aecba31 > > > > > > > > > > > > and still there would be conflicting codepaths (and does not seem > > > > > > right). > > > > > > > > > > > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > > > > > > side that we can try to tackle this issue better? > > > > > > > > > > As I've indicated before, NFSD should not block no matter what the > > > > > client may or may not be doing. I'd like to focus on the server first. > > > > > > > > > > What is the result of: > > > > > > > > > > $ cd <Bookworm's v6.1.90 kernel source > > > > > > $ unset KBUILD_OUTPUT > > > > > $ make -j `nproc` > > > > > $ scripts/faddr2line \ > > > > > fs/nfsd/nfs4state.o \ > > > > > nfsd4_destroy_session+0x16d > > > > > > > > > > Since this issue appeared after v6.1.1, is it possible to bisect > > > > > between v6.1.1 and v6.1.90 ? > > > > > > > > First please note, at least speaking of triggering the issue in > > > > Debian, Debian has moved to 6.1.119 based kernel already (and soon in > > > > the weekends point release move to 6.1.123). > > > > > > > > That said, one of the users which regularly seems to be hit by the > > > > issue was able to provide the above requested information, based for > > > > 6.1.119: > > > > > > > > ~/kernel/linux-source-6.1# make kernelversion > > > > 6.1.119 > > > > ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d > > > > nfsd4_destroy_session+0x16d/0x250: > > > > __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 > > > > (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 > > > > (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 > > > > (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 > > > > > > > > They could provide as well a decode_stacktrace output for the recent > > > > hit (if that is helpful for you): > > > > > > > > [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. > > > > [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 > > > > [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 > > > > [Mon Jan 6 13:25:28 2025] Call Trace: > > > > [Mon Jan 6 13:25:28 2025] <TASK> > > > > [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 > > > > [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 > > > > [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 > > > > [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 > > > > [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 > > > > [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd > > > > [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd > > > > [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd > > > > [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc > > > > [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd > > > > [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc > > > > [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd > > > > [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 > > > > [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 > > > > [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 > > > > [Mon Jan 6 13:25:28 2025] </TASK> > > > > > > > > The question about bisection is actually harder, those are production > > > > systems and I understand it's not possible to do bisect there, while > > > > unfortunately not reprodcing the issue on "lab conditions". > > > > > > > > Does the above give us still a clue on what you were looking for? > > > > > > Thanks for the update. > > > > > > It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct > > > nfs4_client"), while not a specific fix for this issue, might offer some > > > relief by preventing the DESTROY_SESSION hang from affecting all other > > > clients of the degraded server. > > > > > > Not having a reproducer or the ability to bisect puts a damper on > > > things. The next step, then, is to enable tracing on servers where this > > > issue can come up, and wait for the hang to occur. The following command > > > captures information only about callback operation, so it should not > > > generate much data or impact server performance. > > > > > > # trace-cmd record -e nfsd:nfsd_cb\* > > > > > > Let that run until the problem occurs, then ^C, compress the resulting > > > trace.dat file, and either attach it to 1071562 or email it to me > > > privately. > > thanks for the follow-up. > > > > I am the "customer" with two affected file servers. We have since moved those > > servers to the backports kernel (6.11.10) in the hope of forward fixing the > > issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' > > kernel (700+ researchers affected..), and we're also not able to trigger the > > issue on our test environment. > > Hello Dr. Herzog - > > If the problem recurs on 6.11, the trace-cmd I suggest above works > there as well. the bad news is: it just happened again with the bpo kernel. We then immediately started trace-cmd since usually there are several call traces in a row and we did get a trace.dat: http://people.phys.ethz.ch/~daduke/trace.dat we did notice however that the stack trace looked a bit different this time: INFO: task nfsd:2566 blocked for more than 5799 seconds. Tainted: G W 6.11.10+bpo-amd64 #1 Debia> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> task:nfsd state:D stack:0 pid:2566 tgid:2566 > Call Trace: <TASK> __schedule+0x400/0xad0 schedule+0x27/0xf0 nfsd4_shutdown_callback+0xfe/0x140 [nfsd] ? __pfx_var_wake_function+0x10/0x10 __destroy_client+0x1f0/0x290 [nfsd] nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] nfsd4_proc_compound+0x34d/0x670 [nfsd] nfsd_dispatch+0xfd/0x220 [nfsd] svc_process_common+0x2f7/0x700 [sunrpc] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] svc_process+0x131/0x180 [sunrpc] svc_recv+0x830/0xa10 [sunrpc] ? __pfx_nfsd+0x10/0x10 [nfsd] nfsd+0x87/0xf0 [nfsd] kthread+0xcf/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> and also the state of the offending client in `/proc/fs/nfsd/clients/*/info` used to be callback state: UNKNOWN while now it is DOWN or FAULT. No idea what it means, but I thought it was worth mentioning. thanks! -Christian -- Dr. Christian Herzog <herzog@phys.ethz.ch> support: +41 44 633 26 68 Head, IT Services Group, HPT H 8 voice: +41 44 633 39 50 Department of Physics, ETH Zurich 8093 Zurich, Switzerland http://isg.phys.ethz.ch/ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 11:56 ` Christian Herzog @ 2025-01-09 12:42 ` Jeff Layton 2025-01-09 13:56 ` Chuck Lever 2025-01-09 16:32 ` Chuck Lever 2025-01-09 15:49 ` Chuck Lever 1 sibling, 2 replies; 30+ messages in thread From: Jeff Layton @ 2025-01-09 12:42 UTC (permalink / raw) To: Christian Herzog, Chuck Lever Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: > Dear Chuck, > > On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: > > On 1/8/25 9:54 AM, Christian Herzog wrote: > > > Dear Chuck, > > > > > > On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: > > > > On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: > > > > > Hi Chuck, > > > > > > > > > > Thanks for your time on this, much appreciated. > > > > > > > > > > On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: > > > > > > On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > > > > > > > Hi Chuck, hi all, > > > > > > > > > > > > > > [it was not ideal to pick one of the message for this followup, let me > > > > > > > know if you want a complete new thread, adding as well Benjamin and > > > > > > > Trond as they are involved in one mentioned patch] > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > > > what would be the reason for nfsd getting stuck somehow and becoming > > > > > > > > > an unkillable process? See > > > > > > > > > > > > > > > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > > > > > > > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > > > > > > > > > > > > > > > Doesn't this mean that something inside the kernel gets stuck as > > > > > > > > > well? Seems odd to me. > > > > > > > > > > > > > > > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > > > > > > > > the kernel release numbers be translated to LTS kernel releases > > > > > > > > please? Need both "last known working" and "first broken" releases. > > > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > > > > > > > > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > > > > > > > > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > > > > > > > > > > > > > > > is a known set of client backchannel bugs. Knowing the LTS kernel > > > > > > > > releases (see above) will help us figure out what needs to be > > > > > > > > backported to the LTS kernels kernels in question. > > > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > > > [11183.290619] wait_for_completion+0x88/0x150 > > > > > > > > [11183.290623] __flush_workqueue+0x140/0x3e0 > > > > > > > > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > > > > > > > > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > > > > > > > > > > > > > > > is probably related to the backchannel errors on the client, but > > > > > > > > client bugs shouldn't cause the server to hang like this. We > > > > > > > > might be able to say more if you can provide the kernel release > > > > > > > > translations (see above). > > > > > > > > > > > > > > In Debian we hstill have the bug #1071562 open and one person notified > > > > > > > mye offlist that it appears that the issue get more frequent since > > > > > > > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > > > > > > > with a 6.1.y based kernel). > > > > > > > > > > > > > > Some people around those issues, seem to claim that the change > > > > > > > mentioned in > > > > > > > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > > > > > > > would fix the issue, which is as well backchannel related. > > > > > > > > > > > > > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > > > > > > > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > > > > > > > nfs_client's rpc timeouts for backchannel") this is not something > > > > > > > which goes back to 6.1.y, could it be possible that hte backchannel > > > > > > > refactoring and this final fix indeeds fixes the issue? > > > > > > > > > > > > > > As people report it is not easily reproducible, so this makes it > > > > > > > harder to identify fixes correctly. > > > > > > > > > > > > > > I gave a (short) stance on trying to backport commits up to > > > > > > > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > > > > > > > seems to indicate it is probably still not the right thing for > > > > > > > backporting to the older stable series. > > > > > > > > > > > > > > As at least pre-requisites: > > > > > > > > > > > > > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > > > > > > > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > > > > > > > 163cdfca341b76c958567ae0966bd3575c5c6192 > > > > > > > f4afc8fead386c81fda2593ad6162271d26667f8 > > > > > > > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > > > > > > > 57331a59ac0d680f606403eb24edd3c35aecba31 > > > > > > > > > > > > > > and still there would be conflicting codepaths (and does not seem > > > > > > > right). > > > > > > > > > > > > > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > > > > > > > side that we can try to tackle this issue better? > > > > > > > > > > > > As I've indicated before, NFSD should not block no matter what the > > > > > > client may or may not be doing. I'd like to focus on the server first. > > > > > > > > > > > > What is the result of: > > > > > > > > > > > > $ cd <Bookworm's v6.1.90 kernel source > > > > > > > $ unset KBUILD_OUTPUT > > > > > > $ make -j `nproc` > > > > > > $ scripts/faddr2line \ > > > > > > fs/nfsd/nfs4state.o \ > > > > > > nfsd4_destroy_session+0x16d > > > > > > > > > > > > Since this issue appeared after v6.1.1, is it possible to bisect > > > > > > between v6.1.1 and v6.1.90 ? > > > > > > > > > > First please note, at least speaking of triggering the issue in > > > > > Debian, Debian has moved to 6.1.119 based kernel already (and soon in > > > > > the weekends point release move to 6.1.123). > > > > > > > > > > That said, one of the users which regularly seems to be hit by the > > > > > issue was able to provide the above requested information, based for > > > > > 6.1.119: > > > > > > > > > > ~/kernel/linux-source-6.1# make kernelversion > > > > > 6.1.119 > > > > > ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d > > > > > nfsd4_destroy_session+0x16d/0x250: > > > > > __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 > > > > > (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 > > > > > (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 > > > > > (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 > > > > > > > > > > They could provide as well a decode_stacktrace output for the recent > > > > > hit (if that is helpful for you): > > > > > > > > > > [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. > > > > > [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 > > > > > [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > > [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 > > > > > [Mon Jan 6 13:25:28 2025] Call Trace: > > > > > [Mon Jan 6 13:25:28 2025] <TASK> > > > > > [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 > > > > > [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 > > > > > [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 > > > > > [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 > > > > > [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 > > > > > [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd > > > > > [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd > > > > > [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd > > > > > [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc > > > > > [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc > > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd > > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd > > > > > [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc > > > > > [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd > > > > > [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 > > > > > [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 > > > > > [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 > > > > > [Mon Jan 6 13:25:28 2025] </TASK> > > > > > > > > > > The question about bisection is actually harder, those are production > > > > > systems and I understand it's not possible to do bisect there, while > > > > > unfortunately not reprodcing the issue on "lab conditions". > > > > > > > > > > Does the above give us still a clue on what you were looking for? > > > > > > > > Thanks for the update. > > > > > > > > It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct > > > > nfs4_client"), while not a specific fix for this issue, might offer some > > > > relief by preventing the DESTROY_SESSION hang from affecting all other > > > > clients of the degraded server. > > > > > > > > Not having a reproducer or the ability to bisect puts a damper on > > > > things. The next step, then, is to enable tracing on servers where this > > > > issue can come up, and wait for the hang to occur. The following command > > > > captures information only about callback operation, so it should not > > > > generate much data or impact server performance. > > > > > > > > # trace-cmd record -e nfsd:nfsd_cb\* > > > > > > > > Let that run until the problem occurs, then ^C, compress the resulting > > > > trace.dat file, and either attach it to 1071562 or email it to me > > > > privately. > > > thanks for the follow-up. > > > > > > I am the "customer" with two affected file servers. We have since moved those > > > servers to the backports kernel (6.11.10) in the hope of forward fixing the > > > issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' > > > kernel (700+ researchers affected..), and we're also not able to trigger the > > > issue on our test environment. > > > > Hello Dr. Herzog - > > > > If the problem recurs on 6.11, the trace-cmd I suggest above works > > there as well. > the bad news is: it just happened again with the bpo kernel. > > We then immediately started trace-cmd since usually there are several call > traces in a row and we did get a trace.dat: > http://people.phys.ethz.ch/~daduke/trace.dat > > we did notice however that the stack trace looked a bit different this time: > > INFO: task nfsd:2566 blocked for more than 5799 seconds. > Tainted: G W 6.11.10+bpo-amd64 #1 Debia> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> > task:nfsd state:D stack:0 pid:2566 tgid:2566 > > Call Trace: > <TASK> > __schedule+0x400/0xad0 > schedule+0x27/0xf0 > nfsd4_shutdown_callback+0xfe/0x140 [nfsd] > ? __pfx_var_wake_function+0x10/0x10 > __destroy_client+0x1f0/0x290 [nfsd] > nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] > ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] > nfsd4_proc_compound+0x34d/0x670 [nfsd] > nfsd_dispatch+0xfd/0x220 [nfsd] > svc_process_common+0x2f7/0x700 [sunrpc] > ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] > svc_process+0x131/0x180 [sunrpc] > svc_recv+0x830/0xa10 [sunrpc] > ? __pfx_nfsd+0x10/0x10 [nfsd] > nfsd+0x87/0xf0 [nfsd] > kthread+0xcf/0x100 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > and also the state of the offending client in `/proc/fs/nfsd/clients/*/info` > used to be callback state: UNKNOWN while now it is DOWN or FAULT. No idea > what it means, but I thought it was worth mentioning. > Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That probably means that the cl_cb_inflight counter is stuck at >0. I'm guessing that means that there is some callback that it's expecting to complete that isn't. From nfsd4_shutdown_callback(): /* * Note this won't actually result in a null callback; * instead, nfsd4_run_cb_null() will detect the killed * client, destroy the rpc client, and stop: */ nfsd4_run_cb(&clp->cl_cb_null); flush_workqueue(clp->cl_callback_wq); nfsd41_cb_inflight_wait_complete(clp); ...it sounds like that isn't happening properly though. It might be interesting to see if you can track down the callback client in /sys/kernel/debug/sunrpc and see what it's doing. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 12:42 ` Jeff Layton @ 2025-01-09 13:56 ` Chuck Lever 2025-01-09 16:32 ` Chuck Lever 1 sibling, 0 replies; 30+ messages in thread From: Chuck Lever @ 2025-01-09 13:56 UTC (permalink / raw) To: Jeff Layton, Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 1/9/25 7:42 AM, Jeff Layton wrote: > On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: >> Dear Chuck, >> >> On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >>> On 1/8/25 9:54 AM, Christian Herzog wrote: >>>> Dear Chuck, >>>> >>>> On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>>>> On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>>>>> Hi Chuck, >>>>>> >>>>>> Thanks for your time on this, much appreciated. >>>>>> >>>>>> On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>>>>> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>>>>>> Hi Chuck, hi all, >>>>>>>> >>>>>>>> [it was not ideal to pick one of the message for this followup, let me >>>>>>>> know if you want a complete new thread, adding as well Benjamin and >>>>>>>> Trond as they are involved in one mentioned patch] >>>>>>>> >>>>>>>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>>>>>>> an unkillable process? See >>>>>>>>>> >>>>>>>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>>>>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>>>>>>> >>>>>>>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>>>>>>> well? Seems odd to me. >>>>>>>>> >>>>>>>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>>>>>>> the kernel release numbers be translated to LTS kernel releases >>>>>>>>> please? Need both "last known working" and "first broken" releases. >>>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>>>>>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>>>>>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>>>>>>> >>>>>>>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>>>>>>> releases (see above) will help us figure out what needs to be >>>>>>>>> backported to the LTS kernels kernels in question. >>>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> [11183.290619] wait_for_completion+0x88/0x150 >>>>>>>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>>>>>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>>>>>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>>>>>>> >>>>>>>>> is probably related to the backchannel errors on the client, but >>>>>>>>> client bugs shouldn't cause the server to hang like this. We >>>>>>>>> might be able to say more if you can provide the kernel release >>>>>>>>> translations (see above). >>>>>>>> >>>>>>>> In Debian we hstill have the bug #1071562 open and one person notified >>>>>>>> mye offlist that it appears that the issue get more frequent since >>>>>>>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>>>>>>> with a 6.1.y based kernel). >>>>>>>> >>>>>>>> Some people around those issues, seem to claim that the change >>>>>>>> mentioned in >>>>>>>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>>>>>>> would fix the issue, which is as well backchannel related. >>>>>>>> >>>>>>>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>>>>>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>>>>>>> nfs_client's rpc timeouts for backchannel") this is not something >>>>>>>> which goes back to 6.1.y, could it be possible that hte backchannel >>>>>>>> refactoring and this final fix indeeds fixes the issue? >>>>>>>> >>>>>>>> As people report it is not easily reproducible, so this makes it >>>>>>>> harder to identify fixes correctly. >>>>>>>> >>>>>>>> I gave a (short) stance on trying to backport commits up to >>>>>>>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>>>>>>> seems to indicate it is probably still not the right thing for >>>>>>>> backporting to the older stable series. >>>>>>>> >>>>>>>> As at least pre-requisites: >>>>>>>> >>>>>>>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>>>>>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>>>>>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>>>>>>> f4afc8fead386c81fda2593ad6162271d26667f8 >>>>>>>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>>>>>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>>>>>>> >>>>>>>> and still there would be conflicting codepaths (and does not seem >>>>>>>> right). >>>>>>>> >>>>>>>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>>>>>>> side that we can try to tackle this issue better? >>>>>>> >>>>>>> As I've indicated before, NFSD should not block no matter what the >>>>>>> client may or may not be doing. I'd like to focus on the server first. >>>>>>> >>>>>>> What is the result of: >>>>>>> >>>>>>> $ cd <Bookworm's v6.1.90 kernel source > >>>>>>> $ unset KBUILD_OUTPUT >>>>>>> $ make -j `nproc` >>>>>>> $ scripts/faddr2line \ >>>>>>> fs/nfsd/nfs4state.o \ >>>>>>> nfsd4_destroy_session+0x16d >>>>>>> >>>>>>> Since this issue appeared after v6.1.1, is it possible to bisect >>>>>>> between v6.1.1 and v6.1.90 ? >>>>>> >>>>>> First please note, at least speaking of triggering the issue in >>>>>> Debian, Debian has moved to 6.1.119 based kernel already (and soon in >>>>>> the weekends point release move to 6.1.123). >>>>>> >>>>>> That said, one of the users which regularly seems to be hit by the >>>>>> issue was able to provide the above requested information, based for >>>>>> 6.1.119: >>>>>> >>>>>> ~/kernel/linux-source-6.1# make kernelversion >>>>>> 6.1.119 >>>>>> ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d >>>>>> nfsd4_destroy_session+0x16d/0x250: >>>>>> __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 >>>>>> (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 >>>>>> (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 >>>>>> (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 >>>>>> >>>>>> They could provide as well a decode_stacktrace output for the recent >>>>>> hit (if that is helpful for you): >>>>>> >>>>>> [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. >>>>>> [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 >>>>>> [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>>> [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 >>>>>> [Mon Jan 6 13:25:28 2025] Call Trace: >>>>>> [Mon Jan 6 13:25:28 2025] <TASK> >>>>>> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>>>>> [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>>>>> [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>>>>> [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>>>>> [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>>>>> [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>>>>> [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>>>>> [Mon Jan 6 13:25:28 2025] </TASK> >>>>>> >>>>>> The question about bisection is actually harder, those are production >>>>>> systems and I understand it's not possible to do bisect there, while >>>>>> unfortunately not reprodcing the issue on "lab conditions". >>>>>> >>>>>> Does the above give us still a clue on what you were looking for? >>>>> >>>>> Thanks for the update. >>>>> >>>>> It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >>>>> nfs4_client"), while not a specific fix for this issue, might offer some >>>>> relief by preventing the DESTROY_SESSION hang from affecting all other >>>>> clients of the degraded server. >>>>> >>>>> Not having a reproducer or the ability to bisect puts a damper on >>>>> things. The next step, then, is to enable tracing on servers where this >>>>> issue can come up, and wait for the hang to occur. The following command >>>>> captures information only about callback operation, so it should not >>>>> generate much data or impact server performance. >>>>> >>>>> # trace-cmd record -e nfsd:nfsd_cb\* >>>>> >>>>> Let that run until the problem occurs, then ^C, compress the resulting >>>>> trace.dat file, and either attach it to 1071562 or email it to me >>>>> privately. >>>> thanks for the follow-up. >>>> >>>> I am the "customer" with two affected file servers. We have since moved those >>>> servers to the backports kernel (6.11.10) in the hope of forward fixing the >>>> issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' >>>> kernel (700+ researchers affected..), and we're also not able to trigger the >>>> issue on our test environment. >>> >>> Hello Dr. Herzog - >>> >>> If the problem recurs on 6.11, the trace-cmd I suggest above works >>> there as well. >> the bad news is: it just happened again with the bpo kernel. >> >> We then immediately started trace-cmd since usually there are several call >> traces in a row and we did get a trace.dat: >> http://people.phys.ethz.ch/~daduke/trace.dat >> >> we did notice however that the stack trace looked a bit different this time: >> >> INFO: task nfsd:2566 blocked for more than 5799 seconds. >> Tainted: G W 6.11.10+bpo-amd64 #1 Debia> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> >> task:nfsd state:D stack:0 pid:2566 tgid:2566 > >> Call Trace: >> <TASK> >> __schedule+0x400/0xad0 >> schedule+0x27/0xf0 >> nfsd4_shutdown_callback+0xfe/0x140 [nfsd] >> ? __pfx_var_wake_function+0x10/0x10 >> __destroy_client+0x1f0/0x290 [nfsd] >> nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] >> ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] >> nfsd4_proc_compound+0x34d/0x670 [nfsd] >> nfsd_dispatch+0xfd/0x220 [nfsd] >> svc_process_common+0x2f7/0x700 [sunrpc] >> ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] >> svc_process+0x131/0x180 [sunrpc] >> svc_recv+0x830/0xa10 [sunrpc] >> ? __pfx_nfsd+0x10/0x10 [nfsd] >> nfsd+0x87/0xf0 [nfsd] >> kthread+0xcf/0x100 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork+0x31/0x50 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork_asm+0x1a/0x30 >> </TASK> >> >> and also the state of the offending client in `/proc/fs/nfsd/clients/*/info` >> used to be callback state: UNKNOWN while now it is DOWN or FAULT. No idea >> what it means, but I thought it was worth mentioning. >> > > Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That > probably means that the cl_cb_inflight counter is stuck at >0. I'm > guessing that means that there is some callback that it's expecting to > complete that isn't. From nfsd4_shutdown_callback(): > > /* > * Note this won't actually result in a null callback; > * instead, nfsd4_run_cb_null() will detect the killed > * client, destroy the rpc client, and stop: > */ > nfsd4_run_cb(&clp->cl_cb_null); > flush_workqueue(clp->cl_callback_wq); > nfsd41_cb_inflight_wait_complete(clp); > > ...it sounds like that isn't happening properly though. > > It might be interesting to see if you can track down the callback > client in /sys/kernel/debug/sunrpc and see what it's doing. Could be that the RPC client was shutdown while there were outstanding callbacks, and that means that counter can now never go to zero? -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 12:42 ` Jeff Layton 2025-01-09 13:56 ` Chuck Lever @ 2025-01-09 16:32 ` Chuck Lever [not found] ` <f0705a65549ef253.67823675@ac-grenoble.fr> 1 sibling, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-09 16:32 UTC (permalink / raw) To: Jeff Layton Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste, Christian Herzog On 1/9/25 7:42 AM, Jeff Layton wrote: > On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: >> Dear Chuck, >> >> On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >>> On 1/8/25 9:54 AM, Christian Herzog wrote: >>>> Dear Chuck, >>>> >>>> On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>>>> On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>>>>> Hi Chuck, >>>>>> >>>>>> Thanks for your time on this, much appreciated. >>>>>> >>>>>> On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>>>>> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>>>>>> Hi Chuck, hi all, >>>>>>>> >>>>>>>> [it was not ideal to pick one of the message for this followup, let me >>>>>>>> know if you want a complete new thread, adding as well Benjamin and >>>>>>>> Trond as they are involved in one mentioned patch] >>>>>>>> >>>>>>>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>>>>>>> an unkillable process? See >>>>>>>>>> >>>>>>>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>>>>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>>>>>>> >>>>>>>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>>>>>>> well? Seems odd to me. >>>>>>>>> >>>>>>>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>>>>>>> the kernel release numbers be translated to LTS kernel releases >>>>>>>>> please? Need both "last known working" and "first broken" releases. >>>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>>>>>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>>>>>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>>>>>>> >>>>>>>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>>>>>>> releases (see above) will help us figure out what needs to be >>>>>>>>> backported to the LTS kernels kernels in question. >>>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> [11183.290619] wait_for_completion+0x88/0x150 >>>>>>>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>>>>>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>>>>>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>>>>>>> >>>>>>>>> is probably related to the backchannel errors on the client, but >>>>>>>>> client bugs shouldn't cause the server to hang like this. We >>>>>>>>> might be able to say more if you can provide the kernel release >>>>>>>>> translations (see above). >>>>>>>> >>>>>>>> In Debian we hstill have the bug #1071562 open and one person notified >>>>>>>> mye offlist that it appears that the issue get more frequent since >>>>>>>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>>>>>>> with a 6.1.y based kernel). >>>>>>>> >>>>>>>> Some people around those issues, seem to claim that the change >>>>>>>> mentioned in >>>>>>>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>>>>>>> would fix the issue, which is as well backchannel related. >>>>>>>> >>>>>>>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>>>>>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>>>>>>> nfs_client's rpc timeouts for backchannel") this is not something >>>>>>>> which goes back to 6.1.y, could it be possible that hte backchannel >>>>>>>> refactoring and this final fix indeeds fixes the issue? >>>>>>>> >>>>>>>> As people report it is not easily reproducible, so this makes it >>>>>>>> harder to identify fixes correctly. >>>>>>>> >>>>>>>> I gave a (short) stance on trying to backport commits up to >>>>>>>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>>>>>>> seems to indicate it is probably still not the right thing for >>>>>>>> backporting to the older stable series. >>>>>>>> >>>>>>>> As at least pre-requisites: >>>>>>>> >>>>>>>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>>>>>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>>>>>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>>>>>>> f4afc8fead386c81fda2593ad6162271d26667f8 >>>>>>>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>>>>>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>>>>>>> >>>>>>>> and still there would be conflicting codepaths (and does not seem >>>>>>>> right). >>>>>>>> >>>>>>>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>>>>>>> side that we can try to tackle this issue better? >>>>>>> >>>>>>> As I've indicated before, NFSD should not block no matter what the >>>>>>> client may or may not be doing. I'd like to focus on the server first. >>>>>>> >>>>>>> What is the result of: >>>>>>> >>>>>>> $ cd <Bookworm's v6.1.90 kernel source > >>>>>>> $ unset KBUILD_OUTPUT >>>>>>> $ make -j `nproc` >>>>>>> $ scripts/faddr2line \ >>>>>>> fs/nfsd/nfs4state.o \ >>>>>>> nfsd4_destroy_session+0x16d >>>>>>> >>>>>>> Since this issue appeared after v6.1.1, is it possible to bisect >>>>>>> between v6.1.1 and v6.1.90 ? >>>>>> >>>>>> First please note, at least speaking of triggering the issue in >>>>>> Debian, Debian has moved to 6.1.119 based kernel already (and soon in >>>>>> the weekends point release move to 6.1.123). >>>>>> >>>>>> That said, one of the users which regularly seems to be hit by the >>>>>> issue was able to provide the above requested information, based for >>>>>> 6.1.119: >>>>>> >>>>>> ~/kernel/linux-source-6.1# make kernelversion >>>>>> 6.1.119 >>>>>> ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d >>>>>> nfsd4_destroy_session+0x16d/0x250: >>>>>> __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 >>>>>> (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 >>>>>> (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 >>>>>> (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 >>>>>> >>>>>> They could provide as well a decode_stacktrace output for the recent >>>>>> hit (if that is helpful for you): >>>>>> >>>>>> [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. >>>>>> [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 >>>>>> [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>>> [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 >>>>>> [Mon Jan 6 13:25:28 2025] Call Trace: >>>>>> [Mon Jan 6 13:25:28 2025] <TASK> >>>>>> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>>>>> [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>>>>> [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>>>>> [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>>>>> [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>>>>> [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>>>>> [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>>>>> [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>>>>> [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>>>>> [Mon Jan 6 13:25:28 2025] </TASK> >>>>>> >>>>>> The question about bisection is actually harder, those are production >>>>>> systems and I understand it's not possible to do bisect there, while >>>>>> unfortunately not reprodcing the issue on "lab conditions". >>>>>> >>>>>> Does the above give us still a clue on what you were looking for? >>>>> >>>>> Thanks for the update. >>>>> >>>>> It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >>>>> nfs4_client"), while not a specific fix for this issue, might offer some >>>>> relief by preventing the DESTROY_SESSION hang from affecting all other >>>>> clients of the degraded server. >>>>> >>>>> Not having a reproducer or the ability to bisect puts a damper on >>>>> things. The next step, then, is to enable tracing on servers where this >>>>> issue can come up, and wait for the hang to occur. The following command >>>>> captures information only about callback operation, so it should not >>>>> generate much data or impact server performance. >>>>> >>>>> # trace-cmd record -e nfsd:nfsd_cb\* >>>>> >>>>> Let that run until the problem occurs, then ^C, compress the resulting >>>>> trace.dat file, and either attach it to 1071562 or email it to me >>>>> privately. >>>> thanks for the follow-up. >>>> >>>> I am the "customer" with two affected file servers. We have since moved those >>>> servers to the backports kernel (6.11.10) in the hope of forward fixing the >>>> issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' >>>> kernel (700+ researchers affected..), and we're also not able to trigger the >>>> issue on our test environment. >>> >>> Hello Dr. Herzog - >>> >>> If the problem recurs on 6.11, the trace-cmd I suggest above works >>> there as well. >> the bad news is: it just happened again with the bpo kernel. >> >> We then immediately started trace-cmd since usually there are several call >> traces in a row and we did get a trace.dat: >> http://people.phys.ethz.ch/~daduke/trace.dat >> >> we did notice however that the stack trace looked a bit different this time: >> >> INFO: task nfsd:2566 blocked for more than 5799 seconds. >> Tainted: G W 6.11.10+bpo-amd64 #1 Debia> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> >> task:nfsd state:D stack:0 pid:2566 tgid:2566 > >> Call Trace: >> <TASK> >> __schedule+0x400/0xad0 >> schedule+0x27/0xf0 >> nfsd4_shutdown_callback+0xfe/0x140 [nfsd] >> ? __pfx_var_wake_function+0x10/0x10 >> __destroy_client+0x1f0/0x290 [nfsd] >> nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] >> ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] >> nfsd4_proc_compound+0x34d/0x670 [nfsd] >> nfsd_dispatch+0xfd/0x220 [nfsd] >> svc_process_common+0x2f7/0x700 [sunrpc] >> ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] >> svc_process+0x131/0x180 [sunrpc] >> svc_recv+0x830/0xa10 [sunrpc] >> ? __pfx_nfsd+0x10/0x10 [nfsd] >> nfsd+0x87/0xf0 [nfsd] >> kthread+0xcf/0x100 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork+0x31/0x50 >> ? __pfx_kthread+0x10/0x10 >> ret_from_fork_asm+0x1a/0x30 >> </TASK> >> >> and also the state of the offending client in `/proc/fs/nfsd/clients/*/info` >> used to be callback state: UNKNOWN while now it is DOWN or FAULT. No idea >> what it means, but I thought it was worth mentioning. >> > > Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That > probably means that the cl_cb_inflight counter is stuck at >0. I'm > guessing that means that there is some callback that it's expecting to > complete that isn't. From nfsd4_shutdown_callback(): > > /* > * Note this won't actually result in a null callback; > * instead, nfsd4_run_cb_null() will detect the killed > * client, destroy the rpc client, and stop: > */ > nfsd4_run_cb(&clp->cl_cb_null); > flush_workqueue(clp->cl_callback_wq); > nfsd41_cb_inflight_wait_complete(clp); > > ...it sounds like that isn't happening properly though. > > It might be interesting to see if you can track down the callback > client in /sys/kernel/debug/sunrpc and see what it's doing. Here's a possible scenario: The trace.dat shows the server is sending a lot of CB_RECALL_ANY operations. Normally a callback occurs only due to some particular client request. Such requests would be completed before a client unmounts. CB_RECALL_ANY is an operation which does not occur due to a particular client operation and can occur at any time. I believe this is the first operation of this type we've added to NFSD. My guess is that the server's laundromat has sent some CB_RECALL_ANY operations while the client is umounting -- DESTROY_SESSION is racing with those callback operations. deleg_reaper() was backported to 6.1.y in 6.1.81, which lines up more or less with when the issues started to show up. A quick way to test this theory would be to make deleg_reaper() a no-op. If this helps, then we know that shutting down the server's callback client needs to be more careful about cleaning up outstanding callbacks. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <f0705a65549ef253.67823675@ac-grenoble.fr>]
* Re: nfsd blocks indefinitely in nfsd4_destroy_session [not found] ` <f0705a65549ef253.67823675@ac-grenoble.fr> @ 2025-01-16 20:07 ` Chuck Lever 2025-01-17 19:43 ` Baptiste PELLEGRIN 0 siblings, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-16 20:07 UTC (permalink / raw) To: Pellegrin Baptiste, Rik Theys, Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Jeff Layton On 1/11/25 3:14 AM, Pellegrin Baptiste wrote: > On 09/01/25 17:32, *Chuck Lever * <chuck.lever@oracle.com> wrote: >> On 1/9/25 7:42 AM, Jeff Layton wrote: >> >On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: >> >>Dear Chuck, >> >> >> >>On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >> >>>On 1/8/25 9:54 AM, Christian Herzog wrote: >> >>>>Dear Chuck, >> >>>> >> >>>>On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >> >>>>>On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >> >>>>>>Hi Chuck, >> >>>>>> >> >>>>>>Thanks for your time on this, much appreciated. >> >>>>>> >> >>>>>>On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >> >>>>>>>On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >> >>>>>>>>Hi Chuck, hi all, >> >>>>>>>> >> >>>>>>>>[it was not ideal to pick one of the message for this >> followup, let me >> >>>>>>>>know if you want a complete new thread, adding as well >> Benjamin and >> >>>>>>>>Trond as they are involved in one mentioned patch] >> >>>>>>>> >> >>>>>>>>On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>>>On Jun 17, 2024, at 2:55 AM, Harald Dunkel >> <harald.dunkel@aixigo.com> wrote: >> >>>>>>>>>> >> >>>>>>>>>>Hi folks, >> >>>>>>>>>> >> >>>>>>>>>>what would be the reason for nfsd getting stuck somehow and >> becoming >> >>>>>>>>>>an unkillable process? See >> >>>>>>>>>> >> >>>>>>>>>>- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >> <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562> >> >>>>>>>>>>- https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >> +bug/2062568 <https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >> +bug/2062568> >> >>>>>>>>>> >> >>>>>>>>>>Doesn't this mean that something inside the kernel gets stuck as >> >>>>>>>>>>well? Seems odd to me. >> >>>>>>>>> >> >>>>>>>>>I'm not familiar with the Debian or Ubuntu kernel packages. Can >> >>>>>>>>>the kernel release numbers be translated to LTS kernel releases >> >>>>>>>>>please? Need both "last known working" and "first broken" >> releases. >> >>>>>>>>> >> >>>>>>>>>This: >> >>>>>>>>> >> >>>>>>>>>[ 6596.911785] RPC: Could not send backchannel reply error: -110 >> >>>>>>>>>[ 6596.972490] RPC: Could not send backchannel reply error: -110 >> >>>>>>>>>[ 6837.281307] RPC: Could not send backchannel reply error: -110 >> >>>>>>>>> >> >>>>>>>>>is a known set of client backchannel bugs. Knowing the LTS kernel >> >>>>>>>>>releases (see above) will help us figure out what needs to be >> >>>>>>>>>backported to the LTS kernels kernels in question. >> >>>>>>>>> >> >>>>>>>>>This: >> >>>>>>>>> >> >>>>>>>>>[11183.290619] wait_for_completion+0x88/0x150 >> >>>>>>>>>[11183.290623] __flush_workqueue+0x140/0x3e0 >> >>>>>>>>>[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >> >>>>>>>>>[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >> >>>>>>>>> >> >>>>>>>>>is probably related to the backchannel errors on the client, but >> >>>>>>>>>client bugs shouldn't cause the server to hang like this. We >> >>>>>>>>>might be able to say more if you can provide the kernel release >> >>>>>>>>>translations (see above). >> >>>>>>>> >> >>>>>>>>In Debian we hstill have the bug #1071562 open and one person >> notified >> >>>>>>>>mye offlist that it appears that the issue get more frequent since >> >>>>>>>>they updated on NFS client side from Ubuntu 20.04 to Debian >> bookworm >> >>>>>>>>with a 6.1.y based kernel). >> >>>>>>>> >> >>>>>>>>Some people around those issues, seem to claim that the change >> >>>>>>>>mentioned in >> >>>>>>>>https://lists.proxmox.com/pipermail/pve-devel/2024- >> July/064614.html <https://lists.proxmox.com/pipermail/pve-devel/2024- >> July/064614.html> >> >>>>>>>>would fix the issue, which is as well backchannel related. >> >>>>>>>> >> >>>>>>>>This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >> >>>>>>>>again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >> >>>>>>>>nfs_client's rpc timeouts for backchannel") this is not something >> >>>>>>>>which goes back to 6.1.y, could it be possible that hte >> backchannel >> >>>>>>>>refactoring and this final fix indeeds fixes the issue? >> >>>>>>>> >> >>>>>>>>As people report it is not easily reproducible, so this makes it >> >>>>>>>>harder to identify fixes correctly. >> >>>>>>>> >> >>>>>>>>I gave a (short) stance on trying to backport commits up to >> >>>>>>>>6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this >> quickly >> >>>>>>>>seems to indicate it is probably still not the right thing for >> >>>>>>>>backporting to the older stable series. >> >>>>>>>> >> >>>>>>>>As at least pre-requisites: >> >>>>>>>> >> >>>>>>>>2009e32997ed568a305cf9bc7bf27d22e0f6ccda >> >>>>>>>>4119bd0306652776cb0b7caa3aea5b2a93aecb89 >> >>>>>>>>163cdfca341b76c958567ae0966bd3575c5c6192 >> >>>>>>>>f4afc8fead386c81fda2593ad6162271d26667f8 >> >>>>>>>>6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >> >>>>>>>>57331a59ac0d680f606403eb24edd3c35aecba31 >> >>>>>>>> >> >>>>>>>>and still there would be conflicting codepaths (and does not seem >> >>>>>>>>right). >> >>>>>>>> >> >>>>>>>>Chuck, Benjamin, Trond, is there anything we can provive on >> reporters >> >>>>>>>>side that we can try to tackle this issue better? >> >>>>>>> >> >>>>>>>As I've indicated before, NFSD should not block no matter what the >> >>>>>>>client may or may not be doing. I'd like to focus on the server >> first. >> >>>>>>> >> >>>>>>>What is the result of: >> >>>>>>> >> >>>>>>>$ cd <Bookworm's v6.1.90 kernel source > >> >>>>>>>$ unset KBUILD_OUTPUT >> >>>>>>>$ make -j `nproc` >> >>>>>>>$ scripts/faddr2line \ >> >>>>>>> fs/nfsd/nfs4state.o \ >> >>>>>>> nfsd4_destroy_session+0x16d >> >>>>>>> >> >>>>>>>Since this issue appeared after v6.1.1, is it possible to bisect >> >>>>>>>between v6.1.1 and v6.1.90 ? >> >>>>>> >> >>>>>>First please note, at least speaking of triggering the issue in >> >>>>>>Debian, Debian has moved to 6.1.119 based kernel already (and >> soon in >> >>>>>>the weekends point release move to 6.1.123). >> >>>>>> >> >>>>>>That said, one of the users which regularly seems to be hit by the >> >>>>>>issue was able to provide the above requested information, based for >> >>>>>>6.1.119: >> >>>>>> >> >>>>>>~/kernel/linux-source-6.1# make kernelversion >> >>>>>>6.1.119 >> >>>>>>~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/ >> nfs4state.o nfsd4_destroy_session+0x16d >> >>>>>>nfsd4_destroy_session+0x16d/0x250: >> >>>>>>__list_del_entry at /root/kernel/linux-source-6.1/./include/ >> linux/list.h:134 >> >>>>>>(inlined by) list_del at /root/kernel/linux-source-6.1/./ >> include/linux/list.h:148 >> >>>>>>(inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/ >> nfsd/nfs4state.c:2062 >> >>>>>>(inlined by) nfsd4_destroy_session at /root/kernel/linux- >> source-6.1/fs/nfsd/nfs4state.c:3856 >> >>>>>> >> >>>>>>They could provide as well a decode_stacktrace output for the recent >> >>>>>>hit (if that is helpful for you): >> >>>>>> >> >>>>>>[Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more >> than 6883 seconds. >> >>>>>>[Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 >> Debian 6.1.119-1 >> >>>>>>[Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/ >> hung_task_timeout_secs" disables this message. >> >>>>>>[Mon Jan 6 13:25:28 2025] task:nfsd state:D >> stack:0 pid:55306 ppid:2 flags:0x00004000 >> >>>>>>[Mon Jan 6 13:25:28 2025] Call Trace: >> >>>>>>[Mon Jan 6 13:25:28 2025] <TASK> >> >>>>>>[Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >> >>>>>>[Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >> >>>>>>[Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >> >>>>>>[Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >> >>>>>>[Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/ >> build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/ >> build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >> >>>>>>[Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/ >> build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process (debian/build/ >> build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd (debian/build/ >> build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >> >>>>>>[Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >> >>>>>>[Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >> >>>>>>[Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >> >>>>>>[Mon Jan 6 13:25:28 2025] </TASK> >> >>>>>> >> >>>>>>The question about bisection is actually harder, those are >> production >> >>>>>>systems and I understand it's not possible to do bisect there, while >> >>>>>>unfortunately not reprodcing the issue on "lab conditions". >> >>>>>> >> >>>>>>Does the above give us still a clue on what you were looking for? >> >>>>> >> >>>>>Thanks for the update. >> >>>>> >> >>>>>It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >> >>>>>nfs4_client"), while not a specific fix for this issue, might >> offer some >> >>>>>relief by preventing the DESTROY_SESSION hang from affecting all >> other >> >>>>>clients of the degraded server. >> >>>>> >> >>>>>Not having a reproducer or the ability to bisect puts a damper on >> >>>>>things. The next step, then, is to enable tracing on servers >> where this >> >>>>>issue can come up, and wait for the hang to occur. The following >> command >> >>>>>captures information only about callback operation, so it should not >> >>>>>generate much data or impact server performance. >> >>>>> >> >>>>> # trace-cmd record -e nfsd:nfsd_cb\* >> >>>>> >> >>>>>Let that run until the problem occurs, then ^C, compress the >> resulting >> >>>>>trace.dat file, and either attach it to 1071562 or email it to me >> >>>>>privately. >> >>>>thanks for the follow-up. >> >>>> >> >>>>I am the "customer" with two affected file servers. We have since >> moved those >> >>>>servers to the backports kernel (6.11.10) in the hope of forward >> fixing the >> >>>>issue. If this kernel is stable, I'm afraid I can't go back to the >> 'bad' >> >>>>kernel (700+ researchers affected..), and we're also not able to >> trigger the >> >>>>issue on our test environment. >> >>> >> >>>Hello Dr. Herzog - >> >>> >> >>>If the problem recurs on 6.11, the trace-cmd I suggest above works >> >>>there as well. >> >>the bad news is: it just happened again with the bpo kernel. >> >> >> >>We then immediately started trace-cmd since usually there are >> several call >> >>traces in a row and we did get a trace.dat: >> >>http://people.phys.ethz.ch/~daduke/trace.dat <http:// >> people.phys.ethz.ch/~daduke/trace.dat> >> >> >> >>we did notice however that the stack trace looked a bit different >> this time: >> >> >> >> INFO: task nfsd:2566 blocked for more than 5799 seconds. >> >> Tainted: G W 6.11.10+bpo-amd64 #1 Debia> >> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> >> >> task:nfsd state:D stack:0 pid:2566 tgid:2566 > >> >> Call Trace: >> >> <TASK> >> >> __schedule+0x400/0xad0 >> >> schedule+0x27/0xf0 >> >> nfsd4_shutdown_callback+0xfe/0x140 [nfsd] >> >> ? __pfx_var_wake_function+0x10/0x10 >> >> __destroy_client+0x1f0/0x290 [nfsd] >> >> nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] >> >> ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] >> >> nfsd4_proc_compound+0x34d/0x670 [nfsd] >> >> nfsd_dispatch+0xfd/0x220 [nfsd] >> >> svc_process_common+0x2f7/0x700 [sunrpc] >> >> ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] >> >> svc_process+0x131/0x180 [sunrpc] >> >> svc_recv+0x830/0xa10 [sunrpc] >> >> ? __pfx_nfsd+0x10/0x10 [nfsd] >> >> nfsd+0x87/0xf0 [nfsd] >> >> kthread+0xcf/0x100 >> >> ? __pfx_kthread+0x10/0x10 >> >> ret_from_fork+0x31/0x50 >> >> ? __pfx_kthread+0x10/0x10 >> >> ret_from_fork_asm+0x1a/0x30 >> >> </TASK> >> >> >> >>and also the state of the offending client in `/proc/fs/nfsd/ >> clients/*/info` >> >>used to be callback state: UNKNOWN while now it is DOWN or FAULT. No >> idea >> >>what it means, but I thought it was worth mentioning. >> >> >> > >> >Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That >> >probably means that the cl_cb_inflight counter is stuck at >0. I'm >> >guessing that means that there is some callback that it's expecting to >> >complete that isn't. From nfsd4_shutdown_callback(): >> > >> > /* >> > * Note this won't actually result in a null callback; >> > * instead, nfsd4_run_cb_null() will detect the killed >> > * client, destroy the rpc client, and stop: >> > */ >> > nfsd4_run_cb(&clp->cl_cb_null); >> > flush_workqueue(clp->cl_callback_wq); >> > nfsd41_cb_inflight_wait_complete(clp); >> > >> >...it sounds like that isn't happening properly though. >> > >> >It might be interesting to see if you can track down the callback >> >client in /sys/kernel/debug/sunrpc and see what it's doing. >> >> Here's a possible scenario: >> >> The trace.dat shows the server is sending a lot of CB_RECALL_ANY >> operations. >> >> Normally a callback occurs only due to some particular client request. >> Such requests would be completed before a client unmounts. >> >> CB_RECALL_ANY is an operation which does not occur due to a particular >> client operation and can occur at any time. I believe this is the >> first operation of this type we've added to NFSD. >> >> My guess is that the server's laundromat has sent some CB_RECALL_ANY >> operations while the client is umounting -- DESTROY_SESSION is >> racing with those callback operations. >> >> deleg_reaper() was backported to 6.1.y in 6.1.81, which lines up >> more or less with when the issues started to show up. >> >> A quick way to test this theory would be to make deleg_reaper() a >> no-op. If this helps, then we know that shutting down the server's >> callback client needs to be more careful about cleaning up outstanding >> callbacks. >> >> >> -- >> Chuck Lever > > Hello. > > Thanks a lot for all of this work ! > > I have started recording traces using : > > # trace-cmd record -e nfsd:nfsd_cb\* > > and currently waiting for the bug to trigger. This generally happen in > one/two weeks for me. > > If you want I adjust some recording parameters let me know. After looking at your data and Rik's data, I decided to try reproducing the problem in my lab. No luck so far. The signatures of these issues are slightly different from each other: v6.1: - nfsd4_destroy_session hangs on the flush_workqueue in nfsd4_shutdown_callback() - There might be outstanding callback RPCs that are not exiting - Lots of CB_RECALL_ANY traffic. Only place that's generated in v6.1 is the NFSD shrinker when memory is short - There appear to be some unreachable clients -- evidence of client crashes or network outages, perhaps v6.12: - destroy_client() hangs on the wait_var_event in nfsd4_shtudown_callback() - There are no callback RPCs left after the hang - There are some stuck clients listed under /proc/fs/nfsd/ Similar, but not the same. I'm still open to clues, so please continue the regimen of starting the trace-cmd when the server boots, and doing an "echo t > /proc/sysrq-trigger" when the hang appears. # trace-cmd record -e nfsd:nfsd_cb\* -e sunrpc:svc_xprt_detach \ -p function -l nfsd4_destroy_session -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-16 20:07 ` Chuck Lever @ 2025-01-17 19:43 ` Baptiste PELLEGRIN 2025-01-17 20:27 ` Chuck Lever 0 siblings, 1 reply; 30+ messages in thread From: Baptiste PELLEGRIN @ 2025-01-17 19:43 UTC (permalink / raw) To: Chuck Lever, Rik Theys, Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Jeff Layton On 1/16/25 9:07 PM, Chuck Lever wrote: > On 1/11/25 3:14 AM, Pellegrin Baptiste wrote: >> On 09/01/25 17:32, *Chuck Lever * <chuck.lever@oracle.com> wrote: >>> On 1/9/25 7:42 AM, Jeff Layton wrote: >>> >On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: >>> >>Dear Chuck, >>> >> >>> >>On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >>> >>>On 1/8/25 9:54 AM, Christian Herzog wrote: >>> >>>>Dear Chuck, >>> >>>> >>> >>>>On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>> >>>>>On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>> >>>>>>Hi Chuck, >>> >>>>>> >>> >>>>>>Thanks for your time on this, much appreciated. >>> >>>>>> >>> >>>>>>On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>> >>>>>>>On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>> >>>>>>>>Hi Chuck, hi all, >>> >>>>>>>> >>> >>>>>>>>[it was not ideal to pick one of the message for this >>> followup, let me >>> >>>>>>>>know if you want a complete new thread, adding as well >>> Benjamin and >>> >>>>>>>>Trond as they are involved in one mentioned patch] >>> >>>>>>>> >>> >>>>>>>>On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>>>On Jun 17, 2024, at 2:55 AM, Harald Dunkel >>> <harald.dunkel@aixigo.com> wrote: >>> >>>>>>>>>> >>> >>>>>>>>>>Hi folks, >>> >>>>>>>>>> >>> >>>>>>>>>>what would be the reason for nfsd getting stuck somehow and >>> becoming >>> >>>>>>>>>>an unkillable process? See >>> >>>>>>>>>> >>> >>>>>>>>>>- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>> <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562> >>> >>>>>>>>>>- https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >>> +bug/2062568 <https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >>> +bug/2062568> >>> >>>>>>>>>> >>> >>>>>>>>>>Doesn't this mean that something inside the kernel gets >>> stuck as >>> >>>>>>>>>>well? Seems odd to me. >>> >>>>>>>>> >>> >>>>>>>>>I'm not familiar with the Debian or Ubuntu kernel packages. Can >>> >>>>>>>>>the kernel release numbers be translated to LTS kernel releases >>> >>>>>>>>>please? Need both "last known working" and "first broken" >>> releases. >>> >>>>>>>>> >>> >>>>>>>>>This: >>> >>>>>>>>> >>> >>>>>>>>>[ 6596.911785] RPC: Could not send backchannel reply error: >>> -110 >>> >>>>>>>>>[ 6596.972490] RPC: Could not send backchannel reply error: >>> -110 >>> >>>>>>>>>[ 6837.281307] RPC: Could not send backchannel reply error: >>> -110 >>> >>>>>>>>> >>> >>>>>>>>>is a known set of client backchannel bugs. Knowing the LTS >>> kernel >>> >>>>>>>>>releases (see above) will help us figure out what needs to be >>> >>>>>>>>>backported to the LTS kernels kernels in question. >>> >>>>>>>>> >>> >>>>>>>>>This: >>> >>>>>>>>> >>> >>>>>>>>>[11183.290619] wait_for_completion+0x88/0x150 >>> >>>>>>>>>[11183.290623] __flush_workqueue+0x140/0x3e0 >>> >>>>>>>>>[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>> >>>>>>>>>[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>> >>>>>>>>> >>> >>>>>>>>>is probably related to the backchannel errors on the client, >>> but >>> >>>>>>>>>client bugs shouldn't cause the server to hang like this. We >>> >>>>>>>>>might be able to say more if you can provide the kernel release >>> >>>>>>>>>translations (see above). >>> >>>>>>>> >>> >>>>>>>>In Debian we hstill have the bug #1071562 open and one person >>> notified >>> >>>>>>>>mye offlist that it appears that the issue get more frequent >>> since >>> >>>>>>>>they updated on NFS client side from Ubuntu 20.04 to Debian >>> bookworm >>> >>>>>>>>with a 6.1.y based kernel). >>> >>>>>>>> >>> >>>>>>>>Some people around those issues, seem to claim that the change >>> >>>>>>>>mentioned in >>> >>>>>>>>https://lists.proxmox.com/pipermail/pve-devel/2024- >>> July/064614.html <https://lists.proxmox.com/pipermail/pve-devel/2024- >>> July/064614.html> >>> >>>>>>>>would fix the issue, which is as well backchannel related. >>> >>>>>>>> >>> >>>>>>>>This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>> >>>>>>>>again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>> >>>>>>>>nfs_client's rpc timeouts for backchannel") this is not >>> something >>> >>>>>>>>which goes back to 6.1.y, could it be possible that hte >>> backchannel >>> >>>>>>>>refactoring and this final fix indeeds fixes the issue? >>> >>>>>>>> >>> >>>>>>>>As people report it is not easily reproducible, so this makes it >>> >>>>>>>>harder to identify fixes correctly. >>> >>>>>>>> >>> >>>>>>>>I gave a (short) stance on trying to backport commits up to >>> >>>>>>>>6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but >>> this quickly >>> >>>>>>>>seems to indicate it is probably still not the right thing for >>> >>>>>>>>backporting to the older stable series. >>> >>>>>>>> >>> >>>>>>>>As at least pre-requisites: >>> >>>>>>>> >>> >>>>>>>>2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>> >>>>>>>>4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>> >>>>>>>>163cdfca341b76c958567ae0966bd3575c5c6192 >>> >>>>>>>>f4afc8fead386c81fda2593ad6162271d26667f8 >>> >>>>>>>>6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>> >>>>>>>>57331a59ac0d680f606403eb24edd3c35aecba31 >>> >>>>>>>> >>> >>>>>>>>and still there would be conflicting codepaths (and does not >>> seem >>> >>>>>>>>right). >>> >>>>>>>> >>> >>>>>>>>Chuck, Benjamin, Trond, is there anything we can provive on >>> reporters >>> >>>>>>>>side that we can try to tackle this issue better? >>> >>>>>>> >>> >>>>>>>As I've indicated before, NFSD should not block no matter what >>> the >>> >>>>>>>client may or may not be doing. I'd like to focus on the >>> server first. >>> >>>>>>> >>> >>>>>>>What is the result of: >>> >>>>>>> >>> >>>>>>>$ cd <Bookworm's v6.1.90 kernel source > >>> >>>>>>>$ unset KBUILD_OUTPUT >>> >>>>>>>$ make -j `nproc` >>> >>>>>>>$ scripts/faddr2line \ >>> >>>>>>> fs/nfsd/nfs4state.o \ >>> >>>>>>> nfsd4_destroy_session+0x16d >>> >>>>>>> >>> >>>>>>>Since this issue appeared after v6.1.1, is it possible to bisect >>> >>>>>>>between v6.1.1 and v6.1.90 ? >>> >>>>>> >>> >>>>>>First please note, at least speaking of triggering the issue in >>> >>>>>>Debian, Debian has moved to 6.1.119 based kernel already (and >>> soon in >>> >>>>>>the weekends point release move to 6.1.123). >>> >>>>>> >>> >>>>>>That said, one of the users which regularly seems to be hit by the >>> >>>>>>issue was able to provide the above requested information, >>> based for >>> >>>>>>6.1.119: >>> >>>>>> >>> >>>>>>~/kernel/linux-source-6.1# make kernelversion >>> >>>>>>6.1.119 >>> >>>>>>~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/ >>> nfs4state.o nfsd4_destroy_session+0x16d >>> >>>>>>nfsd4_destroy_session+0x16d/0x250: >>> >>>>>>__list_del_entry at /root/kernel/linux-source-6.1/./include/ >>> linux/list.h:134 >>> >>>>>>(inlined by) list_del at /root/kernel/linux-source-6.1/./ >>> include/linux/list.h:148 >>> >>>>>>(inlined by) unhash_session at /root/kernel/linux-source-6.1/ >>> fs/ nfsd/nfs4state.c:2062 >>> >>>>>>(inlined by) nfsd4_destroy_session at /root/kernel/linux- >>> source-6.1/fs/nfsd/nfs4state.c:3856 >>> >>>>>> >>> >>>>>>They could provide as well a decode_stacktrace output for the >>> recent >>> >>>>>>hit (if that is helpful for you): >>> >>>>>> >>> >>>>>>[Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for >>> more than 6883 seconds. >>> >>>>>>[Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 >>> Debian 6.1.119-1 >>> >>>>>>[Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/ >>> hung_task_timeout_secs" disables this message. >>> >>>>>>[Mon Jan 6 13:25:28 2025] task:nfsd state:D >>> stack:0 pid:55306 ppid:2 flags:0x00004000 >>> >>>>>>[Mon Jan 6 13:25:28 2025] Call Trace: >>> >>>>>>[Mon Jan 6 13:25:28 2025] <TASK> >>> >>>>>>[Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>> >>>>>>[Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>> >>>>>>[Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>> >>>>>>[Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>> >>>>>>[Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/ >>> build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/ >>> build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>> >>>>>>[Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/ >>> build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process (debian/build/ >>> build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd (debian/build/ >>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>> >>>>>>[Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>> >>>>>>[Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>> >>>>>>[Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>> >>>>>>[Mon Jan 6 13:25:28 2025] </TASK> >>> >>>>>> >>> >>>>>>The question about bisection is actually harder, those are >>> production >>> >>>>>>systems and I understand it's not possible to do bisect there, >>> while >>> >>>>>>unfortunately not reprodcing the issue on "lab conditions". >>> >>>>>> >>> >>>>>>Does the above give us still a clue on what you were looking for? >>> >>>>> >>> >>>>>Thanks for the update. >>> >>>>> >>> >>>>>It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into >>> struct >>> >>>>>nfs4_client"), while not a specific fix for this issue, might >>> offer some >>> >>>>>relief by preventing the DESTROY_SESSION hang from affecting all >>> other >>> >>>>>clients of the degraded server. >>> >>>>> >>> >>>>>Not having a reproducer or the ability to bisect puts a damper on >>> >>>>>things. The next step, then, is to enable tracing on servers >>> where this >>> >>>>>issue can come up, and wait for the hang to occur. The following >>> command >>> >>>>>captures information only about callback operation, so it should >>> not >>> >>>>>generate much data or impact server performance. >>> >>>>> >>> >>>>> # trace-cmd record -e nfsd:nfsd_cb\* >>> >>>>> >>> >>>>>Let that run until the problem occurs, then ^C, compress the >>> resulting >>> >>>>>trace.dat file, and either attach it to 1071562 or email it to me >>> >>>>>privately. >>> >>>>thanks for the follow-up. >>> >>>> >>> >>>>I am the "customer" with two affected file servers. We have since >>> moved those >>> >>>>servers to the backports kernel (6.11.10) in the hope of forward >>> fixing the >>> >>>>issue. If this kernel is stable, I'm afraid I can't go back to >>> the 'bad' >>> >>>>kernel (700+ researchers affected..), and we're also not able to >>> trigger the >>> >>>>issue on our test environment. >>> >>> >>> >>>Hello Dr. Herzog - >>> >>> >>> >>>If the problem recurs on 6.11, the trace-cmd I suggest above works >>> >>>there as well. >>> >>the bad news is: it just happened again with the bpo kernel. >>> >> >>> >>We then immediately started trace-cmd since usually there are >>> several call >>> >>traces in a row and we did get a trace.dat: >>> >>http://people.phys.ethz.ch/~daduke/trace.dat <http:// >>> people.phys.ethz.ch/~daduke/trace.dat> >>> >> >>> >>we did notice however that the stack trace looked a bit different >>> this time: >>> >> >>> >> INFO: task nfsd:2566 blocked for more than 5799 seconds. >>> >> Tainted: G W 6.11.10+bpo-amd64 #1 Debia> >>> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> >>> >> task:nfsd state:D stack:0 pid:2566 tgid:2566 > >>> >> Call Trace: >>> >> <TASK> >>> >> __schedule+0x400/0xad0 >>> >> schedule+0x27/0xf0 >>> >> nfsd4_shutdown_callback+0xfe/0x140 [nfsd] >>> >> ? __pfx_var_wake_function+0x10/0x10 >>> >> __destroy_client+0x1f0/0x290 [nfsd] >>> >> nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] >>> >> ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] >>> >> nfsd4_proc_compound+0x34d/0x670 [nfsd] >>> >> nfsd_dispatch+0xfd/0x220 [nfsd] >>> >> svc_process_common+0x2f7/0x700 [sunrpc] >>> >> ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] >>> >> svc_process+0x131/0x180 [sunrpc] >>> >> svc_recv+0x830/0xa10 [sunrpc] >>> >> ? __pfx_nfsd+0x10/0x10 [nfsd] >>> >> nfsd+0x87/0xf0 [nfsd] >>> >> kthread+0xcf/0x100 >>> >> ? __pfx_kthread+0x10/0x10 >>> >> ret_from_fork+0x31/0x50 >>> >> ? __pfx_kthread+0x10/0x10 >>> >> ret_from_fork_asm+0x1a/0x30 >>> >> </TASK> >>> >> >>> >>and also the state of the offending client in `/proc/fs/nfsd/ >>> clients/*/info` >>> >>used to be callback state: UNKNOWN while now it is DOWN or FAULT. >>> No idea >>> >>what it means, but I thought it was worth mentioning. >>> >> >>> > >>> >Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That >>> >probably means that the cl_cb_inflight counter is stuck at >0. I'm >>> >guessing that means that there is some callback that it's expecting to >>> >complete that isn't. From nfsd4_shutdown_callback(): >>> > >>> > /* >>> > * Note this won't actually result in a null callback; >>> > * instead, nfsd4_run_cb_null() will detect the killed >>> > * client, destroy the rpc client, and stop: >>> > */ >>> > nfsd4_run_cb(&clp->cl_cb_null); >>> > flush_workqueue(clp->cl_callback_wq); >>> > nfsd41_cb_inflight_wait_complete(clp); >>> > >>> >...it sounds like that isn't happening properly though. >>> > >>> >It might be interesting to see if you can track down the callback >>> >client in /sys/kernel/debug/sunrpc and see what it's doing. >>> >>> Here's a possible scenario: >>> >>> The trace.dat shows the server is sending a lot of CB_RECALL_ANY >>> operations. >>> >>> Normally a callback occurs only due to some particular client request. >>> Such requests would be completed before a client unmounts. >>> >>> CB_RECALL_ANY is an operation which does not occur due to a particular >>> client operation and can occur at any time. I believe this is the >>> first operation of this type we've added to NFSD. >>> >>> My guess is that the server's laundromat has sent some CB_RECALL_ANY >>> operations while the client is umounting -- DESTROY_SESSION is >>> racing with those callback operations. >>> >>> deleg_reaper() was backported to 6.1.y in 6.1.81, which lines up >>> more or less with when the issues started to show up. >>> >>> A quick way to test this theory would be to make deleg_reaper() a >>> no-op. If this helps, then we know that shutting down the server's >>> callback client needs to be more careful about cleaning up outstanding >>> callbacks. >>> >>> >>> -- >>> Chuck Lever >> >> Hello. >> >> Thanks a lot for all of this work ! >> >> I have started recording traces using : >> >> # trace-cmd record -e nfsd:nfsd_cb\* >> >> and currently waiting for the bug to trigger. This generally happen in >> one/two weeks for me. >> >> If you want I adjust some recording parameters let me know. > > After looking at your data and Rik's data, I decided to try reproducing > the problem in my lab. No luck so far. > > The signatures of these issues are slightly different from each other: > > v6.1: > - nfsd4_destroy_session hangs on the flush_workqueue in > nfsd4_shutdown_callback() > - There might be outstanding callback RPCs that are not exiting > - Lots of CB_RECALL_ANY traffic. Only place that's generated in > v6.1 is the NFSD shrinker when memory is short > - There appear to be some unreachable clients -- evidence of client > crashes or network outages, perhaps > > v6.12: > - destroy_client() hangs on the wait_var_event in > nfsd4_shtudown_callback() > - There are no callback RPCs left after the hang > - There are some stuck clients listed under /proc/fs/nfsd/ > > Similar, but not the same. > > I'm still open to clues, so please continue the regimen of starting > the trace-cmd when the server boots, and doing an "echo t > > /proc/sysrq-trigger" when the hang appears. > > # trace-cmd record -e nfsd:nfsd_cb\* -e sunrpc:svc_xprt_detach \ > -p function -l nfsd4_destroy_session > > Hello Chuck. Sorry for my late reply but I need to find something interesting for you. Did you see my call trace analyze here ? https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 On v6.1 did you see that nfsd4_shutdown_callback is reached from both nfsd4 laundromat_main and nfsd4_state_shrinker_worker ? For me the hang in V6.1 come also from wait_var_event. Here the state of my RAM on my server with some load. total used free shared buff/cache available Mem: 32083 4982 651 5 26919 27101 Swap: 16629 842 15787 My desktop clients use heavily suspend. The computer are almost never rebooted. When a user logout, suspending is automatically launched. Do you think that some delegation can be accumulated, locked by many suspended host, and can cause the high memory/cache usage ? Here my crash timeline between the desktop client salleprof13 (the one you think is unreachable) and the server fichdc02. The failure come from suspend. --- Client salleprof13 wake from sleep at 09:55:37 and was attached by the server. The callback channel is initialized. --- janv. 14 07:20:32 salleprofs13 kernel: PM: suspend entry (deep) janv. 14 09:55:37 salleprofs13 kernel: PM: suspend exit janv. 14 09:55:37 salleprofs13 ModemManager[621]: <info> [sleep-monitor-systemd] system is resuming janv. 14 09:55:37 salleprofs13 NetworkManager[576]: <info> [1736844937.2382] manager: sleep: wake requested (sleeping: yes enabled: yes) janv. 14 09:55:37 salleprofs13 systemd[1]: Stopping gssd-bug.service - Gssd credential cache hook... janv. 14 09:55:37 salleprofs13 systemd[1]: Starting winbind.service - Samba Winbind Daemon... janv. 14 09:55:37 salleprofs13 systemd[1]: Started winbind.service - Samba Winbind Daemon. janv. 14 09:55:40 salleprofs13 NetworkManager[576]: <info> [1736844940.6304] device (enp2s0): carrier: link connected janv. 14 09:55:40 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off nfsd-1695 [047] ..... 190512.862590: nfsd_cb_probe: addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN kworker/u96:8-225529 [047] ..... 190512.862782: nfsd_cb_setup: addr=172.16.118.121:0 client 67833e73:4dd32903 proto=tcp flavor=sys kworker/u96:8-225529 [047] ..... 190512.862784: nfsd_cb_state: addr=172.16.118.121:0 client 67833e73:4dd32903 state=UP 2025-01-14T09:55:48.290546+01:00 fichdc02 rpc.mountd[1363]: v4.2 client attached: 0x4dd3290367833e73 from "172.16.118.121:825" --- Someone logon on the station. Immediately the server start sending nfsd_cb_recall_any. Strange no ? What can be recalled on that host that just wake up... --- janv. 14 09:55:53 salleprofs13 nfsrahead[121742]: setting /dnfs/wine readahead to 128 janv. 14 09:56:13 salleprofs13 systemd-logind[569]: New session 36 of user xxxxxxxx. kworker/u96:3-227781 [047] ..... 190541.974364: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:2-220501 [013] ..... 190541.977732: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:2-220501 [000] ..... 190550.387742: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:0-228547 [003] ..... 190550.389602: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:2-220501 [021] ..... 190565.367759: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:1-230409 [000] ..... 190565.371391: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:2-220501 [013] ..... 190579.748430: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:8-225529 [026] ..... 190579.749946: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:8-225529 [013] ..... 190592.749738: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:2-220501 [001] ..... 190592.751192: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:0-228547 [000] ..... 190599.849333: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:0-228547 [000] ..... 190599.850872: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:2-220501 [045] ..... 190686.760138: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:1-230409 [014] ..... 190686.761181: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:1-230409 [013] ..... 190694.103480: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:8-225529 [022] ..... 190694.106120: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:8-225529 [000] ..... 190749.678378: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:2-220501 [007] ..... 190749.679870: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 kworker/u96:4-230705 [014] ..... 190782.830144: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:1-230409 [010] ..... 190782.831220: nfsd_cb_recall_any_done: client 67833e73:4dd32903 status=0 --- Next the user suspend the host maybe by pressing the power button without login out so the last nfsd_cb_recall_any fail Maybe the "unrecognized reply" related ? --- janv. 14 10:01:18 salleprofs13 systemd-logind[569]: The system will suspend now! janv. 14 10:01:18 salleprofs13 NetworkManager[576]: <info> [1736845278.5631] manager: sleep: sleep requested (sleeping: no enabled: yes) janv. 14 10:01:18 salleprofs13 ModemManager[621]: <info> [sleep-monitor-systemd] system is about to suspend janv. 14 10:01:20 salleprofs13 kernel: PM: suspend entry (deep) kworker/u96:2-220501 [047] ..... 190851.843698: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG kworker/u96:2-220501 [047] ..... 190861.062828: nfsd_cb_state: addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT 2025-01-14T10:06:06.586140+01:00 fichdc02 kernel: [191131.106081] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000ad57929d xid 52751431 --- Got 5 unrecognized reply. Seems not related . --- 2025-01-14T10:06:56.762137+01:00 fichdc02 kernel: [191181.279788] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000002dcd99c8 xid 5d819d7d 2025-01-14T10:06:56.762197+01:00 fichdc02 kernel: [191181.280005] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000002dcd99c8 xid 61819d7d 2025-01-14T10:06:56.762201+01:00 fichdc02 kernel: [191181.280054] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000002dcd99c8 xid 60819d7d 2025-01-14T10:06:56.762206+01:00 fichdc02 kernel: [191181.280223] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000002dcd99c8 xid 5f819d7d 2025-01-14T10:06:56.762209+01:00 fichdc02 kernel: [191181.280364] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000002dcd99c8 xid 5e819d7d --- Just 10 minutes after suspend the host wake up (with xxxxxxxx still logged on). The gssd-bug.service is a custom script that kill all the user processes if the kerberos cache file has expired. This is due to a rpc.gssd bug that hammer the server when the cache has expired during suspend. It run "killall -u xxxxxxxx". But here, I don't know if the killall was called or not. Just after the wake. The server send again an nfsd_cb_recall_any. But this time there is no nfsd_cb_state with state=FAULT. From this time there no more kworker cb calls. I think this where the hang start. And we got again an unrecognized reply. --- janv. 14 10:10:24 salleprofs13 kernel: PM: suspend exit janv. 14 10:10:24 salleprofs13 systemd-sleep[124910]: System returned from sleep state. janv. 14 10:10:24 salleprofs13 systemd[1]: Stopping gssd-bug.service - Gssd credential cache hook... janv. 14 10:10:24 salleprofs13 NetworkManager[576]: <info> [1736845824.4117] manager: sleep: wake requested (sleeping: yes enabled: yes) janv. 14 10:10:24 salleprofs13 check-gssd-cache[124962]: id: « xxxxxxxx » : the user don't exist janv. 14 10:10:24 salleprofs13 systemd[1]: Started winbind.service - Samba Winbind Daemon. janv. 14 10:10:27 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off janv. 14 10:10:27 salleprofs13 NetworkManager[576]: <info> [1736845827.6412] device (enp2s0): carrier: link connected janv. 14 10:10:35 salleprofs13 systemd[1]: Stopped gssd-bug.service - Gssd credential cache hook. kworker/u96:2-220501 [012] ..... 191417.544355: nfsd_cb_recall_any: addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG 2025-01-14T10:10:58.426113+01:00 fichdc02 kernel: [191422.934976] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 000000007205cdb2 xid ee91a9d8 --- nfsd probe a failed State. Normal. But as kworker hang, it can't reinit the callback channel after the nfsd_cb_state "state=UNKNOWN". I don't know why but nfsd started also to send nfsd_cb_recall... Note that is less than 120 second before the "BLOCKED" message. so the thread may be already locked. --- nfsd-1690 [014] ..... 191424.483249: nfsd_cb_probe: addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT nfsd-1690 [014] ..... 191424.483252: nfsd_cb_state: addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN nfsd-1695 [013] ...1. 191441.265047: nfsd_cb_recall: addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 00001c6e:00000001 nfsd-1695 [014] ...1. 191441.646104: nfsd_cb_recall: addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 000015fc:00000001 --- Got some strange logs on the client And 7 unrecognized reply on the server that seems not related. --- janv. 14 10:11:00 salleprofs13 PackageKit[122249]: daemon quit janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: Deactivated successfully. janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: Consumed 28.223s CPU time. janv. 14 10:11:02 salleprofs13 gnome-shell[122127]: JS ERROR: Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour org.vmware.viewagent.Credentials : Le délai d’> _injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/overrides/Gio.js:287:25 VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10 VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29 getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37 ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64 _init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30 AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4 _ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32 _showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14 vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14 janv. 14 10:11:05 salleprofs13 gsd-power[122329]: Error setting property 'PowerSaveMode' on interface org.gnome.Mutter.DisplayConfig: Le délai d’attente est dépassé (g-io-erro> janv. 14 10:11:27 salleprofs13 gnome-shell[122127]: JS ERROR: Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour org.vmware.viewagent.Credentials : Le délai d’> _injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/overrides/Gio.js:287:25 VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10 VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29 getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37 ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64 _init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30 AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4 _ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32 _showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14 vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14 2025-01-14T10:12:36.730114+01:00 fichdc02 kernel: [191521.235822] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid d591b1ee 2025-01-14T10:12:36.730165+01:00 fichdc02 kernel: [191521.236547] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid d691b1ee 2025-01-14T10:12:36.730169+01:00 fichdc02 kernel: [191521.236666] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid da91b1ee 2025-01-14T10:12:36.730171+01:00 fichdc02 kernel: [191521.236743] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid db91b1ee 2025-01-14T10:12:36.730172+01:00 fichdc02 kernel: [191521.236769] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid d991b1ee 2025-01-14T10:12:36.730174+01:00 fichdc02 kernel: [191521.236824] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid d891b1ee 2025-01-14T10:12:36.730177+01:00 fichdc02 kernel: [191521.236840] receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 00000000191d4459 xid d791b1ee --- And finally the hang syslog message --- 2025-01-14T10:14:29.370164+01:00 fichdc02 kernel: [191633.871204] INFO: task nfsd:1690 blocked for more than 120 seconds. 2025-01-14T10:14:29.374846+01:00 fichdc02 kernel: [191633.875410] INFO: task nfsd:1691 blocked for more than 120 seconds. 2025-01-14T10:14:29.374966+01:00 fichdc02 kernel: [191633.878106] INFO: task nfsd:1693 blocked for more than 120 seconds. 2025-01-14T10:14:29.378303+01:00 fichdc02 kernel: [191633.880748] INFO: task nfsd:1695 blocked for more than 120 seconds. Hope this help. I'm still recording. Regards, Baptiste. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-17 19:43 ` Baptiste PELLEGRIN @ 2025-01-17 20:27 ` Chuck Lever 0 siblings, 0 replies; 30+ messages in thread From: Chuck Lever @ 2025-01-17 20:27 UTC (permalink / raw) To: Baptiste PELLEGRIN, Rik Theys, Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Jeff Layton On 1/17/25 2:43 PM, Baptiste PELLEGRIN wrote: > > > On 1/16/25 9:07 PM, Chuck Lever wrote: >> On 1/11/25 3:14 AM, Pellegrin Baptiste wrote: >>> On 09/01/25 17:32, *Chuck Lever * <chuck.lever@oracle.com> wrote: >>>> On 1/9/25 7:42 AM, Jeff Layton wrote: >>>> >On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote: >>>> >>Dear Chuck, >>>> >> >>>> >>On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >>>> >>>On 1/8/25 9:54 AM, Christian Herzog wrote: >>>> >>>>Dear Chuck, >>>> >>>> >>>> >>>>On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>>> >>>>>On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>>> >>>>>>Hi Chuck, >>>> >>>>>> >>>> >>>>>>Thanks for your time on this, much appreciated. >>>> >>>>>> >>>> >>>>>>On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>> >>>>>>>On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>> >>>>>>>>Hi Chuck, hi all, >>>> >>>>>>>> >>>> >>>>>>>>[it was not ideal to pick one of the message for this >>>> followup, let me >>>> >>>>>>>>know if you want a complete new thread, adding as well >>>> Benjamin and >>>> >>>>>>>>Trond as they are involved in one mentioned patch] >>>> >>>>>>>> >>>> >>>>>>>>On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III >>>> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>>>On Jun 17, 2024, at 2:55 AM, Harald Dunkel >>>> <harald.dunkel@aixigo.com> wrote: >>>> >>>>>>>>>> >>>> >>>>>>>>>>Hi folks, >>>> >>>>>>>>>> >>>> >>>>>>>>>>what would be the reason for nfsd getting stuck somehow >>>> and becoming >>>> >>>>>>>>>>an unkillable process? See >>>> >>>>>>>>>> >>>> >>>>>>>>>>- https://bugs.debian.org/cgi-bin/bugreport.cgi? >>>> bug=1071562 <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562> >>>> >>>>>>>>>>- https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >>>> +bug/2062568 <https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ >>>> +bug/2062568> >>>> >>>>>>>>>> >>>> >>>>>>>>>>Doesn't this mean that something inside the kernel gets >>>> stuck as >>>> >>>>>>>>>>well? Seems odd to me. >>>> >>>>>>>>> >>>> >>>>>>>>>I'm not familiar with the Debian or Ubuntu kernel packages. >>>> Can >>>> >>>>>>>>>the kernel release numbers be translated to LTS kernel >>>> releases >>>> >>>>>>>>>please? Need both "last known working" and "first broken" >>>> releases. >>>> >>>>>>>>> >>>> >>>>>>>>>This: >>>> >>>>>>>>> >>>> >>>>>>>>>[ 6596.911785] RPC: Could not send backchannel reply error: >>>> -110 >>>> >>>>>>>>>[ 6596.972490] RPC: Could not send backchannel reply error: >>>> -110 >>>> >>>>>>>>>[ 6837.281307] RPC: Could not send backchannel reply error: >>>> -110 >>>> >>>>>>>>> >>>> >>>>>>>>>is a known set of client backchannel bugs. Knowing the LTS >>>> kernel >>>> >>>>>>>>>releases (see above) will help us figure out what needs to be >>>> >>>>>>>>>backported to the LTS kernels kernels in question. >>>> >>>>>>>>> >>>> >>>>>>>>>This: >>>> >>>>>>>>> >>>> >>>>>>>>>[11183.290619] wait_for_completion+0x88/0x150 >>>> >>>>>>>>>[11183.290623] __flush_workqueue+0x140/0x3e0 >>>> >>>>>>>>>[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>> >>>>>>>>>[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>> >>>>>>>>> >>>> >>>>>>>>>is probably related to the backchannel errors on the >>>> client, but >>>> >>>>>>>>>client bugs shouldn't cause the server to hang like this. We >>>> >>>>>>>>>might be able to say more if you can provide the kernel >>>> release >>>> >>>>>>>>>translations (see above). >>>> >>>>>>>> >>>> >>>>>>>>In Debian we hstill have the bug #1071562 open and one >>>> person notified >>>> >>>>>>>>mye offlist that it appears that the issue get more frequent >>>> since >>>> >>>>>>>>they updated on NFS client side from Ubuntu 20.04 to Debian >>>> bookworm >>>> >>>>>>>>with a 6.1.y based kernel). >>>> >>>>>>>> >>>> >>>>>>>>Some people around those issues, seem to claim that the change >>>> >>>>>>>>mentioned in >>>> >>>>>>>>https://lists.proxmox.com/pipermail/pve-devel/2024- >>>> July/064614.html <https://lists.proxmox.com/pipermail/pve- >>>> devel/2024- July/064614.html> >>>> >>>>>>>>would fix the issue, which is as well backchannel related. >>>> >>>>>>>> >>>> >>>>>>>>This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>> >>>>>>>>again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use >>>> the >>>> >>>>>>>>nfs_client's rpc timeouts for backchannel") this is not >>>> something >>>> >>>>>>>>which goes back to 6.1.y, could it be possible that hte >>>> backchannel >>>> >>>>>>>>refactoring and this final fix indeeds fixes the issue? >>>> >>>>>>>> >>>> >>>>>>>>As people report it is not easily reproducible, so this >>>> makes it >>>> >>>>>>>>harder to identify fixes correctly. >>>> >>>>>>>> >>>> >>>>>>>>I gave a (short) stance on trying to backport commits up to >>>> >>>>>>>>6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but >>>> this quickly >>>> >>>>>>>>seems to indicate it is probably still not the right thing for >>>> >>>>>>>>backporting to the older stable series. >>>> >>>>>>>> >>>> >>>>>>>>As at least pre-requisites: >>>> >>>>>>>> >>>> >>>>>>>>2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>> >>>>>>>>4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>> >>>>>>>>163cdfca341b76c958567ae0966bd3575c5c6192 >>>> >>>>>>>>f4afc8fead386c81fda2593ad6162271d26667f8 >>>> >>>>>>>>6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>> >>>>>>>>57331a59ac0d680f606403eb24edd3c35aecba31 >>>> >>>>>>>> >>>> >>>>>>>>and still there would be conflicting codepaths (and does not >>>> seem >>>> >>>>>>>>right). >>>> >>>>>>>> >>>> >>>>>>>>Chuck, Benjamin, Trond, is there anything we can provive on >>>> reporters >>>> >>>>>>>>side that we can try to tackle this issue better? >>>> >>>>>>> >>>> >>>>>>>As I've indicated before, NFSD should not block no matter >>>> what the >>>> >>>>>>>client may or may not be doing. I'd like to focus on the >>>> server first. >>>> >>>>>>> >>>> >>>>>>>What is the result of: >>>> >>>>>>> >>>> >>>>>>>$ cd <Bookworm's v6.1.90 kernel source > >>>> >>>>>>>$ unset KBUILD_OUTPUT >>>> >>>>>>>$ make -j `nproc` >>>> >>>>>>>$ scripts/faddr2line \ >>>> >>>>>>> fs/nfsd/nfs4state.o \ >>>> >>>>>>> nfsd4_destroy_session+0x16d >>>> >>>>>>> >>>> >>>>>>>Since this issue appeared after v6.1.1, is it possible to bisect >>>> >>>>>>>between v6.1.1 and v6.1.90 ? >>>> >>>>>> >>>> >>>>>>First please note, at least speaking of triggering the issue in >>>> >>>>>>Debian, Debian has moved to 6.1.119 based kernel already (and >>>> soon in >>>> >>>>>>the weekends point release move to 6.1.123). >>>> >>>>>> >>>> >>>>>>That said, one of the users which regularly seems to be hit by >>>> the >>>> >>>>>>issue was able to provide the above requested information, >>>> based for >>>> >>>>>>6.1.119: >>>> >>>>>> >>>> >>>>>>~/kernel/linux-source-6.1# make kernelversion >>>> >>>>>>6.1.119 >>>> >>>>>>~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/ >>>> nfs4state.o nfsd4_destroy_session+0x16d >>>> >>>>>>nfsd4_destroy_session+0x16d/0x250: >>>> >>>>>>__list_del_entry at /root/kernel/linux-source-6.1/./include/ >>>> linux/list.h:134 >>>> >>>>>>(inlined by) list_del at /root/kernel/linux-source-6.1/./ >>>> include/linux/list.h:148 >>>> >>>>>>(inlined by) unhash_session at /root/kernel/linux-source-6.1/ >>>> fs/ nfsd/nfs4state.c:2062 >>>> >>>>>>(inlined by) nfsd4_destroy_session at /root/kernel/linux- >>>> source-6.1/fs/nfsd/nfs4state.c:3856 >>>> >>>>>> >>>> >>>>>>They could provide as well a decode_stacktrace output for the >>>> recent >>>> >>>>>>hit (if that is helpful for you): >>>> >>>>>> >>>> >>>>>>[Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for >>>> more than 6883 seconds. >>>> >>>>>>[Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 >>>> Debian 6.1.119-1 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/ >>>> hung_task_timeout_secs" disables this message. >>>> >>>>>>[Mon Jan 6 13:25:28 2025] task:nfsd state:D >>>> stack:0 pid:55306 ppid:2 flags:0x00004000 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] Call Trace: >>>> >>>>>>[Mon Jan 6 13:25:28 2025] <TASK> >>>> >>>>>>[Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/ >>>> build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/ >>>> build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/ >>>> build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/ >>>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/ >>>> build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>>> >>>>>>[Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/ >>>> build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>>> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/ >>>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/ >>>> build/ build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] svc_process (debian/build/ >>>> build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>>> >>>>>>[Mon Jan 6 13:25:28 2025] nfsd (debian/build/ >>>> build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>>> >>>>>>[Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>>> >>>>>>[Mon Jan 6 13:25:28 2025] </TASK> >>>> >>>>>> >>>> >>>>>>The question about bisection is actually harder, those are >>>> production >>>> >>>>>>systems and I understand it's not possible to do bisect there, >>>> while >>>> >>>>>>unfortunately not reprodcing the issue on "lab conditions". >>>> >>>>>> >>>> >>>>>>Does the above give us still a clue on what you were looking for? >>>> >>>>> >>>> >>>>>Thanks for the update. >>>> >>>>> >>>> >>>>>It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into >>>> struct >>>> >>>>>nfs4_client"), while not a specific fix for this issue, might >>>> offer some >>>> >>>>>relief by preventing the DESTROY_SESSION hang from affecting >>>> all other >>>> >>>>>clients of the degraded server. >>>> >>>>> >>>> >>>>>Not having a reproducer or the ability to bisect puts a damper on >>>> >>>>>things. The next step, then, is to enable tracing on servers >>>> where this >>>> >>>>>issue can come up, and wait for the hang to occur. The >>>> following command >>>> >>>>>captures information only about callback operation, so it >>>> should not >>>> >>>>>generate much data or impact server performance. >>>> >>>>> >>>> >>>>> # trace-cmd record -e nfsd:nfsd_cb\* >>>> >>>>> >>>> >>>>>Let that run until the problem occurs, then ^C, compress the >>>> resulting >>>> >>>>>trace.dat file, and either attach it to 1071562 or email it to me >>>> >>>>>privately. >>>> >>>>thanks for the follow-up. >>>> >>>> >>>> >>>>I am the "customer" with two affected file servers. We have >>>> since moved those >>>> >>>>servers to the backports kernel (6.11.10) in the hope of forward >>>> fixing the >>>> >>>>issue. If this kernel is stable, I'm afraid I can't go back to >>>> the 'bad' >>>> >>>>kernel (700+ researchers affected..), and we're also not able to >>>> trigger the >>>> >>>>issue on our test environment. >>>> >>> >>>> >>>Hello Dr. Herzog - >>>> >>> >>>> >>>If the problem recurs on 6.11, the trace-cmd I suggest above works >>>> >>>there as well. >>>> >>the bad news is: it just happened again with the bpo kernel. >>>> >> >>>> >>We then immediately started trace-cmd since usually there are >>>> several call >>>> >>traces in a row and we did get a trace.dat: >>>> >>http://people.phys.ethz.ch/~daduke/trace.dat <http:// >>>> people.phys.ethz.ch/~daduke/trace.dat> >>>> >> >>>> >>we did notice however that the stack trace looked a bit different >>>> this time: >>>> >> >>>> >> INFO: task nfsd:2566 blocked for more than 5799 seconds. >>>> >> Tainted: G W 6.11.10+bpo-amd64 #1 Debia> >>>> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> >>>> >> task:nfsd state:D stack:0 pid:2566 tgid:2566 > >>>> >> Call Trace: >>>> >> <TASK> >>>> >> __schedule+0x400/0xad0 >>>> >> schedule+0x27/0xf0 >>>> >> nfsd4_shutdown_callback+0xfe/0x140 [nfsd] >>>> >> ? __pfx_var_wake_function+0x10/0x10 >>>> >> __destroy_client+0x1f0/0x290 [nfsd] >>>> >> nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] >>>> >> ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] >>>> >> nfsd4_proc_compound+0x34d/0x670 [nfsd] >>>> >> nfsd_dispatch+0xfd/0x220 [nfsd] >>>> >> svc_process_common+0x2f7/0x700 [sunrpc] >>>> >> ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] >>>> >> svc_process+0x131/0x180 [sunrpc] >>>> >> svc_recv+0x830/0xa10 [sunrpc] >>>> >> ? __pfx_nfsd+0x10/0x10 [nfsd] >>>> >> nfsd+0x87/0xf0 [nfsd] >>>> >> kthread+0xcf/0x100 >>>> >> ? __pfx_kthread+0x10/0x10 >>>> >> ret_from_fork+0x31/0x50 >>>> >> ? __pfx_kthread+0x10/0x10 >>>> >> ret_from_fork_asm+0x1a/0x30 >>>> >> </TASK> >>>> >> >>>> >>and also the state of the offending client in `/proc/fs/nfsd/ >>>> clients/*/info` >>>> >>used to be callback state: UNKNOWN while now it is DOWN or FAULT. >>>> No idea >>>> >>what it means, but I thought it was worth mentioning. >>>> >> >>>> > >>>> >Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That >>>> >probably means that the cl_cb_inflight counter is stuck at >0. I'm >>>> >guessing that means that there is some callback that it's expecting to >>>> >complete that isn't. From nfsd4_shutdown_callback(): >>>> > >>>> > /* >>>> > * Note this won't actually result in a null callback; >>>> > * instead, nfsd4_run_cb_null() will detect the killed >>>> > * client, destroy the rpc client, and stop: >>>> > */ >>>> > nfsd4_run_cb(&clp->cl_cb_null); >>>> > flush_workqueue(clp->cl_callback_wq); >>>> > nfsd41_cb_inflight_wait_complete(clp); >>>> > >>>> >...it sounds like that isn't happening properly though. >>>> > >>>> >It might be interesting to see if you can track down the callback >>>> >client in /sys/kernel/debug/sunrpc and see what it's doing. >>>> >>>> Here's a possible scenario: >>>> >>>> The trace.dat shows the server is sending a lot of CB_RECALL_ANY >>>> operations. >>>> >>>> Normally a callback occurs only due to some particular client request. >>>> Such requests would be completed before a client unmounts. >>>> >>>> CB_RECALL_ANY is an operation which does not occur due to a particular >>>> client operation and can occur at any time. I believe this is the >>>> first operation of this type we've added to NFSD. >>>> >>>> My guess is that the server's laundromat has sent some CB_RECALL_ANY >>>> operations while the client is umounting -- DESTROY_SESSION is >>>> racing with those callback operations. >>>> >>>> deleg_reaper() was backported to 6.1.y in 6.1.81, which lines up >>>> more or less with when the issues started to show up. >>>> >>>> A quick way to test this theory would be to make deleg_reaper() a >>>> no-op. If this helps, then we know that shutting down the server's >>>> callback client needs to be more careful about cleaning up outstanding >>>> callbacks. >>>> >>>> >>>> -- >>>> Chuck Lever >>> >>> Hello. >>> >>> Thanks a lot for all of this work ! >>> >>> I have started recording traces using : >>> >>> # trace-cmd record -e nfsd:nfsd_cb\* >>> >>> and currently waiting for the bug to trigger. This generally happen >>> in one/two weeks for me. >>> >>> If you want I adjust some recording parameters let me know. >> >> After looking at your data and Rik's data, I decided to try reproducing >> the problem in my lab. No luck so far. >> >> The signatures of these issues are slightly different from each other: >> >> v6.1: >> - nfsd4_destroy_session hangs on the flush_workqueue in >> nfsd4_shutdown_callback() >> - There might be outstanding callback RPCs that are not exiting >> - Lots of CB_RECALL_ANY traffic. Only place that's generated in >> v6.1 is the NFSD shrinker when memory is short >> - There appear to be some unreachable clients -- evidence of client >> crashes or network outages, perhaps >> >> v6.12: >> - destroy_client() hangs on the wait_var_event in >> nfsd4_shtudown_callback() >> - There are no callback RPCs left after the hang >> - There are some stuck clients listed under /proc/fs/nfsd/ >> >> Similar, but not the same. >> >> I'm still open to clues, so please continue the regimen of starting >> the trace-cmd when the server boots, and doing an "echo t > >> /proc/sysrq-trigger" when the hang appears. >> >> # trace-cmd record -e nfsd:nfsd_cb\* -e sunrpc:svc_xprt_detach \ >> -p function -l nfsd4_destroy_session >> >> > > Hello Chuck. > > Sorry for my late reply but I need to find something > interesting for you. > > Did you see my call trace analyze here ? > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > On v6.1 did you see that nfsd4_shutdown_callback is reached from > both nfsd4 laundromat_main and nfsd4_state_shrinker_worker ? > For me the hang in V6.1 come also from wait_var_event. > > Here the state of my RAM on my server with some load. > > total used free shared buff/cache > available > Mem: 32083 4982 651 5 26919 > 27101 > Swap: 16629 842 15787 > > My desktop clients use heavily suspend. The computer are almost > never rebooted. When a user logout, suspending is automatically > launched. Do you think that some delegation can be accumulated, > locked by many suspended host, and can cause the > high memory/cache usage ? v6.1 I think has courteous server support, meaning the server will attempt to preserve all client state when the client stops sending state-renewing operations. Suspended clients probably don't clean up their network connections. The server could see that as a network partition. Still, the server is supposed to clear out an idle connection after 6 minutes. So these hangs should clear up if this were the only issue. > Here my crash timeline between the desktop client salleprof13 > (the one you think is unreachable) and the server fichdc02. > The failure come from suspend. > > > --- > Client salleprof13 wake from sleep at 09:55:37 > and was attached by the server. > The callback channel is initialized. > --- > > janv. 14 07:20:32 salleprofs13 kernel: PM: suspend entry (deep) > > janv. 14 09:55:37 salleprofs13 kernel: PM: suspend exit > janv. 14 09:55:37 salleprofs13 ModemManager[621]: <info> [sleep-monitor- > systemd] system is resuming > janv. 14 09:55:37 salleprofs13 NetworkManager[576]: <info> > [1736844937.2382] manager: sleep: wake requested (sleeping: yes enabled: > yes) > janv. 14 09:55:37 salleprofs13 systemd[1]: Stopping gssd-bug.service - > Gssd credential cache hook... > janv. 14 09:55:37 salleprofs13 systemd[1]: Starting winbind.service - > Samba Winbind Daemon... > janv. 14 09:55:37 salleprofs13 systemd[1]: Started winbind.service - > Samba Winbind Daemon. > janv. 14 09:55:40 salleprofs13 NetworkManager[576]: <info> > [1736844940.6304] device (enp2s0): carrier: link connected > janv. 14 09:55:40 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link > is Up - 1Gbps/Full - flow control off > > nfsd-1695 [047] ..... 190512.862590: nfsd_cb_probe: > addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN > kworker/u96:8-225529 [047] ..... 190512.862782: nfsd_cb_setup: > addr=172.16.118.121:0 client 67833e73:4dd32903 proto=tcp flavor=sys > kworker/u96:8-225529 [047] ..... 190512.862784: nfsd_cb_state: > addr=172.16.118.121:0 client 67833e73:4dd32903 state=UP > 2025-01-14T09:55:48.290546+01:00 fichdc02 rpc.mountd[1363]: v4.2 client > attached: 0x4dd3290367833e73 from "172.16.118.121:825" > > > --- > Someone logon on the station. > Immediately the server start sending nfsd_cb_recall_any. > Strange no ? What can be recalled on that host that just > wake up... The server attempts to invoke CB_RECALL_ANY on all clients it knows about. If it believes the client has state (even when in courtesy) then the server will send this request to that client. > --- > > janv. 14 09:55:53 salleprofs13 nfsrahead[121742]: setting /dnfs/wine > readahead to 128 > janv. 14 09:56:13 salleprofs13 systemd-logind[569]: New session 36 of > user xxxxxxxx. > > kworker/u96:3-227781 [047] ..... 190541.974364: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:2-220501 [013] ..... 190541.977732: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:2-220501 [000] ..... 190550.387742: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:0-228547 [003] ..... 190550.389602: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:2-220501 [021] ..... 190565.367759: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:1-230409 [000] ..... 190565.371391: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:2-220501 [013] ..... 190579.748430: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:8-225529 [026] ..... 190579.749946: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:8-225529 [013] ..... 190592.749738: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:2-220501 [001] ..... 190592.751192: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:0-228547 [000] ..... 190599.849333: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:0-228547 [000] ..... 190599.850872: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:2-220501 [045] ..... 190686.760138: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:1-230409 [014] ..... 190686.761181: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:1-230409 [013] ..... 190694.103480: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:8-225529 [022] ..... 190694.106120: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:8-225529 [000] ..... 190749.678378: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:2-220501 [007] ..... 190749.679870: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > kworker/u96:4-230705 [014] ..... 190782.830144: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:1-230409 [010] ..... 190782.831220: nfsd_cb_recall_any_done: > client 67833e73:4dd32903 status=0 > > > --- > Next the user suspend the host maybe by pressing > the power button without login out so the last > nfsd_cb_recall_any fail > Maybe the "unrecognized reply" related ? > --- > > janv. 14 10:01:18 salleprofs13 systemd-logind[569]: The system will > suspend now! > janv. 14 10:01:18 salleprofs13 NetworkManager[576]: <info> > [1736845278.5631] manager: sleep: sleep requested (sleeping: no enabled: > yes) > janv. 14 10:01:18 salleprofs13 ModemManager[621]: <info> [sleep-monitor- > systemd] system is about to suspend > janv. 14 10:01:20 salleprofs13 kernel: PM: suspend entry (deep) > > kworker/u96:2-220501 [047] ..... 190851.843698: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > kworker/u96:2-220501 [047] ..... 190861.062828: nfsd_cb_state: > addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT > > 2025-01-14T10:06:06.586140+01:00 fichdc02 kernel: [191131.106081] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000ad57929d xid 52751431 > > > --- > Got 5 unrecognized reply. > Seems not related . > --- > > 2025-01-14T10:06:56.762137+01:00 fichdc02 kernel: [191181.279788] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000002dcd99c8 xid 5d819d7d > 2025-01-14T10:06:56.762197+01:00 fichdc02 kernel: [191181.280005] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000002dcd99c8 xid 61819d7d > 2025-01-14T10:06:56.762201+01:00 fichdc02 kernel: [191181.280054] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000002dcd99c8 xid 60819d7d > 2025-01-14T10:06:56.762206+01:00 fichdc02 kernel: [191181.280223] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000002dcd99c8 xid 5f819d7d > 2025-01-14T10:06:56.762209+01:00 fichdc02 kernel: [191181.280364] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000002dcd99c8 xid 5e819d7d > > > --- > Just 10 minutes after suspend the host wake up (with xxxxxxxx still > logged on). > The gssd-bug.service is a custom script that kill all the user processes > if the kerberos cache file has expired. This is due to a rpc.gssd bug > that hammer the server when the cache has expired during suspend. > It run "killall -u xxxxxxxx". But here, I don't know if the killall > was called or not. > > Just after the wake. The server send again an nfsd_cb_recall_any. But > this time > there is no nfsd_cb_state with state=FAULT. From this time there no more > kworker > cb calls. I think this where the hang start. I'm thinking the hang started a little earlier, but I agree that this is the first symptom of it. I was attempting to reproduce by periodically dropping the connection, but it appears there needs to be one or two more steps to make it behave like a suspended client. I will see what can be done. > And we got again an unrecognized reply. > --- > > janv. 14 10:10:24 salleprofs13 kernel: PM: suspend exit > janv. 14 10:10:24 salleprofs13 systemd-sleep[124910]: System returned > from sleep state. > janv. 14 10:10:24 salleprofs13 systemd[1]: Stopping gssd-bug.service - > Gssd credential cache hook... > janv. 14 10:10:24 salleprofs13 NetworkManager[576]: <info> > [1736845824.4117] manager: sleep: wake requested (sleeping: yes enabled: > yes) > janv. 14 10:10:24 salleprofs13 check-gssd-cache[124962]: id: > « xxxxxxxx » : the user don't exist > janv. 14 10:10:24 salleprofs13 systemd[1]: Started winbind.service - > Samba Winbind Daemon. > janv. 14 10:10:27 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link > is Up - 1Gbps/Full - flow control off > janv. 14 10:10:27 salleprofs13 NetworkManager[576]: <info> > [1736845827.6412] device (enp2s0): carrier: link connected > janv. 14 10:10:35 salleprofs13 systemd[1]: Stopped gssd-bug.service - > Gssd credential cache hook. > > kworker/u96:2-220501 [012] ..... 191417.544355: nfsd_cb_recall_any: > addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG > > 2025-01-14T10:10:58.426113+01:00 fichdc02 kernel: [191422.934976] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 000000007205cdb2 xid ee91a9d8 > > > --- > nfsd probe a failed State. Normal. > But as kworker hang, it can't reinit the callback channel > after the nfsd_cb_state "state=UNKNOWN". > > I don't know why but nfsd started also > to send nfsd_cb_recall... > Note that is less than 120 second before the "BLOCKED" message. > so the thread may be already locked. > --- > > nfsd-1690 [014] ..... 191424.483249: nfsd_cb_probe: > addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT > nfsd-1690 [014] ..... 191424.483252: nfsd_cb_state: > addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN > nfsd-1695 [013] ...1. 191441.265047: nfsd_cb_recall: > addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 00001c6e:00000001 > nfsd-1695 [014] ...1. 191441.646104: nfsd_cb_recall: > addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 000015fc:00000001 > > > --- > Got some strange logs on the client > And 7 unrecognized reply on the server that seems not related. > --- > > janv. 14 10:11:00 salleprofs13 PackageKit[122249]: daemon quit > janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: > Deactivated successfully. > janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: Consumed > 28.223s CPU time. > janv. 14 10:11:02 salleprofs13 gnome-shell[122127]: JS ERROR: > Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour > org.vmware.viewagent.Credentials : Le délai d’> > > _injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/ > overrides/Gio.js:287:25 > > VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10 > > VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29 > > getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37 > > ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64 > > _init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30 > > AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4 > > _ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32 > > _showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14 > > vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14 > janv. 14 10:11:05 salleprofs13 gsd-power[122329]: Error setting property > 'PowerSaveMode' on interface org.gnome.Mutter.DisplayConfig: Le délai > d’attente est dépassé (g-io-erro> > janv. 14 10:11:27 salleprofs13 gnome-shell[122127]: JS ERROR: > Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour > org.vmware.viewagent.Credentials : Le délai d’> > > _injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/ > overrides/Gio.js:287:25 > > VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10 > > VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29 > > getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37 > > ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64 > > _init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30 > > AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4 > > _ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32 > > _showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14 > > vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14 > > > > 2025-01-14T10:12:36.730114+01:00 fichdc02 kernel: [191521.235822] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid d591b1ee > 2025-01-14T10:12:36.730165+01:00 fichdc02 kernel: [191521.236547] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid d691b1ee > 2025-01-14T10:12:36.730169+01:00 fichdc02 kernel: [191521.236666] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid da91b1ee > 2025-01-14T10:12:36.730171+01:00 fichdc02 kernel: [191521.236743] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid db91b1ee > 2025-01-14T10:12:36.730172+01:00 fichdc02 kernel: [191521.236769] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid d991b1ee > 2025-01-14T10:12:36.730174+01:00 fichdc02 kernel: [191521.236824] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid d891b1ee > 2025-01-14T10:12:36.730177+01:00 fichdc02 kernel: [191521.236840] > receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt > 00000000191d4459 xid d791b1ee > > > --- > And finally the hang syslog message > --- > > 2025-01-14T10:14:29.370164+01:00 fichdc02 kernel: [191633.871204] INFO: > task nfsd:1690 blocked for more than 120 seconds. > 2025-01-14T10:14:29.374846+01:00 fichdc02 kernel: [191633.875410] INFO: > task nfsd:1691 blocked for more than 120 seconds. > 2025-01-14T10:14:29.374966+01:00 fichdc02 kernel: [191633.878106] INFO: > task nfsd:1693 blocked for more than 120 seconds. > 2025-01-14T10:14:29.378303+01:00 fichdc02 kernel: [191633.880748] INFO: > task nfsd:1695 blocked for more than 120 seconds. > > > Hope this help. > > I'm still recording. > > Regards, > > Baptiste. > > -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 11:56 ` Christian Herzog 2025-01-09 12:42 ` Jeff Layton @ 2025-01-09 15:49 ` Chuck Lever 2025-01-09 15:58 ` Christian Herzog 1 sibling, 1 reply; 30+ messages in thread From: Chuck Lever @ 2025-01-09 15:49 UTC (permalink / raw) To: Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 1/9/25 6:56 AM, Christian Herzog wrote: > Dear Chuck, > > On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >> On 1/8/25 9:54 AM, Christian Herzog wrote: >>> Dear Chuck, >>> >>> On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>>> On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>>>> Hi Chuck, >>>>> >>>>> Thanks for your time on this, much appreciated. >>>>> >>>>> On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>>>> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>>>>> Hi Chuck, hi all, >>>>>>> >>>>>>> [it was not ideal to pick one of the message for this followup, let me >>>>>>> know if you want a complete new thread, adding as well Benjamin and >>>>>>> Trond as they are involved in one mentioned patch] >>>>>>> >>>>>>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> >>>>>>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>>>>>> an unkillable process? See >>>>>>>>> >>>>>>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>>>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>>>>>> >>>>>>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>>>>>> well? Seems odd to me. >>>>>>>> >>>>>>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>>>>>> the kernel release numbers be translated to LTS kernel releases >>>>>>>> please? Need both "last known working" and "first broken" releases. >>>>>>>> >>>>>>>> This: >>>>>>>> >>>>>>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>>>>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>>>>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>>>>>> >>>>>>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>>>>>> releases (see above) will help us figure out what needs to be >>>>>>>> backported to the LTS kernels kernels in question. >>>>>>>> >>>>>>>> This: >>>>>>>> >>>>>>>> [11183.290619] wait_for_completion+0x88/0x150 >>>>>>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>>>>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>>>>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>>>>>> >>>>>>>> is probably related to the backchannel errors on the client, but >>>>>>>> client bugs shouldn't cause the server to hang like this. We >>>>>>>> might be able to say more if you can provide the kernel release >>>>>>>> translations (see above). >>>>>>> >>>>>>> In Debian we hstill have the bug #1071562 open and one person notified >>>>>>> mye offlist that it appears that the issue get more frequent since >>>>>>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>>>>>> with a 6.1.y based kernel). >>>>>>> >>>>>>> Some people around those issues, seem to claim that the change >>>>>>> mentioned in >>>>>>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>>>>>> would fix the issue, which is as well backchannel related. >>>>>>> >>>>>>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>>>>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>>>>>> nfs_client's rpc timeouts for backchannel") this is not something >>>>>>> which goes back to 6.1.y, could it be possible that hte backchannel >>>>>>> refactoring and this final fix indeeds fixes the issue? >>>>>>> >>>>>>> As people report it is not easily reproducible, so this makes it >>>>>>> harder to identify fixes correctly. >>>>>>> >>>>>>> I gave a (short) stance on trying to backport commits up to >>>>>>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>>>>>> seems to indicate it is probably still not the right thing for >>>>>>> backporting to the older stable series. >>>>>>> >>>>>>> As at least pre-requisites: >>>>>>> >>>>>>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>>>>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>>>>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>>>>>> f4afc8fead386c81fda2593ad6162271d26667f8 >>>>>>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>>>>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>>>>>> >>>>>>> and still there would be conflicting codepaths (and does not seem >>>>>>> right). >>>>>>> >>>>>>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>>>>>> side that we can try to tackle this issue better? >>>>>> >>>>>> As I've indicated before, NFSD should not block no matter what the >>>>>> client may or may not be doing. I'd like to focus on the server first. >>>>>> >>>>>> What is the result of: >>>>>> >>>>>> $ cd <Bookworm's v6.1.90 kernel source > >>>>>> $ unset KBUILD_OUTPUT >>>>>> $ make -j `nproc` >>>>>> $ scripts/faddr2line \ >>>>>> fs/nfsd/nfs4state.o \ >>>>>> nfsd4_destroy_session+0x16d >>>>>> >>>>>> Since this issue appeared after v6.1.1, is it possible to bisect >>>>>> between v6.1.1 and v6.1.90 ? >>>>> >>>>> First please note, at least speaking of triggering the issue in >>>>> Debian, Debian has moved to 6.1.119 based kernel already (and soon in >>>>> the weekends point release move to 6.1.123). >>>>> >>>>> That said, one of the users which regularly seems to be hit by the >>>>> issue was able to provide the above requested information, based for >>>>> 6.1.119: >>>>> >>>>> ~/kernel/linux-source-6.1# make kernelversion >>>>> 6.1.119 >>>>> ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d >>>>> nfsd4_destroy_session+0x16d/0x250: >>>>> __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 >>>>> (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 >>>>> (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 >>>>> (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 >>>>> >>>>> They could provide as well a decode_stacktrace output for the recent >>>>> hit (if that is helpful for you): >>>>> >>>>> [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. >>>>> [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 >>>>> [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>> [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 >>>>> [Mon Jan 6 13:25:28 2025] Call Trace: >>>>> [Mon Jan 6 13:25:28 2025] <TASK> >>>>> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>>>> [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>>>> [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>>>> [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>>>> [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>>>> [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>>>> [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>>>> [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>>>> [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>>>> [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>>>> [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>>>> [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>>>> [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>>>> [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>>>> [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>>>> [Mon Jan 6 13:25:28 2025] </TASK> >>>>> >>>>> The question about bisection is actually harder, those are production >>>>> systems and I understand it's not possible to do bisect there, while >>>>> unfortunately not reprodcing the issue on "lab conditions". >>>>> >>>>> Does the above give us still a clue on what you were looking for? >>>> >>>> Thanks for the update. >>>> >>>> It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >>>> nfs4_client"), while not a specific fix for this issue, might offer some >>>> relief by preventing the DESTROY_SESSION hang from affecting all other >>>> clients of the degraded server. >>>> >>>> Not having a reproducer or the ability to bisect puts a damper on >>>> things. The next step, then, is to enable tracing on servers where this >>>> issue can come up, and wait for the hang to occur. The following command >>>> captures information only about callback operation, so it should not >>>> generate much data or impact server performance. >>>> >>>> # trace-cmd record -e nfsd:nfsd_cb\* >>>> >>>> Let that run until the problem occurs, then ^C, compress the resulting >>>> trace.dat file, and either attach it to 1071562 or email it to me >>>> privately. >>> thanks for the follow-up. >>> >>> I am the "customer" with two affected file servers. We have since moved those >>> servers to the backports kernel (6.11.10) in the hope of forward fixing the >>> issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' >>> kernel (700+ researchers affected..), and we're also not able to trigger the >>> issue on our test environment. >> >> Hello Dr. Herzog - >> >> If the problem recurs on 6.11, the trace-cmd I suggest above works >> there as well. > the bad news is: it just happened again with the bpo kernel. > > We then immediately started trace-cmd since usually there are several call > traces in a row and we did get a trace.dat: > http://people.phys.ethz.ch/~daduke/trace.dat I'd prefer to have the trace running from before the first occurrence of the problem. Even better would be to get it started before the first client mounts the server. But I will have a look at your trace soon. It would be extremely helpful if we could reproduce this problem in our own labs. > we did notice however that the stack trace looked a bit different this time: > > INFO: task nfsd:2566 blocked for more than 5799 seconds. > Tainted: G W 6.11.10+bpo-amd64 #1 Debia> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t> > task:nfsd state:D stack:0 pid:2566 tgid:2566 > > Call Trace: > <TASK> > __schedule+0x400/0xad0 > schedule+0x27/0xf0 > nfsd4_shutdown_callback+0xfe/0x140 [nfsd] > ? __pfx_var_wake_function+0x10/0x10 > __destroy_client+0x1f0/0x290 [nfsd] > nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd] > ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc] > nfsd4_proc_compound+0x34d/0x670 [nfsd] > nfsd_dispatch+0xfd/0x220 [nfsd] > svc_process_common+0x2f7/0x700 [sunrpc] > ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] > svc_process+0x131/0x180 [sunrpc] > svc_recv+0x830/0xa10 [sunrpc] > ? __pfx_nfsd+0x10/0x10 [nfsd] > nfsd+0x87/0xf0 [nfsd] > kthread+0xcf/0x100 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > and also the state of the offending client in `/proc/fs/nfsd/clients/*/info` > used to be callback state: UNKNOWN while now it is DOWN or FAULT. No idea > what it means, but I thought it was worth mentioning. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 15:49 ` Chuck Lever @ 2025-01-09 15:58 ` Christian Herzog 2025-01-09 16:09 ` Chuck Lever 0 siblings, 1 reply; 30+ messages in thread From: Christian Herzog @ 2025-01-09 15:58 UTC (permalink / raw) To: Chuck Lever Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On Thu, Jan 09, 2025 at 10:49:37AM -0500, Chuck Lever wrote: > On 1/9/25 6:56 AM, Christian Herzog wrote: > > Dear Chuck, > > > > On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: > > > On 1/8/25 9:54 AM, Christian Herzog wrote: > > > > Dear Chuck, > > > > > > > > On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: > > > > > On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: > > > > > > Hi Chuck, > > > > > > > > > > > > Thanks for your time on this, much appreciated. > > > > > > > > > > > > On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: > > > > > > > On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: > > > > > > > > Hi Chuck, hi all, > > > > > > > > > > > > > > > > [it was not ideal to pick one of the message for this followup, let me > > > > > > > > know if you want a complete new thread, adding as well Benjamin and > > > > > > > > Trond as they are involved in one mentioned patch] > > > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > > > > > what would be the reason for nfsd getting stuck somehow and becoming > > > > > > > > > > an unkillable process? See > > > > > > > > > > > > > > > > > > > > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 > > > > > > > > > > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 > > > > > > > > > > > > > > > > > > > > Doesn't this mean that something inside the kernel gets stuck as > > > > > > > > > > well? Seems odd to me. > > > > > > > > > > > > > > > > > > I'm not familiar with the Debian or Ubuntu kernel packages. Can > > > > > > > > > the kernel release numbers be translated to LTS kernel releases > > > > > > > > > please? Need both "last known working" and "first broken" releases. > > > > > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > > > > > [ 6596.911785] RPC: Could not send backchannel reply error: -110 > > > > > > > > > [ 6596.972490] RPC: Could not send backchannel reply error: -110 > > > > > > > > > [ 6837.281307] RPC: Could not send backchannel reply error: -110 > > > > > > > > > > > > > > > > > > is a known set of client backchannel bugs. Knowing the LTS kernel > > > > > > > > > releases (see above) will help us figure out what needs to be > > > > > > > > > backported to the LTS kernels kernels in question. > > > > > > > > > > > > > > > > > > This: > > > > > > > > > > > > > > > > > > [11183.290619] wait_for_completion+0x88/0x150 > > > > > > > > > [11183.290623] __flush_workqueue+0x140/0x3e0 > > > > > > > > > [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] > > > > > > > > > [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] > > > > > > > > > > > > > > > > > > is probably related to the backchannel errors on the client, but > > > > > > > > > client bugs shouldn't cause the server to hang like this. We > > > > > > > > > might be able to say more if you can provide the kernel release > > > > > > > > > translations (see above). > > > > > > > > > > > > > > > > In Debian we hstill have the bug #1071562 open and one person notified > > > > > > > > mye offlist that it appears that the issue get more frequent since > > > > > > > > they updated on NFS client side from Ubuntu 20.04 to Debian bookworm > > > > > > > > with a 6.1.y based kernel). > > > > > > > > > > > > > > > > Some people around those issues, seem to claim that the change > > > > > > > > mentioned in > > > > > > > > https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html > > > > > > > > would fix the issue, which is as well backchannel related. > > > > > > > > > > > > > > > > This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, > > > > > > > > again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the > > > > > > > > nfs_client's rpc timeouts for backchannel") this is not something > > > > > > > > which goes back to 6.1.y, could it be possible that hte backchannel > > > > > > > > refactoring and this final fix indeeds fixes the issue? > > > > > > > > > > > > > > > > As people report it is not easily reproducible, so this makes it > > > > > > > > harder to identify fixes correctly. > > > > > > > > > > > > > > > > I gave a (short) stance on trying to backport commits up to > > > > > > > > 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly > > > > > > > > seems to indicate it is probably still not the right thing for > > > > > > > > backporting to the older stable series. > > > > > > > > > > > > > > > > As at least pre-requisites: > > > > > > > > > > > > > > > > 2009e32997ed568a305cf9bc7bf27d22e0f6ccda > > > > > > > > 4119bd0306652776cb0b7caa3aea5b2a93aecb89 > > > > > > > > 163cdfca341b76c958567ae0966bd3575c5c6192 > > > > > > > > f4afc8fead386c81fda2593ad6162271d26667f8 > > > > > > > > 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc > > > > > > > > 57331a59ac0d680f606403eb24edd3c35aecba31 > > > > > > > > > > > > > > > > and still there would be conflicting codepaths (and does not seem > > > > > > > > right). > > > > > > > > > > > > > > > > Chuck, Benjamin, Trond, is there anything we can provive on reporters > > > > > > > > side that we can try to tackle this issue better? > > > > > > > > > > > > > > As I've indicated before, NFSD should not block no matter what the > > > > > > > client may or may not be doing. I'd like to focus on the server first. > > > > > > > > > > > > > > What is the result of: > > > > > > > > > > > > > > $ cd <Bookworm's v6.1.90 kernel source > > > > > > > > $ unset KBUILD_OUTPUT > > > > > > > $ make -j `nproc` > > > > > > > $ scripts/faddr2line \ > > > > > > > fs/nfsd/nfs4state.o \ > > > > > > > nfsd4_destroy_session+0x16d > > > > > > > > > > > > > > Since this issue appeared after v6.1.1, is it possible to bisect > > > > > > > between v6.1.1 and v6.1.90 ? > > > > > > > > > > > > First please note, at least speaking of triggering the issue in > > > > > > Debian, Debian has moved to 6.1.119 based kernel already (and soon in > > > > > > the weekends point release move to 6.1.123). > > > > > > > > > > > > That said, one of the users which regularly seems to be hit by the > > > > > > issue was able to provide the above requested information, based for > > > > > > 6.1.119: > > > > > > > > > > > > ~/kernel/linux-source-6.1# make kernelversion > > > > > > 6.1.119 > > > > > > ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d > > > > > > nfsd4_destroy_session+0x16d/0x250: > > > > > > __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 > > > > > > (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 > > > > > > (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 > > > > > > (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 > > > > > > > > > > > > They could provide as well a decode_stacktrace output for the recent > > > > > > hit (if that is helpful for you): > > > > > > > > > > > > [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. > > > > > > [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 > > > > > > [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > > > [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 > > > > > > [Mon Jan 6 13:25:28 2025] Call Trace: > > > > > > [Mon Jan 6 13:25:28 2025] <TASK> > > > > > > [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 > > > > > > [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 > > > > > > [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 > > > > > > [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 > > > > > > [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 > > > > > > [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc > > > > > > [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc > > > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc > > > > > > [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd > > > > > > [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 > > > > > > [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 > > > > > > [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 > > > > > > [Mon Jan 6 13:25:28 2025] </TASK> > > > > > > > > > > > > The question about bisection is actually harder, those are production > > > > > > systems and I understand it's not possible to do bisect there, while > > > > > > unfortunately not reprodcing the issue on "lab conditions". > > > > > > > > > > > > Does the above give us still a clue on what you were looking for? > > > > > > > > > > Thanks for the update. > > > > > > > > > > It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct > > > > > nfs4_client"), while not a specific fix for this issue, might offer some > > > > > relief by preventing the DESTROY_SESSION hang from affecting all other > > > > > clients of the degraded server. > > > > > > > > > > Not having a reproducer or the ability to bisect puts a damper on > > > > > things. The next step, then, is to enable tracing on servers where this > > > > > issue can come up, and wait for the hang to occur. The following command > > > > > captures information only about callback operation, so it should not > > > > > generate much data or impact server performance. > > > > > > > > > > # trace-cmd record -e nfsd:nfsd_cb\* > > > > > > > > > > Let that run until the problem occurs, then ^C, compress the resulting > > > > > trace.dat file, and either attach it to 1071562 or email it to me > > > > > privately. > > > > thanks for the follow-up. > > > > > > > > I am the "customer" with two affected file servers. We have since moved those > > > > servers to the backports kernel (6.11.10) in the hope of forward fixing the > > > > issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' > > > > kernel (700+ researchers affected..), and we're also not able to trigger the > > > > issue on our test environment. > > > > > > Hello Dr. Herzog - > > > > > > If the problem recurs on 6.11, the trace-cmd I suggest above works > > > there as well. > > the bad news is: it just happened again with the bpo kernel. > > > > We then immediately started trace-cmd since usually there are several call > > traces in a row and we did get a trace.dat: > > http://people.phys.ethz.ch/~daduke/trace.dat > > I'd prefer to have the trace running from before the first occurrence > of the problem. Even better would be to get it started before the > first client mounts the server. > > But I will have a look at your trace soon. before you invest significant time in the trace: there is a chance it's a false alarm: we do see nfsd in D, but they're not spreading like they used to. And since someone is pulling 400 Mb/s from that server via globus gridftp, it might just be real IO overload. We're still trying to figure out what we're dealing with... > It would be extremely helpful if we could reproduce this problem in > our own labs. easy: just hire 700 physicists and let 'em loose thanks, -Christian ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: nfsd blocks indefinitely in nfsd4_destroy_session 2025-01-09 15:58 ` Christian Herzog @ 2025-01-09 16:09 ` Chuck Lever 0 siblings, 0 replies; 30+ messages in thread From: Chuck Lever @ 2025-01-09 16:09 UTC (permalink / raw) To: Christian Herzog Cc: Salvatore Bonaccorso, Benjamin Coddington, Trond Myklebust, Harald Dunkel, Linux NFS Mailing List, Martin Svec, Michael Gernoth, Pellegrin Baptiste On 1/9/25 10:58 AM, Christian Herzog wrote: > On Thu, Jan 09, 2025 at 10:49:37AM -0500, Chuck Lever wrote: >> On 1/9/25 6:56 AM, Christian Herzog wrote: >>> Dear Chuck, >>> >>> On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote: >>>> On 1/8/25 9:54 AM, Christian Herzog wrote: >>>>> Dear Chuck, >>>>> >>>>> On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote: >>>>>> On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote: >>>>>>> Hi Chuck, >>>>>>> >>>>>>> Thanks for your time on this, much appreciated. >>>>>>> >>>>>>> On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote: >>>>>>>> On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote: >>>>>>>>> Hi Chuck, hi all, >>>>>>>>> >>>>>>>>> [it was not ideal to pick one of the message for this followup, let me >>>>>>>>> know if you want a complete new thread, adding as well Benjamin and >>>>>>>>> Trond as they are involved in one mentioned patch] >>>>>>>>> >>>>>>>>> On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi folks, >>>>>>>>>>> >>>>>>>>>>> what would be the reason for nfsd getting stuck somehow and becoming >>>>>>>>>>> an unkillable process? See >>>>>>>>>>> >>>>>>>>>>> - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562 >>>>>>>>>>> - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 >>>>>>>>>>> >>>>>>>>>>> Doesn't this mean that something inside the kernel gets stuck as >>>>>>>>>>> well? Seems odd to me. >>>>>>>>>> >>>>>>>>>> I'm not familiar with the Debian or Ubuntu kernel packages. Can >>>>>>>>>> the kernel release numbers be translated to LTS kernel releases >>>>>>>>>> please? Need both "last known working" and "first broken" releases. >>>>>>>>>> >>>>>>>>>> This: >>>>>>>>>> >>>>>>>>>> [ 6596.911785] RPC: Could not send backchannel reply error: -110 >>>>>>>>>> [ 6596.972490] RPC: Could not send backchannel reply error: -110 >>>>>>>>>> [ 6837.281307] RPC: Could not send backchannel reply error: -110 >>>>>>>>>> >>>>>>>>>> is a known set of client backchannel bugs. Knowing the LTS kernel >>>>>>>>>> releases (see above) will help us figure out what needs to be >>>>>>>>>> backported to the LTS kernels kernels in question. >>>>>>>>>> >>>>>>>>>> This: >>>>>>>>>> >>>>>>>>>> [11183.290619] wait_for_completion+0x88/0x150 >>>>>>>>>> [11183.290623] __flush_workqueue+0x140/0x3e0 >>>>>>>>>> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd] >>>>>>>>>> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd] >>>>>>>>>> >>>>>>>>>> is probably related to the backchannel errors on the client, but >>>>>>>>>> client bugs shouldn't cause the server to hang like this. We >>>>>>>>>> might be able to say more if you can provide the kernel release >>>>>>>>>> translations (see above). >>>>>>>>> >>>>>>>>> In Debian we hstill have the bug #1071562 open and one person notified >>>>>>>>> mye offlist that it appears that the issue get more frequent since >>>>>>>>> they updated on NFS client side from Ubuntu 20.04 to Debian bookworm >>>>>>>>> with a 6.1.y based kernel). >>>>>>>>> >>>>>>>>> Some people around those issues, seem to claim that the change >>>>>>>>> mentioned in >>>>>>>>> https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html >>>>>>>>> would fix the issue, which is as well backchannel related. >>>>>>>>> >>>>>>>>> This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, >>>>>>>>> again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the >>>>>>>>> nfs_client's rpc timeouts for backchannel") this is not something >>>>>>>>> which goes back to 6.1.y, could it be possible that hte backchannel >>>>>>>>> refactoring and this final fix indeeds fixes the issue? >>>>>>>>> >>>>>>>>> As people report it is not easily reproducible, so this makes it >>>>>>>>> harder to identify fixes correctly. >>>>>>>>> >>>>>>>>> I gave a (short) stance on trying to backport commits up to >>>>>>>>> 6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly >>>>>>>>> seems to indicate it is probably still not the right thing for >>>>>>>>> backporting to the older stable series. >>>>>>>>> >>>>>>>>> As at least pre-requisites: >>>>>>>>> >>>>>>>>> 2009e32997ed568a305cf9bc7bf27d22e0f6ccda >>>>>>>>> 4119bd0306652776cb0b7caa3aea5b2a93aecb89 >>>>>>>>> 163cdfca341b76c958567ae0966bd3575c5c6192 >>>>>>>>> f4afc8fead386c81fda2593ad6162271d26667f8 >>>>>>>>> 6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc >>>>>>>>> 57331a59ac0d680f606403eb24edd3c35aecba31 >>>>>>>>> >>>>>>>>> and still there would be conflicting codepaths (and does not seem >>>>>>>>> right). >>>>>>>>> >>>>>>>>> Chuck, Benjamin, Trond, is there anything we can provive on reporters >>>>>>>>> side that we can try to tackle this issue better? >>>>>>>> >>>>>>>> As I've indicated before, NFSD should not block no matter what the >>>>>>>> client may or may not be doing. I'd like to focus on the server first. >>>>>>>> >>>>>>>> What is the result of: >>>>>>>> >>>>>>>> $ cd <Bookworm's v6.1.90 kernel source > >>>>>>>> $ unset KBUILD_OUTPUT >>>>>>>> $ make -j `nproc` >>>>>>>> $ scripts/faddr2line \ >>>>>>>> fs/nfsd/nfs4state.o \ >>>>>>>> nfsd4_destroy_session+0x16d >>>>>>>> >>>>>>>> Since this issue appeared after v6.1.1, is it possible to bisect >>>>>>>> between v6.1.1 and v6.1.90 ? >>>>>>> >>>>>>> First please note, at least speaking of triggering the issue in >>>>>>> Debian, Debian has moved to 6.1.119 based kernel already (and soon in >>>>>>> the weekends point release move to 6.1.123). >>>>>>> >>>>>>> That said, one of the users which regularly seems to be hit by the >>>>>>> issue was able to provide the above requested information, based for >>>>>>> 6.1.119: >>>>>>> >>>>>>> ~/kernel/linux-source-6.1# make kernelversion >>>>>>> 6.1.119 >>>>>>> ~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/nfs4state.o nfsd4_destroy_session+0x16d >>>>>>> nfsd4_destroy_session+0x16d/0x250: >>>>>>> __list_del_entry at /root/kernel/linux-source-6.1/./include/linux/list.h:134 >>>>>>> (inlined by) list_del at /root/kernel/linux-source-6.1/./include/linux/list.h:148 >>>>>>> (inlined by) unhash_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:2062 >>>>>>> (inlined by) nfsd4_destroy_session at /root/kernel/linux-source-6.1/fs/nfsd/nfs4state.c:3856 >>>>>>> >>>>>>> They could provide as well a decode_stacktrace output for the recent >>>>>>> hit (if that is helpful for you): >>>>>>> >>>>>>> [Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for more than 6883 seconds. >>>>>>> [Mon Jan 6 13:25:28 2025] Not tainted 6.1.0-28-amd64 #1 Debian 6.1.119-1 >>>>>>> [Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>>>> [Mon Jan 6 13:25:28 2025] task:nfsd state:D stack:0 pid:55306 ppid:2 flags:0x00004000 >>>>>>> [Mon Jan 6 13:25:28 2025] Call Trace: >>>>>>> [Mon Jan 6 13:25:28 2025] <TASK> >>>>>>> [Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0 >>>>>>> [Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0 >>>>>>> [Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150 >>>>>>> [Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160 >>>>>>> [Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420 >>>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc >>>>>>> [Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc >>>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] svc_process (debian/build/build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc >>>>>>> [Mon Jan 6 13:25:28 2025] nfsd (debian/build/build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd >>>>>>> [Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100 >>>>>>> [Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20 >>>>>>> [Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30 >>>>>>> [Mon Jan 6 13:25:28 2025] </TASK> >>>>>>> >>>>>>> The question about bisection is actually harder, those are production >>>>>>> systems and I understand it's not possible to do bisect there, while >>>>>>> unfortunately not reprodcing the issue on "lab conditions". >>>>>>> >>>>>>> Does the above give us still a clue on what you were looking for? >>>>>> >>>>>> Thanks for the update. >>>>>> >>>>>> It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into struct >>>>>> nfs4_client"), while not a specific fix for this issue, might offer some >>>>>> relief by preventing the DESTROY_SESSION hang from affecting all other >>>>>> clients of the degraded server. >>>>>> >>>>>> Not having a reproducer or the ability to bisect puts a damper on >>>>>> things. The next step, then, is to enable tracing on servers where this >>>>>> issue can come up, and wait for the hang to occur. The following command >>>>>> captures information only about callback operation, so it should not >>>>>> generate much data or impact server performance. >>>>>> >>>>>> # trace-cmd record -e nfsd:nfsd_cb\* >>>>>> >>>>>> Let that run until the problem occurs, then ^C, compress the resulting >>>>>> trace.dat file, and either attach it to 1071562 or email it to me >>>>>> privately. >>>>> thanks for the follow-up. >>>>> >>>>> I am the "customer" with two affected file servers. We have since moved those >>>>> servers to the backports kernel (6.11.10) in the hope of forward fixing the >>>>> issue. If this kernel is stable, I'm afraid I can't go back to the 'bad' >>>>> kernel (700+ researchers affected..), and we're also not able to trigger the >>>>> issue on our test environment. >>>> >>>> Hello Dr. Herzog - >>>> >>>> If the problem recurs on 6.11, the trace-cmd I suggest above works >>>> there as well. >>> the bad news is: it just happened again with the bpo kernel. >>> >>> We then immediately started trace-cmd since usually there are several call >>> traces in a row and we did get a trace.dat: >>> http://people.phys.ethz.ch/~daduke/trace.dat >> >> I'd prefer to have the trace running from before the first occurrence >> of the problem. Even better would be to get it started before the >> first client mounts the server. >> >> But I will have a look at your trace soon. > before you invest significant time in the trace: there is a chance it's a > false alarm: we do see nfsd in D, but they're not spreading like they used to. > And since someone is pulling 400 Mb/s from that server via globus gridftp, > it might just be real IO overload. We're still trying to figure out what we're > dealing with... 6.11 is going to behave a little differently, IIRC. It should allow other clients to continue working normally after one client hits this issue. >> It would be extremely helpful if we could reproduce this problem in >> our own labs. > easy: just hire 700 physicists and let 'em loose Yeah, that's the problem. I ain't got 700 physicists in my back pocket who would put up with NFS server outages ;-) Jeff and I have discussed a potential pynfs-based reproducer, but no guarantee it's the same issue as yours. -- Chuck Lever ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2025-01-17 20:28 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-17 6:55 nfsd becomes a zombie Harald Dunkel
2024-06-17 14:31 ` Chuck Lever III
2024-06-17 19:20 ` Calum Mackay
2024-06-18 14:29 ` Harald Dunkel
2024-06-18 14:32 ` Harald Dunkel
2024-06-18 14:52 ` Chuck Lever
2024-06-19 7:32 ` Harald Dunkel
2024-06-19 7:56 ` Harald Dunkel
2024-06-19 13:14 ` Chuck Lever III
2024-06-20 5:29 ` Harald Dunkel
2024-06-20 19:09 ` Chuck Lever III
2024-07-02 17:25 ` Harald Dunkel
2024-07-02 18:17 ` Chuck Lever III
2024-07-03 4:14 ` Harald Dunkel
2024-12-25 9:15 ` nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie) Salvatore Bonaccorso
2025-01-01 19:24 ` nfsd blocks indefinitely in nfsd4_destroy_session Chuck Lever
2025-01-07 21:17 ` Salvatore Bonaccorso
2025-01-08 13:33 ` Chuck Lever
2025-01-08 14:54 ` Christian Herzog
2025-01-08 15:07 ` Chuck Lever
2025-01-09 11:56 ` Christian Herzog
2025-01-09 12:42 ` Jeff Layton
2025-01-09 13:56 ` Chuck Lever
2025-01-09 16:32 ` Chuck Lever
[not found] ` <f0705a65549ef253.67823675@ac-grenoble.fr>
2025-01-16 20:07 ` Chuck Lever
2025-01-17 19:43 ` Baptiste PELLEGRIN
2025-01-17 20:27 ` Chuck Lever
2025-01-09 15:49 ` Chuck Lever
2025-01-09 15:58 ` Christian Herzog
2025-01-09 16:09 ` Chuck Lever
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox