From: Salvatore Bonaccorso <carnil@debian.org>
To: Chuck Lever III <chuck.lever@oracle.com>,
Benjamin Coddington <bcodding@redhat.com>,
Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Harald Dunkel <harald.dunkel@aixigo.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
herzog@phys.ethz.ch, Martin Svec <martin.svec@zoner.cz>,
Michael Gernoth <debian@zerfleddert.de>,
Pellegrin Baptiste <Baptiste.Pellegrin@ac-grenoble.fr>
Subject: nfsd blocks indefinitely in nfsd4_destroy_session (was: Re: nfsd becomes a zombie)
Date: Wed, 25 Dec 2024 10:15:47 +0100 [thread overview]
Message-ID: <Z2vNQ6HXfG_LqBQc@eldamar.lan> (raw)
In-Reply-To: <C1CE3A96-599C-4D73-BCC0-3587EC68FCB0@oracle.com>
Hi Chuck, hi all,
[it was not ideal to pick one of the message for this followup, let me
know if you want a complete new thread, adding as well Benjamin and
Trond as they are involved in one mentioned patch]
On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III wrote:
>
>
> > On Jun 17, 2024, at 2:55 AM, Harald Dunkel <harald.dunkel@aixigo.com> wrote:
> >
> > Hi folks,
> >
> > what would be the reason for nfsd getting stuck somehow and becoming
> > an unkillable process? See
> >
> > - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562
> > - https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568
> >
> > Doesn't this mean that something inside the kernel gets stuck as
> > well? Seems odd to me.
>
> I'm not familiar with the Debian or Ubuntu kernel packages. Can
> the kernel release numbers be translated to LTS kernel releases
> please? Need both "last known working" and "first broken" releases.
>
> This:
>
> [ 6596.911785] RPC: Could not send backchannel reply error: -110
> [ 6596.972490] RPC: Could not send backchannel reply error: -110
> [ 6837.281307] RPC: Could not send backchannel reply error: -110
>
> is a known set of client backchannel bugs. Knowing the LTS kernel
> releases (see above) will help us figure out what needs to be
> backported to the LTS kernels kernels in question.
>
> This:
>
> [11183.290619] wait_for_completion+0x88/0x150
> [11183.290623] __flush_workqueue+0x140/0x3e0
> [11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
> [11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd]
>
> is probably related to the backchannel errors on the client, but
> client bugs shouldn't cause the server to hang like this. We
> might be able to say more if you can provide the kernel release
> translations (see above).
In Debian we hstill have the bug #1071562 open and one person notified
mye offlist that it appears that the issue get more frequent since
they updated on NFS client side from Ubuntu 20.04 to Debian bookworm
with a 6.1.y based kernel).
Some people around those issues, seem to claim that the change
mentioned in
https://lists.proxmox.com/pipermail/pve-devel/2024-July/064614.html
would fix the issue, which is as well backchannel related.
This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply,
again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use the
nfs_client's rpc timeouts for backchannel") this is not something
which goes back to 6.1.y, could it be possible that hte backchannel
refactoring and this final fix indeeds fixes the issue?
As people report it is not easily reproducible, so this makes it
harder to identify fixes correctly.
I gave a (short) stance on trying to backport commits up to
6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but this quickly
seems to indicate it is probably still not the right thing for
backporting to the older stable series.
As at least pre-requisites:
2009e32997ed568a305cf9bc7bf27d22e0f6ccda
4119bd0306652776cb0b7caa3aea5b2a93aecb89
163cdfca341b76c958567ae0966bd3575c5c6192
f4afc8fead386c81fda2593ad6162271d26667f8
6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc
57331a59ac0d680f606403eb24edd3c35aecba31
and still there would be conflicting codepaths (and does not seem
right).
Chuck, Benjamin, Trond, is there anything we can provive on reporters
side that we can try to tackle this issue better?
Regards,
Salvatore
next prev parent reply other threads:[~2024-12-25 9:15 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-17 6:55 nfsd becomes a zombie Harald Dunkel
2024-06-17 14:31 ` Chuck Lever III
2024-06-17 19:20 ` Calum Mackay
2024-06-18 14:29 ` Harald Dunkel
2024-06-18 14:32 ` Harald Dunkel
2024-06-18 14:52 ` Chuck Lever
2024-06-19 7:32 ` Harald Dunkel
2024-06-19 7:56 ` Harald Dunkel
2024-06-19 13:14 ` Chuck Lever III
2024-06-20 5:29 ` Harald Dunkel
2024-06-20 19:09 ` Chuck Lever III
2024-07-02 17:25 ` Harald Dunkel
2024-07-02 18:17 ` Chuck Lever III
2024-07-03 4:14 ` Harald Dunkel
2024-12-25 9:15 ` Salvatore Bonaccorso [this message]
2025-01-01 19:24 ` nfsd blocks indefinitely in nfsd4_destroy_session Chuck Lever
2025-01-07 21:17 ` Salvatore Bonaccorso
2025-01-08 13:33 ` Chuck Lever
2025-01-08 14:54 ` Christian Herzog
2025-01-08 15:07 ` Chuck Lever
2025-01-09 11:56 ` Christian Herzog
2025-01-09 12:42 ` Jeff Layton
2025-01-09 13:56 ` Chuck Lever
2025-01-09 16:32 ` Chuck Lever
[not found] ` <f0705a65549ef253.67823675@ac-grenoble.fr>
2025-01-16 20:07 ` Chuck Lever
2025-01-17 19:43 ` Baptiste PELLEGRIN
2025-01-17 20:27 ` Chuck Lever
2025-01-09 15:49 ` Chuck Lever
2025-01-09 15:58 ` Christian Herzog
2025-01-09 16:09 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2vNQ6HXfG_LqBQc@eldamar.lan \
--to=carnil@debian.org \
--cc=Baptiste.Pellegrin@ac-grenoble.fr \
--cc=bcodding@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=debian@zerfleddert.de \
--cc=harald.dunkel@aixigo.com \
--cc=herzog@phys.ethz.ch \
--cc=linux-nfs@vger.kernel.org \
--cc=martin.svec@zoner.cz \
--cc=trond.myklebust@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox