From: Jeff Layton via Bugspray Bot <bugbot@kernel.org>
To: cel@kernel.org, trondmy@kernel.org, carnil@debian.org,
anna@kernel.org, linux-nfs@vger.kernel.org, herzog@phys.ethz.ch,
harald.dunkel@aixigo.com, chuck.lever@oracle.com,
jlayton@kernel.org, baptiste.pellegrin@ac-grenoble.fr,
benoit.gschwind@minesparis.psl.eu
Subject: Re: NFSD threads hang when destroying a session or client ID
Date: Tue, 21 Jan 2025 17:35:14 +0000 [thread overview]
Message-ID: <20250121-b219710c10-7af78b1a3afc@bugzilla.kernel.org> (raw)
In-Reply-To: <20250121-b219710c7-773af1987926@bugzilla.kernel.org>
Jeff Layton writes via Kernel.org Bugzilla:
(In reply to Chuck Lever from comment #7)
> The trace captures I've reviewed suggest that a callback session is in use,
> so I would say the NFS minor version is 1 or higher. Perhaps it's not the
> RPC_SIGNALLED test above that is the problem, but the one later in
> nfsd4_cb_sequence_done().
Ok, good. Knowing that it's not v4.0 allows us to rule out some codepaths.
There are a couple of other cases where we goto need_restart:
The NFS4ERR_BADSESSION case does this, and also if it doesn't get a reply at all (case 1).
There is also this that looks a little sketchy:
------------8<-------------------
trace_nfsd_cb_free_slot(task, cb);
nfsd41_cb_release_slot(cb);
if (RPC_SIGNALLED(task))
goto need_restart;
out:
return ret;
retry_nowait:
if (rpc_restart_call_prepare(task))
ret = false;
goto out;
need_restart:
if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
trace_nfsd_cb_restart(clp, cb);
task->tk_status = 0;
cb->cb_need_restart = true;
}
return false;
------------8<-------------------
Probably now the same bug, but it looks like if RPC_SIGNALLED returns true, then it'll restart the RPC after releasing the slot. It seems like that could break the reply cache handling, as the restarted call could be on a different slot. I'll look at patching that, at least, though I'm not sure it's related to the hang.
More notes. The only way RPC_TASK_SIGNALLED gets set is:
nfsd4_process_cb_update()
rpc_shutdown_client()
rpc_killall_tasks()
That gets called if:
if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
nfsd4_process_cb_update(cb);
Which means that NFSD4_CLIENT_CB_UPDATE was probably set? NFSD4_CLIENT_CB_KILL seems less likely since that would nerf the cb_need_restart handling.
View: https://bugzilla.kernel.org/show_bug.cgi?id=219710#c10
You can reply to this message to join the discussion.
--
Deet-doot-dot, I am a bot.
Kernel.org Bugzilla (bugspray 0.1-dev)
next prev parent reply other threads:[~2025-01-21 17:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-20 15:00 NFSD threads hang when destroying a session or client ID Chuck Lever via Bugspray Bot
2025-01-20 15:14 ` Chuck Lever
2025-01-20 15:25 ` Chuck Lever via Bugspray Bot
2025-01-20 15:40 ` Chuck Lever via Bugspray Bot
2025-01-20 19:00 ` Chuck Lever via Bugspray Bot
2025-01-20 20:35 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-21 14:40 ` Jeff Layton via Bugspray Bot
2025-01-21 16:10 ` Chuck Lever via Bugspray Bot
2025-01-21 17:35 ` Jeff Layton via Bugspray Bot [this message]
2025-01-21 19:38 ` Tom Talpey
2025-01-21 19:43 ` Chuck Lever
2025-01-21 16:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-21 16:35 ` Chuck Lever via Bugspray Bot
2025-01-22 11:40 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-22 14:19 ` Chuck Lever
2025-01-22 21:25 ` JJ Jordan via Bugspray Bot
2025-01-22 21:25 ` JJ Jordan via Bugspray Bot
2025-01-23 2:10 ` Li Lingfeng via Bugspray Bot
2025-01-23 13:50 ` Jeff Layton via Bugspray Bot
2025-01-23 14:22 ` Chuck Lever
2025-01-23 20:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-23 21:45 ` Chuck Lever via Bugspray Bot
2025-01-26 9:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-26 17:05 ` Chuck Lever via Bugspray Bot
2025-01-29 13:15 ` rik.theys via Bugspray Bot
2025-01-29 19:40 ` Chuck Lever via Bugspray Bot
2025-01-30 14:05 ` rik.theys via Bugspray Bot
2025-01-29 19:50 ` Chuck Lever via Bugspray Bot
2025-02-10 12:05 ` Baptiste PELLEGRIN via Bugspray Bot
2025-02-21 13:42 ` Salvatore Bonaccorso
2025-02-21 13:57 ` Harald Dunkel
2025-02-21 14:31 ` Salvatore Bonaccorso
2025-02-21 14:50 ` Jeff Layton via Bugspray Bot
2025-02-21 16:00 ` Chuck Lever via Bugspray Bot
2025-02-21 14:45 ` Jeff Layton via Bugspray Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250121-b219710c10-7af78b1a3afc@bugzilla.kernel.org \
--to=bugbot@kernel.org \
--cc=anna@kernel.org \
--cc=baptiste.pellegrin@ac-grenoble.fr \
--cc=benoit.gschwind@minesparis.psl.eu \
--cc=carnil@debian.org \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=harald.dunkel@aixigo.com \
--cc=herzog@phys.ethz.ch \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox