From: Trond Myklebust <trondmy@kernel.org>
To: Harshvardhan Jha <harshvardhan.j.jha@oracle.com>,
NeilBrown <neil@brown.name>, Anna Schumaker <anna@kernel.org>
Cc: Mark Brown <broonie@kernel.org>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable()
Date: Fri, 05 Sep 2025 15:38:14 -0400 [thread overview]
Message-ID: <3e558c7f675b1f2e87098e58ef06c6f45ecf0a58.camel@kernel.org> (raw)
In-Reply-To: <b409c469-260e-4bd5-9cf8-49f524f3fd5a@oracle.com>
On Thu, 2025-08-28 at 18:12 +0530, Harshvardhan Jha wrote:
> Hi there,
>
> On 20/08/25 3:08 AM, NeilBrown wrote:
> > rpc_wait_bit_killable() is called when it is appropriate for a
> > fatal
> > signal to abort the wait.
> >
> > If it is called late during process exit after exit_signals() is
> > called
> > (and when PF_EXITING is set), it cannot receive a fatal signal so
> > waiting indefinitely is not safe.
> >
> > However aborting immediately, as it currently does, is not ideal as
> > it
> > mean that the related NFS request cannot succeed, even if the
> > network
> > and server are working properly.
> >
> > One of the causes of filesystem IO when PF_EXITING is set is
> > acct_process() which may access the process accounting file. For a
> > NFS-root configuration, this can be accessed over NFS.
> >
> > In this configuration LTP test "acct02" fails.
> >
> > Though waiting indefinitely is not appropriate, aborting
> > immediately is
> > also not desirable. This patch aims for a middle ground of waiting
> > at
> > most 5 seconds. This should be enough when NFS service is working,
> > but
> > not so much as to delay process exit excessively when NFS service
> > is not
> > functioning.
> >
> > Reported-by: Mark Brown <broonie@kernel.org>
> > Reported-and-tested-by: Harshvardhan Jha
> > <harshvardhan.j.jha@oracle.com>
> > Link:
> > https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/7d4d57b0-39a3-49f1-8ada-60364743e3b4@sirena.org.uk/__;!!ACWV5N9M2RV99hQ!LaRJdjZulcG71nHFWdEAszB9mJEhezxPsDxHO8xeQJ7P8a9UfYNRIm1ziuuHU5wxgEXW14vAqC1dlpSQraWaxA$
> >
> > Fixes: 14e41b16e8cb ("SUNRPC: Don't allow waiting for exiting
> > tasks")
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> > net/sunrpc/sched.c | 14 +++++++++-----
> > 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> > index 73bc39281ef5..92f39e828fbe 100644
> > --- a/net/sunrpc/sched.c
> > +++ b/net/sunrpc/sched.c
> > @@ -276,11 +276,15 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue);
> >
> > static int rpc_wait_bit_killable(struct wait_bit_key *key, int
> > mode)
> > {
> > - if (unlikely(current->flags & PF_EXITING))
> > - return -EINTR;
> > - schedule();
> > - if (signal_pending_state(mode, current))
> > - return -ERESTARTSYS;
> > + if (unlikely(current->flags & PF_EXITING)) {
> > + /* Cannot be killed by a signal, so don't wait
> > indefinitely */
> > + if (schedule_timeout(5 * HZ) == 0)
> > + return -EINTR;
> > + } else {
> > + schedule();
> > + if (signal_pending_state(mode, current))
> > + return -ERESTARTSYS;
> > + }
> > return 0;
> > }
> >
> Is it possible to get this merged in 6.17? I have tested this and the
> LTP tests pass.
After much thought, I think I'd rather just revert the commit that
caused the issue. I'll work on an alternative for the 6.18 timeframe
instead.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2025-09-05 19:38 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-19 21:38 [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable() NeilBrown
2025-08-28 12:42 ` Harshvardhan Jha
2025-08-28 13:10 ` Trond Myklebust
2025-08-29 1:04 ` NeilBrown
2025-09-05 19:38 ` Trond Myklebust [this message]
2025-09-05 22:45 ` NeilBrown
2025-09-05 23:16 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e558c7f675b1f2e87098e58ef06c6f45ecf0a58.camel@kernel.org \
--to=trondmy@kernel.org \
--cc=anna@kernel.org \
--cc=broonie@kernel.org \
--cc=harshvardhan.j.jha@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox