Linux NFS development
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@kernel.org>
To: Harshvardhan Jha <harshvardhan.j.jha@oracle.com>,
	NeilBrown <neil@brown.name>, Anna Schumaker <anna@kernel.org>
Cc: Mark Brown <broonie@kernel.org>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable()
Date: Fri, 05 Sep 2025 15:38:14 -0400	[thread overview]
Message-ID: <3e558c7f675b1f2e87098e58ef06c6f45ecf0a58.camel@kernel.org> (raw)
In-Reply-To: <b409c469-260e-4bd5-9cf8-49f524f3fd5a@oracle.com>

On Thu, 2025-08-28 at 18:12 +0530, Harshvardhan Jha wrote:
> Hi there,
> 
> On 20/08/25 3:08 AM, NeilBrown wrote:
> > rpc_wait_bit_killable() is called when it is appropriate for a
> > fatal
> > signal to abort the wait.
> > 
> > If it is called late during process exit after exit_signals() is
> > called
> > (and when PF_EXITING is set), it cannot receive a fatal signal so
> > waiting indefinitely is not safe.
> > 
> > However aborting immediately, as it currently does, is not ideal as
> > it
> > mean that the related NFS request cannot succeed, even if the
> > network
> > and server are working properly.
> > 
> > One of the causes of filesystem IO when PF_EXITING is set is
> > acct_process() which may access the process accounting file.  For a
> > NFS-root configuration, this can be accessed over NFS.
> > 
> > In this configuration LTP test "acct02" fails.
> > 
> > Though waiting indefinitely is not appropriate, aborting
> > immediately is
> > also not desirable.  This patch aims for a middle ground of waiting
> > at
> > most 5 seconds.  This should be enough when NFS service is working,
> > but
> > not so much as to delay process exit excessively when NFS service
> > is not
> > functioning.
> > 
> > Reported-by: Mark Brown <broonie@kernel.org>
> > Reported-and-tested-by: Harshvardhan Jha
> > <harshvardhan.j.jha@oracle.com>
> > Link:
> > https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/7d4d57b0-39a3-49f1-8ada-60364743e3b4@sirena.org.uk/__;!!ACWV5N9M2RV99hQ!LaRJdjZulcG71nHFWdEAszB9mJEhezxPsDxHO8xeQJ7P8a9UfYNRIm1ziuuHU5wxgEXW14vAqC1dlpSQraWaxA$
> >  
> > Fixes: 14e41b16e8cb ("SUNRPC: Don't allow waiting for exiting
> > tasks")
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  net/sunrpc/sched.c | 14 +++++++++-----
> >  1 file changed, 9 insertions(+), 5 deletions(-)
> > 
> > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> > index 73bc39281ef5..92f39e828fbe 100644
> > --- a/net/sunrpc/sched.c
> > +++ b/net/sunrpc/sched.c
> > @@ -276,11 +276,15 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue);
> >  
> >  static int rpc_wait_bit_killable(struct wait_bit_key *key, int
> > mode)
> >  {
> > -	if (unlikely(current->flags & PF_EXITING))
> > -		return -EINTR;
> > -	schedule();
> > -	if (signal_pending_state(mode, current))
> > -		return -ERESTARTSYS;
> > +	if (unlikely(current->flags & PF_EXITING)) {
> > +		/* Cannot be killed by a signal, so don't wait
> > indefinitely */
> > +		if (schedule_timeout(5 * HZ) == 0)
> > +			return -EINTR;
> > +	} else {
> > +		schedule();
> > +		if (signal_pending_state(mode, current))
> > +			return -ERESTARTSYS;
> > +	}
> >  	return 0;
> >  }
> >  
> Is it possible to get this merged in 6.17? I have tested this and the
> LTP tests pass.

After much thought, I think I'd rather just revert the commit that
caused the issue. I'll work on an alternative for the 6.18 timeframe
instead.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com

  parent reply	other threads:[~2025-09-05 19:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19 21:38 [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable() NeilBrown
2025-08-28 12:42 ` Harshvardhan Jha
2025-08-28 13:10   ` Trond Myklebust
2025-08-29  1:04     ` NeilBrown
2025-09-05 19:38   ` Trond Myklebust [this message]
2025-09-05 22:45     ` NeilBrown
2025-09-05 23:16       ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e558c7f675b1f2e87098e58ef06c6f45ecf0a58.camel@kernel.org \
    --to=trondmy@kernel.org \
    --cc=anna@kernel.org \
    --cc=broonie@kernel.org \
    --cc=harshvardhan.j.jha@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox