Linux NFS development
 help / color / mirror / Atom feed
From: Scott Mayhew <smayhew@redhat.com>
To: NeilBrown <neilb@ownmail.net>
Cc: steved@redhat.com, bcodding@redhat.com, yoyang@redhat.com,
	linux-nfs@vger.kernel.org
Subject: Re: [nfs-utils PATCH] rpc-statd.service: weaken the dependency on rpcbind.socket
Date: Mon, 8 Sep 2025 15:07:18 -0400	[thread overview]
Message-ID: <aL8pZlSMHQBZdL3k@aion> (raw)
In-Reply-To: <175711562246.2850467.6098728603666668070@noble.neil.brown.name>

On Sat, 06 Sep 2025, NeilBrown wrote:

> On Sat, 06 Sep 2025, Scott Mayhew wrote:
> > In 91da135f ("systemd unit files: fix up dependencies on rpcbind"),
> > Neil laid out the rationale for how the nfs services should define their
> > dependencies on rpcbind.  In a nutshell:
> > 
> > 1. Dependencies should only be defined using rpcbind.socket
> > 2. Ordering for dependencies should only be defined usint "After="
> > 3. nfs-server.service should use "Wants=rpcbind.socket", to allow
> >    rpcbind.socket to be masked in NFSv4-only setups.
> > 4. rpc-statd.service should use "Requires=rpcbind.socket", as rpc.statd
> >    is useless if it can't register with rpcbind.
> > 
> > Then in https://bugzilla.redhat.com/show_bug.cgi?id=2100395, Ben noted
> > that due to the way the dependencies are ordered, when 'systemctl stop
> > rpcbind.socket' is run, systemd first sends SIGTERM to rpcbind, then
> > SIGTERM to rpc.statd.  On SIGTERM, rpcbind tears down /var/run/rpcbind.sock.
> > However, rpc-statd on SIGTERM attempts to unregister from rpcbind.  This
> > results in a long delay:
> > 
> > [root@rawhide ~]# time systemctl restart rpcbind.socket
> > 
> > real	1m0.147s
> > user	0m0.004s
> > sys	0m0.003s
> > 
> > 8a835ceb ("rpc-statd.service: Stop rpcbind and rpc.stat in an exit race")
> > fixed this by changing the dependency in rpc-statd.service to use
> > "After=rpcbind.service", bending rule #1 from above.
> 
> Thanks for the thorough and detailed explanation.
> 
> I'd like to suggest a different fix.  Change rpc-statd.service to
> declare:
> 
> After=network-online.target nss-lookup.target rpcbind.socket rpcbind.service
> 
> i.e. it is declared to be After both the socket and the service.
> 
> "After" declarations only have effect if the units are in the same
> transaction.  If the Unit is not being started or stopped, the After
> declaration has no effect.
> 
> So on startup, this will ensure rpcbind.socket is started before
> rpc-statd.service.
> On shutdown in a transaction that stops both rpc-statd.service and
> rpcbind.service, rpcbind.service won't be stopped until after
> rpc-statd.service is stopped.

That works too.

> 
> I agree that it isn't necessary to restart rpc-statd when rpcbind is
> restarted.
> Maybe that is a justification to use Wants instead of Requires.
> Or maybe Upholds would be even better.

I think Upholds is confusing.... especially since there aren't any
existing unit files using it, at least on a stock Fedora Rawhide
system.  I don't see it being used on OpenSUSE Tumbleweed or Debian
Trixie either.  I think it's going to confuse users if they try to stop
rpcbind.socket and then find that it's still running.  Finally, when I tested
it, it prevented me from stopping rpc-statd.  Eventually the shutdown
timer hit and systemd sent rpc-statd a SIGABRT, which in turned
triggered the systemd-coredump handler.  That's a whole mess of syslog
entries that's going to more bug reports.  I'd rather stick with Wants.
 
> 
> I wonder if putting
> 
>  ConditionPathIsSymbolisLink !/etc/systemd/system/rpcbind.socket

I'm lost.  What what cause the rpcbind.socket symlink to be created
directly in /etc/systemd/system?  I've seen it get created in
/etc/systemd/system/sockets.target.wants or
/etc/systemd/system/multi-user.target.wants, but never directly in
/etc/systemd/system.

-Scott
> 
> in rpc-statd.service would be a suitable way to stop rpc-statd from
> starting if rpcbind.socket is masked.
> 
> In any case I think there are two separate issues here which deserve two
> separate patch.
> 1/ shutdown ordering isn't handled correctly.  Adding the extra After
>    directive should fix that
> 2/ rpc.statd is restarted unnecessarily.  Wants or Upholds might be part
>    of the solution.
> 
> Thanks,
> NeilBrown
> 
>    
> 
> > 
> > Yongcheng recently noted that when runnnig the following test:
> > 
> > [root@rawhide ~]# for i in `seq 10`; do systemctl reset-failed; \
> > 	systemctl stop rpcbind rpcbind.socket ; systemctl restart nfs-server ; \
> > 	systemctl status rpc-statd; done
> > 
> > rpc-statd.service would often fail to start:
> > 
> > × rpc-statd.service - NFS status monitor for NFSv2/3 locking.
> >      Loaded: loaded (/usr/lib/systemd/system/rpc-statd.service; enabled-runtime; preset: disabled)
> >     Drop-In: /usr/lib/systemd/system/service.d
> >              └─10-timeout-abort.conf
> >      Active: failed (Result: exit-code) since Fri 2025-09-05 18:01:15 EDT; 229ms ago
> >    Duration: 228ms
> >  Invocation: bafb2bb00761439ebc348000704e8fbb
> >        Docs: man:rpc.statd(8)
> >     Process: 29937 ExecStart=/usr/sbin/rpc.statd (code=exited, status=1/FAILURE)
> >    Mem peak: 1.5M
> >         CPU: 7ms
> > 
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Version 2.8.2 starting
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Flags: TI-RPC
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp): svc_reg() err: RPC: Remote system error - Connection refused
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp): svc_reg() err: RPC: Success
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp6): svc_reg() err: RPC: Success
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp6): svc_reg() err: RPC: Success
> > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: failed to create RPC listeners, exiting
> > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Control process exited, code=exited, status=1/FAILURE
> > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Failed with result 'exit-code'.
> > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: Failed to start rpc-statd.service - NFS status monitor for NFSv2/3 locking..
> > 
> > I propose we revert the change from 8a835ceb and instead turn the
> > dependency into a weak dependency by using "Wants=rpcbind.socket"
> > instead of "Requires=rpcbind.socket".  This bends rule #4 above and will
> > make it so that systemd will try to start rpcbind.socket if it isn't
> > already running when rpc-statd.service starts, but it won't restart
> > rpc-statd.service whenever rpcbind is restarted.  Frankly, we shouldn't
> > need to restart services whenever rpcbind is restarted (thats why
> > rpcbind has the warmstart feature).  The only drawback is that now if an
> > admin wants to set up an NFSv4-only server by masking rpcbind.socket,
> > they'll need to mask rpc-statd.service as well.  I don't think that's
> > too much to ask, so the nfs.systemd man page has been updated
> > accordingly.
> > 
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> >  systemd/nfs.systemd.man   | 10 +++++++---
> >  systemd/rpc-statd.service |  5 +++--
> >  2 files changed, 10 insertions(+), 5 deletions(-)
> > 
> > diff --git a/systemd/nfs.systemd.man b/systemd/nfs.systemd.man
> > index a8476038..93fb87cd 100644
> > --- a/systemd/nfs.systemd.man
> > +++ b/systemd/nfs.systemd.man
> > @@ -137,7 +137,9 @@ NFSv2) and does not want
> >  .I rpcbind
> >  to be running, the correct approach is to run
> >  .RS
> > -.B systemctl mask rpcbind
> > +.B systemctl mask rpcbind.socket
> > +.br
> > +.B systemctl mask rpc-statd.service
> >  .RE
> >  This will disable
> >  .IR rpcbind ,
> > @@ -145,9 +147,11 @@ and the various NFS services which depend on it (and are only needed
> >  for NFSv3) will refuse to start, without interfering with the
> >  operation of NFSv4 services.  In particular,
> >  .I rpc.statd
> > -will not run when
> > +will fail to start when
> >  .I rpcbind
> > -is masked.
> > +is masked, so
> > +.I rpc-statd.service
> > +should be masked as well.
> >  .PP
> >  .I idmapd
> >  is only needed for NFSv4, and even then is not needed when the client
> > diff --git a/systemd/rpc-statd.service b/systemd/rpc-statd.service
> > index 660ed861..4e138f69 100644
> > --- a/systemd/rpc-statd.service
> > +++ b/systemd/rpc-statd.service
> > @@ -3,10 +3,11 @@ Description=NFS status monitor for NFSv2/3 locking.
> >  Documentation=man:rpc.statd(8)
> >  DefaultDependencies=no
> >  Conflicts=umount.target
> > -Requires=nss-lookup.target rpcbind.socket
> > +Requires=nss-lookup.target
> > +Wants=rpcbind.socket
> >  Wants=network-online.target
> >  Wants=rpc-statd-notify.service
> > -After=network-online.target nss-lookup.target rpcbind.service
> > +After=network-online.target nss-lookup.target rpcbind.socket
> >  
> >  PartOf=nfs-utils.service
> >  IgnoreOnIsolate=yes
> > -- 
> > 2.50.1
> > 
> > 
> 
> 


  reply	other threads:[~2025-09-08 19:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 22:35 [nfs-utils PATCH] rpc-statd.service: weaken the dependency on rpcbind.socket Scott Mayhew
2025-09-05 23:40 ` NeilBrown
2025-09-08 19:07   ` Scott Mayhew [this message]
2025-09-09  1:59     ` NeilBrown
2025-09-09 12:49       ` Scott Mayhew

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aL8pZlSMHQBZdL3k@aion \
    --to=smayhew@redhat.com \
    --cc=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=steved@redhat.com \
    --cc=yoyang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox