From: Jeff Layton <jlayton@kernel.org>
To: Trond Myklebust <trondmy@hammerspace.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
"pabeni@redhat.com" <pabeni@redhat.com>,
"okorniev@redhat.com" <okorniev@redhat.com>,
"tom@talpey.com" <tom@talpey.com>,
"anna@kernel.org" <anna@kernel.org>,
"horms@kernel.org" <horms@kernel.org>,
"kuba@kernel.org" <kuba@kernel.org>,
"Dai.Ngo@oracle.com" <Dai.Ngo@oracle.com>,
"edumazet@google.com" <edumazet@google.com>,
"neilb@suse.de" <neilb@suse.de>
Cc: "josef@toxicpanda.com" <josef@toxicpanda.com>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"bcodding@redhat.com" <bcodding@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH RFC 0/9] nfs/sunrpc: stop holding netns references in client-side NFS and RPC objects
Date: Tue, 18 Mar 2025 07:30:13 -0400 [thread overview]
Message-ID: <d05913314baff82463eb67a26605a16659dc1862.camel@kernel.org> (raw)
In-Reply-To: <8e781348f00184205c38e96e9e8af046d4d2f500.camel@hammerspace.com>
On Mon, 2025-03-17 at 22:11 +0000, Trond Myklebust wrote:
> On Mon, 2025-03-17 at 17:57 -0400, Jeff Layton wrote:
> > On Mon, 2025-03-17 at 21:35 +0000, Trond Myklebust wrote:
> > > On Mon, 2025-03-17 at 16:59 -0400, Jeff Layton wrote:
> > > > We have a long-standing problem with containers that have NFS
> > > > mounts
> > > > in
> > > > them. Best practice is to unmount gracefully, of course, but
> > > > sometimes
> > > > containers just spontaneously die (e.g. SIGSEGV in the init task
> > > > in
> > > > the
> > > > container). When that happens the orchestrator will see that all
> > > > of
> > > > the
> > > > tasks are dead, and will detach the mount namespace and kill off
> > > > the
> > > > network connection.
> > > >
> > > > If there are RPCs in flight at the time, the rpc_clnt will try to
> > > > retransmit them indefinitely, but there is no hope of them ever
> > > > contacting the server since nothing in userland can reach the
> > > > netns
> > > > at that point to fix anything.
> > > >
> > > > This patchset takes the approach of changing various nfs client
> > > > and
> > > > sunrpc objects to not hold a netns reference. Instead, when a
> > > > nfs_net
> > > > or
> > > > sunrpc_net is exiting, all nfs_server, nfs_client and rpc_clnt
> > > > objects
> > > > associated with it are shut down, and the pre_exit functions
> > > > block
> > > > until they are gone.
> > > >
> > > > With this approach, when the last userland task in the container
> > > > exits,
> > > > the NFS and RPC clients get cleaned up automatically. As a bonus,
> > > > this
> > > > fixes another bug with the gssproxy RPC client that causes net
> > > > namespace
> > > > leaks in any container where it runs (details in the patch
> > > > descriptions).
> > > >
> > >
> > > So with this approach, what happens if the NFS mount was created in
> > > a
> > > container, but got bind mounted somewhere else?
> > >
> >
> > The lifetime of these objects are tied to the net namespace. If it
> > gets
> > bind-mounted into a different mount namespace, while the tasks are
> > setns()'ed into the correct net namespace, then I expect the mount
> > would end up shut down at that point and be unusable, just like if
> > you
> > echo 1 into the shutdown file in sysfs.
> >
> > Hopefully no one is doing anything that silly. You wouldn't be able
> > to
> > upcall, for one thing, since there wouldn't be any more userland
> > processes attached to the netns.
> >
> > I'll test that scenario and get back to you though. I do want to make
> > sure that that's not going to lead to a crash or anything.
>
> I agree with you that it's not a sane scenario, and that there is no
> need to try to make it work. However the user space tools are there to
> allow it to happen, so we need to ensure that the kernel won't panic or
> cause any new exotic hangs.
Unfortunately, this does create a hang.
Bind-mounting it will cause the superblock's refcount to increase,
which keeps the nfs_server struct active. That holds a reference to the
nfs_client, which prevents everything from coming down properly in
pre_exit.
I'll have to think about how we can solve that. Let me know if you have
ideas.
--
Jeff Layton <jlayton@kernel.org>
prev parent reply other threads:[~2025-03-18 11:30 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 20:59 [PATCH RFC 0/9] nfs/sunrpc: stop holding netns references in client-side NFS and RPC objects Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 1/9] sunrpc: transplant shutdown_client() to sunrpc module Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 2/9] lockd: add a helper to shut down rpc_clnt in nlm_host Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 3/9] lockd: don't #include debug.h from lockd.h Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 4/9] nfs: transplant nfs_server shutdown into a helper function Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 5/9] nfs: don't hold a reference to struct net in struct nfs_client Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 6/9] auth_gss: shut down gssproxy rpc_clnt in net pre_exit Jeff Layton
2025-03-17 20:59 ` [PATCH RFC 7/9] auth_gss: don't hold a net reference in gss_auth Jeff Layton
2025-03-17 21:00 ` [PATCH RFC 8/9] sunrpc: don't hold a struct net reference in rpc_xprt Jeff Layton
2025-03-17 21:00 ` [PATCH RFC 9/9] sunrpc: don't upgrade passive net reference in xs_create_sock Jeff Layton
2025-03-17 21:28 ` Trond Myklebust
2025-03-17 21:36 ` Jeff Layton
2025-03-17 21:37 ` Trond Myklebust
2025-03-17 21:41 ` Jeff Layton
2025-03-17 21:35 ` [PATCH RFC 0/9] nfs/sunrpc: stop holding netns references in client-side NFS and RPC objects Trond Myklebust
2025-03-17 21:57 ` Jeff Layton
2025-03-17 22:11 ` Trond Myklebust
2025-03-18 11:30 ` Jeff Layton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d05913314baff82463eb67a26605a16659dc1862.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=bcodding@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=netdev@vger.kernel.org \
--cc=okorniev@redhat.com \
--cc=pabeni@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox