From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>, NeilBrown <neil@brown.name>,
linux-nfs@vger.kernel.org
Subject: Re: unable to run NFSD in container if "options sunrpc pool_mode=pernode"
Date: Fri, 23 May 2025 19:09:27 -0400 [thread overview]
Message-ID: <aDEAJzELBTH0CqHI@kernel.org> (raw)
In-Reply-To: <6bb9e9cce27e2a222bf55e272d690aab8f0eef13.camel@kernel.org>
On Fri, May 23, 2025 at 06:40:45PM -0400, Jeff Layton wrote:
> On Fri, 2025-05-23 at 18:19 -0400, Mike Snitzer wrote:
> > On Fri, May 23, 2025 at 02:40:17PM -0400, Jeff Layton wrote:
> > > On Fri, 2025-05-23 at 14:29 -0400, Mike Snitzer wrote:
> > > > I don't know if $SUBJECT ever worked... but with latest 6.15 or
> > > > nfsd-testing if I just use pool_mode=global then all is fine.
> > > >
> > > > If pool_mode=pernode then mounting the container's NFSv3 export fails.
> > > >
> > > > I haven't started to dig into code yet but pool_mode=pernode works
> > > > perfectly fine if NFSD isn't running in a container.
> > > >
>
> Oops, I went and looked and nfsd isn't running in a container on these
> boxes. There are some other containerized apps running on the box, but
> nfsd isn't running in a container.
OK.
> > > > ps. yet another reason why pool_mode=pernode should be the default if
> > > > more than 1 NUMA node ;)
> > >
> > > Huh, strange. I've no idea why that would be. What kernel is this?
> >
> > It is this 6.12.24 based frankenbeast-ish kernel:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=kernel-6.12.24/main-testing
> >
> > Basically just 6.12.24 + NFS and NFSD sync'd through nfs-testing and
> > nfsd-testing (so 6.15 NFS and NFSD going on 6.16).
> >
> > But I also just verified that this kernel built on Chuck's
> > nfsd-testing branch (with 2 extra patches) has the same issue:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=cel-nfsd-testing-6.16
> >
> > Here is the NFS related config:
> >
> > CONFIG_NETWORK_FILESYSTEMS=y
> > CONFIG_NFS_FS=m
> > # CONFIG_NFS_V2 is not set
> > CONFIG_NFS_V3=m
> > CONFIG_NFS_V3_ACL=y
> > CONFIG_NFS_V4=m
> > # CONFIG_NFS_SWAP is not set
> > CONFIG_NFS_V4_1=y
> > CONFIG_NFS_V4_2=y
> > CONFIG_PNFS_FILE_LAYOUT=m
> > CONFIG_PNFS_BLOCK=m
> > CONFIG_PNFS_FLEXFILE_LAYOUT=m
> > CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
> > # CONFIG_NFS_V4_1_MIGRATION is not set
> > CONFIG_NFS_V4_SECURITY_LABEL=y
> > CONFIG_NFS_FSCACHE=y
> > # CONFIG_NFS_USE_LEGACY_DNS is not set
> > CONFIG_NFS_USE_KERNEL_DNS=y
> > CONFIG_NFS_DEBUG=y
> > CONFIG_NFS_DISABLE_UDP_SUPPORT=y
> > # CONFIG_NFS_V4_2_READ_PLUS is not set
> > CONFIG_NFSD=m
> > # CONFIG_NFSD_V2 is not set
> > CONFIG_NFSD_V3_ACL=y
> > CONFIG_NFSD_V4=y
> > CONFIG_NFSD_PNFS=y
> > # CONFIG_NFSD_BLOCKLAYOUT is not set
> > CONFIG_NFSD_SCSILAYOUT=y
> > # CONFIG_NFSD_FLEXFILELAYOUT is not set
> > # CONFIG_NFSD_V4_2_INTER_SSC is not set
> > CONFIG_NFSD_V4_SECURITY_LABEL=y
> > # CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
> > # CONFIG_NFSD_V4_DELEG_TIMESTAMPS is not set
> > CONFIG_GRACE_PERIOD=m
> > CONFIG_LOCKD=m
> > CONFIG_LOCKD_V4=y
> > CONFIG_NFS_ACL_SUPPORT=m
> > CONFIG_NFS_COMMON=y
> > CONFIG_NFS_COMMON_LOCALIO_SUPPORT=m
> > CONFIG_NFS_LOCALIO=y
> > CONFIG_NFS_V4_2_SSC_HELPER=y
> > CONFIG_SUNRPC=m
> > CONFIG_SUNRPC_GSS=m
> > CONFIG_SUNRPC_BACKCHANNEL=y
> > CONFIG_RPCSEC_GSS_KRB5=m
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2=y
> > CONFIG_SUNRPC_DEBUG=y
> > CONFIG_SUNRPC_XPRT_RDMA=m
> >
> > > FWIW, I just built a localio-enabled on a v6.12-uek kernel for our own
> > > purposes yesterday and it's running pool_mode=pernode. It seemed to
> > > work fine as a v3 DS, but I didn't test mounting the container's export
> > > directly.
> >
> > OK, but you were able to access the v3 DS just fine (assuming pNFS
> > flexfiles layouts that point to your DS that is running NFSD in a
> > container) ?
> >
> > I'm using nfs-utils-2.8.2. I don't see any nfsd threads running if I
> > use "options sunrpc pool_mode=pernode".
> >
>
> I'll have a look soon, but if you figure it out in the meantime, let us
> know.
Will do.
Just the latest info I have, with sunrpc's pool_mode=pernode dd hangs
with this stack trace:
# cat /proc/8087/stack
[<0>] rpc_wait_bit_killable+0x25/0x80 [sunrpc]
[<0>] __rpc_execute+0x151/0x480 [sunrpc]
[<0>] rpc_execute+0xca/0xf0 [sunrpc]
[<0>] rpc_run_task+0x110/0x180 [sunrpc]
[<0>] nfs4_call_sync_custom+0xb/0x30 [nfsv4]
[<0>] nfs4_do_call_sync+0x69/0x90 [nfsv4]
[<0>] _nfs4_proc_getattr+0x128/0x160 [nfsv4]
[<0>] nfs4_proc_getattr+0x73/0x100 [nfsv4]
[<0>] nfs4_do_open+0x775/0x9d0 [nfsv4]
[<0>] nfs4_atomic_open+0xf7/0x100 [nfsv4]
[<0>] nfs_atomic_open+0x1e7/0x6c0 [nfs]
[<0>] path_openat+0xd38/0x11f0
[<0>] do_filp_open+0xae/0x120
[<0>] do_sys_openat2+0x24d/0x2a0
[<0>] do_sys_open+0x4f/0x90
[<0>] do_syscall_64+0x7b/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
And if I just try to mount using v3 it fails with:
# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:52:04 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: requested NFS version or transport protocol is not supported for /mnt/test
# rpcinfo -p 10.200.80.89
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 20048 mountd
100005 1 tcp 20048 mountd
100005 2 udp 20048 mountd
100005 2 tcp 20048 mountd
100024 1 udp 45252 status
100024 1 tcp 60557 status
100005 3 udp 20048 mountd
100005 3 tcp 20048 mountd
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 3 tcp 2049 nfs_acl
100021 1 udp 40987 nlockmgr
100021 3 udp 40987 nlockmgr
100021 4 udp 40987 nlockmgr
100021 1 tcp 36527 nlockmgr
100021 3 tcp 36527 nlockmgr
100021 4 tcp 36527 nlockmgr
(Not sure what's up with portmap issues and it not progressing to
trying program 100005.. which as you can see below it does)
But if I just use sunrpc's default pool_mode=global:
# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:55:43 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.200.80.89 prog 100005 vers 3 prot UDP port 20048
# rpcinfo -p 10.200.80.89
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 54037 status
100024 1 tcp 46339 status
100005 1 udp 20048 mountd
100005 1 tcp 20048 mountd
100005 2 udp 20048 mountd
100005 2 tcp 20048 mountd
100005 3 udp 20048 mountd
100005 3 tcp 20048 mountd
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 3 tcp 2049 nfs_acl
100021 1 udp 36268 nlockmgr
100021 3 udp 36268 nlockmgr
100021 4 udp 36268 nlockmgr
100021 1 tcp 44195 nlockmgr
100021 3 tcp 44195 nlockmgr
100021 4 tcp 44195 nlockmgr
next prev parent reply other threads:[~2025-05-23 23:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-23 18:29 unable to run NFSD in container if "options sunrpc pool_mode=pernode" Mike Snitzer
2025-05-23 18:40 ` Jeff Layton
2025-05-23 22:19 ` Mike Snitzer
2025-05-23 22:38 ` Mike Snitzer
2025-05-23 22:40 ` Jeff Layton
2025-05-23 23:09 ` Mike Snitzer [this message]
2025-05-24 3:53 ` Mike Snitzer
2025-05-24 10:26 ` Jeff Layton
2025-05-24 12:05 ` Jeff Layton
2025-05-24 14:33 ` Mike Snitzer
2025-05-24 15:10 ` Jeff Layton
2025-05-27 13:50 ` Jeff Layton
2025-05-27 21:59 ` Jeff Layton
2025-06-13 12:32 ` Jeff Layton
2025-05-23 18:40 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aDEAJzELBTH0CqHI@kernel.org \
--to=snitzer@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.