All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>, NeilBrown <neil@brown.name>,
	linux-nfs@vger.kernel.org
Subject: Re: unable to run NFSD in container if "options sunrpc pool_mode=pernode"
Date: Fri, 23 May 2025 19:09:27 -0400	[thread overview]
Message-ID: <aDEAJzELBTH0CqHI@kernel.org> (raw)
In-Reply-To: <6bb9e9cce27e2a222bf55e272d690aab8f0eef13.camel@kernel.org>

On Fri, May 23, 2025 at 06:40:45PM -0400, Jeff Layton wrote:
> On Fri, 2025-05-23 at 18:19 -0400, Mike Snitzer wrote:
> > On Fri, May 23, 2025 at 02:40:17PM -0400, Jeff Layton wrote:
> > > On Fri, 2025-05-23 at 14:29 -0400, Mike Snitzer wrote:
> > > > I don't know if $SUBJECT ever worked... but with latest 6.15 or
> > > > nfsd-testing if I just use pool_mode=global then all is fine.
> > > > 
> > > > If pool_mode=pernode then mounting the container's NFSv3 export fails.
> > > > 
> > > > I haven't started to dig into code yet but pool_mode=pernode works
> > > > perfectly fine if NFSD isn't running in a container.
> > > > 
> 
> Oops, I went and looked and nfsd isn't running in a container on these
> boxes. There are some other containerized apps running on the box, but
> nfsd isn't running in a container.

OK.

> > > > ps. yet another reason why pool_mode=pernode should be the default if
> > > > more than 1 NUMA node ;)
> > > 
> > > Huh, strange. I've no idea why that would be. What kernel is this?
> > 
> > It is this 6.12.24 based frankenbeast-ish kernel:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=kernel-6.12.24/main-testing
> > 
> > Basically just 6.12.24 + NFS and NFSD sync'd through nfs-testing and
> > nfsd-testing (so 6.15 NFS and NFSD going on 6.16).
> > 
> > But I also just verified that this kernel built on Chuck's
> > nfsd-testing branch (with 2 extra patches) has the same issue:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=cel-nfsd-testing-6.16
> > 
> > Here is the NFS related config:
> > 
> > CONFIG_NETWORK_FILESYSTEMS=y
> > CONFIG_NFS_FS=m
> > # CONFIG_NFS_V2 is not set
> > CONFIG_NFS_V3=m
> > CONFIG_NFS_V3_ACL=y
> > CONFIG_NFS_V4=m
> > # CONFIG_NFS_SWAP is not set
> > CONFIG_NFS_V4_1=y
> > CONFIG_NFS_V4_2=y
> > CONFIG_PNFS_FILE_LAYOUT=m
> > CONFIG_PNFS_BLOCK=m
> > CONFIG_PNFS_FLEXFILE_LAYOUT=m
> > CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
> > # CONFIG_NFS_V4_1_MIGRATION is not set
> > CONFIG_NFS_V4_SECURITY_LABEL=y
> > CONFIG_NFS_FSCACHE=y
> > # CONFIG_NFS_USE_LEGACY_DNS is not set
> > CONFIG_NFS_USE_KERNEL_DNS=y
> > CONFIG_NFS_DEBUG=y
> > CONFIG_NFS_DISABLE_UDP_SUPPORT=y
> > # CONFIG_NFS_V4_2_READ_PLUS is not set
> > CONFIG_NFSD=m
> > # CONFIG_NFSD_V2 is not set
> > CONFIG_NFSD_V3_ACL=y
> > CONFIG_NFSD_V4=y
> > CONFIG_NFSD_PNFS=y
> > # CONFIG_NFSD_BLOCKLAYOUT is not set
> > CONFIG_NFSD_SCSILAYOUT=y
> > # CONFIG_NFSD_FLEXFILELAYOUT is not set
> > # CONFIG_NFSD_V4_2_INTER_SSC is not set
> > CONFIG_NFSD_V4_SECURITY_LABEL=y
> > # CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
> > # CONFIG_NFSD_V4_DELEG_TIMESTAMPS is not set
> > CONFIG_GRACE_PERIOD=m
> > CONFIG_LOCKD=m
> > CONFIG_LOCKD_V4=y
> > CONFIG_NFS_ACL_SUPPORT=m
> > CONFIG_NFS_COMMON=y
> > CONFIG_NFS_COMMON_LOCALIO_SUPPORT=m
> > CONFIG_NFS_LOCALIO=y
> > CONFIG_NFS_V4_2_SSC_HELPER=y
> > CONFIG_SUNRPC=m
> > CONFIG_SUNRPC_GSS=m
> > CONFIG_SUNRPC_BACKCHANNEL=y
> > CONFIG_RPCSEC_GSS_KRB5=m
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2=y
> > CONFIG_SUNRPC_DEBUG=y
> > CONFIG_SUNRPC_XPRT_RDMA=m
> > 
> > > FWIW, I just built a localio-enabled on a v6.12-uek kernel for our own
> > > purposes yesterday and it's running pool_mode=pernode. It seemed to
> > > work fine as a v3 DS, but I didn't test mounting the container's export
> > > directly.
> > 
> > OK, but you were able to access the v3 DS just fine (assuming pNFS
> > flexfiles layouts that point to your DS that is running NFSD in a
> > container) ?
> > 
> > I'm using nfs-utils-2.8.2.  I don't see any nfsd threads running if I
> > use "options sunrpc pool_mode=pernode".
> > 
> 
> I'll have a look soon, but if you figure it out in the meantime, let us
> know.

Will do.

Just the latest info I have, with sunrpc's pool_mode=pernode dd hangs
with this stack trace:

# cat /proc/8087/stack
[<0>] rpc_wait_bit_killable+0x25/0x80 [sunrpc]
[<0>] __rpc_execute+0x151/0x480 [sunrpc]
[<0>] rpc_execute+0xca/0xf0 [sunrpc]
[<0>] rpc_run_task+0x110/0x180 [sunrpc]
[<0>] nfs4_call_sync_custom+0xb/0x30 [nfsv4]
[<0>] nfs4_do_call_sync+0x69/0x90 [nfsv4]
[<0>] _nfs4_proc_getattr+0x128/0x160 [nfsv4]
[<0>] nfs4_proc_getattr+0x73/0x100 [nfsv4]
[<0>] nfs4_do_open+0x775/0x9d0 [nfsv4]
[<0>] nfs4_atomic_open+0xf7/0x100 [nfsv4]
[<0>] nfs_atomic_open+0x1e7/0x6c0 [nfs]
[<0>] path_openat+0xd38/0x11f0
[<0>] do_filp_open+0xae/0x120
[<0>] do_sys_openat2+0x24d/0x2a0
[<0>] do_sys_open+0x4f/0x90
[<0>] do_syscall_64+0x7b/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

And if I just try to mount using v3 it fails with:

# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:52:04 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: requested NFS version or transport protocol is not supported for /mnt/test

# rpcinfo -p 10.200.80.89
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100005    2   udp  20048  mountd
    100005    2   tcp  20048  mountd
    100024    1   udp  45252  status
    100024    1   tcp  60557  status
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100021    1   udp  40987  nlockmgr
    100021    3   udp  40987  nlockmgr
    100021    4   udp  40987  nlockmgr
    100021    1   tcp  36527  nlockmgr
    100021    3   tcp  36527  nlockmgr
    100021    4   tcp  36527  nlockmgr

(Not sure what's up with portmap issues and it not progressing to
trying program 100005.. which as you can see below it does)

But if I just use sunrpc's default pool_mode=global:

# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:55:43 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.200.80.89 prog 100005 vers 3 prot UDP port 20048

# rpcinfo -p 10.200.80.89
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  54037  status
    100024    1   tcp  46339  status
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100005    2   udp  20048  mountd
    100005    2   tcp  20048  mountd
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100021    1   udp  36268  nlockmgr
    100021    3   udp  36268  nlockmgr
    100021    4   udp  36268  nlockmgr
    100021    1   tcp  44195  nlockmgr
    100021    3   tcp  44195  nlockmgr
    100021    4   tcp  44195  nlockmgr

  reply	other threads:[~2025-05-23 23:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-23 18:29 unable to run NFSD in container if "options sunrpc pool_mode=pernode" Mike Snitzer
2025-05-23 18:40 ` Jeff Layton
2025-05-23 22:19   ` Mike Snitzer
2025-05-23 22:38     ` Mike Snitzer
2025-05-23 22:40     ` Jeff Layton
2025-05-23 23:09       ` Mike Snitzer [this message]
2025-05-24  3:53         ` Mike Snitzer
2025-05-24 10:26           ` Jeff Layton
2025-05-24 12:05           ` Jeff Layton
2025-05-24 14:33             ` Mike Snitzer
2025-05-24 15:10               ` Jeff Layton
2025-05-27 13:50               ` Jeff Layton
2025-05-27 21:59                 ` Jeff Layton
2025-06-13 12:32               ` Jeff Layton
2025-05-23 18:40 ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aDEAJzELBTH0CqHI@kernel.org \
    --to=snitzer@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.