[RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring

Linux NFS development
 help / color / mirror / Atom feed

* [RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring
@ 2026-06-02 15:47 Chuck Lever
  2026-06-03  1:39 ` Hannes Reinecke
  0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever @ 2026-06-02 15:47 UTC (permalink / raw)
  To: linux-nfs
  Cc: keyrings, kernel-tls-handshake, netdev, Trond Myklebust,
	Anna Schumaker, Christoph Hellwig, Hannes Reinecke, David Howells,
	Jarkko Sakkinen, Sagi Grimberg

Today, exactly one x.509 certificate and private key pair can be
used at a time for all NFS mounts. The location of that pair is
set in /etc/tlshd/config.

We currently have an awkward experimental mechanism for specifying
an alternative x.509 certificate and private key for an xprtsec=mtls
NFS mount, but it needs to be completed so it can be documented and
advertised for use.

I asked Claude to write a rough draft of a design document that
outlines what needs to be done to finish the work. I would like
input on the kernel-side mechanism in particular for the
per-network-namespace keyring and the way userspace reaches it.

Problem
=======

NFS mutual-TLS mounts (xprtsec=mtls) need the client to present an
x.509 certificate and prove possession of its private key. The
handshake runs in userspace in tlshd; the kernel hands tlshd the
credentials by keyring serial number over the handshake genetlink
upcall.

The only front end today is two undocumented integer mount options:

    mount -o xprtsec=mtls,cert_serial=723847,privkey_serial=723848 \
          server:/export /mnt

The administrator must load the cert and key into the keyring out of
band, discover the integer serials, and paste them onto the command
line. Serials are opaque, non-reproducible across boots, and easy to
transpose. There is also no isolation: nfs_tls_key_verify() does a
global key_lookup() on the serial, and the .nfs keyring created in
fs/nfs/inode.c is module-global and never referenced again -- any tlshd
that learns a serial can read the key.

This RFC proposes a named, per-mount client-identity interface backed
by a provisioning CLI, and fixes the keyring to isolate credentials per
network namespace. The kernel handshake ABI (integer serials over
genetlink) does not change.

The cross-subsystem ask: a per-netns .nfs keyring
=================================================

Network namespace is the correct isolation domain. tlshd is bound to a
network namespace, not a user namespace: it services sockets passed up
from the kernel over the per-netns handshake genetlink socket, and one
tlshd runs per network namespace that needs TLS-protected mounts.

  - Replace the dead module-global .nfs keyring with one keyring per
    network namespace, held in struct nfs_net (fs/nfs/netns.h) and
    allocated at nfs_net_init(). The keys subsystem otherwise
    namespaces on user_namespace, so this is a kernel-held object
    referenced from nfs_net (like today's global keyring, but one per
    netns). The DNS resolver's per-netns key scoping (net->key_domain,
    request_key_net()) is precedent that netns-scoped key handling is
    acceptable.

  - tlshd attaches at handshake time, not at launch. This matters: the
    keyring may be empty or freshly created when tlshd starts, so
    linking it by name at startup is the wrong model. Instead NFS sets
    ta_keyring to the netns .nfs keyring serial in
    xs_tls_handshake_sync(), the kernel sends it as
    HANDSHAKE_A_ACCEPT_KEYRING, and tlshd links that serial into its
    session keyring per handshake -- the path tlshd already implements.
    Linking grants tlshd possession of the keyring and, through it, of
    the possessor-scoped cert and privkey keys.

  - Credential keys are created possessor-readable only (no
    KEY_USR_READ). That is what makes isolation enforceable rather
    than advisory: a key provisioned in namespace A is absent from B's
    keyring and unreadable by B's tlshd even if its serial leaks.

Open question, and where I most want input: userspace -- the
provisioning CLI and mount.nfs -- needs to name the kernel-held netns
keyring in order to add and search keys. Candidates, modeled on
KEYCTL_GET_PERSISTENT (security/keys/persistent.c):

  (a) a new keyctl command that links the caller's netns .nfs keyring
      into a destination keyring and returns its serial;
  (b) an NFS-specific request_key key type the module instantiates to
      point at the netns keyring;
  (c) a per-netns serial exported via procfs or netlink.

The per-netns keyring decision itself I consider settled; the retrieval
primitive is the open one. There is also a user_namespace accounting
nuance: keys added by userspace are quota-charged against a key_user
keyed by user_namespace even though the keyring lives in nfs_net. I
would like the keyrings folks to confirm the quota and ownership
interaction is sane when the user_ns and net_ns boundaries do not
coincide.

Userspace front end
===================

With the keyring in place the front end is straightforward and follows
the nvme-cli / cifscreds pattern.

A new nfs-utils tool -- working name nfstlskey, fitting the nfsidmap /
nfsconf family -- manages x.509 client identities:

    nfstlskey add  <identity> --cert cert.pem --key key.pem
    nfstlskey list
    nfstlskey remove <identity>

The add subcommand reads the PEM cert and key, converts each to DER,
and creates two "user" keys on the netns .nfs keyring ("user" because
tlshd consumes raw DER via keyctl_read_alloc()), with possessor-only
read. Description convention:

    nfs:x509:<identity>:cert
    nfs:x509:<identity>:privkey

The mount command names the identity:

    mount -o xprtsec=mtls,tls_identity=<identity> server:/export /mnt

mount.nfs runs in the caller's namespace, searches the .nfs keyring for
the two descriptions, and passes the existing cert_serial= and
privkey_serial= options to the kernel. tls_identity= is purely a
userspace convenience that resolves a name to the serials the kernel
already accepts; the raw serial options remain as a documented escape
hatch. Both get documented in nfs(5), with a new nfstlskey(8) page.

tlshd changes are minimal: confirm the per-handshake link of the passed
keyring happens before the cert and privkey serials are read, and
retire the now-unnecessary .nfs entry in the keyrings= startup path.

Future work: mTLS-protected NFSROOT
===================================

NFSROOT mounts run in the kernel at boot, in the initial network
namespace, before any userspace mount.nfs, nfstlskey, or possibly tlshd
exists. mTLS for NFSROOT therefore cannot rely on userspace
provisioning: the kernel must obtain the client cert and key itself --
the leading candidate is extracting key material from a TPM -- and
place it on the initial-netns .nfs keyring before the handshake, with a
tlshd available early (initramfs). The kernel-owned per-netns keyring
chosen here is a prerequisite: a userspace-created keyring could not be
populated at NFSROOT time. Out of scope for the initial work, but the
design must not foreclose it, and the same TPM-resident-key path would
inform an eventual PKCS#11/TPM identity-naming scheme.

Alternatives considered
=======================

  - request-key upcall keyed by server name (the nfsidmap model): most
    NFS-native and needs no mount option, but makes selection automatic
    rather than per-mount explicit. Worth revisiting if per-server
    auto-selection is ever wanted.

  - File paths in the mount options (strongSwan / wpa_supplicant model):
    forces mount.nfs to read key files and load the keyring on every
    mount and offers no reusable provisioned identity.

  - Document the raw serials only: does nothing for namespacing.

Comments welcome, particularly on the keyring retrieval primitive and
whether network namespace is the binding you would expect.

--
Chuck Lever

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring
  2026-06-02 15:47 [RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring Chuck Lever
@ 2026-06-03  1:39 ` Hannes Reinecke
  2026-06-03 14:27   ` Chuck Lever
  0 siblings, 1 reply; 3+ messages in thread
From: Hannes Reinecke @ 2026-06-03  1:39 UTC (permalink / raw)
  To: Chuck Lever, linux-nfs
  Cc: keyrings, kernel-tls-handshake, netdev, Trond Myklebust,
	Anna Schumaker, Christoph Hellwig, David Howells, Jarkko Sakkinen,
	Sagi Grimberg

On 6/2/26 17:47, Chuck Lever wrote:
> Today, exactly one x.509 certificate and private key pair can be
> used at a time for all NFS mounts. The location of that pair is
> set in /etc/tlshd/config.
> 
> We currently have an awkward experimental mechanism for specifying
> an alternative x.509 certificate and private key for an xprtsec=mtls
> NFS mount, but it needs to be completed so it can be documented and
> advertised for use.
> 
> I asked Claude to write a rough draft of a design document that
> outlines what needs to be done to finish the work. I would like
> input on the kernel-side mechanism in particular for the
> per-network-namespace keyring and the way userspace reaches it.
> 
> 
> Problem
> =======
> 
> NFS mutual-TLS mounts (xprtsec=mtls) need the client to present an
> x.509 certificate and prove possession of its private key. The
> handshake runs in userspace in tlshd; the kernel hands tlshd the
> credentials by keyring serial number over the handshake genetlink
> upcall.
> 
> The only front end today is two undocumented integer mount options:
> 
>      mount -o xprtsec=mtls,cert_serial=723847,privkey_serial=723848 \
>            server:/export /mnt
> 
> The administrator must load the cert and key into the keyring out of
> band, discover the integer serials, and paste them onto the command
> line. Serials are opaque, non-reproducible across boots, and easy to
> transpose. There is also no isolation: nfs_tls_key_verify() does a
> global key_lookup() on the serial, and the .nfs keyring created in
> fs/nfs/inode.c is module-global and never referenced again -- any tlshd
> that learns a serial can read the key.
> 
> This RFC proposes a named, per-mount client-identity interface backed
> by a provisioning CLI, and fixes the keyring to isolate credentials per
> network namespace. The kernel handshake ABI (integer serials over
> genetlink) does not change.
> 
> 
> The cross-subsystem ask: a per-netns .nfs keyring
> =================================================
> 
> Network namespace is the correct isolation domain. tlshd is bound to a
> network namespace, not a user namespace: it services sockets passed up
> from the kernel over the per-netns handshake genetlink socket, and one
> tlshd runs per network namespace that needs TLS-protected mounts.
> 
>    - Replace the dead module-global .nfs keyring with one keyring per
>      network namespace, held in struct nfs_net (fs/nfs/netns.h) and
>      allocated at nfs_net_init(). The keys subsystem otherwise
>      namespaces on user_namespace, so this is a kernel-held object
>      referenced from nfs_net (like today's global keyring, but one per
>      netns). The DNS resolver's per-netns key scoping (net->key_domain,
>      request_key_net()) is precedent that netns-scoped key handling is
>      acceptable.
> 
>    - tlshd attaches at handshake time, not at launch. This matters: the
>      keyring may be empty or freshly created when tlshd starts, so
>      linking it by name at startup is the wrong model. Instead NFS sets
>      ta_keyring to the netns .nfs keyring serial in
>      xs_tls_handshake_sync(), the kernel sends it as
>      HANDSHAKE_A_ACCEPT_KEYRING, and tlshd links that serial into its
>      session keyring per handshake -- the path tlshd already implements.
>      Linking grants tlshd possession of the keyring and, through it, of
>      the possessor-scoped cert and privkey keys.
> 
>    - Credential keys are created possessor-readable only (no
>      KEY_USR_READ). That is what makes isolation enforceable rather
>      than advisory: a key provisioned in namespace A is absent from B's
>      keyring and unreadable by B's tlshd even if its serial leaks.
> 
> Open question, and where I most want input: userspace -- the
> provisioning CLI and mount.nfs -- needs to name the kernel-held netns
> keyring in order to add and search keys. Candidates, modeled on
> KEYCTL_GET_PERSISTENT (security/keys/persistent.c):
> 
>    (a) a new keyctl command that links the caller's netns .nfs keyring
>        into a destination keyring and returns its serial;
>    (b) an NFS-specific request_key key type the module instantiates to
>        point at the netns keyring;
>    (c) a per-netns serial exported via procfs or netlink.
> 
> The per-netns keyring decision itself I consider settled; the retrieval
> primitive is the open one. There is also a user_namespace accounting
> nuance: keys added by userspace are quota-charged against a key_user
> keyed by user_namespace even though the keyring lives in nfs_net. I
> would like the keyrings folks to confirm the quota and ownership
> interaction is sane when the user_ns and net_ns boundaries do not
> coincide.
>

I am all for making keyrings namespace-aware. Logically I _think_ they
should be tagged per user-namespace, as this really is about the 
filesystem (and as such would warrant to be tagged per mount ns).
Tagging it per net-namespace is not a great fit (well, for me, at 
least), as also block devices might require keys to present the
bdev (eg nvme authentication)

I might be okay to have it tagged per net-namespace, though, as all
current users are in some shape or form being network related.
But I'm not sure if that stays that way, so I am worried if we're
not restricting ourselves to much by that choice.
As really, the question is: what is the driving the namespace selection?
Is it the _requesting_ layer, ie the layer issuing the mount() call?
Or is it the _providing_ layer, ie the layer providing the 
devices/interfaces where the mount() call is operating on?
If it's the former, then we need to tag is as
net-namespace. If it's the latter, then we need to tag it as a
user-namespace / mount-ns.

We should probably ask Christian ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring
  2026-06-03  1:39 ` Hannes Reinecke
@ 2026-06-03 14:27   ` Chuck Lever
  0 siblings, 0 replies; 3+ messages in thread
From: Chuck Lever @ 2026-06-03 14:27 UTC (permalink / raw)
  To: Hannes Reinecke, linux-nfs
  Cc: keyrings, kernel-tls-handshake, netdev, Trond Myklebust,
	Anna Schumaker, Christoph Hellwig, David Howells, Jarkko Sakkinen,
	Sagi Grimberg


On Tue, Jun 2, 2026, at 6:39 PM, Hannes Reinecke wrote:
> I am all for making keyrings namespace-aware. Logically I _think_ they
> should be tagged per user-namespace, as this really is about the 
> filesystem (and as such would warrant to be tagged per mount ns).
> Tagging it per net-namespace is not a great fit (well, for me, at 
> least), as also block devices might require keys to present the
> bdev (eg nvme authentication)

My understanding of the proposal is that there is one keyring on the
system for .nfs and the keys in it are visible only in the namespace
where they were created.

Therefore the consumer (say, NFS, or NFSD) is running in a particular
network namespace. It will create keys on the one .nfs keyring, but
only the tlshd in that same network namespace will have access to
those keys.


> I might be okay to have it tagged per net-namespace, though, as all
> current users are in some shape or form being network related.
> But I'm not sure if that stays that way, so I am worried if we're
> not restricting ourselves to much by that choice.
> As really, the question is: what is the driving the namespace selection?
> Is it the _requesting_ layer, ie the layer issuing the mount() call?
> Or is it the _providing_ layer, ie the layer providing the 
> devices/interfaces where the mount() call is operating on?
> If it's the former, then we need to tag is as
> net-namespace. If it's the latter, then we need to tag it as a
> user-namespace / mount-ns.

tlshd is a network layer service, so it doesn't make sense to bind
it to a user or mount namespace, IMHO.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-03 14:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 15:47 [RFC] NFS: named client identities for mTLS mounts and a per-namespace .nfs keyring Chuck Lever
2026-06-03  1:39 ` Hannes Reinecke
2026-06-03 14:27   ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox