The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
       [not found]     ` <20260423.yipaikooJ6oo@digikod.net>
@ 2026-05-08 15:13       ` Günther Noack
  0 siblings, 0 replies; 3+ messages in thread
From: Günther Noack @ 2026-05-08 15:13 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Christian Brauner, Paul Moore,
	Serge E . Hallyn, Justin Suess, Lennart Poettering,
	Mikhail Ivanov, Nicolas Bouchinet, Shervin Oloumi, Tingmao Wang,
	kernel-team, linux-fsdevel, linux-kernel, linux-security-module

On Thu, Apr 23, 2026 at 03:52:12PM +0200, Mickaël Salaün wrote:
> On Wed, Apr 22, 2026 at 10:38:33PM +0200, Günther Noack wrote:
> > Hello!
> > 
> > On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > > Document the two new Landlock permission categories in the userspace
> > > API guide, admin guide, and kernel security documentation.
> > > 
> > > The userspace API guide adds sections on capability restriction
> > > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > > covering creation via unshare/clone and entry via setns), and the
> > > backward-compatible degradation pattern for ABI < 9.  A table documents
> > > the per-namespace-type capability requirements for both creation and
> > > entry.
> > > 
> > > The admin guide adds the new perm.namespace_enter and
> > > perm.capability_use audit blocker names with their object identification
> > > fields (namespace_type, namespace_inum, capability).
> > > 
> > > The kernel security documentation adds a "Ruleset restriction models"
> > > section defining the three models (handled_access_*, handled_perm,
> > > scoped), their coverage and compatibility properties, and the criteria
> > > for choosing between them for future features.  It also documents
> > > composability with user namespaces and adds kernel-doc references for
> > > the new capability and namespace headers.
> > > 
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Günther Noack <gnoack@google.com>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > ---
> > >  Documentation/admin-guide/LSM/landlock.rst |  19 ++-
> > >  Documentation/security/landlock.rst        |  80 ++++++++++-
> > >  Documentation/userspace-api/landlock.rst   | 156 ++++++++++++++++++++-
> > >  3 files changed, 245 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > > index 9923874e2156..99c6a599ce9e 100644
> > > --- a/Documentation/admin-guide/LSM/landlock.rst
> > > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > >  ================================
> > >  
> > >  :Author: Mickaël Salaün
> > > -:Date: January 2026
> > > +:Date: March 2026
> > >  
> > >  Landlock can leverage the audit framework to log events.
> > >  
> > > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > >          - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > >          - scope.signal - Signal sending denied
> > >  
> > > +    **perm.*** - Permission restrictions (ABI 9+):
> > > +        - perm.namespace_enter - Namespace entry was denied (creation via
> > > +          :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > > +          :manpage:`setns(2)`);
> > > +          ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > > +          ``namespace_inum`` identifies the target namespace for
> > > +          :manpage:`setns(2)` operations
> > > +        - perm.capability_use - Capability use was denied;
> > > +          ``capability`` indicates the capability number
> > > +
> > >      Multiple blockers can appear in a single event (comma-separated) when
> > >      multiple access rights are missing. For example, creating a regular file
> > >      in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > >      ``blockers=fs.make_reg,fs.refer``.
> > >  
> > > -    The object identification fields (path, dev, ino for filesystem; opid,
> > > -    ocomm for signals) depend on the type of access being blocked and provide
> > > -    context about what resource was involved in the denial.
> > > +    The object identification fields depend on the type of access being blocked:
> > > +    ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > > +    ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > > +    ``capability`` for capability use.
> > >  
> > >  
> > >  AUDIT_LANDLOCK_DOMAIN
> > > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > > --- a/Documentation/security/landlock.rst
> > > +++ b/Documentation/security/landlock.rst
> > > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > >  ==================================
> > >  
> > >  :Author: Mickaël Salaün
> > > -:Date: September 2025
> > > +:Date: March 2026
> > >  
> > >  Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
> > >  harden a whole system, this feature should be available to any process,
> > > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > >  this avoids unattended bypasses through file descriptor passing (i.e. confused
> > >  deputy attack).
> > >  
> > > +Composability with user namespaces
> > > +----------------------------------
> > > +
> > > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > > +scoping enforce isolation over independent hierarchies.  Landlock checks domain
> > > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry.  These
> > > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > > +to its own configuration, regardless of namespace or capability state, and vice
> > > +versa.  This orthogonality is a design invariant that must hold for all new
> > > +scoped features.
> > > +
> > > +Ruleset restriction models
> > > +--------------------------
> > 
> > I have to second Justin, it's a good idea to introduce this explanation.
> > 
> > > +
> > > +Landlock provides three restriction models, each with different coverage
> > > +and compatibility properties.
> > > +
> > > +Access rights (``handled_access_*``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Access rights control **enumerated operations on kernel objects**
> > > +identified by a rule key (a file hierarchy or a network port).  Each
> > > +``handled_access_*`` field declares a set of access rights that the
> > > +ruleset restricts.  Multiple access rights share a single rule type.
> > > +Operations for which no access right exists yet remain uncontrolled;
> > > +new rights are added incrementally across ABI versions.
> > > +
> > > +Permissions (``handled_perm``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Permissions control **broad operations enforced at single kernel
> > > +chokepoints**, achieving complete deny-by-default coverage.  Each
> > > +``LANDLOCK_PERM_*`` flag maps to its own rule type.  When a ruleset
> > > +handles a permission, all instances of that operation are denied unless
> > > +explicitly allowed by a rule.  New kernel values (new ``CAP_*``
> > > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > > +denied without any Landlock update.
> > 
> > I find the terminology of "chokepoints" and "gateways" in this and the
> > header documentation a bit vague; you could argue that opening a file
> > for reading is also a chokepoint/gateway for using read() later on;
> > it's not immediately clear to me how that's delineated.
> 
> Yeah, I wanted to express something wider that a fine-grained access
> right.  Any alternative words that would fit better?

I find it also difficult to explain.  A "critical enforcement point",
maybe?

     Permissions control **permission checks at critical enforcement
     points**, independent of individual kernel objects.  They guard
     critical features which are prerequisites for further access, such
     as entering namespaces and using capabilities, and do so in a
     deny-by-default manner (all namespace and capability types are
     denied without having to list these individually in the ruleset).

WDYT?

(FWIW, I also found the term "Policy Enforcement Point" on the web, but
that seems to be an Enterprise Software term which probably has more
specific meaning there; probably better to avoid that name.)
        
        
> > In my mind, the handled_* groups of access rights are usually defined
> > by the "namespace" of the objects they are protecting, more than
> > anything else: handled_access_fs: file paths, handled_access_net:
> > struct sockaddr (which we only expose as "port" for now).
> > 
> > To play the devil's advocate, a possible alternative would have been
> > to introduce:
> > 
> >   handled_access_ns with values LANDLOCK_ACCESS_NS_FOO_ENTER,
> >   LANDLOCK_ACCESS_NS_BAR_ENTER, etc. (and documenting somewhere that
> >   these are guaranteed to stay in sync; a static assert is enough to
> >   make sure they do).
> 
> That was actually one of my initial version, but I couldn't find any
> meaning ful other access rights that would both be useful for the
> sandboxing use case and worth the implementation.  At the end I
> concluded that we needed "ambiant" access rights for things that are not
> really tied to existing kernel objects, and to be able to fully express
> current and future properties, hence using non-Landlock UAPI
> (capabilities, namespace types...).  The handled_perm name was the less
> ambiguous one I could find, which still make sense.
> 
> Another important property is that the permissions rules don't have
> access rights, only *one* permission bit which could be removed.  I
> choose to keep it as a safeguard (for UAPI check) and to still be able
> to add new ones for such rule if one day we really find a useful use
> case.  Anyway, it's basically free.

Yes, sounds fair.  I also think these two points are the crucial ones
here, namely (a) it's not specific to a kernel object, and (b) the
deny-by-default property (you don't need to list out all the types in
the ruleset to block them all).  (My suggested rephrasing above talks
about these too.)


> >   handled_access_caps with values LANDLOCK_ACCESS_CAPS_USE_FOO,
> >   LANDLOCK_ACCESS_CAPS_USE_BAR, etc., also guaranteed to stay in sync.
> 
> Genuine question: what would be these FOO and BAR?  I couldn't find
> anything worth it.  The idea is to have a simple interface.  In fact,
> initially I didn't have these suffixes (i.e. _USE, _ENTER), and they are
> not really needed, but these are also safeguards in the case we would
> need one, and the main motivation is to make the semantic clear to
> users (and more consistent with other Landlock access rights).

By "FOO" and "BAR" I meant to imply the different capabilities, e.g.,
LANDLOCK_ACCESS_CAP_USE_AUDIT_CONTROL,
LANDLOCK_ACCESS_CAP_USE_AUDIT_READ, LANDLOCK_ACCESS_CAP_USE_AUDIT_WRITE,
LANDLOCK_ACCESS_CAP_USE_BLOCK_SUSPEND, etc.

> > That way the blocked accesses would still be "operations", and we
> > would not need to have rules for them because the "object" being
> > protected are the processes within the Landlock domain, so to say.
> 
> I'm not sure to understand, but an (also) previous version was to just
> put the capability (and namespace type) bits directly in the ruleset
> struct.  The issue with this approach is that it doesn't work well with
> a deny-by-default enforcement, and this would not be extensible, and
> this would not handle well compatibility (fields set to zero by
> default).
> 
> > 
> > Arguably, the LANDLOCK_ACCESS_FS_MAKE_* rights already follow a
> > similar pattern.
> 
> Hmm, I'm not following.

What I meant is that these are "rolled out" in a similar way to my
LANDLOCK_ACCESS_CAP_USE_... examples above, because they list the
different file types in LANDLOCK_ACCESS_FS_MAKE_CHAR, ..._MAKE_DIR,
..._MAKE_REG, ..._MAKE_SOCK, etc.


> > To be clear, I am myself only 50% convinced whether the API would be
> > better.  The implementation would be easier (but that doesn't count
> > much in comparison).
> > 
> > 
> > > +Each permission flag names a single gateway operation whose control
> > > +transitively covers an open-ended set of downstream operations: for
> > > +example, exercising a capability enables privileged operations across
> > > +many subsystems; entering a namespace enables gaining capabilities in a
> > > +new context.
> > > +
> > > +Permission rules identify what to allow using constants defined by other
> > > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``).  Unknown values are
> > > +silently ignored because deny-by-default ensures they are denied anyway.
> > > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > > +rejected (``-EINVAL``), since Landlock owns that namespace.
> > 
> > OK I played through the compatibility scenarios which puzzled me in my
> > reply to the cover letter, for both namespaces and capabilities.
> > Namespaces are OK, so I'm just including that for completeness and for
> > comparison, but I think the capabilities might be tricky?
> > 
> > 
> > Case A: Namespaces
> > 
> > In the scenario where a caller restricts
> > LANDLOCK_PERM_NAMESPACE_ENTER, but then adds a rule to allow a
> > non-existent namespace number like 1<<63.
> > 
> > Landlock ABI v9:
> > * The rule is accepted and the unknown value for the namespace type
> >   silently ignored
> > * It is not possible to enter the namespace because the namespace API
> >   doesn't exist for it.  (But that's appropriate.)
> 
> Yes, the namespace would just be unknown to the kernel, Landlock doesn't
> do anything here.
> 
> > 
> > Landlock ABI v_future (the namespace type 1<<63 exists now):
> > * The rule continues to be accepted.
> > * When trying to exercise the namespace type, it works.
> 
> It works because the kernel now know about this namespace.  Again,
> nothing related to Landlock specifically.
> 
> > 
> > It seems that this scenario works fine.  In the earlier version,
> > entering the namespace already doesn't work because the kernel doesn't
> > have support for it.
> > 
> > 
> > Case B: Capabilities
> > 
> > Whne new capabilities are introduced, I see that people have used the
> > pattern where these capabilities are split off from operations which
> > were previously controlled by CAP_SYS_ADMIN.  An example is commit
> > a17b53c4a4b5 ("bpf, capability: Introduce CAP_BPF"), which states:
> > 
> >   Split BPF operations that are allowed under CAP_SYS_ADMIN into
> >   combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN.  For backward
> >   compatibility include them in CAP_SYS_ADMIN as well.
> > 
> > (The same pattern was also used in the introduction of
> > CAP_CHECKPOINT_RESTORE and CAP_PERFMON.  CAP_AUDIT_READ is older and
> > did it differently.)
> 
> The key point here (and the architectural limitation) is that a new
> capability cannot completely replace an existing one.  The original
> capability check will remain forever.
> 
> > 
> > Let's say there is a frobnicate() syscall guarded by CAP_SYS_ADMIN.  A
> > future kernel introduces CAP_FOO and then checks for frobnicate() that
> > either one of CAP_FOO or CAP_SYS_ADMIN are present.
> > 
> > A caller creates a ruleset restricting capability use with Landlock,
> > and adds a rule to allow CAP_FOO but not CAP_SYS_ADMIN (e.g.,
> > ^CAP_SYS_ADMIN)
> > 
> > Landlock ABI v9:  (CAP_FOO doesn't exist)
> > * The rule for CAP_FOO is accepted and the unknown value for the
> >   capability silently ignored.
> > * The call to frobnicate() fails because the use of the capability is
> >   forbidden
> > 
> > Landlock ABI v10:  (CAP_FOO starts to exist)
> > * The rule continues to be accepted
> > * The call to frobnicate() **succeeds now**, because the new kernel guards
> >   the operation by either one of those capabilities.
> > 
> > 
> > So... for capabilities, it seems to be slightly incompatible if users
> > allow capabilities with a rule which are not known yet?  The reason
> > for that is the way how capabilities "fork off" from CAP_SYS_ADMIN.
> 
> The key point is that the compatibility is deferred to the other kernel
> subsystems.  User space need to know which capabilities (or namespace
> types) are supported before using them.  It's not a Landlock
> compatibility issue.

Fair enough, OK then.  Paraphrasing, to make sure we are aligned: If you
allow-list one of the newer capabilities through landlock_add_rule, and
then run your program on a kernel where that capability doesn't exist
yet, you can not expect that to work.  Seems fair.


> > I mean, I can see that it's a pretty fringe scenario if users pass
> > capabilities that don't exist yet, but it *is* strictly speaking an
> > incompatibiliy.  Should we check the range of the passed capabilities?
> > Am I overlooking any downsides to this if we force users to stay
> > between 0 and CAP_LAST_CAP?
> 
> Checking the range of known capabilities (or namespace types) could
> break the same Landlock rules on different kernels even if targeting the
> same Landlock ABI version, which would be much worse.  I definitely
> prefer to have idempotent/deterministic Landlock rules.

Hm, good point.  The list of supported capabilities can not be probed through
the Landlock ABI number.

—Günther

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH v1 05/11] landlock: Enforce namespace entry restrictions
       [not found] ` <20260312100444.2609563-6-mic@digikod.net>
@ 2026-05-08 15:46   ` Günther Noack
  0 siblings, 0 replies; 3+ messages in thread
From: Günther Noack @ 2026-05-08 15:46 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module

On Thu, Mar 12, 2026 at 11:04:38AM +0100, Mickaël Salaün wrote:
> Add Landlock enforcement for namespace entry via the LSM namespace_alloc
> and namespace_install hooks.  This lets a sandboxed process restrict
> which namespace types it can acquire, using
> LANDLOCK_PERM_NAMESPACE_ENTER and per-type rules.
> 
> Introduce the handled_perm field in struct landlock_ruleset_attr for
> permission categories that control broad operations enforced at single
> kernel chokepoints, achieving complete deny-by-default coverage.  Each
> LANDLOCK_PERM_* flag names a gateway operation (use, enter) whose
> control transitively covers downstream operations.  Rule values
> reference constants from other kernel subsystems (CLONE_NEW* for
> namespaces); unknown values are silently accepted because the allow-list
> denies them by default.  See the "Ruleset restriction models" section in
> the kernel documentation for the full design rationale.
> 
> Add two namespace hooks:
> 
> - hook_namespace_alloc() fires during unshare(CLONE_NEW*) and
>   clone(CLONE_NEW*) via __ns_common_init(), and checks the namespace
>   type against the domain's allowed set.
> 
> - hook_namespace_install() fires during setns() via validate_ns(),
>   performing the same type-based check.  Both hooks set namespace_type
>   in the audit data; hook_namespace_install() also sets inum for the
>   target namespace.
> 
> Both hooks perform a pure bitmask check: if the namespace's CLONE_NEW*
> type is not in the layer's allowed set, the operation is denied.  No
> domain ancestry bypass, no namespace creator tracking, just a flat
> per-layer allowed-types bitmask.
> 
> Add the perm_rules bitfield to struct layer_rights (introduced by a
> preceding commit) to store per-layer namespace type bitmasks.  The 8-bit
> NS field maps to the 8 known namespace types via
> landlock_ns_type_to_bit(), keeping the storage compact.
> 
> LANDLOCK_RULE_NAMESPACE uses struct landlock_namespace_attr with an
> allowed_perm field (matching the pattern of allowed_access in existing
> rule types) and a namespace_types bitmask of CLONE_NEW* flags.  Unknown
> namespace type bits are silently accepted for forward compatibility;
> they have no effect since the allow-list denies by default.
> 
> User namespace creation does not require capabilities, so Landlock can
> restrict it directly.  Non-user namespace types require CAP_SYS_ADMIN
> before the Landlock check is reached; when both
> LANDLOCK_PERM_NAMESPACE_ENTER and LANDLOCK_PERM_CAPABILITY_USE are
> handled, both must allow the operation.
> 
> Five KUnit tests verify the landlock_ns_type_to_bit() and
> landlock_ns_types_to_bits() conversion helpers.
> 
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>  include/uapi/linux/landlock.h                |  58 +++++-
>  security/landlock/Makefile                   |   1 +
>  security/landlock/access.h                   |  42 ++++-
>  security/landlock/audit.c                    |   4 +
>  security/landlock/audit.h                    |   1 +
>  security/landlock/cred.h                     |  42 +++++
>  security/landlock/limits.h                   |   7 +
>  security/landlock/ns.c                       | 188 +++++++++++++++++++
>  security/landlock/ns.h                       |  74 ++++++++
>  security/landlock/ruleset.c                  |  11 +-
>  security/landlock/ruleset.h                  |  25 ++-
>  security/landlock/setup.c                    |   2 +
>  security/landlock/syscalls.c                 |  70 ++++++-
>  tools/testing/selftests/landlock/base_test.c |   2 +-
>  14 files changed, 509 insertions(+), 18 deletions(-)
>  create mode 100644 security/landlock/ns.c
>  create mode 100644 security/landlock/ns.h
> 
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index f88fa1f68b77..b76e656241df 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -51,6 +51,14 @@ struct landlock_ruleset_attr {
>  	 * resources (e.g. IPCs).
>  	 */
>  	__u64 scoped;
> +	/**
> +	 * @handled_perm: Bitmask of permissions (cf. `Permission flags`_)
> +	 * that this ruleset handles.  Each permission controls a broad
> +	 * operation enforced at a kernel chokepoint: all instances of
> +	 * that operation are denied unless explicitly allowed by a rule.
> +	 * See Documentation/security/landlock.rst for the rationale.
> +	 */
> +	__u64 handled_perm;
>  };
>  
>  /**
> @@ -153,6 +161,11 @@ enum landlock_rule_type {
>  	 * landlock_net_port_attr .
>  	 */
>  	LANDLOCK_RULE_NET_PORT,
> +	/**
> +	 * @LANDLOCK_RULE_NAMESPACE: Type of a &struct
> +	 * landlock_namespace_attr .
> +	 */
> +	LANDLOCK_RULE_NAMESPACE,
>  };
>  
>  /**
> @@ -206,6 +219,24 @@ struct landlock_net_port_attr {
>  	__u64 port;
>  };
>  
> +/**
> + * struct landlock_namespace_attr - Namespace type definition
> + *
> + * Argument of sys_landlock_add_rule() with %LANDLOCK_RULE_NAMESPACE.
> + */
> +struct landlock_namespace_attr {
> +	/**
> +	 * @allowed_perm: Must be set to %LANDLOCK_PERM_NAMESPACE_ENTER.
> +	 */
> +	__u64 allowed_perm;
> +	/**
> +	 * @namespace_types: Bitmask of namespace types (``CLONE_NEW*`` flags)
> +	 * that should be allowed to be entered under this rule.  Unknown bits
> +	 * are silently ignored for forward compatibility.
> +	 */
> +	__u64 namespace_types;
> +};
> +
>  /**
>   * DOC: fs_access
>   *
> @@ -379,6 +410,31 @@ struct landlock_net_port_attr {
>  /* clang-format off */
>  #define LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET		(1ULL << 0)
>  #define LANDLOCK_SCOPE_SIGNAL		                (1ULL << 1)
> -/* clang-format on*/
> +/* clang-format on */
> +
> +/**
> + * DOC: perm
> + *
> + * Permission flags
> + * ~~~~~~~~~~~~~~~~
> + *
> + * These flags restrict broad operations enforced at kernel chokepoints.
> + * Each flag names a gateway operation whose control transitively covers
> + * an open-ended set of downstream operations.  Handled permissions that
> + * are not explicitly allowed by a rule are denied by default.  Rule
> + * values reference constants from other kernel subsystems; unknown values
> + * are silently accepted for forward compatibility since the allow-list
> + * denies them by default.
> + * See Documentation/security/landlock.rst for design details.

It needs an empty line before the "See Documentation/..." for that to be
its own paragraph.

As discussed on the documentation patch, there are a few mentions of
"chokepoints" and "gateways" here and elsehwhere in this commit and commit
message, which should be updated along if that phrasing changes in the
documentation.

(I suggested something like "critical enforcement points" there, and
added a suggestion which delineated the permission flags more in terms
of (a) not being about individual kernel objects and (b) doing
deny-by-default for an open-ended list of operations whose full list is
defined in a more core part of the kernel.)

> + *
> + * - %LANDLOCK_PERM_NAMESPACE_ENTER: Restrict entering (creating or joining
> + *   via :manpage:`setns(2)`) specific namespace types.  A process in a
> + *   Landlock domain that handles this permission is denied from entering
> + *   namespace types that are not explicitly allowed by a
> + *   %LANDLOCK_RULE_NAMESPACE rule.
> + */
> +/* clang-format off */
> +#define LANDLOCK_PERM_NAMESPACE_ENTER			(1ULL << 0)
> +/* clang-format on */
>  
>  #endif /* _UAPI_LINUX_LANDLOCK_H */
> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
> index ffa7646d99f3..734aed4ac1bf 100644
> --- a/security/landlock/Makefile
> +++ b/security/landlock/Makefile
> @@ -8,6 +8,7 @@ landlock-y := \
>  	cred.o \
>  	task.o \
>  	fs.o \
> +	ns.o \
>  	tsync.o
>  
>  landlock-$(CONFIG_INET) += net.o
> diff --git a/security/landlock/access.h b/security/landlock/access.h
> index b3e147771a0e..9c67987a77ae 100644
> --- a/security/landlock/access.h
> +++ b/security/landlock/access.h
> @@ -42,6 +42,8 @@ static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
>  static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_NET);
>  /* Makes sure all scoped rights can be stored. */
>  static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_SCOPE);
> +/* Makes sure all permission types can be stored. */
> +static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_PERM);
>  /* Makes sure for_each_set_bit() and for_each_clear_bit() calls are OK. */
>  static_assert(sizeof(unsigned long) >= sizeof(access_mask_t));
>  
> @@ -50,6 +52,7 @@ struct access_masks {
>  	access_mask_t fs : LANDLOCK_NUM_ACCESS_FS;
>  	access_mask_t net : LANDLOCK_NUM_ACCESS_NET;
>  	access_mask_t scope : LANDLOCK_NUM_SCOPE;
> +	access_mask_t perm : LANDLOCK_NUM_PERM;
>  };
>  
>  union access_masks_all {
> @@ -61,14 +64,47 @@ union access_masks_all {
>  static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
>  	      sizeof(typeof_member(union access_masks_all, all)));
>  
> +/**
> + * struct perm_rules - Per-layer allowed bitmasks for permission types
> + *
> + * Compact bitfield struct holding the allowed bitmasks for permission
> + * types that use flat (non-tree) per-layer storage.  All fields share
> + * a single 64-bit storage unit.
> + */
> +struct perm_rules {
> +	/**
> +	 * @ns: Allowed namespace types.  Each bit corresponds to a
> +	 * sequential index assigned by the ``_LANDLOCK_NS_*`` enum
> +	 * (derived from ``FOR_EACH_NS_TYPE``).  Bits are converted from
> +	 * ``CLONE_NEW*`` flags at rule-add time via
> +	 * ``landlock_ns_types_to_bits()`` and at enforcement time via
> +	 * ``landlock_ns_type_to_bit()``.
> +	 */
> +	u64 ns : LANDLOCK_NUM_PERM_NS;
> +};
> +
> +static_assert(sizeof(struct perm_rules) == sizeof(u64));
> +
>  /**
>   * struct layer_rights - Per-layer access configuration
>   *
> - * Wraps the handled-access bitfields together with any additional per-layer
> - * data (e.g. allowed bitmasks added by future patches).  This is the element
> - * type of the &struct landlock_ruleset.layers FAM.
> + * Wraps the handled-access bitfields together with per-layer allowed
> + * bitmasks.  This is the element type of the &struct
> + * landlock_ruleset.layers FAM.
> + *
> + * Unlike filesystem and network access rights, which are tracked per-object
> + * in red-black trees, namespace types use a flat bitmask because their
> + * keyspace is small and bounded (~8 namespace types).  A single rule adds
> + * to the allowed set via bitwise OR; at enforcement time each layer is
> + * checked directly (no tree lookup needed).
>   */
>  struct layer_rights {
> +	/**
> +	 * @allowed: Per-layer allowed bitmasks for permission types.
> +	 * Placed before @handled to avoid an internal padding hole
> +	 * (8-byte perm_rules followed by 4-byte access_masks).
> +	 */
> +	struct perm_rules allowed;
>  	/**
>  	 * @handled: Bitmask of access rights handled (i.e. restricted) by
>  	 * this layer.
> diff --git a/security/landlock/audit.c b/security/landlock/audit.c
> index 60ff217ab95b..46a635893914 100644
> --- a/security/landlock/audit.c
> +++ b/security/landlock/audit.c
> @@ -78,6 +78,10 @@ get_blocker(const enum landlock_request_type type,
>  	case LANDLOCK_REQUEST_SCOPE_SIGNAL:
>  		WARN_ON_ONCE(access_bit != -1);
>  		return "scope.signal";
> +
> +	case LANDLOCK_REQUEST_NAMESPACE:
> +		WARN_ON_ONCE(access_bit != -1);
> +		return "perm.namespace_enter";
>  	}
>  
>  	WARN_ON_ONCE(1);
> diff --git a/security/landlock/audit.h b/security/landlock/audit.h
> index 56778331b58c..e9e52fb628f5 100644
> --- a/security/landlock/audit.h
> +++ b/security/landlock/audit.h
> @@ -21,6 +21,7 @@ enum landlock_request_type {
>  	LANDLOCK_REQUEST_NET_ACCESS,
>  	LANDLOCK_REQUEST_SCOPE_ABSTRACT_UNIX_SOCKET,
>  	LANDLOCK_REQUEST_SCOPE_SIGNAL,
> +	LANDLOCK_REQUEST_NAMESPACE,
>  };
>  
>  /*
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index 3e2a7e88710e..68067ff53ead 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -153,6 +153,48 @@ landlock_get_applicable_subject(const struct cred *const cred,
>  	return NULL;
>  }
>  
> +/**
> + * landlock_perm_is_denied - Check if a permission bitmask request is denied
> + *
> + * @domain: The enforced domain.
> + * @perm_bit: The LANDLOCK_PERM_* flag to check.
> + * @request_value: Compact bitmask to look for (e.g. result of
> + *                 ``landlock_ns_type_to_bit(CLONE_NEWNET)``).
> + *
> + * Iterate from the youngest layer to the oldest.  For each layer that
> + * handles @perm_bit, check whether @request_value is present in the
> + * layer's allowed bitmask.  Return on the first (youngest) denying
> + * layer.
> + *
> + * Return: The youngest denying layer + 1, or 0 if allowed.
> + */
> +static inline size_t
> +landlock_perm_is_denied(const struct landlock_ruleset *const domain,
> +			const access_mask_t perm_bit, const u64 request_value)
> +{
> +	ssize_t layer;
> +
> +	for (layer = domain->num_layers - 1; layer >= 0; layer--) {
> +		u64 allowed;
> +
> +		if (!(domain->layers[layer].handled.perm & perm_bit))
> +			continue;
> +
> +		switch (perm_bit) {
> +		case LANDLOCK_PERM_NAMESPACE_ENTER:
> +			allowed = domain->layers[layer].allowed.ns;
> +			break;
> +		default:
> +			WARN_ON_ONCE(1);
> +			return layer + 1;
> +		}
> +
> +		if (!(allowed & request_value))
> +			return layer + 1;
> +	}
> +	return 0;
> +}
> +
>  __init void landlock_add_cred_hooks(void);
>  
>  #endif /* _SECURITY_LANDLOCK_CRED_H */
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index eb584f47288d..e361b653fcf5 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -12,6 +12,7 @@
>  
>  #include <linux/bitops.h>
>  #include <linux/limits.h>
> +#include <linux/ns/ns_common_types.h>
>  #include <uapi/linux/landlock.h>
>  
>  /* clang-format off */
> @@ -31,6 +32,12 @@
>  #define LANDLOCK_MASK_SCOPE		((LANDLOCK_LAST_SCOPE << 1) - 1)
>  #define LANDLOCK_NUM_SCOPE		__const_hweight64(LANDLOCK_MASK_SCOPE)
>  
> +#define LANDLOCK_LAST_PERM		LANDLOCK_PERM_NAMESPACE_ENTER
> +#define LANDLOCK_MASK_PERM		((LANDLOCK_LAST_PERM << 1) - 1)
> +#define LANDLOCK_NUM_PERM		__const_hweight64(LANDLOCK_MASK_PERM)
> +
> +#define LANDLOCK_NUM_PERM_NS		__const_hweight64((u64)(CLONE_NS_ALL))
> +
>  #define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_TSYNC
>  #define LANDLOCK_MASK_RESTRICT_SELF	((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
>  
> diff --git a/security/landlock/ns.c b/security/landlock/ns.c
> new file mode 100644
> index 000000000000..fd9e00a295d2
> --- /dev/null
> +++ b/security/landlock/ns.c
> @@ -0,0 +1,188 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Landlock - Namespace hooks
> + *
> + * Copyright © 2026 Cloudflare
> + */
> +
> +#include <linux/lsm_audit.h>
> +#include <linux/lsm_hooks.h>
> +#include <linux/ns/ns_common_types.h>
> +#include <linux/ns_common.h>
> +#include <linux/nsproxy.h>
> +#include <uapi/linux/landlock.h>
> +
> +#include "audit.h"
> +#include "cred.h"
> +#include "limits.h"
> +#include "ns.h"
> +#include "ruleset.h"
> +#include "setup.h"
> +
> +/* Ensures the audit inum field can hold ns_common.inum without truncation. */
> +static_assert(sizeof(((struct common_audit_data *)NULL)->u.ns.inum) >=
> +	      sizeof(((struct ns_common *)NULL)->inum));
> +
> +static const struct access_masks ns_perm = {
> +	.perm = LANDLOCK_PERM_NAMESPACE_ENTER,
> +};
> +
> +/**
> + * hook_namespace_alloc - Check namespace entry permission for creation
> + *
> + * @ns: The namespace being initialized.
> + *
> + * Checks if the current domain allows entering (creating) this namespace
> + * type.  Fires during unshare(2) and clone(2) via __ns_common_init() in
> + * kernel/nscommon.c.
> + *
> + * Return: 0 if allowed, -EPERM if namespace creation is denied.
> + */
> +static int hook_namespace_alloc(struct ns_common *const ns)
> +{
> +	const struct landlock_cred_security *subject;
> +	size_t denied_layer;
> +
> +	WARN_ON_ONCE(!(CLONE_NS_ALL & ns->ns_type));
> +
> +	subject =
> +		landlock_get_applicable_subject(current_cred(), ns_perm, NULL);
> +	if (!subject)
> +		return 0;
> +
> +	denied_layer = landlock_perm_is_denied(
> +		subject->domain, LANDLOCK_PERM_NAMESPACE_ENTER,
> +		landlock_ns_type_to_bit(ns->ns_type));
> +	if (!denied_layer)
> +		return 0;
> +
> +	landlock_log_denial(subject, &(struct landlock_request){
> +					     .type = LANDLOCK_REQUEST_NAMESPACE,
> +					     .audit.type = LSM_AUDIT_DATA_NS,
> +					     .audit.u.ns.ns_type = ns->ns_type,
> +					     .layer_plus_one = denied_layer,
> +				     });
> +	return -EPERM;
> +}
> +
> +/**
> + * hook_namespace_install - Check namespace entry permission
> + *
> + * @nsset: The namespace set being modified.
> + * @ns: The namespace being entered.
> + *
> + * Checks if the current domain restricts entering this namespace type.
> + * Fires during setns(2) via validate_ns() in kernel/nsproxy.c.
> + * Uses the same type-based check as hook_namespace_alloc(): the
> + * restriction is on which namespace types the process can enter,
> + * regardless of who created the namespace.
> + *
> + * Return: 0 if entry is allowed, -EPERM if denied.
> + */
> +static int hook_namespace_install(const struct nsset *nsset,
> +				  struct ns_common *ns)
> +{
> +	const struct landlock_cred_security *subject;
> +	size_t denied_layer;
> +
> +	WARN_ON_ONCE(!(CLONE_NS_ALL & ns->ns_type));
> +
> +	subject =
> +		landlock_get_applicable_subject(current_cred(), ns_perm, NULL);
> +	if (!subject)
> +		return 0;
> +
> +	denied_layer = landlock_perm_is_denied(
> +		subject->domain, LANDLOCK_PERM_NAMESPACE_ENTER,
> +		landlock_ns_type_to_bit(ns->ns_type));
> +	if (!denied_layer)
> +		return 0;
> +
> +	landlock_log_denial(subject, &(struct landlock_request){
> +					     .type = LANDLOCK_REQUEST_NAMESPACE,
> +					     .audit.type = LSM_AUDIT_DATA_NS,
> +					     .audit.u.ns.ns_type = ns->ns_type,
> +					     .audit.u.ns.inum = ns->inum,
> +					     .layer_plus_one = denied_layer,
> +				     });
> +	return -EPERM;
> +}
> +
> +static struct security_hook_list landlock_hooks[] __ro_after_init = {
> +	LSM_HOOK_INIT(namespace_alloc, hook_namespace_alloc),
> +	LSM_HOOK_INIT(namespace_install, hook_namespace_install),
> +};
> +
> +__init void landlock_add_ns_hooks(void)
> +{
> +	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
> +			   &landlock_lsmid);
> +}
> +
> +#ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
> +
> +#include <kunit/test.h>
> +
> +/* clang-format off */
> +#define _TEST_NS_BIT(struct_name, flag) \
> +	do { \
> +		const u64 bit = landlock_ns_type_to_bit(flag); \
> +		KUNIT_EXPECT_NE(test, 0ULL, bit); \
> +		KUNIT_EXPECT_EQ(test, 0ULL, seen &bit); \
> +		seen |= bit; \
> +	} while (0);
> +/* clang-format on */
> +
> +static void test_ns_type_to_bit(struct kunit *const test)
> +{
> +	u64 seen = 0;
> +
> +	FOR_EACH_NS_TYPE(_TEST_NS_BIT)
> +
> +	KUNIT_EXPECT_EQ(test, GENMASK_ULL(LANDLOCK_NUM_PERM_NS - 1, 0), seen);
> +}
> +
> +static void test_ns_type_to_bit_unknown(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, 0ULL, landlock_ns_type_to_bit(CLONE_THREAD));
> +}
> +
> +static void test_ns_types_to_bits_all(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, GENMASK_ULL(LANDLOCK_NUM_PERM_NS - 1, 0),
> +			landlock_ns_types_to_bits(CLONE_NS_ALL));
> +}
> +
> +/* clang-format off */
> +#define _TEST_NS_SINGLE(struct_name, flag) \
> +	KUNIT_EXPECT_EQ(test, landlock_ns_type_to_bit(flag), \
> +			landlock_ns_types_to_bits(flag));
> +/* clang-format on */
> +
> +static void test_ns_types_to_bits_single(struct kunit *const test)
> +{
> +	FOR_EACH_NS_TYPE(_TEST_NS_SINGLE)
> +}
> +
> +static void test_ns_types_to_bits_zero(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, 0ULL, landlock_ns_types_to_bits(0));
> +}
> +
> +static struct kunit_case test_cases[] = {
> +	KUNIT_CASE(test_ns_type_to_bit),
> +	KUNIT_CASE(test_ns_type_to_bit_unknown),
> +	KUNIT_CASE(test_ns_types_to_bits_all),
> +	KUNIT_CASE(test_ns_types_to_bits_single),
> +	KUNIT_CASE(test_ns_types_to_bits_zero),
> +	{}
> +};
> +
> +static struct kunit_suite test_suite = {
> +	.name = "landlock_ns",
> +	.test_cases = test_cases,
> +};
> +
> +kunit_test_suite(test_suite);
> +
> +#endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */
> diff --git a/security/landlock/ns.h b/security/landlock/ns.h
> new file mode 100644
> index 000000000000..c731ecc08f8c
> --- /dev/null
> +++ b/security/landlock/ns.h
> @@ -0,0 +1,74 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock - Namespace hooks
> + *
> + * Copyright © 2026 Cloudflare
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_NS_H
> +#define _SECURITY_LANDLOCK_NS_H
> +
> +#include <linux/bitops.h>
> +#include <linux/bug.h>
> +#include <linux/compiler_attributes.h>
> +#include <linux/ns/ns_common_types.h>
> +#include <linux/types.h>
> +
> +#include "limits.h"
> +
> +/* _LANDLOCK_NS_CLONE_NEWCGROUP, */
> +#define _LANDLOCK_NS_ENUM(struct_name, flag) _LANDLOCK_NS_##flag,
> +
> +/* _LANDLOCK_NS_CLONE_NEWCGROUP = 0, */
> +enum {
> +	FOR_EACH_NS_TYPE(_LANDLOCK_NS_ENUM) _LANDLOCK_NUM_NS_TYPES,
> +};
> +
> +static_assert(_LANDLOCK_NUM_NS_TYPES == LANDLOCK_NUM_PERM_NS);
> +
> +/*
> + * case CLONE_NEWCGROUP:
> + *         return BIT_ULL(_LANDLOCK_NS_CLONE_NEWCGROUP);
> + */
> +/* clang-format off */
> +#define _LANDLOCK_NS_CASE(struct_name, flag) \
> +	case flag: \
> +		return BIT_ULL(_LANDLOCK_NS_##flag);
> +/* clang-format on */
> +
> +static inline __attribute_const__ u64
> +landlock_ns_type_to_bit(const unsigned long ns_type)
> +{
> +	switch (ns_type) {
> +		FOR_EACH_NS_TYPE(_LANDLOCK_NS_CASE)
> +	default:
> +		WARN_ON_ONCE(1);
> +		return 0;
> +	}
> +}
> +
> +/*
> + * if (ns_types & CLONE_NEWCGROUP)
> + *         bits |= BIT_ULL(_LANDLOCK_NS_CLONE_NEWCGROUP);
> + */
> +/* clang-format off */
> +#define _LANDLOCK_NS_CONVERT(struct_name, flag) \
> +	do { \
> +		if (ns_types & (flag)) \
> +			bits |= BIT_ULL(_LANDLOCK_NS_##flag); \
> +	} while (0);
> +/* clang-format on */
> +
> +static inline __attribute_const__ u64
> +landlock_ns_types_to_bits(const u64 ns_types)
> +{
> +	u64 bits = 0;
> +
> +	WARN_ON_ONCE(ns_types & ~CLONE_NS_ALL);
> +	FOR_EACH_NS_TYPE(_LANDLOCK_NS_CONVERT)
> +	return bits;
> +}
> +
> +__init void landlock_add_ns_hooks(void);
> +
> +#endif /* _SECURITY_LANDLOCK_NS_H */
> diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
> index a7f8be37ec31..7321e2f19b03 100644
> --- a/security/landlock/ruleset.c
> +++ b/security/landlock/ruleset.c
> @@ -53,15 +53,14 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
>  	return new_ruleset;
>  }
>  
> -struct landlock_ruleset *
> -landlock_create_ruleset(const access_mask_t fs_access_mask,
> -			const access_mask_t net_access_mask,
> -			const access_mask_t scope_mask)
> +struct landlock_ruleset *landlock_create_ruleset(
> +	const access_mask_t fs_access_mask, const access_mask_t net_access_mask,
> +	const access_mask_t scope_mask, const access_mask_t perm_mask)
>  {
>  	struct landlock_ruleset *new_ruleset;
>  
>  	/* Informs about useless ruleset. */
> -	if (!fs_access_mask && !net_access_mask && !scope_mask)
> +	if (!fs_access_mask && !net_access_mask && !scope_mask && !perm_mask)
>  		return ERR_PTR(-ENOMSG);
>  	new_ruleset = create_ruleset(1);
>  	if (IS_ERR(new_ruleset))
> @@ -72,6 +71,8 @@ landlock_create_ruleset(const access_mask_t fs_access_mask,
>  		landlock_add_net_access_mask(new_ruleset, net_access_mask, 0);
>  	if (scope_mask)
>  		landlock_add_scope_mask(new_ruleset, scope_mask, 0);
> +	if (perm_mask)
> +		landlock_add_perm_mask(new_ruleset, perm_mask, 0);
>  	return new_ruleset;
>  }
>  
> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index 900c47eb0216..747261391c00 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -190,10 +190,9 @@ struct landlock_ruleset {
>  	};
>  };
>  
> -struct landlock_ruleset *
> -landlock_create_ruleset(const access_mask_t access_mask_fs,
> -			const access_mask_t access_mask_net,
> -			const access_mask_t scope_mask);
> +struct landlock_ruleset *landlock_create_ruleset(
> +	const access_mask_t access_mask_fs, const access_mask_t access_mask_net,
> +	const access_mask_t scope_mask, const access_mask_t perm_mask);
>  
>  void landlock_put_ruleset(struct landlock_ruleset *const ruleset);
>  void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
> @@ -303,6 +302,24 @@ landlock_get_scope_mask(const struct landlock_ruleset *const ruleset,
>  	return ruleset->layers[layer_level].handled.scope;
>  }
>  
> +static inline void
> +landlock_add_perm_mask(struct landlock_ruleset *const ruleset,
> +		       const access_mask_t perm_mask, const u16 layer_level)
> +{
> +	access_mask_t mask = perm_mask & LANDLOCK_MASK_PERM;
> +
> +	/* Should already be checked in sys_landlock_create_ruleset(). */
> +	WARN_ON_ONCE(perm_mask != mask);
> +	ruleset->layers[layer_level].handled.perm |= mask;
> +}
> +
> +static inline access_mask_t
> +landlock_get_perm_mask(const struct landlock_ruleset *const ruleset,
> +		       const u16 layer_level)
> +{
> +	return ruleset->layers[layer_level].handled.perm;
> +}
> +
>  bool landlock_unmask_layers(const struct landlock_rule *const rule,
>  			    struct layer_access_masks *masks);
>  
> diff --git a/security/landlock/setup.c b/security/landlock/setup.c
> index 47dac1736f10..a7ed776b41b4 100644
> --- a/security/landlock/setup.c
> +++ b/security/landlock/setup.c
> @@ -17,6 +17,7 @@
>  #include "fs.h"
>  #include "id.h"
>  #include "net.h"
> +#include "ns.h"
>  #include "setup.h"
>  #include "task.h"
>  
> @@ -68,6 +69,7 @@ static int __init landlock_init(void)
>  	landlock_add_task_hooks();
>  	landlock_add_fs_hooks();
>  	landlock_add_net_hooks();
> +	landlock_add_ns_hooks();
>  	landlock_init_id();
>  	landlock_initialized = true;
>  	pr_info("Up and running.\n");
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 2aa7b50d875f..152d952e98f6 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -20,6 +20,7 @@
>  #include <linux/fs.h>
>  #include <linux/limits.h>
>  #include <linux/mount.h>
> +#include <linux/ns/ns_common_types.h>
>  #include <linux/path.h>
>  #include <linux/sched.h>
>  #include <linux/security.h>
> @@ -34,6 +35,7 @@
>  #include "fs.h"
>  #include "limits.h"
>  #include "net.h"
> +#include "ns.h"
>  #include "ruleset.h"
>  #include "setup.h"
>  #include "tsync.h"
> @@ -95,7 +97,9 @@ static void build_check_abi(void)
>  	struct landlock_ruleset_attr ruleset_attr;
>  	struct landlock_path_beneath_attr path_beneath_attr;
>  	struct landlock_net_port_attr net_port_attr;
> +	struct landlock_namespace_attr namespace_attr;
>  	size_t ruleset_size, path_beneath_size, net_port_size;
> +	size_t namespace_size;
>  
>  	/*
>  	 * For each user space ABI structures, first checks that there is no
> @@ -105,8 +109,9 @@ static void build_check_abi(void)
>  	ruleset_size = sizeof(ruleset_attr.handled_access_fs);
>  	ruleset_size += sizeof(ruleset_attr.handled_access_net);
>  	ruleset_size += sizeof(ruleset_attr.scoped);
> +	ruleset_size += sizeof(ruleset_attr.handled_perm);
>  	BUILD_BUG_ON(sizeof(ruleset_attr) != ruleset_size);
> -	BUILD_BUG_ON(sizeof(ruleset_attr) != 24);
> +	BUILD_BUG_ON(sizeof(ruleset_attr) != 32);
>  
>  	path_beneath_size = sizeof(path_beneath_attr.allowed_access);
>  	path_beneath_size += sizeof(path_beneath_attr.parent_fd);
> @@ -117,6 +122,11 @@ static void build_check_abi(void)
>  	net_port_size += sizeof(net_port_attr.port);
>  	BUILD_BUG_ON(sizeof(net_port_attr) != net_port_size);
>  	BUILD_BUG_ON(sizeof(net_port_attr) != 16);
> +
> +	namespace_size = sizeof(namespace_attr.allowed_perm);
> +	namespace_size += sizeof(namespace_attr.namespace_types);
> +	BUILD_BUG_ON(sizeof(namespace_attr) != namespace_size);
> +	BUILD_BUG_ON(sizeof(namespace_attr) != 16);
>  }
>  
>  /* Ruleset handling */
> @@ -166,7 +176,7 @@ static const struct file_operations ruleset_fops = {
>   * If the change involves a fix that requires userspace awareness, also update
>   * the errata documentation in Documentation/userspace-api/landlock.rst .
>   */
> -const int landlock_abi_version = 8;
> +const int landlock_abi_version = 9;
>  
>  /**
>   * sys_landlock_create_ruleset - Create a new ruleset
> @@ -249,10 +259,16 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
>  	if ((ruleset_attr.scoped | LANDLOCK_MASK_SCOPE) != LANDLOCK_MASK_SCOPE)
>  		return -EINVAL;
>  
> +	/* Checks permission content (and 32-bits cast). */
> +	if ((ruleset_attr.handled_perm | LANDLOCK_MASK_PERM) !=
> +	    LANDLOCK_MASK_PERM)
> +		return -EINVAL;
> +
>  	/* Checks arguments and transforms to kernel struct. */
>  	ruleset = landlock_create_ruleset(ruleset_attr.handled_access_fs,
>  					  ruleset_attr.handled_access_net,
> -					  ruleset_attr.scoped);
> +					  ruleset_attr.scoped,
> +					  ruleset_attr.handled_perm);
>  	if (IS_ERR(ruleset))
>  		return PTR_ERR(ruleset);
>  
> @@ -390,13 +406,57 @@ static int add_rule_net_port(struct landlock_ruleset *ruleset,
>  					net_port_attr.allowed_access);
>  }
>  
> +static int add_rule_namespace(struct landlock_ruleset *const ruleset,
> +			      const void __user *const rule_attr)
> +{
> +	struct landlock_namespace_attr ns_attr;
> +	int res;
> +	access_mask_t mask;
> +
> +	/* Copies raw user space buffer. */
> +	res = copy_from_user(&ns_attr, rule_attr, sizeof(ns_attr));
> +	if (res)
> +		return -EFAULT;
> +
> +	/* Informs about useless rule: empty allowed_perm. */
> +	if (!ns_attr.allowed_perm)
> +		return -ENOMSG;
> +
> +	/* The allowed_perm must match LANDLOCK_PERM_NAMESPACE_ENTER. */
> +	if (ns_attr.allowed_perm != LANDLOCK_PERM_NAMESPACE_ENTER)
> +		return -EINVAL;
> +
> +	/* Checks that allowed_perm matches the @ruleset constraints. */
> +	mask = landlock_get_perm_mask(ruleset, 0);
> +	if (!(mask & LANDLOCK_PERM_NAMESPACE_ENTER))
> +		return -EINVAL;
> +
> +	/* Informs about useless rule: empty namespace_types. */
> +	if (!ns_attr.namespace_types)
> +		return -ENOMSG;
> +
> +	/*
> +	 * Stores only the namespace types this kernel knows about.
> +	 * Unknown bits are silently accepted for forward compatibility:
> +	 * user space compiled against newer headers can pass new
> +	 * CLONE_NEW* flags without getting EINVAL on older kernels.
> +	 * Unknown bits have no effect because no hook checks them.
> +	 */
> +	mutex_lock(&ruleset->lock);
> +	ruleset->layers[0].allowed.ns |= landlock_ns_types_to_bits(
> +		ns_attr.namespace_types & CLONE_NS_ALL);
> +	mutex_unlock(&ruleset->lock);
> +	return 0;
> +}
> +
>  /**
>   * sys_landlock_add_rule - Add a new rule to a ruleset
>   *
>   * @ruleset_fd: File descriptor tied to the ruleset that should be extended
>   *		with the new rule.
>   * @rule_type: Identify the structure type pointed to by @rule_attr:
> - *             %LANDLOCK_RULE_PATH_BENEATH or %LANDLOCK_RULE_NET_PORT.
> + *             %LANDLOCK_RULE_PATH_BENEATH, %LANDLOCK_RULE_NET_PORT, or
> + *             %LANDLOCK_RULE_NAMESPACE.
>   * @rule_attr: Pointer to a rule (matching the @rule_type).
>   * @flags: Must be 0.
>   *
> @@ -446,6 +506,8 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
>  		return add_rule_path_beneath(ruleset, rule_attr);
>  	case LANDLOCK_RULE_NET_PORT:
>  		return add_rule_net_port(ruleset, rule_attr);
> +	case LANDLOCK_RULE_NAMESPACE:
> +		return add_rule_namespace(ruleset, rule_attr);
>  	default:
>  		return -EINVAL;
>  	}
> diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
> index 0fea236ef4bd..30d37234086c 100644
> --- a/tools/testing/selftests/landlock/base_test.c
> +++ b/tools/testing/selftests/landlock/base_test.c
> @@ -76,7 +76,7 @@ TEST(abi_version)
>  	const struct landlock_ruleset_attr ruleset_attr = {
>  		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
>  	};
> -	ASSERT_EQ(8, landlock_create_ruleset(NULL, 0,
> +	ASSERT_EQ(9, landlock_create_ruleset(NULL, 0,
>  					     LANDLOCK_CREATE_RULESET_VERSION));
>  
>  	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
> -- 
> 2.53.0
> 

Documentation remarks above are minor, please feel free to tag as reviewed.
I could not find any issues in the code.

Reviewed-by: Günther Noack <gnoack@google.com>

—Günther

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH v1 06/11] landlock: Enforce capability restrictions
       [not found] ` <20260312100444.2609563-7-mic@digikod.net>
@ 2026-05-08 15:54   ` Günther Noack
  0 siblings, 0 replies; 3+ messages in thread
From: Günther Noack @ 2026-05-08 15:54 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Christian Brauner, Paul Moore, Serge E . Hallyn, Justin Suess,
	Lennart Poettering, Mikhail Ivanov, Nicolas Bouchinet,
	Shervin Oloumi, Tingmao Wang, kernel-team, linux-fsdevel,
	linux-kernel, linux-security-module

On Thu, Mar 12, 2026 at 11:04:39AM +0100, Mickaël Salaün wrote:
> Add Landlock enforcement for capability use via the LSM capable hook.
> This lets a sandboxed process restrict which Linux capabilities it can
> exercise, using LANDLOCK_PERM_CAPABILITY_USE and per-capability rules.
> 
> The capable hook is purely restrictive: it runs after cap_capable()
> (LSM_ORDER_FIRST), so it can deny capabilities that commoncap would
> allow, but it can never grant capabilities that commoncap denied.
> 
> Add hook_capable() that uses landlock_perm_is_denied() to perform a pure
> bitmask check: if the capability is not in the layer's allowed set, the
> check is denied.  No domain ancestry bypass, no cross-namespace
> discriminant, just a flat per-layer allowed-caps bitmask, matching the
> same pattern used by LANDLOCK_PERM_NAMESPACE_ENTER.
> 
> Adding the 41-bit capability bitfield to struct perm_rules brings it to
> 49 out of 64 bits used (41 caps + 8 namespace types, 15 bits padding),
> keeping struct layer_rights at 16 bytes (8 bytes perm_rules + 4 bytes
> access_masks + 4 bytes tail padding) and the layers[] array at 256 bytes
> maximum.  The caps bitfield is placed first in struct perm_rules (before
> the ns bitfield) because capabilities use a direct BIT_ULL(cap) mapping
> that benefits from starting at bit 0 of the storage unit.
> 
> Non-user namespace operations require both LANDLOCK_PERM_NAMESPACE_ENTER
> (type allowed) and LANDLOCK_PERM_CAPABILITY_USE (CAP_SYS_ADMIN allowed)
> when both permissions are handled.  This follows naturally from the
> kernel calling capable(CAP_SYS_ADMIN) before namespace operations: both
> hooks fire independently and audit logs identify which permission was
> denied.
> 
> The enforcement is purely at exercise time via the capable hook, not by
> modifying the credential's capability sets.  Stripping denied
> capabilities would give processes an accurate capget(2) view of their
> usable capabilities, but no LSM other than commoncap modifies capability
> sets; Landlock follows this convention and restricts use without
> altering what the process holds.  A sandboxed process inside a user
> namespace will see all capabilities via capget(2) but will receive
> -EPERM when attempting to use any denied capability.
> 
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge E. Hallyn <serge@hallyn.com>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> ---
>  include/uapi/linux/landlock.h |  31 ++++++++
>  security/landlock/Makefile    |   1 +
>  security/landlock/access.h    |  15 +++-
>  security/landlock/audit.c     |   4 +
>  security/landlock/audit.h     |   1 +
>  security/landlock/cap.c       | 142 ++++++++++++++++++++++++++++++++++
>  security/landlock/cap.h       |  49 ++++++++++++
>  security/landlock/cred.h      |   3 +
>  security/landlock/limits.h    |   4 +-
>  security/landlock/setup.c     |   2 +
>  security/landlock/syscalls.c  |  58 +++++++++++++-
>  11 files changed, 302 insertions(+), 8 deletions(-)
>  create mode 100644 security/landlock/cap.c
>  create mode 100644 security/landlock/cap.h
> 
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index b76e656241df..0e73be459d47 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -166,6 +166,11 @@ enum landlock_rule_type {
>  	 * landlock_namespace_attr .
>  	 */
>  	LANDLOCK_RULE_NAMESPACE,
> +	/**
> +	 * @LANDLOCK_RULE_CAPABILITY: Type of a &struct
> +	 * landlock_capability_attr .
> +	 */
> +	LANDLOCK_RULE_CAPABILITY,
>  };
>  
>  /**
> @@ -237,6 +242,24 @@ struct landlock_namespace_attr {
>  	__u64 namespace_types;
>  };
>  
> +/**
> + * struct landlock_capability_attr - Capability definition
> + *
> + * Argument of sys_landlock_add_rule() with %LANDLOCK_RULE_CAPABILITY.
> + */
> +struct landlock_capability_attr {
> +	/**
> +	 * @allowed_perm: Must be set to %LANDLOCK_PERM_CAPABILITY_USE.
> +	 */
> +	__u64 allowed_perm;
> +	/**
> +	 * @capabilities: Bitmask of capabilities (``1ULL << CAP_*``) that
> +	 * should be allowed for use under this rule.  Bits above
> +	 * ``CAP_LAST_CAP`` are silently ignored for forward compatibility.
> +	 */
> +	__u64 capabilities;
> +};
> +
>  /**
>   * DOC: fs_access
>   *
> @@ -432,9 +455,17 @@ struct landlock_namespace_attr {
>   *   Landlock domain that handles this permission is denied from entering
>   *   namespace types that are not explicitly allowed by a
>   *   %LANDLOCK_RULE_NAMESPACE rule.
> + * - %LANDLOCK_PERM_CAPABILITY_USE: Restrict the use of specific Linux
> + *   capabilities.  A process in a Landlock domain that handles this
> + *   permission is denied from exercising capabilities that are not
> + *   explicitly allowed by a %LANDLOCK_RULE_CAPABILITY rule.  This hook
> + *   is purely restrictive: it can deny capabilities that the kernel
> + *   would otherwise grant, but it can never grant capabilities that the
> + *   kernel already denied.
>   */
>  /* clang-format off */
>  #define LANDLOCK_PERM_NAMESPACE_ENTER			(1ULL << 0)
> +#define LANDLOCK_PERM_CAPABILITY_USE			(1ULL << 1)
>  /* clang-format on */
>  
>  #endif /* _UAPI_LINUX_LANDLOCK_H */
> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
> index 734aed4ac1bf..63311d556f93 100644
> --- a/security/landlock/Makefile
> +++ b/security/landlock/Makefile
> @@ -9,6 +9,7 @@ landlock-y := \
>  	task.o \
>  	fs.o \
>  	ns.o \
> +	cap.o \
>  	tsync.o
>  
>  landlock-$(CONFIG_INET) += net.o
> diff --git a/security/landlock/access.h b/security/landlock/access.h
> index 9c67987a77ae..65227b3064db 100644
> --- a/security/landlock/access.h
> +++ b/security/landlock/access.h
> @@ -72,6 +72,13 @@ static_assert(sizeof(typeof_member(union access_masks_all, masks)) ==
>   * a single 64-bit storage unit.
>   */
>  struct perm_rules {
> +	/**
> +	 * @caps: Allowed capabilities.  Each bit corresponds to a
> +	 * ``CAP_*`` value (e.g. ``CAP_NET_RAW`` = bit 13).  Bits are
> +	 * stored directly (sequential mapping) and masked with
> +	 * ``CAP_VALID_MASK`` at rule-add time.
> +	 */
> +	u64 caps : LANDLOCK_NUM_PERM_CAP;
>  	/**
>  	 * @ns: Allowed namespace types.  Each bit corresponds to a
>  	 * sequential index assigned by the ``_LANDLOCK_NS_*`` enum
> @@ -93,10 +100,10 @@ static_assert(sizeof(struct perm_rules) == sizeof(u64));
>   * landlock_ruleset.layers FAM.
>   *
>   * Unlike filesystem and network access rights, which are tracked per-object
> - * in red-black trees, namespace types use a flat bitmask because their
> - * keyspace is small and bounded (~8 namespace types).  A single rule adds
> - * to the allowed set via bitwise OR; at enforcement time each layer is
> - * checked directly (no tree lookup needed).
> + * in red-black trees, namespace types and capabilities use flat bitmasks
> + * because their keyspaces are small and bounded (~8 namespace types, 41
> + * capabilities).  A single rule adds to the allowed set via bitwise OR; at
> + * enforcement time each layer is checked directly (no tree lookup needed).
>   */
>  struct layer_rights {
>  	/**
> diff --git a/security/landlock/audit.c b/security/landlock/audit.c
> index 46a635893914..24b7800ec479 100644
> --- a/security/landlock/audit.c
> +++ b/security/landlock/audit.c
> @@ -82,6 +82,10 @@ get_blocker(const enum landlock_request_type type,
>  	case LANDLOCK_REQUEST_NAMESPACE:
>  		WARN_ON_ONCE(access_bit != -1);
>  		return "perm.namespace_enter";
> +
> +	case LANDLOCK_REQUEST_CAPABILITY:
> +		WARN_ON_ONCE(access_bit != -1);
> +		return "perm.capability_use";
>  	}
>  
>  	WARN_ON_ONCE(1);
> diff --git a/security/landlock/audit.h b/security/landlock/audit.h
> index e9e52fb628f5..fe5d701ea45d 100644
> --- a/security/landlock/audit.h
> +++ b/security/landlock/audit.h
> @@ -22,6 +22,7 @@ enum landlock_request_type {
>  	LANDLOCK_REQUEST_SCOPE_ABSTRACT_UNIX_SOCKET,
>  	LANDLOCK_REQUEST_SCOPE_SIGNAL,
>  	LANDLOCK_REQUEST_NAMESPACE,
> +	LANDLOCK_REQUEST_CAPABILITY,
>  };
>  
>  /*
> diff --git a/security/landlock/cap.c b/security/landlock/cap.c
> new file mode 100644
> index 000000000000..536e579f63a9
> --- /dev/null
> +++ b/security/landlock/cap.c
> @@ -0,0 +1,142 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Landlock - Capability hooks
> + *
> + * Copyright © 2026 Cloudflare
> + */
> +
> +#include <linux/capability.h>
> +#include <linux/cred.h>
> +#include <linux/lsm_audit.h>
> +#include <linux/lsm_hooks.h>
> +#include <uapi/linux/landlock.h>
> +
> +#include "audit.h"
> +#include "cap.h"
> +#include "cred.h"
> +#include "limits.h"
> +#include "ruleset.h"
> +#include "setup.h"
> +
> +static const struct access_masks cap_perm = {
> +	.perm = LANDLOCK_PERM_CAPABILITY_USE,
> +};
> +
> +/**
> + * hook_capable - Deny capability use for Landlock-sandboxed processes
> + *
> + * @cred: Credentials being checked.
> + * @ns: User namespace for the capability check.
> + * @cap: Capability number (CAP_*).
> + * @opts: Capability check options.  CAP_OPT_NOAUDIT suppresses audit logging.
> + *
> + * Pure bitmask check: denies the capability if it is not in the layer's
> + * allowed set.  This hook is purely restrictive: it runs after
> + * cap_capable() (LSM_ORDER_FIRST), so it can deny capabilities that
> + * commoncap would allow, but it can never grant capabilities that
> + * commoncap denied.
> + *
> + * Return: 0 if allowed, -EPERM if capability use is denied.
> + */
> +static int hook_capable(const struct cred *cred, struct user_namespace *ns,
> +			int cap, unsigned int opts)
> +{
> +	const struct landlock_cred_security *subject;
> +	size_t denied_layer;
> +
> +	subject = landlock_get_applicable_subject(cred, cap_perm, NULL);
> +	if (!subject)
> +		return 0;
> +
> +	denied_layer = landlock_perm_is_denied(subject->domain,
> +					       LANDLOCK_PERM_CAPABILITY_USE,
> +					       landlock_cap_to_bit(cap));
> +	if (!denied_layer)
> +		return 0;
> +
> +	/*
> +	 * Respects CAP_OPT_NOAUDIT to suppress audit records for
> +	 * capability probes (e.g., ns_capable_noaudit(),
> +	 * has_capability_noaudit()).
> +	 */
> +	if (!(opts & CAP_OPT_NOAUDIT))
> +		landlock_log_denial(subject,
> +				    &(struct landlock_request){
> +					    .type = LANDLOCK_REQUEST_CAPABILITY,
> +					    .audit.type = LSM_AUDIT_DATA_CAP,
> +					    .audit.u.cap = cap,
> +					    .layer_plus_one = denied_layer,
> +				    });
> +
> +	return -EPERM;
> +}
> +
> +static struct security_hook_list landlock_hooks[] __ro_after_init = {
> +	LSM_HOOK_INIT(capable, hook_capable),
> +};
> +
> +__init void landlock_add_cap_hooks(void)
> +{
> +	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
> +			   &landlock_lsmid);
> +}
> +
> +#ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
> +
> +#include <kunit/test.h>
> +
> +static void test_cap_to_bit(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, BIT_ULL(0), landlock_cap_to_bit(0));
> +	KUNIT_EXPECT_EQ(test, BIT_ULL(CAP_NET_RAW),
> +			landlock_cap_to_bit(CAP_NET_RAW));
> +	KUNIT_EXPECT_EQ(test, BIT_ULL(CAP_SYS_ADMIN),
> +			landlock_cap_to_bit(CAP_SYS_ADMIN));
> +	KUNIT_EXPECT_EQ(test, BIT_ULL(CAP_LAST_CAP),
> +			landlock_cap_to_bit(CAP_LAST_CAP));
> +}
> +
> +static void test_cap_to_bit_invalid(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, 0ULL, landlock_cap_to_bit(-1));
> +	KUNIT_EXPECT_EQ(test, 0ULL, landlock_cap_to_bit(CAP_LAST_CAP + 1));
> +}
> +
> +static void test_caps_to_bits_valid(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, (u64)CAP_VALID_MASK,
> +			landlock_caps_to_bits(CAP_VALID_MASK));
> +	KUNIT_EXPECT_EQ(test, BIT_ULL(CAP_NET_RAW),
> +			landlock_caps_to_bits(BIT_ULL(CAP_NET_RAW)));
> +}
> +
> +static void test_caps_to_bits_unknown(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, 0ULL,
> +			landlock_caps_to_bits(BIT_ULL(CAP_LAST_CAP + 1)));
> +}
> +
> +static void test_caps_to_bits_zero(struct kunit *const test)
> +{
> +	KUNIT_EXPECT_EQ(test, 0ULL, landlock_caps_to_bits(0));
> +}
> +
> +static struct kunit_case test_cases[] = {
> +	/* clang-format off */
> +	KUNIT_CASE(test_cap_to_bit),
> +	KUNIT_CASE(test_cap_to_bit_invalid),
> +	KUNIT_CASE(test_caps_to_bits_valid),
> +	KUNIT_CASE(test_caps_to_bits_unknown),
> +	KUNIT_CASE(test_caps_to_bits_zero),
> +	{}
> +	/* clang-format on */
> +};
> +
> +static struct kunit_suite test_suite = {
> +	.name = "landlock_cap",
> +	.test_cases = test_cases,
> +};
> +
> +kunit_test_suite(test_suite);
> +
> +#endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */
> diff --git a/security/landlock/cap.h b/security/landlock/cap.h
> new file mode 100644
> index 000000000000..334b6974fb95
> --- /dev/null
> +++ b/security/landlock/cap.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock - Capability hooks
> + *
> + * Copyright © 2026 Cloudflare
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_CAP_H
> +#define _SECURITY_LANDLOCK_CAP_H
> +
> +#include <linux/bitops.h>
> +#include <linux/bug.h>
> +#include <linux/capability.h>
> +#include <linux/compiler_attributes.h>
> +#include <linux/types.h>
> +
> +/**
> + * landlock_cap_to_bit - Convert a capability number to a compact bitmask
> + *
> + * @cap: Capability number (CAP_*).
> + *
> + * Return: BIT_ULL(@cap), or 0 if @cap is invalid (with a WARN).
> + */
> +static inline __attribute_const__ u64 landlock_cap_to_bit(const int cap)
> +{
> +	if (WARN_ON_ONCE(!cap_valid(cap)))
> +		return 0;
> +
> +	return BIT_ULL(cap);
> +}
> +
> +/**
> + * landlock_caps_to_bits - Validate and mask a capability bitmask
> + *
> + * @capabilities: Bitmask of capabilities (e.g. from user space).
> + *
> + * Return: @capabilities masked to known capabilities.  Warns if unknown
> + * bits are present (callers must pre-mask for user input).
> + */
> +static inline __attribute_const__ u64
> +landlock_caps_to_bits(const u64 capabilities)
> +{
> +	WARN_ON_ONCE(capabilities & ~CAP_VALID_MASK);
> +	return capabilities & CAP_VALID_MASK;
> +}
> +
> +__init void landlock_add_cap_hooks(void);
> +
> +#endif /* _SECURITY_LANDLOCK_CAP_H */
> diff --git a/security/landlock/cred.h b/security/landlock/cred.h
> index 68067ff53ead..257197facbae 100644
> --- a/security/landlock/cred.h
> +++ b/security/landlock/cred.h
> @@ -184,6 +184,9 @@ landlock_perm_is_denied(const struct landlock_ruleset *const domain,
>  		case LANDLOCK_PERM_NAMESPACE_ENTER:
>  			allowed = domain->layers[layer].allowed.ns;
>  			break;
> +		case LANDLOCK_PERM_CAPABILITY_USE:
> +			allowed = domain->layers[layer].allowed.caps;
> +			break;
>  		default:
>  			WARN_ON_ONCE(1);
>  			return layer + 1;
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index e361b653fcf5..43e832c0deb0 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -11,6 +11,7 @@
>  #define _SECURITY_LANDLOCK_LIMITS_H
>  
>  #include <linux/bitops.h>
> +#include <linux/capability.h>
>  #include <linux/limits.h>
>  #include <linux/ns/ns_common_types.h>
>  #include <uapi/linux/landlock.h>
> @@ -32,11 +33,12 @@
>  #define LANDLOCK_MASK_SCOPE		((LANDLOCK_LAST_SCOPE << 1) - 1)
>  #define LANDLOCK_NUM_SCOPE		__const_hweight64(LANDLOCK_MASK_SCOPE)
>  
> -#define LANDLOCK_LAST_PERM		LANDLOCK_PERM_NAMESPACE_ENTER
> +#define LANDLOCK_LAST_PERM		LANDLOCK_PERM_CAPABILITY_USE
>  #define LANDLOCK_MASK_PERM		((LANDLOCK_LAST_PERM << 1) - 1)
>  #define LANDLOCK_NUM_PERM		__const_hweight64(LANDLOCK_MASK_PERM)
>  
>  #define LANDLOCK_NUM_PERM_NS		__const_hweight64((u64)(CLONE_NS_ALL))
> +#define LANDLOCK_NUM_PERM_CAP		(CAP_LAST_CAP + 1)
>  
>  #define LANDLOCK_LAST_RESTRICT_SELF	LANDLOCK_RESTRICT_SELF_TSYNC
>  #define LANDLOCK_MASK_RESTRICT_SELF	((LANDLOCK_LAST_RESTRICT_SELF << 1) - 1)
> diff --git a/security/landlock/setup.c b/security/landlock/setup.c
> index a7ed776b41b4..971419d663bb 100644
> --- a/security/landlock/setup.c
> +++ b/security/landlock/setup.c
> @@ -11,6 +11,7 @@
>  #include <linux/lsm_hooks.h>
>  #include <uapi/linux/lsm.h>
>  
> +#include "cap.h"
>  #include "common.h"
>  #include "cred.h"
>  #include "errata.h"
> @@ -70,6 +71,7 @@ static int __init landlock_init(void)
>  	landlock_add_fs_hooks();
>  	landlock_add_net_hooks();
>  	landlock_add_ns_hooks();
> +	landlock_add_cap_hooks();
>  	landlock_init_id();
>  	landlock_initialized = true;
>  	pr_info("Up and running.\n");
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 152d952e98f6..38a4bf92781a 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -30,6 +30,7 @@
>  #include <linux/uaccess.h>
>  #include <uapi/linux/landlock.h>
>  
> +#include "cap.h"
>  #include "cred.h"
>  #include "domain.h"
>  #include "fs.h"
> @@ -98,8 +99,9 @@ static void build_check_abi(void)
>  	struct landlock_path_beneath_attr path_beneath_attr;
>  	struct landlock_net_port_attr net_port_attr;
>  	struct landlock_namespace_attr namespace_attr;
> +	struct landlock_capability_attr capability_attr;
>  	size_t ruleset_size, path_beneath_size, net_port_size;
> -	size_t namespace_size;
> +	size_t namespace_size, capability_size;
>  
>  	/*
>  	 * For each user space ABI structures, first checks that there is no
> @@ -127,6 +129,11 @@ static void build_check_abi(void)
>  	namespace_size += sizeof(namespace_attr.namespace_types);
>  	BUILD_BUG_ON(sizeof(namespace_attr) != namespace_size);
>  	BUILD_BUG_ON(sizeof(namespace_attr) != 16);
> +
> +	capability_size = sizeof(capability_attr.allowed_perm);
> +	capability_size += sizeof(capability_attr.capabilities);
> +	BUILD_BUG_ON(sizeof(capability_attr) != capability_size);
> +	BUILD_BUG_ON(sizeof(capability_attr) != 16);
>  }
>  
>  /* Ruleset handling */
> @@ -449,14 +456,57 @@ static int add_rule_namespace(struct landlock_ruleset *const ruleset,
>  	return 0;
>  }
>  
> +static int add_rule_capability(struct landlock_ruleset *const ruleset,
> +			       const void __user *const rule_attr)
> +{
> +	struct landlock_capability_attr cap_attr;
> +	int res;
> +	access_mask_t mask;
> +
> +	/* Copies raw user space buffer. */
> +	res = copy_from_user(&cap_attr, rule_attr, sizeof(cap_attr));
> +	if (res)
> +		return -EFAULT;
> +
> +	/* Informs about useless rule: empty allowed_perm. */
> +	if (!cap_attr.allowed_perm)
> +		return -ENOMSG;
> +
> +	/* The allowed_perm must match LANDLOCK_PERM_CAPABILITY_USE. */
> +	if (cap_attr.allowed_perm != LANDLOCK_PERM_CAPABILITY_USE)
> +		return -EINVAL;
> +
> +	/* Checks that allowed_perm matches the @ruleset constraints. */
> +	mask = landlock_get_perm_mask(ruleset, 0);
> +	if (!(mask & LANDLOCK_PERM_CAPABILITY_USE))
> +		return -EINVAL;
> +
> +	/* Informs about useless rule: empty capabilities. */
> +	if (!cap_attr.capabilities)
> +		return -ENOMSG;
> +
> +	/*
> +	 * Stores only the capabilities this kernel knows about.
> +	 * Unknown bits are silently accepted for forward compatibility:
> +	 * user space compiled against newer headers can pass new
> +	 * CAP_* bits without getting EINVAL on older kernels.
> +	 * Unknown bits have no effect because no hook checks them.
> +	 */
> +	mutex_lock(&ruleset->lock);
> +	ruleset->layers[0].allowed.caps |=
> +		landlock_caps_to_bits(cap_attr.capabilities & CAP_VALID_MASK);
> +	mutex_unlock(&ruleset->lock);
> +	return 0;
> +}
> +
>  /**
>   * sys_landlock_add_rule - Add a new rule to a ruleset
>   *
>   * @ruleset_fd: File descriptor tied to the ruleset that should be extended
>   *		with the new rule.
>   * @rule_type: Identify the structure type pointed to by @rule_attr:
> - *             %LANDLOCK_RULE_PATH_BENEATH, %LANDLOCK_RULE_NET_PORT, or
> - *             %LANDLOCK_RULE_NAMESPACE.
> + *             %LANDLOCK_RULE_PATH_BENEATH, %LANDLOCK_RULE_NET_PORT,
> + *             %LANDLOCK_RULE_NAMESPACE, or %LANDLOCK_RULE_CAPABILITY.
>   * @rule_attr: Pointer to a rule (matching the @rule_type).
>   * @flags: Must be 0.
>   *
> @@ -508,6 +558,8 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
>  		return add_rule_net_port(ruleset, rule_attr);
>  	case LANDLOCK_RULE_NAMESPACE:
>  		return add_rule_namespace(ruleset, rule_attr);
> +	case LANDLOCK_RULE_CAPABILITY:
> +		return add_rule_capability(ruleset, rule_attr);
>  	default:
>  		return -EINVAL;
>  	}
> -- 
> 2.53.0
> 

Reviewed-by: Günther Noack <gnoack@google.com>

—Günther

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-08 15:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260312100444.2609563-1-mic@digikod.net>
     [not found] ` <20260312100444.2609563-12-mic@digikod.net>
     [not found]   ` <20260422.5a7059c06fb0@gnoack.org>
     [not found]     ` <20260423.yipaikooJ6oo@digikod.net>
2026-05-08 15:13       ` [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions Günther Noack
     [not found] ` <20260312100444.2609563-6-mic@digikod.net>
2026-05-08 15:46   ` [RFC PATCH v1 05/11] landlock: Enforce namespace entry restrictions Günther Noack
     [not found] ` <20260312100444.2609563-7-mic@digikod.net>
2026-05-08 15:54   ` [RFC PATCH v1 06/11] landlock: Enforce capability restrictions Günther Noack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox