From: "Günther Noack" <gnoack@google.com>
To: "Mickaël Salaün" <mic@digikod.net>
Cc: "Günther Noack" <gnoack3000@gmail.com>,
"Christian Brauner" <brauner@kernel.org>,
"Paul Moore" <paul@paul-moore.com>,
"Serge E . Hallyn" <serge@hallyn.com>,
"Justin Suess" <utilityemal77@gmail.com>,
"Lennart Poettering" <lennart@poettering.net>,
"Mikhail Ivanov" <ivanov.mikhail1@huawei-partners.com>,
"Nicolas Bouchinet" <nicolas.bouchinet@oss.cyber.gouv.fr>,
"Shervin Oloumi" <enlightened@google.com>,
"Tingmao Wang" <m@maowtm.org>,
kernel-team@cloudflare.com, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org
Subject: Re: [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions
Date: Fri, 8 May 2026 17:13:48 +0200 [thread overview]
Message-ID: <af39llfstIu_weMM@google.com> (raw)
In-Reply-To: <20260423.yipaikooJ6oo@digikod.net>
On Thu, Apr 23, 2026 at 03:52:12PM +0200, Mickaël Salaün wrote:
> On Wed, Apr 22, 2026 at 10:38:33PM +0200, Günther Noack wrote:
> > Hello!
> >
> > On Thu, Mar 12, 2026 at 11:04:44AM +0100, Mickaël Salaün wrote:
> > > Document the two new Landlock permission categories in the userspace
> > > API guide, admin guide, and kernel security documentation.
> > >
> > > The userspace API guide adds sections on capability restriction
> > > (LANDLOCK_PERM_CAPABILITY_USE with LANDLOCK_RULE_CAPABILITY), namespace
> > > restriction (LANDLOCK_PERM_NAMESPACE_ENTER with LANDLOCK_RULE_NAMESPACE
> > > covering creation via unshare/clone and entry via setns), and the
> > > backward-compatible degradation pattern for ABI < 9. A table documents
> > > the per-namespace-type capability requirements for both creation and
> > > entry.
> > >
> > > The admin guide adds the new perm.namespace_enter and
> > > perm.capability_use audit blocker names with their object identification
> > > fields (namespace_type, namespace_inum, capability).
> > >
> > > The kernel security documentation adds a "Ruleset restriction models"
> > > section defining the three models (handled_access_*, handled_perm,
> > > scoped), their coverage and compatibility properties, and the criteria
> > > for choosing between them for future features. It also documents
> > > composability with user namespaces and adds kernel-doc references for
> > > the new capability and namespace headers.
> > >
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Günther Noack <gnoack@google.com>
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > Cc: Serge E. Hallyn <serge@hallyn.com>
> > > Signed-off-by: Mickaël Salaün <mic@digikod.net>
> > > ---
> > > Documentation/admin-guide/LSM/landlock.rst | 19 ++-
> > > Documentation/security/landlock.rst | 80 ++++++++++-
> > > Documentation/userspace-api/landlock.rst | 156 ++++++++++++++++++++-
> > > 3 files changed, 245 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst
> > > index 9923874e2156..99c6a599ce9e 100644
> > > --- a/Documentation/admin-guide/LSM/landlock.rst
> > > +++ b/Documentation/admin-guide/LSM/landlock.rst
> > > @@ -6,7 +6,7 @@ Landlock: system-wide management
> > > ================================
> > >
> > > :Author: Mickaël Salaün
> > > -:Date: January 2026
> > > +:Date: March 2026
> > >
> > > Landlock can leverage the audit framework to log events.
> > >
> > > @@ -59,14 +59,25 @@ AUDIT_LANDLOCK_ACCESS
> > > - scope.abstract_unix_socket - Abstract UNIX socket connection denied
> > > - scope.signal - Signal sending denied
> > >
> > > + **perm.*** - Permission restrictions (ABI 9+):
> > > + - perm.namespace_enter - Namespace entry was denied (creation via
> > > + :manpage:`unshare(2)` / :manpage:`clone(2)` or joining via
> > > + :manpage:`setns(2)`);
> > > + ``namespace_type`` indicates the type (hex CLONE_NEW* bitmask),
> > > + ``namespace_inum`` identifies the target namespace for
> > > + :manpage:`setns(2)` operations
> > > + - perm.capability_use - Capability use was denied;
> > > + ``capability`` indicates the capability number
> > > +
> > > Multiple blockers can appear in a single event (comma-separated) when
> > > multiple access rights are missing. For example, creating a regular file
> > > in a directory that lacks both ``make_reg`` and ``refer`` rights would show
> > > ``blockers=fs.make_reg,fs.refer``.
> > >
> > > - The object identification fields (path, dev, ino for filesystem; opid,
> > > - ocomm for signals) depend on the type of access being blocked and provide
> > > - context about what resource was involved in the denial.
> > > + The object identification fields depend on the type of access being blocked:
> > > + ``path``, ``dev``, ``ino`` for filesystem; ``opid``, ``ocomm`` for signals;
> > > + ``namespace_type`` and ``namespace_inum`` for namespace operations;
> > > + ``capability`` for capability use.
> > >
> > >
> > > AUDIT_LANDLOCK_DOMAIN
> > > diff --git a/Documentation/security/landlock.rst b/Documentation/security/landlock.rst
> > > index 3e4d4d04cfae..cd3d640ca5c9 100644
> > > --- a/Documentation/security/landlock.rst
> > > +++ b/Documentation/security/landlock.rst
> > > @@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
> > > ==================================
> > >
> > > :Author: Mickaël Salaün
> > > -:Date: September 2025
> > > +:Date: March 2026
> > >
> > > Landlock's goal is to create scoped access-control (i.e. sandboxing). To
> > > harden a whole system, this feature should be available to any process,
> > > @@ -89,6 +89,72 @@ this is required to keep access controls consistent over the whole system, and
> > > this avoids unattended bypasses through file descriptor passing (i.e. confused
> > > deputy attack).
> > >
> > > +Composability with user namespaces
> > > +----------------------------------
> > > +
> > > +Landlock domain-based scoping and the kernel's user namespace-based capability
> > > +scoping enforce isolation over independent hierarchies. Landlock checks domain
> > > +ancestry; the kernel's ``ns_capable()`` checks user namespace ancestry. These
> > > +hierarchies are orthogonal: Landlock enforcement is deterministic with respect
> > > +to its own configuration, regardless of namespace or capability state, and vice
> > > +versa. This orthogonality is a design invariant that must hold for all new
> > > +scoped features.
> > > +
> > > +Ruleset restriction models
> > > +--------------------------
> >
> > I have to second Justin, it's a good idea to introduce this explanation.
> >
> > > +
> > > +Landlock provides three restriction models, each with different coverage
> > > +and compatibility properties.
> > > +
> > > +Access rights (``handled_access_*``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Access rights control **enumerated operations on kernel objects**
> > > +identified by a rule key (a file hierarchy or a network port). Each
> > > +``handled_access_*`` field declares a set of access rights that the
> > > +ruleset restricts. Multiple access rights share a single rule type.
> > > +Operations for which no access right exists yet remain uncontrolled;
> > > +new rights are added incrementally across ABI versions.
> > > +
> > > +Permissions (``handled_perm``)
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +Permissions control **broad operations enforced at single kernel
> > > +chokepoints**, achieving complete deny-by-default coverage. Each
> > > +``LANDLOCK_PERM_*`` flag maps to its own rule type. When a ruleset
> > > +handles a permission, all instances of that operation are denied unless
> > > +explicitly allowed by a rule. New kernel values (new ``CAP_*``
> > > +capabilities, new ``CLONE_NEW*`` namespace types) are automatically
> > > +denied without any Landlock update.
> >
> > I find the terminology of "chokepoints" and "gateways" in this and the
> > header documentation a bit vague; you could argue that opening a file
> > for reading is also a chokepoint/gateway for using read() later on;
> > it's not immediately clear to me how that's delineated.
>
> Yeah, I wanted to express something wider that a fine-grained access
> right. Any alternative words that would fit better?
I find it also difficult to explain. A "critical enforcement point",
maybe?
Permissions control **permission checks at critical enforcement
points**, independent of individual kernel objects. They guard
critical features which are prerequisites for further access, such
as entering namespaces and using capabilities, and do so in a
deny-by-default manner (all namespace and capability types are
denied without having to list these individually in the ruleset).
WDYT?
(FWIW, I also found the term "Policy Enforcement Point" on the web, but
that seems to be an Enterprise Software term which probably has more
specific meaning there; probably better to avoid that name.)
> > In my mind, the handled_* groups of access rights are usually defined
> > by the "namespace" of the objects they are protecting, more than
> > anything else: handled_access_fs: file paths, handled_access_net:
> > struct sockaddr (which we only expose as "port" for now).
> >
> > To play the devil's advocate, a possible alternative would have been
> > to introduce:
> >
> > handled_access_ns with values LANDLOCK_ACCESS_NS_FOO_ENTER,
> > LANDLOCK_ACCESS_NS_BAR_ENTER, etc. (and documenting somewhere that
> > these are guaranteed to stay in sync; a static assert is enough to
> > make sure they do).
>
> That was actually one of my initial version, but I couldn't find any
> meaning ful other access rights that would both be useful for the
> sandboxing use case and worth the implementation. At the end I
> concluded that we needed "ambiant" access rights for things that are not
> really tied to existing kernel objects, and to be able to fully express
> current and future properties, hence using non-Landlock UAPI
> (capabilities, namespace types...). The handled_perm name was the less
> ambiguous one I could find, which still make sense.
>
> Another important property is that the permissions rules don't have
> access rights, only *one* permission bit which could be removed. I
> choose to keep it as a safeguard (for UAPI check) and to still be able
> to add new ones for such rule if one day we really find a useful use
> case. Anyway, it's basically free.
Yes, sounds fair. I also think these two points are the crucial ones
here, namely (a) it's not specific to a kernel object, and (b) the
deny-by-default property (you don't need to list out all the types in
the ruleset to block them all). (My suggested rephrasing above talks
about these too.)
> > handled_access_caps with values LANDLOCK_ACCESS_CAPS_USE_FOO,
> > LANDLOCK_ACCESS_CAPS_USE_BAR, etc., also guaranteed to stay in sync.
>
> Genuine question: what would be these FOO and BAR? I couldn't find
> anything worth it. The idea is to have a simple interface. In fact,
> initially I didn't have these suffixes (i.e. _USE, _ENTER), and they are
> not really needed, but these are also safeguards in the case we would
> need one, and the main motivation is to make the semantic clear to
> users (and more consistent with other Landlock access rights).
By "FOO" and "BAR" I meant to imply the different capabilities, e.g.,
LANDLOCK_ACCESS_CAP_USE_AUDIT_CONTROL,
LANDLOCK_ACCESS_CAP_USE_AUDIT_READ, LANDLOCK_ACCESS_CAP_USE_AUDIT_WRITE,
LANDLOCK_ACCESS_CAP_USE_BLOCK_SUSPEND, etc.
> > That way the blocked accesses would still be "operations", and we
> > would not need to have rules for them because the "object" being
> > protected are the processes within the Landlock domain, so to say.
>
> I'm not sure to understand, but an (also) previous version was to just
> put the capability (and namespace type) bits directly in the ruleset
> struct. The issue with this approach is that it doesn't work well with
> a deny-by-default enforcement, and this would not be extensible, and
> this would not handle well compatibility (fields set to zero by
> default).
>
> >
> > Arguably, the LANDLOCK_ACCESS_FS_MAKE_* rights already follow a
> > similar pattern.
>
> Hmm, I'm not following.
What I meant is that these are "rolled out" in a similar way to my
LANDLOCK_ACCESS_CAP_USE_... examples above, because they list the
different file types in LANDLOCK_ACCESS_FS_MAKE_CHAR, ..._MAKE_DIR,
..._MAKE_REG, ..._MAKE_SOCK, etc.
> > To be clear, I am myself only 50% convinced whether the API would be
> > better. The implementation would be easier (but that doesn't count
> > much in comparison).
> >
> >
> > > +Each permission flag names a single gateway operation whose control
> > > +transitively covers an open-ended set of downstream operations: for
> > > +example, exercising a capability enables privileged operations across
> > > +many subsystems; entering a namespace enables gaining capabilities in a
> > > +new context.
> > > +
> > > +Permission rules identify what to allow using constants defined by other
> > > +kernel subsystems (``CAP_*``, ``CLONE_NEW*``). Unknown values are
> > > +silently ignored because deny-by-default ensures they are denied anyway.
> > > +In contrast, unknown ``LANDLOCK_PERM_*`` flags in ``handled_perm`` are
> > > +rejected (``-EINVAL``), since Landlock owns that namespace.
> >
> > OK I played through the compatibility scenarios which puzzled me in my
> > reply to the cover letter, for both namespaces and capabilities.
> > Namespaces are OK, so I'm just including that for completeness and for
> > comparison, but I think the capabilities might be tricky?
> >
> >
> > Case A: Namespaces
> >
> > In the scenario where a caller restricts
> > LANDLOCK_PERM_NAMESPACE_ENTER, but then adds a rule to allow a
> > non-existent namespace number like 1<<63.
> >
> > Landlock ABI v9:
> > * The rule is accepted and the unknown value for the namespace type
> > silently ignored
> > * It is not possible to enter the namespace because the namespace API
> > doesn't exist for it. (But that's appropriate.)
>
> Yes, the namespace would just be unknown to the kernel, Landlock doesn't
> do anything here.
>
> >
> > Landlock ABI v_future (the namespace type 1<<63 exists now):
> > * The rule continues to be accepted.
> > * When trying to exercise the namespace type, it works.
>
> It works because the kernel now know about this namespace. Again,
> nothing related to Landlock specifically.
>
> >
> > It seems that this scenario works fine. In the earlier version,
> > entering the namespace already doesn't work because the kernel doesn't
> > have support for it.
> >
> >
> > Case B: Capabilities
> >
> > Whne new capabilities are introduced, I see that people have used the
> > pattern where these capabilities are split off from operations which
> > were previously controlled by CAP_SYS_ADMIN. An example is commit
> > a17b53c4a4b5 ("bpf, capability: Introduce CAP_BPF"), which states:
> >
> > Split BPF operations that are allowed under CAP_SYS_ADMIN into
> > combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN. For backward
> > compatibility include them in CAP_SYS_ADMIN as well.
> >
> > (The same pattern was also used in the introduction of
> > CAP_CHECKPOINT_RESTORE and CAP_PERFMON. CAP_AUDIT_READ is older and
> > did it differently.)
>
> The key point here (and the architectural limitation) is that a new
> capability cannot completely replace an existing one. The original
> capability check will remain forever.
>
> >
> > Let's say there is a frobnicate() syscall guarded by CAP_SYS_ADMIN. A
> > future kernel introduces CAP_FOO and then checks for frobnicate() that
> > either one of CAP_FOO or CAP_SYS_ADMIN are present.
> >
> > A caller creates a ruleset restricting capability use with Landlock,
> > and adds a rule to allow CAP_FOO but not CAP_SYS_ADMIN (e.g.,
> > ^CAP_SYS_ADMIN)
> >
> > Landlock ABI v9: (CAP_FOO doesn't exist)
> > * The rule for CAP_FOO is accepted and the unknown value for the
> > capability silently ignored.
> > * The call to frobnicate() fails because the use of the capability is
> > forbidden
> >
> > Landlock ABI v10: (CAP_FOO starts to exist)
> > * The rule continues to be accepted
> > * The call to frobnicate() **succeeds now**, because the new kernel guards
> > the operation by either one of those capabilities.
> >
> >
> > So... for capabilities, it seems to be slightly incompatible if users
> > allow capabilities with a rule which are not known yet? The reason
> > for that is the way how capabilities "fork off" from CAP_SYS_ADMIN.
>
> The key point is that the compatibility is deferred to the other kernel
> subsystems. User space need to know which capabilities (or namespace
> types) are supported before using them. It's not a Landlock
> compatibility issue.
Fair enough, OK then. Paraphrasing, to make sure we are aligned: If you
allow-list one of the newer capabilities through landlock_add_rule, and
then run your program on a kernel where that capability doesn't exist
yet, you can not expect that to work. Seems fair.
> > I mean, I can see that it's a pretty fringe scenario if users pass
> > capabilities that don't exist yet, but it *is* strictly speaking an
> > incompatibiliy. Should we check the range of the passed capabilities?
> > Am I overlooking any downsides to this if we force users to stay
> > between 0 and CAP_LAST_CAP?
>
> Checking the range of known capabilities (or namespace types) could
> break the same Landlock rules on different kernels even if targeting the
> same Landlock ABI version, which would be much worse. I definitely
> prefer to have idempotent/deterministic Landlock rules.
Hm, good point. The list of supported capabilities can not be probed through
the Landlock ABI number.
—Günther
next prev parent reply other threads:[~2026-05-08 15:13 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-12 10:04 [RFC PATCH v1 00/11] Landlock: Namespace and capability control Mickaël Salaün
2026-03-12 10:04 ` [RFC PATCH v1 01/11] security: add LSM blob and hooks for namespaces Mickaël Salaün
2026-03-25 12:31 ` Christian Brauner
2026-04-09 16:40 ` Mickaël Salaün
2026-04-10 9:35 ` Christian Brauner
2026-04-22 21:21 ` Günther Noack
2026-04-23 0:19 ` Paul Moore
2026-04-24 18:56 ` Mickaël Salaün
2026-04-24 19:28 ` Paul Moore
2026-04-27 14:57 ` Christian Brauner
2026-04-27 21:46 ` Paul Moore
2026-03-12 10:04 ` [RFC PATCH v1 02/11] security: Add LSM_AUDIT_DATA_NS for namespace audit records Mickaël Salaün
2026-03-25 12:32 ` Christian Brauner
2026-04-01 16:38 ` Mickaël Salaün
2026-04-01 18:48 ` Mickaël Salaün
2026-04-09 13:29 ` Christian Brauner
2026-04-22 21:21 ` Günther Noack
2026-03-12 10:04 ` [RFC PATCH v1 03/11] nsproxy: Add FOR_EACH_NS_TYPE() X-macro and CLONE_NS_ALL Mickaël Salaün
2026-03-25 12:33 ` Christian Brauner
2026-03-25 15:26 ` Mickaël Salaün
2026-03-26 14:22 ` (subset) " Christian Brauner
2026-03-12 10:04 ` [RFC PATCH v1 04/11] landlock: Wrap per-layer access masks in struct layer_rights Mickaël Salaün
2026-04-10 1:45 ` Tingmao Wang
2026-04-22 21:29 ` Günther Noack
2026-03-12 10:04 ` [RFC PATCH v1 05/11] landlock: Enforce namespace entry restrictions Mickaël Salaün
2026-04-10 1:45 ` Tingmao Wang
2026-05-08 15:46 ` Günther Noack
2026-03-12 10:04 ` [RFC PATCH v1 06/11] landlock: Enforce capability restrictions Mickaël Salaün
2026-04-22 21:36 ` Günther Noack
2026-05-08 15:54 ` Günther Noack
2026-03-12 10:04 ` [RFC PATCH v1 07/11] selftests/landlock: Drain stale audit records on init Mickaël Salaün
2026-03-24 13:27 ` Günther Noack
2026-03-12 10:04 ` [RFC PATCH v1 08/11] selftests/landlock: Add namespace restriction tests Mickaël Salaün
2026-03-12 10:04 ` [RFC PATCH v1 09/11] selftests/landlock: Add capability " Mickaël Salaün
2026-03-12 10:04 ` [RFC PATCH v1 10/11] samples/landlock: Add capability and namespace restriction support Mickaël Salaün
2026-04-22 21:20 ` Günther Noack
2026-04-23 13:51 ` Mickaël Salaün
2026-03-12 10:04 ` [RFC PATCH v1 11/11] landlock: Add documentation for capability and namespace restrictions Mickaël Salaün
2026-03-12 14:48 ` Justin Suess
2026-04-23 13:51 ` Mickaël Salaün
2026-04-23 16:01 ` Justin Suess
2026-04-23 16:08 ` Justin Suess
2026-04-22 20:38 ` Günther Noack
2026-04-23 13:52 ` Mickaël Salaün
2026-05-08 15:13 ` Günther Noack [this message]
2026-03-25 12:34 ` [RFC PATCH v1 00/11] Landlock: Namespace and capability control Christian Brauner
2026-04-20 15:06 ` Günther Noack
2026-04-21 8:24 ` Mickaël Salaün
2026-04-22 21:16 ` Günther Noack
2026-04-23 13:50 ` Mickaël Salaün
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af39llfstIu_weMM@google.com \
--to=gnoack@google.com \
--cc=brauner@kernel.org \
--cc=enlightened@google.com \
--cc=gnoack3000@gmail.com \
--cc=ivanov.mikhail1@huawei-partners.com \
--cc=kernel-team@cloudflare.com \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=m@maowtm.org \
--cc=mic@digikod.net \
--cc=nicolas.bouchinet@oss.cyber.gouv.fr \
--cc=paul@paul-moore.com \
--cc=serge@hallyn.com \
--cc=utilityemal77@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox